10 If & Functions
Updated for S2026? No.
10.1 If statements
The conditional statements we’ve worked with so far have evaluated our data to say, if this condition holds in this row, take this action. Sometimes we want to extend this and say, if a certain condition holds, Run this code
We can do this via if statements. Similar to when we do this for a single line of code, we start with a logical statment – a statement that produces a single T or F, and then R will only run the code if the statement is True.
So for example:
operation <- "add"
if (operation == "add") {
print("I'm going to add:")
4+4
}
#> [1] "I'm going to add:"
#> [1] 8But if we do the following it won’t work:
operation <- "multiply"
if (operation == "add") {
print("I'm going to add:")
4+4
}Nothing happened here, and that is by design. R thought: in order for me to run this code chunk,this logical statement has to be True, it’s not, so I’ll skip it.
But if we set up a second if statement it will work:
if (operation == "multiply") {
print("I'm going to multiply:")
4*4
}
#> [1] "I'm going to multiply:"
#> [1] 16We can even run both at the same time and only get the one that evaluates as True.
operation <- "multiply"
if (operation == "add") {
print("I'm going to add:")
4+4
}
if (operation == "multiply") {
print("I'm going to multiply:")
4*4
}
#> [1] "I'm going to multiply:"
#> [1] 16
operation <- "add"
if (operation == "add") {
print("I'm going to add:")
4+4
}
#> [1] "I'm going to add:"
#> [1] 8
if (operation == "multiply") {
print("I'm going to multiply:")
4*4
}Right now our if statement says: if condition is True, run code, if it’s not, do nothing. But often times we want to say: if condition is True, run code, if it’s not, run this other code.We can accomplish this by adding an “else”
if(2+2==4){
print("code chunk 1")
} else {
print("code chunk 2")
}
#> [1] "code chunk 1"
if(2+2==5){
print("code chunk 1")
} else {
print("code chunk 2")
}
#> [1] "code chunk 2"So if we wanted a chunk of code that adds if we tell it to, but otherwise multiplies:
operation <- "add"
if (operation == "add") {
print("I'm going to add:")
4+4
} else {
print("I'm going to multiply:")
4*4
}
#> [1] "I'm going to add:"
#> [1] 8
operation <- "subtract"
if (operation == "add") {
print("I'm going to add:")
4+4
} else {
print("I'm going to multiply:")
4*4
}
#> [1] "I'm going to multiply:"
#> [1] 16The nice thing about these is they are infinitely stackable using else if ()
operation <- "subtract"
if (operation == "add") {
print("I'm going to add:")
4+4
} else if (operation=="multiply") {
print("I'm going to multiply:")
4*4
} else {
print("Please enter a valid operator.")
}
#> [1] "Please enter a valid operator."Where is this sort of thing helpful? This is a bit of a “you’ll know it when you see it” situation. Where I use them most is inside loops. For example in election work I am often loading in many snapshots of data from across an election night, and I might only run certain functions when a state has reached a certain threshold of completeness.
Here is an example of where it might be helpful. Here are the 2020 presidential election results by county:
pres <- rio::import("https://github.com/marctrussler/IDS-Data/raw/main/CountyPresData2020.Rds")
#> Warning: Missing `trust` will be set to FALSE by default
#> for RDS in 2.0.0.What if I want to record the winning margin in each state. That is: regardless of whether Trump or Biden carried the state, how much did they win by?
First, let’s write code that finds the dem and rep margin in each state:
head(pres)
#> state county fips.code biden.votes trump.votes
#> 1 AK ED 1 02901 3477 3511
#> 2 AK ED 10 02910 2727 8081
#> 3 AK ED 11 02911 3130 7096
#> 4 AK ED 12 02912 2957 7893
#> 5 AK ED 13 02913 2666 4652
#> 6 AK ED 14 02914 4261 6714
#> other.votes
#> 1 326
#> 2 397
#> 3 402
#> 4 388
#> 5 395
#> 6 468
pres$total.votes <- pres$biden.votes + pres$trump.votes + pres$other.votes
state <- unique(pres$state)
dem.perc <- NA
rep.perc <- NA
i <- 1
for(i in 1:length(state)){
dem.perc[i] <- sum(pres$biden.votes[pres$state==state[i]])/ sum(pres$total.votes[pres$state==state[i]])
rep.perc[i] <- sum(pres$trump.votes[pres$state==state[i]])/ sum(pres$total.votes[pres$state==state[i]])
}
results <- cbind.data.frame(state, dem.perc, rep.perc)Pretty good! But to also calculate the winning margin in each state our code will need to be different dependent on the code of those two columns. Specifically, if the dem is the winner then the winning margin is dem-rep but if the rep is the winner then the winning margin is rep-dem. We can accomplish this via an if statement:
state <- unique(pres$state)
dem.perc <- NA
rep.perc <- NA
winner.margin <- NA
i <- 1
for(i in 1:length(state)){
dem.perc[i] <- sum(pres$biden.votes[pres$state==state[i]])/ sum(pres$total.votes[pres$state==state[i]])
rep.perc[i] <- sum(pres$trump.votes[pres$state==state[i]])/ sum(pres$total.votes[pres$state==state[i]])
if(dem.perc[i]>rep.perc[i]){
winner.margin[i] <- dem.perc[i]- rep.perc[i]
} else {
winner.margin[i] <- rep.perc[i]-dem.perc[i]
}
}
results <- cbind.data.frame(state,dem.perc, rep.perc, winner.margin)
results <- results[order(results$winner.margin),]Why would this have been wrong?
state <- unique(pres$state)
dem.perc <- NA
rep.perc <- NA
winner.margin <- NA
i <- 1
for(i in 1:length(state)){
dem.perc[i] <- sum(pres$biden.votes[pres$state==state[i]])/ sum(pres$total.votes[pres$state==state[i]])
rep.perc[i] <- sum(pres$trump.votes[pres$state==state[i]])/ sum(pres$total.votes[pres$state==state[i]])
if(dem.perc[i]>rep.perc[i]){
winner.margin[i] <- dem.perc[i]- rep.perc[i]
}
winner.margin[i] <- rep.perc[i]-dem.perc[i]
}
results <- cbind.data.frame(state,dem.perc, rep.perc, winner.margin)
results <- results[order(results$winner.margin),]