Reputation: 1335
I want to apply a function to rows of a data frame. The function is conditional on the value of one column being greater than the value in another column. If the condition is met I take the element from two (other) columns and multiply them, the result is then added to a new column. If the initial condition is not met there is no multiplication and an original value is copied to the new column.
Create some data:
var0 <- c("A", "B", "C", "D", "E")
var1 <- rep(c(105,200), each = 5)
var2 <- c(110:114, 25:29)
var3 <- rep(c(560,135), each = 5)
var4 <- rep(c(0.5,0.2), each = 5)
my_df <- as.data.frame(cbind(var0, var1, var2, var3, var4))
Have a look at the data:
var0 var1 var2 var3 var4
1 A 105 110 560 0.5
2 B 105 111 560 0.5
3 C 105 112 560 0.5
4 D 105 113 560 0.5
5 E 105 114 560 0.5
6 A 200 25 135 0.2
7 B 200 26 135 0.2
8 C 200 27 135 0.2
9 D 200 28 135 0.2
10 E 200 29 135 0.2
My attempt at writing the code:
apply(my_df, 1, function(x) {
if(x$var3 > x$var1) {
x$output <- x$var2 * x$var4
} else {
x$output <- x$var2
}
return(x)
})
What the result should look like:
var0 var1 var2 var3 var4 output
1 A 105 110 560 0.5 55.0
2 B 105 111 560 0.5 55.5
3 C 105 112 560 0.5 56.0
4 D 105 113 560 0.5 56.5
5 E 105 114 560 0.5 57.0
6 A 200 25 135 0.2 25.0
7 B 200 26 135 0.2 26.0
8 C 200 27 135 0.2 27.0
9 D 200 28 135 0.2 28.0
10 E 200 29 135 0.2 29.0
Because var3 is greater than var1 in the first 5 rows var2 * var4 occurs, in the last 5 rows the condition is not met so var2 is simply copied to the output column.
Upvotes: 1
Views: 8301
Reputation: 3492
var0 <- c("A", "B", "C", "D", "E")
var1 <- rep(c(105,200), each = 5)
var2 <- c(110:114, 25:29)
var3 <- rep(560,135, 5)
var4 <- rep(c(0.5,0.2), each = 5)
to avoid numbers to be converted to factors I am using cbind.data.frame instead of as.data.frame of cbind
my_df <-cbind.data.frame(var0, var1, var2, var3, var4)
> str(my_df)
'data.frame': 10 obs. of 5 variables:
$ var0: Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5 1 2 3 4 5
$ var1: num 105 105 105 105 105 200 200 200 200 200
$ var2: int 110 111 112 113 114 25 26 27 28 29
$ var3: num 560 560 560 560 560 560 560 560 560 560
$ var4: num 0.5 0.5 0.5 0.5 0.5 0.2 0.2 0.2 0.2 0.2
I then use an ifelse condition to get the new column
>my_df$output=ifelse(my_df$var3>my_df$var1,my_df$var2*my_df$var4,my_df$var2)
> my_df
var0 var1 var2 var3 var4 output
1 A 105 110 560 0.5 55.0
2 B 105 111 560 0.5 55.5
3 C 105 112 560 0.5 56.0
4 D 105 113 560 0.5 56.5
5 E 105 114 560 0.5 57.0
6 A 200 25 560 0.2 5.0
7 B 200 26 560 0.2 5.2
8 C 200 27 560 0.2 5.4
9 D 200 28 560 0.2 5.6
10 E 200 29 560 0.2 5.8
Note I was not getting the same values in var3 as yours. So I changed var3 to be the ones given
> var3 <- c(rep(560,5),rep(135,5))
> var3
[1] 560 560 560 560 560 135 135 135 135 135
> my_df <-cbind.data.frame(var0, var1, var2, var3, var4)
> my_df$output=ifelse(my_df$var3>my_df$var1,my_df$var2*my_df$var4,my_df$var2)
> my_df
var0 var1 var2 var3 var4 output
1 A 105 110 560 0.5 55.0
2 B 105 111 560 0.5 55.5
3 C 105 112 560 0.5 56.0
4 D 105 113 560 0.5 56.5
5 E 105 114 560 0.5 57.0
6 A 200 25 135 0.2 25.0
7 B 200 26 135 0.2 26.0
8 C 200 27 135 0.2 27.0
9 D 200 28 135 0.2 28.0
10 E 200 29 135 0.2 29.0
Upvotes: 1
Reputation: 521103
You don't need to use an apply()
function here, you can just use ifelse()
:
df$output <- ifelse(df$var3 > df$var1, df$var2*df$var4, df$var2)
Upvotes: 2