Reputation: 11
I have a data frame, vinaSums2
:
Treatment Dose (ppm) variable sum sum2 percent score
1 0 #DFA1 12 19 0.63 NA
2 0 #DFA2 6 19 0.32 NA
3 0 #DFA3 0 19 0.00 NA
4 0 #DFA4 1 19 0.05 NA
5 0 Bug/Discard 0 19 0.00 NA
6 150 #DFA1 12 20 0.60 NA
7 150 #DFA2 5 20 0.25 NA
8 150 #DFA3 3 20 0.15 NA
9 150 #DFA4 0 20 0.00 NA
10 150 Bug/Discard 0 20 0.00 NA
11 300 #DFA1 14 19 0.74 NA
12 300 #DFA2 1 19 0.05 NA
13 300 #DFA3 2 19 0.11 NA
14 300 #DFA4 1 19 0.05 NA
15 300 Bug/Discard 1 19 0.05 NA
I would like to transform values of vinaSums2$sum
based on their respective values for vinaSums2$variable
and place this in vinaSums2$score
. I have tried variations of this:
vinaSums2$score = apply(vinaSums2[,c(2,3)], 1, function(x){ifelse(x[1] == "#DFA1", x[2]*1, ifelse(x[1] == "#DFA2", x[2]*0.8, ifelse(x[1] == "#DFA3", x[2]*0.6, ifelse(x[1] == "#DFA4", x[2]*0.4, x[2]*0.2))))})
Which results in the error:
Error during wrapup: non-numeric argument to binary operator
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
I don't understand how apply()
is working with the function and the data. x[1]
should return the "variable" for that row and x[2]
should return the score for that row in my mind. But I obviously don't know what apply()
is doing under the hood.
Any help would be appreciated
Upvotes: 1
Views: 40
Reputation: 79228
In Base R, you could do:
vec<-c("#DFA1" = 1, "#DFA2"=0.8, "#DFA3"=0.6, "#DFA4" = 0.4)
vs <- vec[vinaSums2$variable]
vs[is.na(vs)] <- 0.2
vinaSums2$score <- vs * vinaSums2$sum
vinaSums2
Treatment.Dose..ppm. variable sum sum2 percent score
1 0 #DFA1 12 19 0.63 12.0
2 0 #DFA2 6 19 0.32 4.8
3 0 #DFA3 0 19 0.00 0.0
4 0 #DFA4 1 19 0.05 0.4
5 0 Bug/Discard 0 19 0.00 0.0
6 150 #DFA1 12 20 0.60 12.0
7 150 #DFA2 5 20 0.25 4.0
8 150 #DFA3 3 20 0.15 1.8
9 150 #DFA4 0 20 0.00 0.0
10 150 Bug/Discard 0 20 0.00 0.0
11 300 #DFA1 14 19 0.74 14.0
12 300 #DFA2 1 19 0.05 0.8
13 300 #DFA3 2 19 0.11 1.2
14 300 #DFA4 1 19 0.05 0.4
15 300 Bug/Discard 1 19 0.05 0.2
Using tidyverse you could do:
library(tidyverse)
vec<-c("#DFA1" = 1, "#DFA2"=0.8, "#DFA3"=0.6, "#DFA4" = 0.4)
vinaSums2 %>%
mutate(v = factor(variable, names(vec), vec),
v = as.numeric(as.character(fct_explicit_na(v, "0.2"))),
score = v * sum)%>%
select(-v)
Upvotes: 1
Reputation: 388982
You don't need apply
since ifelse
is vectorized.
vinaSums2$score <- with(vinaSums2, ifelse(variable == "#DFA1", sum,
ifelse(variable == "#DFA2", sum*0.8,
ifelse(variable == "#DFA3", sum*0.6,
ifelse(variable == "#DFA4", sum*0.4, sum*0.2)))))
You can also use case_when
from dplyr
which will be helpful if you have lot of such conditions.
library(dplyr)
vinaSums2 %>%
mutate(score = case_when(variable == "#DFA1" ~ sum * 1,
variable == "#DFA2"~sum*0.8,
variable == "#DFA3"~sum*0.6,
variable == "#DFA4" ~ sum * 0.4,
TRUE ~ sum * 0.2))
Upvotes: 2