Reputation: 173
I'm trying to simplify the following multiple ifelse code using sapply or lapply (still can't distinguish them).
My goal is to allocate points based on placement like shown below.
df$Point <- ifelse(df$Placement_v2 <= 1, 10,
ifelse(df$Placement_v2 <= 10, 9,
ifelse(df$Placement_v2 <= 25, 8,
ifelse(df$Placement_v2 <= 50, 7, 1) )))
This code works okay, but I want to make a dataframe and simply my code above using sapply or lapply (or anyother function).
I've tried this code but is not working as expected. Only the rows with placement 1 get 10 points and other rows end up with 1.
<2nd code>
df$Point <- sapply(df2$Placement, function(x) ifelse(df$Placement_v2 <= x, df2$Point[df2$Placement == x], 1 ) )
How can I solve this problem?
Upvotes: 1
Views: 1795
Reputation: 32558
You could create a dataframe with values and replacements. Then you can use cut
to lookup the appropriate value
dict = data.frame(replacement = c(10, 9, 8, 7, 1, 1),
values = c(0, 1, 10, 25, 50, 1e5))
#DATA
set.seed(42)
placement = sample(1:100, 15)
cbind(placement,
new_placement = dict$replacement[as.integer(cut(placement, breaks = dict$values))])
# placement new_placement
# [1,] 92 1
# [2,] 93 1
# [3,] 29 7
# [4,] 81 1
# [5,] 62 1
# [6,] 50 7
# [7,] 70 1
# [8,] 13 8
# [9,] 61 1
#[10,] 65 1
#[11,] 42 7
#[12,] 91 1
#[13,] 83 1
#[14,] 23 8
#[15,] 40 7
Upvotes: 0
Reputation: 3240
A few ways to go about this. I'll use data.table
.
library(data.table)
set.seed(123)
df <- data.table(Placement_v2 = runif(200, -10, 100))
First option, move the evaluation out to a function, and then lapply
the function to your Placement_v2
column. This has the benefit of being much cleaner than your nested ifelse
statements.
funky <- function(x) {
if (x <= 1) {
val <- 10
} else if (x <= 10){
val <- 9
} else if (x <= 25){
val <- 8
} else if (x <= 50){
val <- 7
} else {
val <- 1
}
return(val)
}
df[, Point := unlist(lapply(Placement_v2, funky))]
Result:
Placement_v2 Point
1: 21.633527 8
2: 76.713565 1
3: 34.987461 7
4: 87.131914 1
5: 93.451401 1
---
196: 41.318597 7
197: 34.751585 7
198: 62.515336 1
199: 6.758128 9
200: 53.015376 1
I would instead approach this by subsetting the data, and assigning by each subset. You could do this by specifying each subset df[Placement_v2 <= 1]
, df[Placement_v2 >= 1 & Placement_v2 <= 10]
, etc. But, if you do it in the correct order, you can avoid the double equality evaluation.
df[, Point := 1]
df[Placement_v2 <= 50, Point := 7]
df[Placement_v2 <= 25, Point := 8]
df[Placement_v2 <= 10, Point := 9]
df[Placement_v2 <= 1, Point := 10]
Which gives the same result:
Placement_v2 Point
1: 21.633527 8
2: 76.713565 1
3: 34.987461 7
4: 87.131914 1
5: 93.451401 1
---
196: 41.318597 7
197: 34.751585 7
198: 62.515336 1
199: 6.758128 9
200: 53.015376 1
Upvotes: 1