Scijens
Scijens

Reputation: 561

Use case_when function of dplyr with connection between two datasets

I have the following two example datasets and want to use case_when function as a combination of the two datasets:

game_data <- data.frame(player = c(1,1,1,2,2,2,3,3,3), level = c(1,2,3,1,2,3,1,2,3), score=c(0,150,170,80,100,110,75,100,0))
> game_data
  player level score
1      1     1     0
2      1     2   150
3      1     3   170
4      2     1    80
5      2     2   100
6      2     3   110
7      3     1    75
8      3     2   100
9      3     3     0
> 
> range_data <- data.frame(level = c(1,2,3), Point1 = c(20,70,140), Point2 = c(40,80,180), Point3 = c(60,90,220))
> range_data
  level Point1 Point2 Point3
1     1     20     40     60
2     2     70     80     90
3     3    140    180    220
> 

I now want to use the Ranges between points in the second dataset for creating a new variable in the game_data datset based on the range between scores. For example, if the score of user 1 is in level 2 at 150, the new variable PointRange should show "Range4" as it's higher than 90.

I've tried the following but it doesn't work:

result <- game_data %>%
  mutate(PointRange = case_when(level == range_data$level & score <  range_data$point1 ~ "Range1",
                                level == range_data$level & score >= range_data$point1 & score < data$point2 ~ "Range2",
                                level == range_data$level & score >= range_data$point2 & score <= data$point3 ~ "Range3",
                                level == range_data$level & score >= range_data$point3 ~ "Range4"))

How can I manage this? Thanks in advance!

Upvotes: 0

Views: 503

Answers (1)

jasbner
jasbner

Reputation: 2283

Since you are matching on the level column you can simply inner_join that column and then work from a single data frame.

the arguments are evaluated in order, so you must proceed from the most specific to the most general.

game_data %>%
  inner_join(range_data, by = "level") %>%
  mutate(PointRange = case_when(score>=Point3 ~ "Range4",
                                score>=Point2 ~"Range3",
                                score>=Point1 ~"Range2",
                                score<Point1 ~"Range1")) %>%
  select(-Point1,-Point2,-Point3)

#  player level score PointRange
#1      1     1     0     Range1
#2      1     2   150     Range4
#3      1     3   170     Range2
#4      2     1    80     Range4
#5      2     2   100     Range4
#6      2     3   110     Range1
#7      3     1    75     Range4
#8      3     2   100     Range4
#9      3     3     0     Range1

Upvotes: 1

Related Questions