Reputation: 3670
Consider the following dataframe:
name <- c("Sally", "Dave", "Aaron", "Jane", "Michael")
rank <- c(1,2,1,2,3)
df <- data.frame(name, rank, stringsAsFactors = FALSE)
I'd like to create a grouping variable (event) based on the rank column, as such:
event <- c("Hurdles", "Hurdles", "Long Jump", "Long Jump", "Long Jump")
df_desired <- data.frame(name, rank, event, stringsAsFactors = FALSE)
There are lots of examples of going the other way (making a ranking variable based on a group) but I can't seem to find one doing what I'd like.
It's possible to use filter
, full_join
and then fill
as shown below, but is there a simpler way?
library(tidyverse)
df <- df %>%
mutate(order = row_number())
df_1 <- df %>%
filter(rank == 1)
df_1$event <- c("Hurdles", "Long Jump")
df %>%
filter(rank != 1) %>%
mutate(event = as.character(NA)) %>%
full_join(df_1, by = c("order", "name", "rank", "event")) %>%
arrange(order) %>%
fill(event) %>%
select(-order)
Upvotes: 1
Views: 58
Reputation: 887991
We can use cumsum
to create the index
library(dplyr)
df %>%
mutate(event = c("Hurdles", "Long Jump")[cumsum(rank == 1)])
# name rank event
#1 Sally 1 Hurdles
#2 Dave 2 Hurdles
#3 Aaron 1 Long Jump
#4 Jane 2 Long Jump
#5 Michael 3 Long Jump
Or in base R
(just in case)
df$event <- c("Hurdles", "Long Jump")[cumsum(df$rank == 1)])
Upvotes: 4