fega
fega

Reputation: 35

Label consecutive runs of a sequence in a column with consecutive letters

I have the following data:

df <- data.frame(week = rep(seq(1, 4, by=1), times = 3) )

   week
1     1
2     2
3     3
4     4
5     1
6     2
7     3
8     4
9     1
10    2
11    3
12    4

I want to label each consecutive runs of 1:4 with a letter so that the result is this:

   week episode
1     1       a
2     2       a
3     3       a
4     4       a
5     1       b
6     2       b
7     3       b
8     4       b
9     1       c
10    2       c
11    3       c
12    4       c

I have tried the following but this does not distinguish the separate consecutive runs of the sequence 1:4

data.frame(df, episode = letters[cumsum(c(1L, diff(df$week) > 1L))]) 
   week episode
1     1       a
2     2       a
3     3       a
4     4       a
5     1       a
6     2       a
7     3       a
8     4       a
9     1       a
10    2       a
11    3       a
12    4       a

Upvotes: 2

Views: 106

Answers (3)

tmfmnk
tmfmnk

Reputation: 39858

A different dplyr possibility could be:

df %>%
 mutate(episode = letters[gl(n()/4, 4)])

   week episode
1     1       a
2     2       a
3     3       a
4     4       a
5     1       b
6     2       b
7     3       b
8     4       b
9     1       c
10    2       c
11    3       c
12    4       c

Or the same with base R:

df$episode = letters[gl(length(df$week)/4, 4)]

Or:

df %>%
 mutate(episode = letters[ceiling(seq_along(week)/4)])

Or the same with base R:

df$episode = letters[ceiling(seq_along(df$week)/4)]

Upvotes: 1

IceCreamToucan
IceCreamToucan

Reputation: 28675

You can use rowid from the data.table package

library(data.table)
setDT(df)

df[, episode := letters[rowid(week)]]

#     week episode
#  1:    1       a
#  2:    2       a
#  3:    3       a
#  4:    4       a
#  5:    1       b
#  6:    2       b
#  7:    3       b
#  8:    4       b
#  9:    1       c
# 10:    2       c
# 11:    3       c
# 12:    4       c

Upvotes: 1

akrun
akrun

Reputation: 887088

If it is already in a sequence, then just do the cumulative of logical vector (week == 1)

library(dplyr)
df %>% 
    mutate(episode =  letters[cumsum(week == 1)])
#   week episode
#1     1       a
#2     2       a
#3     3       a
#4     4       a
#5     1       b
#6     2       b
#7     3       b
#8     4       b
#9     1       c
#10    2       c
#11    3       c
#12    4       c

Or using base R (without any additional packages)

df$episode <- letters[cumsum(df$week == 1)]

Upvotes: 2

Related Questions