Reputation: 77
For a dataframe like this:
cat val
1 aaa 0.05638315
2 aaa 0.25767250
3 aaa 0.30776611
4 aaa 0.46854928
5 aaa 0.55232243
6 bbb 0.17026205
7 bbb 0.37032054
8 bbb 0.48377074
9 bbb 0.54655860
10 bbb 0.81240262
11 ccc 0.28035384
12 ccc 0.39848790
13 ccc 0.62499648
14 ccc 0.76255108
15 ccc 0.88216552
I want a assign repeating sequence numbers to rows group wise like I am assigning number only from 1 to 3 and then the sequence starts from 1 again in the same group:
cat val num
1 aaa 0.05638315 1
2 aaa 0.25767250 2
3 aaa 0.30776611 3
4 aaa 0.46854928 1
5 aaa 0.55232243 2
6 bbb 0.17026205 1
7 bbb 0.37032054 2
8 bbb 0.48377074 3
9 bbb 0.54655860 1
10 bbb 0.81240262 2
11 ccc 0.28035384 1
12 ccc 0.39848790 2
13 ccc 0.62499648 3
14 ccc 0.76255108 1
15 ccc 0.88216552 2
How can I achieve it?
Upvotes: 0
Views: 750
Reputation: 70256
Here's a classic split / apply / combine approach:
df <- unsplit(lapply(split(df, df$cat), function(x)
cbind(x, id = rep(1:3, length.out = nrow(x)))), df$cat)
# cat val id
# 1 aaa 0.05638315 1
# 2 aaa 0.25767250 2
# 3 aaa 0.30776611 3
# 4 aaa 0.46854928 1
# 5 aaa 0.55232243 2
# 6 bbb 0.17026205 1
# 7 bbb 0.37032054 2
# 8 bbb 0.48377074 3
# 9 bbb 0.54655860 1
# 10 bbb 0.81240262 2
# 11 ccc 0.28035384 1
# 12 ccc 0.39848790 2
# 13 ccc 0.62499648 3
# 14 ccc 0.76255108 1
# 15 ccc 0.88216552 2
And a dplyr alternative:
library(dplyr)
df %>% group_by(cat) %>% mutate(id = rep(1:3, length.out = n()))
And a data.table alternative, too:
library(data.table)
setDT(df)
df[, id := rep(1:3, length.out = .N), by = cat]
Upvotes: 2
Reputation: 1481
Here is a solution. Though there is a warning, I find it elegant as concise:
df=data.frame(cat=rep(letters[1:3],each=5),val=rnorm(3*5))
df[,"n"] <- tapply(df[,"val"],df[,"cat"],function(vec) rep.int(1:3,times=ceiling(length(vec)/3))[1:length(vec)])
df
with result
> df
cat val n
1 a -0.01160222 1
2 a 0.13296221 2
3 a -0.19907366 3
4 a -0.52969178 1
5 a 0.05834779 2
6 b 1.06572206 1
7 b 1.23418529 2
8 b -2.53532404 3
9 b -0.77518265 1
10 b -1.35705148 2
11 c -1.16828739 1
12 c -0.32130593 2
13 c 0.98217935 3
14 c 0.31917671 1
15 c 0.89867657 2
Upvotes: 0
Reputation: 1975
This should do the trick. You can get the unique cats in your data.frame, extract the corresponding rows and then attach a numeric vector of integers starting from 1, including values in the sequence (1,2,3). This is recounted for 1 for each cat.
df <- data.frame(cat=c(rep("aaa", 5), rep("bbb", 2), rep("ccc", 4), rep("ddd", 7)),
val = rnorm(n = 18))
df$num <- do.call(c, lapply(unique(df$cat), (function(i){
slice <- df[df$cat==i,]
rep(1:3, 1+as.integer(nrow(slice)/3))[1:nrow(slice)]
})))
The final result is the following
cat val num
1 aaa -0.20791826 1
2 aaa 1.95733315 2
3 aaa 1.01099852 3
4 aaa 0.25355751 1
5 aaa 0.70946906 2
6 bbb 1.60555603 1
7 bbb -0.05718921 2
8 ccc 0.13465897 1
Upvotes: 0