Reputation: 377
Answers or points in the right direction are appreciated.
-I have a dataset that is organized by group (id) -There is a column (trial) that indicates the trial the data corresponds to. This value is repeated from 1 to some number. Each trial value can be repeated a variable length (e.g., 1122234444). -Sequences through the trial values are repeated within group. E.g., within each id - you go through a sequence of trial and then trial restarts at 1 and goes through the sequence again for some number of times.
I need to know how many times the trial sequence has been repeated within each group of id.
The desired output is the variable "repetition".
The "repetition" variable should start at 1 and repeat until the sequence restarts again to 1, where it should move to 2 to indicate that the trial sequence is on it's 2nd repeat.
The max number of trials, ids, and the number of repetitions are always variable, but the trial sequence always goes (repeating at variable length) 1,2,3,....
id <- sort(rep(c("a", "b"), each = 4, times = 2))
trial <- rep(1:2, each = 2 , times = 2)
repetition <- rep(1:2, each = 4, times = 2)
df <- data.frame(id, trial, repetition)
id trial repetition
1 a 1 1
2 a 1 1
3 a 2 1
4 a 2 1
5 a 1 2
6 a 1 2
7 a 2 2
8 a 2 2
9 b 1 1
10 b 1 1
11 b 2 1
12 b 2 1
13 b 1 2
14 b 1 2
15 b 2 2
16 b 2 2
Upvotes: 2
Views: 372
Reputation: 51592
Here is an idea using dplyr
together with splitstackshape
. We first use new = cumsum(c(1, diff(trial) != 0))
to get the number of different groups. We then group by id
, new
and count them (new1
). We slice
to get the top of each group and use cumsum(trial == 1)
to get the repetition. Finally, we use splitstackshape
function expandRows
which replicates the rows by the count number we obtained from new1
. We finish by tidying a bit with select
and ungroup
.
library(dplyr)
library(splitstackshape)
df %>%
mutate(new = cumsum(c(1, diff(trial) != 0))) %>%
group_by(id, new) %>%
mutate(new1 = n()) %>%
slice(1L) %>%
group_by(id) %>%
mutate(repetition = cumsum(trial == 1)) %>%
expandRows('new1') %>%
select(-new) %>%
ungroup()
# A tibble: 16 × 3
# id trial repetition
# <fctr> <int> <int>
#1 a 1 1
#2 a 1 1
#3 a 2 1
#4 a 2 1
#5 a 1 2
#6 a 1 2
#7 a 2 2
#8 a 2 2
#9 b 1 1
#10 b 1 1
#11 b 2 1
#12 b 2 1
#13 b 1 2
#14 b 1 2
#15 b 2 2
#16 b 2 2
Upvotes: 1
Reputation: 149
I assumed your data looks something like this:
trial=rep(c(1,1,2,2,2,3,4,4,4,4,1,2,2,2,2,2,3,3,3,4,5,5,5,6,6,7,1,1,2,3,3,4,5,6,7,7,7),2)
id=c(rep("a",length(trial/2)),rep("b",length(trial/2)))
df=data.frame(id,trial,repetition=numeric(length(trial)))
Then this code does what you are asking for as far as I understood:
counter=1
for(i in 1:nrow(df)){
if(i>1){
if(df$id[i-1] != df$id[i]){
counter=1
} else {
if(df$trial[i-1]>df$trial[i]){
counter=counter+1
}
}
df$repetition[i]=counter
}else{
df$repetition[i]=1
}
}
In my data-frame the repetition
-column already exists but
this also works if the data-frame df
doesn't have the repetition
-column yet.
It will be added by the code in the loop if it doesn't exist yet.
Upvotes: 1