Reputation: 1237
I have two columns ID
and Trial
. The ID column is filled with NAs. The Trial
column starts at 0 and ends on an arbritary number (e.g. 1232), whereupon the next trial sequences commences with 0, etc. My goal is to create a unique ID for each series of trials.
I am new to R and realize that there are several ways to solve this:
lapply
(or rapply
?) together with an abstract (?) function call or handlenextElem
from the iterator
package together with point 1 or 2seq()
based on some kind of iteration on subsets: ex_data[subset]
So far, I have figured out that number of participants is:
N <- dim(filter(ex_data, Trial == 0))[1]
Or more elegantly:
N <- count(ex_data[Trial == 0])
In particular, it is the conditional part that I am struggling with and what would be the most R-like solution.
Pseudocode:
IDs are 1:N
while IDs < N+1
current + 1
while column Trial is > 0
ID is IDs[current]
next Trial
next Trial
How do I make the decision when to use loops over more compact expressions like the apply
family? Specifically, how do I generate a new sequence based on a nearly cyclic column?
Example Data (for generation see below)
id t
[1,] NA 0
[2,] NA 1
[3,] NA 2
[4,] NA 3
[5,] NA 4
[6,] NA 5
[7,] NA 0
[8,] NA 1
[9,] NA 2
[10,] NA 3
[11,] NA 4
[12,] NA 5
[13,] NA 6
[14,] NA 7
[15,] NA 0
[16,] NA 1
[17,] NA 2
[18,] NA 3
[19,] NA 4
[20,] NA 5
[21,] NA 6
[22,] NA 7
[23,] NA 8
[24,] NA 9
[25,] NA 10
[26,] NA 11
[27,] NA 12
# Generate Example Data
t <- c(0:5, 0:7, 0:12)
id <- rep(NA, length(t))
dta <- cbind(id, t)
# Optional (using dtplyr)
# dta <- tbl_df(dta)
Upvotes: 0
Views: 146