noumenal
noumenal

Reputation: 1237

How can I replace a data.table column with a sequence that depends on another column?

I have two columns ID and Trial. The ID column is filled with NAs. The Trial column starts at 0 and ends on an arbritary number (e.g. 1232), whereupon the next trial sequences commences with 0, etc. My goal is to create a unique ID for each series of trials.

I am new to R and realize that there are several ways to solve this:

So far, I have figured out that number of participants is:

N <- dim(filter(ex_data, Trial == 0))[1]

Or more elegantly:

N <- count(ex_data[Trial == 0])

In particular, it is the conditional part that I am struggling with and what would be the most R-like solution.

Pseudocode:

IDs are 1:N

while IDs < N+1
    current + 1
    while column Trial is > 0
        ID is IDs[current]
        next Trial
    next Trial

How do I make the decision when to use loops over more compact expressions like the apply family? Specifically, how do I generate a new sequence based on a nearly cyclic column?

Example Data (for generation see below)


      id  t
 [1,] NA  0
 [2,] NA  1
 [3,] NA  2
 [4,] NA  3
 [5,] NA  4
 [6,] NA  5
 [7,] NA  0
 [8,] NA  1
 [9,] NA  2
[10,] NA  3
[11,] NA  4
[12,] NA  5
[13,] NA  6
[14,] NA  7
[15,] NA  0
[16,] NA  1
[17,] NA  2
[18,] NA  3
[19,] NA  4
[20,] NA  5
[21,] NA  6
[22,] NA  7
[23,] NA  8
[24,] NA  9
[25,] NA 10
[26,] NA 11
[27,] NA 12


# Generate Example Data
t <- c(0:5, 0:7, 0:12)
id <- rep(NA, length(t))
dta <- cbind(id, t)
# Optional (using dtplyr)
# dta <- tbl_df(dta)

Upvotes: 0

Views: 146

Answers (2)

akrun
akrun

Reputation: 887118

We can use data.table methods

ex_data[, ID := cumsum(!Trial)]

Upvotes: 4

joel.wilson
joel.wilson

Reputation: 8413

solution :

ex_data$ID <- cumsum(ex_data$Trial==0 )

Upvotes: 0

Related Questions