KTWillow
KTWillow

Reputation: 307

Data.table: Add rows for missing combinations of 2 factors without losing associated descriptive factors

I have a data table with multiple factors, for example:

dt <- data.table(station=c(1,1,2,2,3), station.type=c("X","X","Y","Y","Y"), stage=c("A","B","A","B","A"), value=10:14)

   station station.type stage value
1:       1            X     A    10
2:       1            X     B    11
3:       2            Y     A    12
4:       2            Y     B    13
5:       3            Y     A    14

Each station is associated with a type (My actual data has over 50 stations and 10 types). In the example, the combination station 3 / stage B is missing. I want to add rows for the missing combinations, while retaining the type associated with the station.

I started from Matt Dowle's answer to this question: Fastest way to add rows for missing values in a data.frame?

setkey(dt, station, stage)
dt[CJ(station, stage, unique=TRUE)]

   station station.type stage value
1:       1            X     A    10
2:       1            X     B    11
3:       2            Y     A    12
4:       2            Y     B    13
5:       3            Y     A    14
6:       3           NA     B    NA

But then I have to do another merge with the original data table to fill in the type for each station.

Is there a way to it all in one line - something like:

dt[CJ(cbind(station, station.type), stage, unique=TRUE)]

(of course this doesn't work because CJ takes vectors as arguments)

Upvotes: 3

Views: 390

Answers (1)

Arun
Arun

Reputation: 118859

Here's one way:

dt[, .SD[.(stage=c("A", "B")), on="stage"], by=.(station, station.type)]

Upvotes: 4

Related Questions