Reputation: 29
I would like to subset a database like this in two parts:
df <- data.frame(ID = c(1,1,1,1,1,2,2,2,3,3,3),
ins =c(1,2,3,4,5,3,2,4,5,8,9),
Ytx = c(NA,NA,1998,NA,NA,NA,NA,NA,NA,2011,NA))
ID ins Ytx
1 1 NA
1 2 NA
1 3 1998
1 4 NA
1 5 NA
2 3 NA
2 2 NA
2 4 NA
3 5 NA
3 8 2011
3 9 NA
The first should look like this (all the values coming after the occurrence of Ytx grouped by ID):
ID ins Ytx
1 3 1998
1 4 NA
1 5 NA
3 8 2011
3 9 NA
and the other one made by what is left:
ID ins Ytx
1 1 NA
1 2 NA
2 3 NA
2 2 NA
2 4 NA
3 5 NA
Thank you for your support
Upvotes: 0
Views: 66
Reputation: 83235
Two other alternatives with data.table:
# convert 'df' tot a 'data.table'
library(data.table)
setDT(df)
# alternative 1
split(df, df[, !!cumsum(!is.na(Ytx)), by = ID]$V1)
# alternative 2
split(df, df[, !!Reduce(`+`, !is.na(Ytx), accumulate = TRUE), by = ID]$V1)
which both give:
$`FALSE` ID ins Ytx 1: 1 1 NA 2: 1 2 NA 3: 2 3 NA 4: 2 2 NA 5: 2 4 NA 6: 3 5 NA $`TRUE` ID ins Ytx 1: 1 3 1998 2: 1 4 NA 3: 1 5 NA 4: 3 8 2011 5: 3 9 NA
Upvotes: 2
Reputation: 25225
Here is an option using data.table
:
setDT(df)[, rn := .I]
idx <- df[,
if (any(!is.na(Ytx)))
.I[seq(.N) >= match(TRUE, !is.na(Ytx))]
, ID]$V1
df[idx]
:
ID ins Ytx rn
1: 1 3 1998 3
2: 1 4 NA 4
3: 1 5 NA 5
4: 3 8 2011 10
5: 3 9 NA 11
df[-idx]
:
ID ins Ytx rn
1: 1 1 NA 1
2: 1 2 NA 2
3: 2 3 NA 6
4: 2 2 NA 7
5: 2 4 NA 8
6: 3 5 NA 9
data:
library(data.table)
df <- data.frame(ID = c(1,1,1,1,1,2,2,2,3,3,3),
ins =c(1,2,3,4,5,3,2,4,5,8,9),
Ytx = c(NA,NA,1998,NA,NA,NA,NA,NA,NA,2011,NA))
Upvotes: 0
Reputation: 389012
In base R, we can use ave
and split
. This will divide data into list of two. First one is the data before value of Ytx
and the second one is rows after Ytx
.
split(df, with(df, ave(!is.na(Ytx), ID, FUN = cumsum)))
#Or if you may have multiple Ytx per ID
#split(df, with(df, ave(!is.na(Ytx), ID, FUN = cumsum)) > 0)
#$`0`
# ID ins Ytx
#1 1 1 NA
#2 1 2 NA
#6 2 3 NA
#7 2 2 NA
#8 2 4 NA
#9 3 5 NA
#$`1`
# ID ins Ytx
#3 1 3 1998
#4 1 4 NA
#5 1 5 NA
#10 3 8 2011
#11 3 9 NA
Upvotes: 2