Reputation: 1532
I have a dataset with nested structures in R (some cells are arrays from its original JSON structure).
set.seed(123)
data = list()
data$nested_df_1 = data.frame(a = letters[1:10]
, b = round(rnorm(10), 0))
data$nested_df_2 = list()
data$nested_df_2$nested_df_2_1 = data.frame(c = letters[11:20]
, d = sample(-100:100, 10))
Now I want to subset the whole list data
so that it only includes all instances (= all rows in all structures) where data$nested_df_1$b >= 0
.
> data$nested_df_1
a b
1 a -1
2 b 0
3 c 2
4 d 0
5 e 0
6 f 2
7 g 0
8 h -1
9 i -1
10 j 0
Thus: rows 1, 8, 9 would need to be removed from the whole structure (i.e. from data$nested_df_1
and data$nested_df_2$nested_df_2_1
.
If I just wanted this for the data$nested_df_1
dataframe, I could do:
data$nested_df_1 = data$nested_df_1[data$nested_df_1$b >= 0, ]
(The indices remain constant, i.e. if row_i
in data$nested_df_1
meets the criterion, then this is also true for row_i
in data$nested_df_2$nested_df_2_1
).
But how can I do the subset for the whole nested structure?
Upvotes: 1
Views: 73
Reputation: 887971
We can create a logical index, loop through the list
, if it is a data.frame
subset
or else loop through the list
and subset
(assuming list
nest is of depth 2)
i1 <- data$nested_df_1$b >= 0
lapply(data, function(x) if(is.data.frame(x)) subset(x, i1) else
lapply(x, function(y) subset(y, i1)))
Upvotes: 1