Reputation: 1120
It is going to be hard to give you a reproducible example but in general it supposed to be an easy task for many of you. My brain has not yet switched on after midday coffee.
I have a list of 20-30 data frames. I would like to extract specific rows from each data frame. The pattern will be very repetetive.
From first data frame, lets call it LD1
I would like to take rows 1:8
and from every next data frame the row numbers will be higher by 8
, so 9:16
, third - 17:24
, etc.
I would like to keep original names of these data frames.
Can someone switch a light in my brain ?
Upvotes: 1
Views: 110
Reputation: 349
Minimal reproducible example:
# works also if you have matrices instead of data frames
genDF.a <- genDF.b <- genDF.c <- data.frame(matrix(rep(1:100, 2), nrow = 100))
myList <- list(a = genDF.a, b = genDF.b, c = genDF.c)
Now the answer to your question:
# put the indices of the rows you want to extract on a list
myInds <- lapply(0:2, function(i) (1:8)+(8*(i)))
# use mapply to loop over both, the list of matrices and the list of indices
mapply(function(M, ind) M[ind,], myList, myInds, SIMPLIFY = TRUE)
Edit based on the comment of @Sotos
# use Map to loop over both, the list of matrices and the list of indices
Map(function(M, ind) M[ind,], myList, myInds)
You will obtain a list with the desired rows of each matrix along with the names from the original list.
I filled the rows of the data frames with their corresponding index so that it is easy to check it works.
The output:
$a
X1 X2
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 8 8
$b
X1 X2
9 9 9
10 10 10
11 11 11
12 12 12
13 13 13
14 14 14
15 15 15
16 16 16
$c
X1 X2
17 17 17
18 18 18
19 19 19
20 20 20
21 21 21
22 22 22
23 23 23
24 24 24
Upvotes: 1
Reputation: 39858
One option involving purrr
and dplyr
could be:
map2(.x = lst,
.y = split(1:nrow(lst[[1]]),
cut(1:nrow(lst[[1]]), c(0, cumsum(rep(5, length(lst)-1)), Inf))),
~ .x %>%
filter(row_number() %in% .y))
Here, the number of rows is the following:
$df1
[1] 5
$df2
[1] 5
$df3
[1] 5
$df4
[1] 17
Could be a slightly more compact by:
df_nrow <- 1:nrow(lst[[1]])
n <- 5
map2(.x = lst,
.y = split(df_nrow,
cut(df_nrow, c(0, cumsum(rep(n, length(lst)-1)), Inf))),
~ .x %>%
filter(row_number() %in% .y))
Sample data:
lst <- list(df1 = mtcars,
df2 = mtcars,
df3 = mtcars,
df4 = mtcars)
Upvotes: 1
Reputation: 10375
Using mapply
df=list(a=mtcars,b=mtcars,c=mtcars)
ix=list(1:8,9:16,17:25)
mapply(function(x,y){list(x[y,])},x=df,y=ix)
Upvotes: 0
Reputation: 51592
One idea is to use Map
and create the indices using a simple mathematical formula which will work for any number of data frames in your list, i.e.
Map(function(x, y)x[seq(8) + y * 8,, drop = FALSE], l2, 0:(length(l2) - 1))
which gives,
$v1
v1
1 444
2 52
3 345
4 48
5 375
6 491
7 10
8 126
$v1
v1
9 57
10 354
11 239
12 205
13 273
14 172
15 345
16 293
$v1
v1
17 366
18 487
19 423
20 194
21 18
22 476
23 151
24 382
$v1
v1
25 131
26 245
27 10
28 41
29 248
30 104
31 163
32 187
$v1
v1
33 335
34 44
35 442
36 362
37 470
38 145
39 384
40 257
where l2
,
dput(l2)
list(v1 = structure(list(v1 = c(444L, 52L, 345L, 48L, 375L, 491L,
10L, 126L, 231L, 124L, 494L, 476L, 213L, 208L, 35L, 327L, 294L,
467L, 39L, 295L, 12L, 49L, 201L, 335L, 72L, 204L, 453L, 299L,
157L, 355L, 380L, 348L, 309L, 117L, 404L, 304L, 222L, 287L, 500L,
406L, 340L, 166L, 442L, 256L, 354L, 269L, 98L, 245L, 471L, 253L,
15L, 130L, 434L, 329L, 465L, 18L, 346L, 389L, 185L, 238L)), row.names = c(NA,
-60L), class = "data.frame"), v1 = structure(list(v1 = c(67L,
461L, 68L, 420L, 59L, 291L, 391L, 275L, 57L, 354L, 239L, 205L,
273L, 172L, 345L, 293L, 236L, 304L, 70L, 410L, 91L, 204L, 343L,
386L, 400L, 482L, 221L, 190L, 340L, 328L, 367L, 36L, 95L, 229L,
98L, 148L, 255L, 490L, 101L, 480L, 113L, 122L, 330L, 31L, 276L,
18L, 192L, 243L, 178L, 240L, 297L, 75L, 381L, 144L, 71L, 208L,
76L, 46L, 146L, 373L)), row.names = c(NA, -60L), class = "data.frame"),
v1 = structure(list(v1 = c(344L, 200L, 282L, 236L, 404L,
201L, 286L, 185L, 479L, 46L, 32L, 124L, 365L, 297L, 66L,
483L, 366L, 487L, 423L, 194L, 18L, 476L, 151L, 382L, 240L,
261L, 346L, 345L, 85L, 332L, 179L, 67L, 87L, 415L, 98L, 480L,
320L, 307L, 141L, 224L, 27L, 432L, 103L, 23L, 370L, 306L,
153L, 78L, 418L, 186L, 459L, 162L, 59L, 484L, 20L, 385L,
216L, 116L, 99L, 301L)), row.names = c(NA, -60L), class = "data.frame"),
v1 = structure(list(v1 = c(358L, 233L, 343L, 121L, 22L, 230L,
461L, 430L, 246L, 19L, 155L, 303L, 197L, 276L, 44L, 264L,
102L, 243L, 153L, 385L, 89L, 49L, 360L, 148L, 131L, 245L,
10L, 41L, 248L, 104L, 163L, 187L, 5L, 179L, 341L, 322L, 250L,
210L, 223L, 103L, 80L, 151L, 263L, 310L, 34L, 275L, 165L,
328L, 71L, 364L, 454L, 336L, 249L, 205L, 284L, 419L, 113L,
185L, 416L, 298L)), row.names = c(NA, -60L), class = "data.frame"),
v1 = structure(list(v1 = c(393L, 346L, 227L, 242L, 61L, 264L,
106L, 326L, 278L, 150L, 397L, 398L, 199L, 478L, 430L, 134L,
297L, 291L, 341L, 436L, 47L, 94L, 275L, 419L, 448L, 180L,
24L, 440L, 135L, 260L, 472L, 158L, 335L, 44L, 442L, 362L,
470L, 145L, 384L, 257L, 6L, 333L, 429L, 149L, 62L, 173L,
109L, 330L, 492L, 286L, 328L, 178L, 197L, 367L, 282L, 426L,
466L, 111L, 123L, 251L)), row.names = c(NA, -60L), class = "data.frame"))
Upvotes: 3
Reputation: 168
You can use lapply
and modify it to your needs, a working example
# create some sample data
sample_list <- lapply(1:30, function(i) {
tibble::tibble(x = i * 1:1000, y = 2 * x)
})
# number of rows to extract/skip
skip_no <- 8
# use lapply with anonymus function
lapply(1:length(sample_list), function(i) {
# create own variable to set the sample_list index in relation to
# the anonymus function argument 'i'
if (i == 1) {
current_index <- 1
} else {
current_index <- (i - 1) * skip_no + 1
}
sample_list[[i]][current_index:(current_index + skip_no - 1),]
})
Upvotes: 0