Reputation: 95
I'm new writing for loops, but this dataset is too large to do this one by one (though open to suggestions). I have a list of dataframes (A), and I am trying to rbind each of them to specific, corresponding columns in another dataframe (B). The dataframes in A each have 3 columns, and dataframe B has 6 columns. I want to rbind the first df in A with B columns 1,5 & 6; the second df in A with B columns 2, 5, & 6, and so on.
Some example data:
B = data.frame("x" = c(1:3),"y"=c(1:3),"z"=c(1:3),"presence" = "y", "clust" = c(1:3))
A <- list()
for (i in 1:3) {
A[[i]] <- data.frame("value" = 1 - B[,i],
"presence" = "n",
"clust" = rownames(B))
}
I would like to end up with 3 dfs, each with 3 columns (x,presence,clust then y,presence,clust etc). I am trying to rbind them with a for loop like so:
combined <- list()
for (i in 1:2) {
combined[[i]] <- rbind(A[[i]],B[,c(i,3,4)])
}
Error in match.names(clabs, names(xi)) : names do not match previous names
Essentially, how do I rbind the columns by index rather than name?
Upvotes: 2
Views: 71
Reputation: 79184
This solution applies type.convert()
to A
to adjust data types, then uses map()
and bind_rows()
to append selected columns from B
to each dataframe in A
.
library(dplyr)
library(purrr)
A |>
type.convert(as.is = TRUE) |>
map(~ bind_rows(.x, B |>
select(value = 1,4,5)
)
)
[[1]]
value presence clust
1 0 n 1
2 -1 n 2
3 -2 n 3
4 1 y 1
5 2 y 2
6 3 y 3
[[2]]
value presence clust
1 0 n 1
2 -1 n 2
3 -2 n 3
4 1 y 1
5 2 y 2
6 3 y 3
[[3]]
value presence clust
1 0 n 1
2 -1 n 2
3 -2 n 3
4 1 y 1
5 2 y 2
6 3 y 3
Upvotes: 0
Reputation: 93938
Match the names up inline with a Map
ping function:
Map(\(a,b) rbind(setNames(a, names(b)), b),
A, lapply(1:3, \(x) B[c(x,4,5)]))
##[[1]]
## x presence clust
##1 0 n 1
##2 -1 n 2
##3 -2 n 3
##4 1 y 1
##5 2 y 2
##6 3 y 3
##
##[[2]]
## y presence clust
##1 0 n 1
##2 -1 n 2
##3 -2 n 3
##4 1 y 1
##5 2 y 2
##6 3 y 3
##
##[[3]]
## z presence clust
##1 0 n 1
##2 -1 n 2
##3 -2 n 3
##4 1 y 1
##5 2 y 2
##6 3 y 3
Upvotes: 2
Reputation: 19191
Changing the variable value to x,y,z per data frame
A_ <- lapply(seq_along(A), \(x) {
names(A[[x]])[1] <- names(B[x]); A[[x]]
})
This enables the use of rbindlist
from data.table
which will bind by variable name.
lapply(A_, \(x)
data.table::rbindlist(list(x, B), fill=T)[, seq_along(A_), with=F])
[[1]]
x presence clust
<num> <char> <char>
1: 0 n 1
2: -1 n 2
3: -2 n 3
4: 1 y 1
5: 2 y 2
6: 3 y 3
[[2]]
y presence clust
<num> <char> <char>
1: 0 n 1
2: -1 n 2
3: -2 n 3
4: 1 y 1
5: 2 y 2
6: 3 y 3
[[3]]
z presence clust
<num> <char> <char>
1: 0 n 1
2: -1 n 2
3: -2 n 3
4: 1 y 1
5: 2 y 2
6: 3 y 3
Note, use as.data.frame
within lapply
to get back data.frames
Upvotes: 3