Evelyn Abbott
Evelyn Abbott

Reputation: 95

use for loop to rbind dataframes of different sizes

I'm new writing for loops, but this dataset is too large to do this one by one (though open to suggestions). I have a list of dataframes (A), and I am trying to rbind each of them to specific, corresponding columns in another dataframe (B). The dataframes in A each have 3 columns, and dataframe B has 6 columns. I want to rbind the first df in A with B columns 1,5 & 6; the second df in A with B columns 2, 5, & 6, and so on.

Some example data:

B = data.frame("x" = c(1:3),"y"=c(1:3),"z"=c(1:3),"presence" = "y", "clust" = c(1:3))

A <- list()
for (i in 1:3) {
  A[[i]] <- data.frame("value" = 1 - B[,i],
                                   "presence" = "n",
                                   "clust" = rownames(B))
}

I would like to end up with 3 dfs, each with 3 columns (x,presence,clust then y,presence,clust etc). I am trying to rbind them with a for loop like so:

combined <- list()
for (i in 1:2) {
  combined[[i]] <- rbind(A[[i]],B[,c(i,3,4)])
}

Error in match.names(clabs, names(xi)) : names do not match previous names

Essentially, how do I rbind the columns by index rather than name?

Upvotes: 2

Views: 71

Answers (3)

TarJae
TarJae

Reputation: 79184

This solution applies type.convert() to A to adjust data types, then uses map() and bind_rows() to append selected columns from B to each dataframe in A.

library(dplyr)
library(purrr)

A |> 
  type.convert(as.is = TRUE) |> 
  map(~ bind_rows(.x, B |> 
                    select(value = 1,4,5)
                  ) 
      )
[[1]]
  value presence clust
1     0        n     1
2    -1        n     2
3    -2        n     3
4     1        y     1
5     2        y     2
6     3        y     3

[[2]]
  value presence clust
1     0        n     1
2    -1        n     2
3    -2        n     3
4     1        y     1
5     2        y     2
6     3        y     3

[[3]]
  value presence clust
1     0        n     1
2    -1        n     2
3    -2        n     3
4     1        y     1
5     2        y     2
6     3        y     3

Upvotes: 0

thelatemail
thelatemail

Reputation: 93938

Match the names up inline with a Mapping function:

Map(\(a,b) rbind(setNames(a, names(b)), b),
    A, lapply(1:3, \(x) B[c(x,4,5)]))

##[[1]]
##   x presence clust
##1  0        n     1
##2 -1        n     2
##3 -2        n     3
##4  1        y     1
##5  2        y     2
##6  3        y     3
##
##[[2]]
##   y presence clust
##1  0        n     1
##2 -1        n     2
##3 -2        n     3
##4  1        y     1
##5  2        y     2
##6  3        y     3
##
##[[3]]
##   z presence clust
##1  0        n     1
##2 -1        n     2
##3 -2        n     3
##4  1        y     1
##5  2        y     2
##6  3        y     3

Upvotes: 2

Andre Wildberg
Andre Wildberg

Reputation: 19191

Changing the variable value to x,y,z per data frame

A_ <- lapply(seq_along(A), \(x) {
  names(A[[x]])[1] <- names(B[x]); A[[x]]
})

This enables the use of rbindlist from data.table which will bind by variable name.

lapply(A_, \(x) 
  data.table::rbindlist(list(x, B), fill=T)[, seq_along(A_), with=F])
[[1]]
       x presence  clust
   <num>   <char> <char>
1:     0        n      1
2:    -1        n      2
3:    -2        n      3
4:     1        y      1
5:     2        y      2
6:     3        y      3

[[2]]
       y presence  clust
   <num>   <char> <char>
1:     0        n      1
2:    -1        n      2
3:    -2        n      3
4:     1        y      1
5:     2        y      2
6:     3        y      3

[[3]]
       z presence  clust
   <num>   <char> <char>
1:     0        n      1
2:    -1        n      2
3:    -2        n      3
4:     1        y      1
5:     2        y      2
6:     3        y      3

Note, use as.data.frame within lapply to get back data.frames

Upvotes: 3

Related Questions