Shrilaxmi M S
Shrilaxmi M S

Reputation: 171

Combine all Rdata files in directory with a same name object in it

My data I want to combine my Rdata files into one which is having same object name and save to new combined Rdata file in an directory some think like similar to this thread but not able to do it getting error anyone please suggest any simple way to do it in R I am new to R not able to figure it out.

all.files = c("data1.Rdata", "data1.Rdata", "data1.Rdata")

mylist<- lapply(all.files, function(x) {
  load(file = x)
  get(ls()[ls()!= "filename"])
})

names(mylist) <- all.files

Upvotes: 1

Views: 618

Answers (2)

patL
patL

Reputation: 2299

If I understood it correctly you want to get all that .RData into one single data.frame.

One option is to list all files in your working directory that have the extension .RData, load and combine them using rbind:

ll <- list.files(pattern = '.RData')

res <- do.call(rbind,
               lapply(ll, function(x) {
                 
                 load(file = x)
                 get(ls())
               }))

No we can check both top 6 rows.

head(res)

#     chrom     start       end                gid         gname                tid strand
#32590   chr7  45574608  45574777 ENSMUSG00000085214 0610005C13Rik ENSMUST00000130094      -
#109006  chr4 154023688 154023891 ENSMUSG00000078350 1190007F08Rik ENSMUST00000143047      -
#475764 chr15  83365029  83365513 ENSMUSG00000075511 1700001L05Rik ENSMUST00000178628      -
#448806 chr13  31567474  31567610 ENSMUSG00000038408 1700018A04Rik ENSMUST00000150418      -
#11159   chr6 147694981 147695041 ENSMUSG00000085077 1700049E15Rik ENSMUST00000152737      +
#339243 chr12  22958352  22960254 ENSMUSG00000073164 2410018L13Rik ENSMUST00000149246      -
#             class biotype byname.uniq bygid.uniq
#32590  altAcceptor lincRNA        TRUE       TRUE
#109006 altAcceptor lincRNA        TRUE       TRUE
#475764 altAcceptor lincRNA        TRUE       TRUE
#448806 altAcceptor lincRNA        TRUE       TRUE
#11159  altAcceptor lincRNA        TRUE       TRUE
#339243 altAcceptor lincRNA        TRUE       TRUE

and bottom 6 too:

tail(res)
#       chrom     start       end                gid gname                tid strand
#189235  chr6  90373711  90373841 ENSMUSG00000034430  Zxdc ENSMUST00000113539      +
#563026 chr11  72916473  72916587 ENSMUSG00000055670 Zzef1 ENSMUST00000069395      +
#563046 chr11  72916473  72916587 ENSMUSG00000055670 Zzef1 ENSMUST00000152481      +
#158407  chr3 152449013 152449128 ENSMUSG00000039068  Zzz3 ENSMUST00000106101      +
#158450  chr3 152449013 152449128 ENSMUSG00000039068  Zzz3 ENSMUST00000106103      +
#158465  chr3 152449016 152449128 ENSMUSG00000039068  Zzz3 ENSMUST00000089982      +
#             class        biotype byname.uniq bygid.uniq
#189235 altAcceptor protein_coding       FALSE      FALSE
#563026 altAcceptor protein_coding       FALSE      FALSE
#563046 altAcceptor protein_coding       FALSE      FALSE
#158407 altAcceptor protein_coding       FALSE      FALSE
#158450 altAcceptor protein_coding       FALSE      FALSE
#158465 altAcceptor protein_coding       FALSE      FALSE

and you can check the dimensions.

dim(res)
#24279    11

Edit

This worked on R 4.0.3. It seems that if fails for R 4.1.1.. I'll edit the answer with a new solution.

Upvotes: 1

Nick
Nick

Reputation: 349

I typically do this with a loop:

FileVector <- c("data1.Rdata", "data1.Rdata", "data1.Rdata")
Res <- vector(mode = "list",
              length = length(FileVector))

for (m1 in seq_along(FileVector)) {
  FilesLoaded <- load(file = FileVector[m1],
                      verbose = FALSE)
  if ("filename" %in% FilesLoaded) {
    Res[[m1]] <- get("filename")
  }
  rm(list = FilesLoaded)
}

This gives us a list, and we can add other checks to the loop to say, not add data that has some value in whatever column, or we can also check each new piece of data to make sure that certain column names are present. You can also wrap your load() call in a try() call if you have real world concerns like some data files not having generated properly. Then we just slam it all together with do.call()

# Null positions will be dropped
Res <- do.call(rbind,
               Res)

It's usually advantageous to build your vector of file names with something like list.files() and specifying the pattern = argument.

Generally that might look something like:

FileVector <- list.files(path = "~/my/directory",
                         full.names = TRUE,
                         pattern = "mypattern")

Upvotes: 0

Related Questions