goclem
goclem

Reputation: 954

Merging dataframes by names using R

I want to create a function that merges dataframes whose names contain a defined character string. In the following example, myfun(A) would merge the dataframes whose name contains "A", that is, A1 and A2, leaving B1 out.

A1=data.frame(id=paste0("id",1:10),var1=letters[sample(1:26,10)])
A2=data.frame(id=paste0("id",1:10),var2=LETTERS[sample(1:26,10)])
B1=data.frame(id=paste0("id",1:10),var3=letters[sample(1:26,10)])

My best try (which does not work):

myfun=function(my.pattern){
  dfs=ls(,pattern=paste(my.pattern)) # Getting the list of dataframes whose name contains the pattern
  merged_df=merge(dfs[1],dfs[2],by=id) # Merging those dataframes
  return(merged_df)
}

Upvotes: 0

Views: 180

Answers (2)

Timo Kvamme
Timo Kvamme

Reputation: 2964

I use this function often, it takes in a directory, and then you give it a criterion it should select .csv files.

combine_csv <- function(dir, criterion1 = "subject"){ 
    dir<-list.files(dir, full.names =TRUE)
    data <- data.frame()
    cat_string <- c() # inititalize character vector

    for (i in dir) {
            if (grepl(criterion1, i)) {

                    cat_string <- c(cat_string, i)
            }
    }
    tables <- lapply(cat_string, read.csv, header = TRUE)
    data <- do.call(rbind , tables)
    return(data)      
}

It can even be customized to include multiple criteria

if (grepl(criterion1, i) & grepl(criterion2, i)) 

Or the way i use it is to look through a parrent folder, and check subfolders if they contain the .csv's im looking for

    for (i in 1:length(parent_dir_content)) {
            cur_dir <- parent_dir_content[i]
            if (grepl(criterion1, cur_dir)) {
                    cur_files<-list.files(cur_dir, full.names =TRUE)
                    for (j in 1:length(cur_files)) {
                            cur_file <- cur_files[j]
                            if (grepl(criterion2, cur_file)){
                                    cat_string <- c(cat_string, cur_file)
                            }
                    }
            }
    }

Upvotes: 0

akrun
akrun

Reputation: 887118

We could use mget to get the value of the objects from the ls in a list and then we can merge by using Reduce

myfun <- function(my.pattern){
 v1 <- ls(pattern=my.pattern, envir=parent.frame())
 Reduce(function(...) merge(..., by = 'id'), mget(v1, envir=parent.frame()))
}

myfun('A\\d+')
# id var1.x var1.y
#1   id1      d      R
#2  id10      c      V
#3   id2      z      E
#4   id3      w      W
#5   id4      l      U
#6   id5      y      X
#7   id6      h      P
#8   id7      n      H
#9   id8      f      O
#10  id9      g      A

Upvotes: 3

Related Questions