chan chong
chan chong

Reputation: 145

Using merge command in r for merging depending upon column values

So, I have several dataframes like this

1 2 a
2 3 b
3 4 c
4 5 d
3 5 e
......

1 2 j
2 3 i
3 4 t
3 5 r
.......

2 3 t
2 4 g
6 7 i
8 9 t
......

What I want is, I want to merge all of these files into one single file showing the values of third column for each pair of values in columns 1 and columns 2 and 0 if that pair is not present.

So, the output for this will be, since, there are three files (there are more)

1 2 aj0
2 3 bit
3 4 ct0
4 5 d00
3 5 er0
6 7 00i
8 9 00t
......

What I did was combine all my text .txt files in a single list. Then,

L <- lapply(seq_along(L), function(i) { 
  L[[i]][, paste0('DF', i)] <-  1
  L[[i]] 
})

Which will indicate the presence of a value when we will be merging them. I don't know how to proceed further. Any inputs will be great. Thanks!

Upvotes: 0

Views: 81

Answers (1)

Ramnath
Ramnath

Reputation: 55695

Here is one way to do it with Reduce

# function to generate dummy data
gen_data<- function(){
  data.frame(
    x = 1:3,
    y = 2:4,
    z = sample(LETTERS, 3, replace = TRUE)
  )
}

# generate list of data frames to merge
L <- lapply(1:3, function(x) gen_data())

# function to merge by x and y and concatenate z
f <- function(x, y){
  d <- merge(x, y, by = c('x', 'y'), all = TRUE)
  # set merged column to zero if no match is found
  d[['z.x']] = ifelse(is.na(d[['z.x']]), 0, d[['z.x']])
  d[['z.y']] = ifelse(is.na(d[['z.y']]), 0, d[['z.y']])
  d$z <- paste0(d[['z.x']], d[['z.y']])
  d['z.x'] <- d['z.y'] <- NULL
  return(d)
}



# merge data frames
Reduce(f, L)

Upvotes: 2

Related Questions