Append values from column 2 to values from column 1

Question

In R, I have two data frames (A and B) that share columns (1, 2 and 3). Column 1 has a unique identifier, and is the same for each data frame; columns 2 and 3 have different information. I'm trying to merge these two data frames to get 1 new data frame that has columns 1, 2, and 3, and in which the values in column 2 and 3 are concatenated: i.e. column 2 of the new data frame contains: [data frame A column 2 + data frame B column 2]

Example:

dfA <- data.frame(Name = c("John","James","Peter"),
                  Score = c(2,4,0),
                  Response = c("1,0,0,1","1,1,1,1","0,0,0,0"))

dfB <- data.frame(Name = c("John","James","Peter"),
                  Score = c(3,1,4),
                  Response = c("0,1,1,1","0,1,0,0","1,1,1,1"))

dfA:
    Name Score Response
1  John     2  1,0,0,1
2 James     4  1,1,1,1
3 Peter     0  0,0,0,0

dfB:
   Name Score Response
1  John     3  0,1,1,1
2 James     1  0,1,0,0
3 Peter     4  1,1,1,1

Should results in:

dfNew <- data.frame(Name = c("John","James","Peter"),
                    Score = c(5,5,4),
                    Response = c("1,0,0,1,0,1,1,1","1,1,1,1,0,1,0,0","0,0,0,0,1,1,1,1"))

dfNew:
   Name Score Response
1  John     5  1,0,0,1,0,1,1,1
2 James     5  1,1,1,1,0,1,0,0
3 Peter     4  0,0,0,0,1,1,1,1

I've tried merge but that simply appends the columns (much like cbind)

Is there a way to do this, without having to cycle through all columns, like:

colnames(dfNew) <- c("Name","Score","Response")
dfNew$Score <- dfA$Score + dfB$Score
dfNew$Response <- paste(dfA$Response, dfB$Response, sep=",")

The added difficulty is, as you might have noticed, that for some columns we need to use addition, whereas others require concatenation separated by a comma (the columns requiring addition are formatted as numerical, the others as text, which might make it easier?)

Thanks in advance!

PS. The string 1,0,0,1,0,1,1,1 etc. captures the response per trial – this example has 8 trials to which participants can either respond correctly (1) or incorrectly (0); the final score is collected under Score. Just to explain why my data/example looks the way it does.

Brian Diggs · Accepted Answer

I would approach this with a for loop over the column names you want to merge. Given your example data:

cols <- c("Score", "Response")

dfNew <- dfA[,"Name",drop=FALSE]
for (n in cols) {
  switch(class(dfA[[n]]),
         "numeric" = {dfNew[[n]] <- dfA[[n]] + dfB[[n]]},
         "factor"=, "character" = {dfNew[[n]] <- paste(dfA[[n]], dfB[[n]], sep=",")})
}

This solution is basically what you had as your idea, but with a loop. The data sets are looked at to see if they are numeric (add them numerically) or a string or factor (concatenate the strings). You could get a similar result by having two vectors of names, one for the numeric and one for the character, but this is extensible if you have other data types as well (though I don't know what they might be). The major drawback of this method is that is assumes the data frames are in the same order with regard to Name. The next solution doesn't make that assumption

dfNew <- merge(dfA, dfB, by="Name")
for (n in cols) {
  switch(class(dfA[[n]]),
         "numeric" = {dfNew[[n]] <- dfNew[[paste0(n,".x")]] + dfNew[[paste0(n,".y")]]},
         "factor"=, "character" = {dfNew[[n]] <- paste(dfNew[[paste0(n,".x")]], dfNew[[paste0(n,".y")]], sep=",")})
  dfNew[[paste0(n,".x")]] <- NULL
  dfNew[[paste0(n,".y")]] <- NULL
}

Same general idea as previous, but uses merge to make sure that the data is correctly aligned, and then works on columns (whose names are postfixed with ".x" and ".y") with dfNew. Additional steps are included to get rid of the separate columns after joining. Also has the bonus feature of carrying along any other columns not specified for joining together in cols.

Append values from column 2 to values from column 1

Answers (2)

Related Questions