Curious
Curious

Reputation: 549

How to combine two columns in a data frame element by element?

I need to combine two columns in a data frame element by element. I tried to use paste function but that basically concatenated the columns and that's not what I need:

#sample data
df <- data.frame ("col1" = c("red|",
                             "blue| , red|", 
                             "blue| , red| , yellow|"), 
                  "col2" = c("green",
                             "yellow , blue",
                             "black , red , blue"))

#this is what I tried:
df$new <- paste(df$col1, df$col2, sep = " , ")

#output for each row:
# "red| , green"           
# "blue| , red| , yellow , blue"            
# "blue| , red| , yellow| , black , red , blue"

#below is the desired output:
df$correct_output <- c("red|green",
                       "blue|yellow , red|blue",
                       "blue|black , red|red , yellow|blue")

Upvotes: 1

Views: 52

Answers (2)

Yosi Hammer
Yosi Hammer

Reputation: 588

df <- data.frame ("col1" = c("red|",
                             "blue| , red|", 
                             "blue| , red| , yellow|"), 
                  "col2" = c("green",
                             "yellow , blue",
                             "black , red , blue"),
                  stringsAsFactors = F)
parts1 <- strsplit(df$col1, ' , ')
parts2 <- strsplit(df$col2, ' , ')
# join parts from two columns
n <- dim(df)[1]
df$col3 <- lapply(1:n, function(i) paste0(parts1[[i]], parts2[[i]])) 
# join joined parts to a single string per row
df$col3 <- lapply(col3, function(x) paste(x, collapse = ' , '))
df

                    col1               col2                               col3
1                   red|              green                          red|green
2           blue| , red|      yellow , blue             blue|yellow , red|blue
3 blue| , red| , yellow| black , red , blue blue|black , red|red , yellow|blue

Upvotes: 0

AntoniosK
AntoniosK

Reputation: 16121

#sample data
df <- data.frame ("col1" = c("red|",
                             "blue| , red|", 
                             "blue| , red| , yellow|"), 
                  "col2" = c("green",
                             "yellow , blue",
                             "black , red , blue"))

library(tidyverse)

df %>%
  group_by(id = row_number()) %>%           # group by a row id (useful to reshape)
  separate_rows(col1, col2, sep=" ,") %>%   # separate based on comma and add new rows
  unite(col, col1, col2, sep="") %>%        # combine corresponding values
  summarise(correct = paste0(gsub(" ", "", col), collapse = ", ")) %>% # remove any spaces and combine values
  bind_cols(df, .) %>%                      # bind origina dataset
  select(-id)                               # remove id column

#                     col1               col2                          correct
# 1                   red|              green                        red|green
# 2           blue| , red|      yellow , blue            blue|yellow, red|blue
# 3 blue| , red| , yellow| black , red , blue blue|black, red|red, yellow|blue

Upvotes: 2

Related Questions