emiliamcl
emiliamcl

Reputation: 11

Concatenation between rows

I have a dataframe with two colmns:

C1 <- c("abcd > de > efg", "hij > kl > iiii", "aa", "a > bbb")
C2 <- c("1980","1982","1989","1989")

df <- data.frame(C1, C2, stringsAsFactors = FALSE)

My goal is concatenate the arguments of the 2 of them like this:

result <- c("1980abcd > 1980de > 1980efg", "1982hij > 1982kl > 1982iiii", "1989aa", "1989a > 1989bbb")

How can i do that? Thanks.

Upvotes: 0

Views: 69

Answers (3)

talat
talat

Reputation: 70256

Here's an approach that doesn't require splitting each string and pasting back together:

mapply(function(x,y) gsub("(^|\\s)(?=[a-z]+)",  paste0("\\1", y), x, perl = TRUE), 
                     df$C1, df$C2, USE.NAMES = FALSE)
#[1] "1980abcd > 1980de > 1980efg" "1982hij > 1982kl > 1982iiii"
#[3] "1989aa"                      "1989a > 1989bbb" 

The regular expression pattern (^|\\s)(?=[a-z]+) matches either the beginning of the string or a space followed by a lower case character and then replaces it with the corresponding C2-entry.


Here's a purrr alternative:

library(purrr)
strsplit(df$C1, " > ") %>% map2_chr(df$C2, ~paste(.y, .x, sep = "", collapse=" > "))
#[1] "1980abcd > 1980de > 1980efg" "1982hij > 1982kl > 1982iiii"
#[3] "1989aa"                      "1989a > 1989bbb" 

Upvotes: 1

Sotos
Sotos

Reputation: 51582

One way via base R is to use split the C1 vector and use mapply to paste with C2, i.e.

v1 <- mapply(function(x, y) paste(paste0(x, y), collapse = ' > '), C2, strsplit(C1, ' > '))

unname(v1)
#[1] "1980abcd > 1980de > 1980efg" "1982hij > 1982kl > 1982iiii" "1989aa"   "1989a > 1989bbb"

NOTE: The result of mapply (i.e. v1) is a named vector. Hence I used unname to get to your desired structure. However, note that a named vector is still a vector and will behave as such.

Upvotes: 1

Nicolas Rosewick
Nicolas Rosewick

Reputation: 1998

Using strsplit, apply and paste :

    library(dplyr)
    df <- tibble(C1=strsplit(C1," > "),C2)

    res <- unlist(apply(df,1,function(y){paste(paste(x$C2,x$C1,sep=""),collapse=" > ")}))
    # [1] "1980abcd > 1980de > 1980efg" "1982hij > 1982kl > 1982iiii" "1989aa"                     
    # [4] "1989a > 1989bbb" 

Upvotes: 0

Related Questions