Piyush Shah
Piyush Shah

Reputation: 321

Concatenating multiple rows with similar names in R

I have a dataframe db1 with say 30 variables. Out of these 30, ten have sequential names - X1, X2,....X10. All these X variables are characters. I wanted to concatenate all of them. So I could of course do

db1$new <- paste(X1, X2, X3, X4, X5, X6, X7, X8, X9, X10)

But, this is not fun, and if I have a new file with different number of X variables, this code will not work. So, I need some method that concatenates using the variable name. I tried

zz1 <- paste(grep('^X',names(db1), value = TRUE))
zz2 <- paste("db1$",zz1,sep="",collapse = ",")

The second statement is to get the variable names seperated by commas. I then tried merging using

db1$new <- paste(db1$Terms,zz2,collapse = ","))

This did not work as R did not understand the zz2 were file names. What can I do?

Upvotes: 1

Views: 649

Answers (3)

tyluRp
tyluRp

Reputation: 4768

One way with tidyr and dplyr:

library(dplyr)
library(tidyr)

unite(db1, "var", starts_with("x"), sep = "")

#   var z1
# 1 aaa  a
# 2 bbb  b

This will unite any column that starts_with "x" and stores the result in a variable named var.

If the data is structured such that there are other variables starting with "x" that aren't of interest (e.g. "xvar") and should not be concatenated, then you can replace starts_with with matches and use regular expressions. Thank MKR for the suggestion:

unite(db1, "var", matches("^x\\d+"), sep = "")

#   var z1 xvar
# 1 aaa  a    a
# 2 bbb  b    b

Data:

db1 <- data.frame(x1 = c("a", "b"), 
                  x2 = c("a", "b"),
                  z1 = c("a", "b"),
                  x3 = c("a", "b"))

Upvotes: 2

PKumar
PKumar

Reputation: 11128

Use do.call with paste0, like this, Using the dataset like below(Using @MKR data):

df <- structure(list(id = 1:2, X1 = c("a", "b"), X2 = c("a", "b"), 
        X3 = c("a", "b")), .Names = c("id", 
    "X1", "X2", "X3"), row.names = c(NA, -2L), class = "data.frame")

df$pastecol = do.call("paste0",df[,grep("^X\\d+$",names(df))])

Output:

#> df$pastecol = do.call("paste0",df[,grep("^X\\d+$",names(df))])
#> df
#  id X1 X2 X3 pastecol
#1  1  a  a  a      aaa
#2  2  b  b  b      bbb

Upvotes: 1

MKR
MKR

Reputation: 20095

One option could be by using select_ from dplyr and then apply.

#data
db1 <- data.frame(id = 1:2, x1 = c("a", "b"), x2 = c("a", "b"),
                  x3 = c("a", "b"))

library(tidyverse)

db1$new <- db1 %>% 
select_(.dots = grep("^x\\d+",names(db1), value = T)) %>%
apply(1,paste,collapse="") 

db1
# Result
#  id x1 x2 x3 new
#1  1  a  a  a aaa
#2  2  b  b  b bbb

Upvotes: 2

Related Questions