egnha
egnha

Reputation: 1197

Correct way to concatenate lists of strings in R

What is the idiomatic way to do the following string concatenation in R?

Given two vectors of strings, such as the following,

titles <- c("A", "B")
sub.titles <- c("x", "y", "z")

I want to produce the vector

full.titles <- c("A_x", "A_y", "A_z", "B_x", "B_y", "B_z")

Obviously, this could be done with two for-loops. However, I would like to know what an “idiomatic” (i.e., elegant and natural) solution would be in R.

In Python, an idiomatic solution might look like this:

titles = ['A', 'B']
subtitles = ['x', 'y', 'z']
full_titles = ['_'.join([title, subtitle])
               for title in titles for subtitle in subtitles]

Does R allow for a similar degree of expressiveness?

Remark

The consensus among the solutions proposed thus far is that the idiomatic way to do this in R is, basically,

full.titles <- c(t(outer(titles, sub.titles, paste, sep = "_")))

Interestingly, this has an (almost) literal translation in Python:

full_titles = map('_'.join, product(titles, subtitles))

where product is the cartesian-product function from the itertools module. However, in Python, such a use of map is considered more convoluted—that is, less expressive—than the equivalent use of list comprehension, as above.

Upvotes: 0

Views: 4279

Answers (6)

Colonel Beauvel
Colonel Beauvel

Reputation: 31161

Using do.call combined with paste and expand.grid

sort(do.call(paste, c(sep='_', expand.grid(titles, sub.titles))))
#[1] "A_x" "A_y" "A_z" "B_x" "B_y" "B_z"

Or using tidyr::unite combined with expand.grid

unite(expand.grid(titles, sub.titles), Res, everything()) %>% .$Res

Upvotes: 3

Scott Warchal
Scott Warchal

Reputation: 1027

apply(expand.grid(titles, sub.titles), 1, paste, collapse = "_")

expand.grid creates a matrix of combinations between titles and sub.titles.
apply goes down the matrix of combinations and pastes them together.

Upvotes: 2

Zahiro Mor
Zahiro Mor

Reputation: 1718

full.titles  <-  paste0(expand.grid(titles,sub.titles)$Var1,'_',
expand.grid(titles,sub.titles)$Var2)
>full.titles
[1] "A_x" "B_x" "A_y" "B_y" "A_z" "B_z"

Upvotes: 1

tblznbits
tblznbits

Reputation: 6778

This code also works: as.vector(outer(titles, subtitles, FUN=paste, sep="_"))

outer essentially performs a function element-wise to each element from each vector. So it'll take each element from titles and perform a function with each element from subtitles. The default function is multiplication, but we change that default by passing a new argument to the FUN parameter. Arguments that are used in our new function are appended after a comma. So we're telling R to take the first element from titles and paste it together with each element from subtitles and separate the two elements with a "_". Then do it again with the second element from titles.

Upvotes: 1

Miff
Miff

Reputation: 7941

There are a couple of ways to go about this, either using the 'outer()' function to define your function as the matrix product of two vectors, along the lines of:

outer(titles, sub.titles, paste, sep='_')

and then wrangling it from a matrix into a vector, or converting your input to dataframe, using expand.grid()

do.call(paste, expand.grid(titles, sub.titles, sep='_', stringsAsFactors=FALSE))

Upvotes: 5

J_F
J_F

Reputation: 10352

Try this code:

unlist(lapply(1:length(titles), function(x){paste(titles[x], sub.titles, sep="_")}))

Upvotes: 1

Related Questions