R: Concatenate strings from one column based on repeating strings in another column

Question

I have a data frame that looks like the example below.

element	string
x	abc
y	def
z	ghi
z	jkl
x	mno
y	pqr
z	stu
x	vwx
y	yza
z	bcd
z	efg
z	hij
x	klm
y	nop
z	qrs
z	tuv
z	wxy

All the strings in the string column vary but the values in the element column always follow an x-y-z pattern, although the number of z's varies. I would like to take the strings in the strings column from each x-y-z set and concatenate them - so the strings column in the dataframe above would look like this:

string
abc def ghi jkl
mno pqr stu
vwx yza bcd efg hij
klm nop qrs tuv wxy

I was thinking there might be a way to do this using dplyr::rowwise? The variable # of z rows per each set is tripping me up though in figuring out something that might work...

deschen · Accepted Answer

The tricky part is that you need to group by chunks of x/y/z. Below is one approach. Once you have your id to group by you can simply summarize and concatenate the strings.

library(tidyverse)

df <- data.frame(element = c(letters[24:26], 'z', letters[24:26]),
                 string = c('abc', 'def', 'ghi', 'ijk', 'lmn', 'o', 'p'))

df %>%
  mutate(id = cumsum(if_else(element < lag(element), 1, 0, missing = 1))) %>%
  group_by(id) %>%
  summarize(strong = str_c(string, collapse = ' '), .groups = 'drop')

With the above test data, this gives:

# A tibble: 2 x 2
     id strong         
             
1     1 abc def ghi ijk
2     2 lmn o p

R: Concatenate strings from one column based on repeating strings in another column

Answers (1)

Related Questions

element	string
x	abc
y	def
z	ghi
z	jkl
x	mno
y	pqr
z	stu
x	vwx
y	yza
z	bcd
z	efg
z	hij
x	klm
y	nop
z	qrs
z	tuv
z	wxy

element	string
x	abc
y	def
z	ghi
z	jkl
x	mno
y	pqr
z	stu
x	vwx
y	yza
z	bcd
z	efg
z	hij
x	klm
y	nop
z	qrs
z	tuv
z	wxy

element	string
x	abc
y	def
z	ghi
z	jkl
x	mno
y	pqr
z	stu
x	vwx
y	yza
z	bcd
z	efg
z	hij
x	klm
y	nop
z	qrs
z	tuv
z	wxy