BleepBloop
BleepBloop

Reputation: 29

R: Concatenate strings from one column based on repeating strings in another column

I have a data frame that looks like the example below.

element string
x abc
y def
z ghi
z jkl
x mno
y pqr
z stu
x vwx
y yza
z bcd
z efg
z hij
x klm
y nop
z qrs
z tuv
z wxy

All the strings in the string column vary but the values in the element column always follow an x-y-z pattern, although the number of z's varies. I would like to take the strings in the strings column from each x-y-z set and concatenate them - so the strings column in the dataframe above would look like this:

string
abc def ghi jkl
mno pqr stu
vwx yza bcd efg hij
klm nop qrs tuv wxy

I was thinking there might be a way to do this using dplyr::rowwise? The variable # of z rows per each set is tripping me up though in figuring out something that might work...

Upvotes: 0

Views: 1274

Answers (1)

deschen
deschen

Reputation: 10996

The tricky part is that you need to group by chunks of x/y/z. Below is one approach. Once you have your id to group by you can simply summarize and concatenate the strings.

library(tidyverse)

df <- data.frame(element = c(letters[24:26], 'z', letters[24:26]),
                 string = c('abc', 'def', 'ghi', 'ijk', 'lmn', 'o', 'p'))

df %>%
  mutate(id = cumsum(if_else(element < lag(element), 1, 0, missing = 1))) %>%
  group_by(id) %>%
  summarize(strong = str_c(string, collapse = ' '), .groups = 'drop')

With the above test data, this gives:

# A tibble: 2 x 2
     id strong         
  <dbl> <chr>          
1     1 abc def ghi ijk
2     2 lmn o p        

Upvotes: 0

Related Questions