Reputation: 29
I have a data frame that looks like the example below.
element | string |
---|---|
x | abc |
y | def |
z | ghi |
z | jkl |
x | mno |
y | pqr |
z | stu |
x | vwx |
y | yza |
z | bcd |
z | efg |
z | hij |
x | klm |
y | nop |
z | qrs |
z | tuv |
z | wxy |
All the strings in the string column vary but the values in the element column always follow an x-y-z pattern, although the number of z's varies. I would like to take the strings in the strings column from each x-y-z set and concatenate them - so the strings column in the dataframe above would look like this:
string |
---|
abc def ghi jkl |
mno pqr stu |
vwx yza bcd efg hij |
klm nop qrs tuv wxy |
I was thinking there might be a way to do this using dplyr::rowwise? The variable # of z rows per each set is tripping me up though in figuring out something that might work...
Upvotes: 0
Views: 1274
Reputation: 10996
The tricky part is that you need to group by chunks of x/y/z. Below is one approach. Once you have your id to group by you can simply summarize and concatenate the strings.
library(tidyverse)
df <- data.frame(element = c(letters[24:26], 'z', letters[24:26]),
string = c('abc', 'def', 'ghi', 'ijk', 'lmn', 'o', 'p'))
df %>%
mutate(id = cumsum(if_else(element < lag(element), 1, 0, missing = 1))) %>%
group_by(id) %>%
summarize(strong = str_c(string, collapse = ' '), .groups = 'drop')
With the above test data, this gives:
# A tibble: 2 x 2
id strong
<dbl> <chr>
1 1 abc def ghi ijk
2 2 lmn o p
Upvotes: 0