jmb277
jmb277

Reputation: 568

in R: Combining rows that are almost duplicates into one row with differing element combined

I've got this data.frame that has the form like:

my_df <- data.frame(id = c(1, 1, 2, 3), 
                 title = c("YourMa", "YourMa", "MyMa", "HisMa"), 
                autqty = c(2, 2, 1, 1), 
                   aut = c("Steve", "Joe", "Albert", "Kevin"), 
                  pubb = c("Good", "Good", "Meh", "Fan"))

which looks like:

> my_df
id  title   autqty aut    pubb
1   YourMa     2   Steve  Good
1   YourMa     2   Joe    Good
2   MyMa       1   Albert Meh
3   HisMa      1   Kevin  Fan

Note that for id 1, all of the info is the same except for the one aut entry. My goal is to slim-down my_df such that the aut data is combined into one element:

  id  title  autqty  aut         pubb
   1 YourMa    2     Steve, Joe  Good
   2 MyMa      1     Albert      Meh
   3 HisMa     1     Kevin       Fan

Note: this is a smaller version of my original data. I'd like to be able to handle any number of aut's that occur.

Upvotes: 0

Views: 80

Answers (1)

JasonWang
JasonWang

Reputation: 2434

Using group_by and summarise in dplyr:

my_df %>% 
  group_by(id, title, autqty, pubb) %>%
  summarise(aut=paste(aut, collapse=", ")) %>%
  ungroup()

# A tibble: 3 × 5
     id  title autqty   pubb        aut
  <dbl> <fctr>  <dbl> <fctr>      <chr>
1     1 YourMa      2   Good Steve, Joe
2     2   MyMa      1    Meh     Albert
3     3  HisMa      1    Fan      Kevin

Upvotes: 5

Related Questions