user7016618
user7016618

Reputation: 93

Concatenating all rows within a group using dplyr

Suppose I have a dataframe like this:

hand_id card_id card_name card_class
A       1       p          alpha
A       2       q          beta
A       3       r          theta
B       2       q          beta
B       3       r          theta
B       4       s          gamma
C       1       p          alpha
C       2       q          beta 

I would like to concatenate the card_id, card_name, and card_class into one single row per hand level A, B, C. So the result would look something like this:

hand_id  combo_1  combo_2  combo_3
A        1-2-3    p-q-r    alpha-beta-theta
B        2-3-4    q-r-s    beta-theta-gamma
....

I attempted to do this using group_by and mutate, but I can't seem to get it to work

    data <- read_csv('data.csv')
    byHand <- group_by(data, hand_id) %>%
      mutate(combo_1 = paste(card_id), 
             combo_2 = paste(card_name),
             combo_3 = paste(card_class))

Thank you for your help.

Upvotes: 9

Views: 12004

Answers (4)

Conor
Conor

Reputation: 143

If you have NAs in your data, you can use na.omit() inline with str_c(). unique() will also work if you only want the distinct.

data:

    hand_id card_id card_name card_class
  <chr>     <dbl> <chr>     <chr>     
1 A             1 p         alpha     
2 A             2 q         beta      
3 A             3 r         theta     
4 A            NA NA        NA        
5 B             2 q         beta      
6 B             3 r         theta     
7 B             4 s         gamma     
8 C             1 p         alpha     
9 C             2 q         beta      

code:

data %>% 
      group_by(hand_id) %>% 
      summarize(card_id = str_c(na.omit(card_id), collapse = "-"),
                card_name = str_c(na.omit(card_name), collapse = "-"),
                card_class = str_c(na.omit(card_class), collapse = "-"))

output:

hand_id card_id card_name card_class     
* <chr>   <chr>   <chr>     <chr>          
1 A       1-2-3   p-q-r     alpha-beta-the…
2 B       2-3-4   q-r-s     beta-theta-gam…
3 C       1-2     p-q       alpha-beta  

Upvotes: 0

akrun
akrun

Reputation: 886938

Here is another option using data.table

library(data.table)
setDT(data)[, lapply(.SD, paste, collapse="-") , by = hand_id]
#     hand_id card_id card_name       card_class
#1:       A   1-2-3     p-q-r alpha-beta-theta
#2:       B   2-3-4     q-r-s beta-theta-gamma
#3:       C     1-2       p-q       alpha-beta

Upvotes: 4

parksw3
parksw3

Reputation: 659

Not very familiar with dplyr... so here's my attempt without dplyr

df <- read_csv('data.csv')

res <- lapply(split(df, df$hand_id),function(x){
    sL <- apply(x[,-1], 2, function(y) paste(y, collapse = "-"))
    d <- data.frame(x$hand_id[1], rbind(sL))
    names(d) <- c("hand_id", "combo_1", "combo_2", "combo_3")
    return(d)
})
res <- do.call("rbind",res)
rownames(res) <- NULL

Here's the output:

##   hand_id combo_1 combo_2          combo_3
## 1       A   1-2-3   p-q-r alpha-beta-theta
## 2       B   2-3-4   q-r-s beta-theta-gamma
## 3       C     1-2     p-q       alpha-beta

Upvotes: 0

zacdav
zacdav

Reputation: 4671

You were kind of close!

library(tidyr)
library(dplyr)

data <- read_csv('data.csv')
byHand <- group_by(data, hand_id) %>%
    summarise(combo_1 = paste(card_id, collapse = "-"), 
              combo_2 = paste(card_name, collapse = "-"),
              combo_3 = paste(card_class, collapse = "-"))

or using summarise_each:

 byHand <- group_by(data, hand_id) %>%
        summarise_each(funs(paste(., collapse = "-")))

Upvotes: 20

Related Questions