Hash
Hash

Reputation: 57

Convert duplicate rows to separate columns in R

I have the following dataset:

    ID      Date    Op  SS  type    
1   AR02    03-14   NA  SS  ET
2   AR02    03-14   NA  SS  ET  
3   AR02    06-14   NA  SS  ET  
4   AR03    31-14   NA  SS  ET  
5   AR08    03-14   NA  SS  ET  
6   AR09    03-14   NA  SS  EN  

Since I have duplicate rows, I would like to get a table that appends all the duplicate rows as separate columns. Something like this.

    ID      Date    Op  SS  type ID     Date    Op  SS  type  ID    Date    Op  SS  type
1   AR02    03-14   NA  SS  ET   AR02   03-14   NA  SS  ET    AR02  06-14   NA  SS  ET  
2   AR03    31-14   NA  SS  ET   NA     NA      NA  NA  NA    NA    NA      NA  NA  NA  
3   AR08    03-14   NA  SS  ET   NA     NA      NA  NA  NA    NA    NA      NA  NA  NA  
4   AR09    03-14   NA  SS  EN   NA     NA      NA  NA  NA    NA    NA      NA  NA  NA

I am trying to do this in R. I tried melt, transpose (t), aggregate functions but somehow unable to get the right fix. Any help will be appreciated.

Upvotes: 1

Views: 613

Answers (1)

akrun
akrun

Reputation: 887038

One option would be to convert the 'data.frame' to 'data.table' (setDT(df1), grouped by 'ID', we paste the rows together, then use cSplit to split the pasted column ('V1') to separate columns. Note that I am not repeating the 'ID' column as it is already unique for a row.

 library(splitstackshape)
 library(data.table)
 DT <- setDT(df1)[, do.call(paste, c(.SD, list(collapse=', '))) , ID]
 DT1 <- cSplit(DT, 'V1', sep='[ ,]+', fixed=FALSE, stripWhite=TRUE)
 setnames(DT1, 2:ncol(DT1), rep(names(df1)[-1], 3))
 DT1
 #     ID  Date Op SS type  Date Op SS type  Date Op SS type
 #1: AR02 03-14 NA SS   ET 03-14 NA SS   ET 06-14 NA SS   ET
 #2: AR03 31-14 NA SS   ET    NA NA NA   NA    NA NA NA   NA
 #3: AR08 03-14 NA SS   ET    NA NA NA   NA    NA NA NA   NA
 #4: AR09 03-14 NA SS   EN    NA NA NA   NA    NA NA NA   NA

It is not recommended to have duplicated column names, so

 setnames(DT1,  make.unique(names(DT1)))

Upvotes: 3

Related Questions