Reputation: 57
I have the following dataset:
ID Date Op SS type
1 AR02 03-14 NA SS ET
2 AR02 03-14 NA SS ET
3 AR02 06-14 NA SS ET
4 AR03 31-14 NA SS ET
5 AR08 03-14 NA SS ET
6 AR09 03-14 NA SS EN
Since I have duplicate rows, I would like to get a table that appends all the duplicate rows as separate columns. Something like this.
ID Date Op SS type ID Date Op SS type ID Date Op SS type
1 AR02 03-14 NA SS ET AR02 03-14 NA SS ET AR02 06-14 NA SS ET
2 AR03 31-14 NA SS ET NA NA NA NA NA NA NA NA NA NA
3 AR08 03-14 NA SS ET NA NA NA NA NA NA NA NA NA NA
4 AR09 03-14 NA SS EN NA NA NA NA NA NA NA NA NA NA
I am trying to do this in R. I tried melt, transpose (t), aggregate functions but somehow unable to get the right fix. Any help will be appreciated.
Upvotes: 1
Views: 613
Reputation: 887038
One option would be to convert the 'data.frame' to 'data.table' (setDT(df1)
, grouped by 'ID', we paste
the rows together, then use cSplit
to split the pasted column ('V1') to separate columns. Note that I am not repeating the 'ID' column as it is already unique for a row.
library(splitstackshape)
library(data.table)
DT <- setDT(df1)[, do.call(paste, c(.SD, list(collapse=', '))) , ID]
DT1 <- cSplit(DT, 'V1', sep='[ ,]+', fixed=FALSE, stripWhite=TRUE)
setnames(DT1, 2:ncol(DT1), rep(names(df1)[-1], 3))
DT1
# ID Date Op SS type Date Op SS type Date Op SS type
#1: AR02 03-14 NA SS ET 03-14 NA SS ET 06-14 NA SS ET
#2: AR03 31-14 NA SS ET NA NA NA NA NA NA NA NA
#3: AR08 03-14 NA SS ET NA NA NA NA NA NA NA NA
#4: AR09 03-14 NA SS EN NA NA NA NA NA NA NA NA
It is not recommended to have duplicated column names, so
setnames(DT1, make.unique(names(DT1)))
Upvotes: 3