Reputation: 21
I would like to know how to use R to merge rows in one set of data.
Currently my data looks like this:
Text 1 Text 2 Text 3 Text 4
Bob Aba Abb Abc
Robert Aba Abb Abc
Fred Abd Abe Abf
Martin Abg Abh Abi
If text two and text 3 are both the same for two rows (as in rows 1 & 2) I would like to make it into one row with more columns for the other data.
Text 1 Text 1a Text 2 Text 3 Text 4 Text 4a
Bob Robert Aba Abb Abc Abd
Fred NA Abd Abe Abf NA
Martin NA Abg Abh Abi NA
I did something similar with joining two separate sets of data and merging them using join
join=join(Data1, Data2, by = c('Text2'), type = "full", match = "all")
but I can't work out how to do it for duplicates within one set of data.
I think it might be possible to use aggregate but I have not used it before, my attempt was:
MyDataAgg=aggregate(MyData, by=list(MyData$Text1), c)
but when I try I am getting an output that looks like this on summary:
1 -none- numeric
1 -none- numeric
2 -none- numeric
or this on structure:
$ Initials :List of 12505
..$ 1 : int 62
..$ 2 : int 310
..$ 3 : int 504
I would also like to be able to combine rows using matching elements of two variables.
Upvotes: 2
Views: 662
Reputation: 121588
I don't think you can reshape or aggregate because :
Here a a manual attempt using by
to process by key, and rbind.fill
to aggregate all the list together. Each by
step , is creating a one-row data.frame having (Text2,Text3) as key.
do.call(plyr::rbind.fill,by(dat,list(dat$Text2,dat$Text3),
function(d){
## change all other columns to a one row data.frame
dd <- as.data.frame(as.list(rapply(d[,-c(2,3)],as.character)))
## the tricky part : add 1 to a name like Text1 to become Text11 ,
## this is import to join data.frames formed by by
names(dd) <- gsub('(Text[0-9]$)','\\11',names(dd))
## add key to to the row
cbind(unique(d[,2:3]),dd)
}))
Text2 Text3 Text11 Text12 Text41 Text42
1 Aba Abb Bob Robert Abc Abd
2 Abd Abe Fred <NA> Abf <NA>
3 Abg Abh Martin <NA> Abi <NA>
Upvotes: 1