Reputation: 905
I have a data.frame that looks like this:
Element1 Element2 Value Index a cf 0.14 1 a ng 0.25 1 a ck 0.12 1 a rt 0.59 1 a pl 0.05 1 b gh 0.02 2 b er 0.91 2 b jk 0.87 2 c qw 0.23 3 c po 0.15 3
I would like the following output:
Element_a1 Element_a2 Value_a Element_b1 Element_b2 Value_b a cf 0.14 b gh 0.02 a ng 0.25 b er 0.91 a ck 0.12 b jk 0.87 a rt 0.59 NA NA NA a pl 0.05 NA NA NA
and so on...
I applied "split" function to split the initial data.frame according to "Index" column but I cannot transform the splitted data.frame (that is a list of data.frames) in a single data.frame
as desired since the length of the single data.frames is not equal. I tried to apply (from ply package)
x = do.call(rbind.fill, spl)
as from another post, but a data.frame like the initial one is returned.
Upvotes: 0
Views: 278
Reputation: 81743
Here's a way to do it:
nRow <- max(table(dat$Element1)) # maximum number of rows in a group
spl2 <- by(dat, dat$Element1, FUN = function(x) {
if (nRow > nrow(x)) { # insufficient number of rows?
subdat <- dat[seq_len(nRow - nrow(x)), ] # create a data frame
subdat[ , ] <- NA # fill it with NAs
return(rbind(x, subdat))} # bind it to the subset and return the result
return(x) # return the subset as it is
})
result <- do.call(cbind, spl2) # bind all subsets together
Upvotes: 2
Reputation: 21315
I would use split
and then cbind
them together, post-padding. I borrow the cbindPad
function from combining two data frames of different lengths:
cbindPad <- function(...){
args <- list(...)
n <- sapply(args,nrow)
mx <- max(n)
pad <- function(x, mx){
if (nrow(x) < mx){
nms <- colnames(x)
padTemp <- matrix(NA,mx - nrow(x), ncol(x))
colnames(padTemp) <- nms
return(rbind(x,padTemp))
}
else{
return(x)
}
}
rs <- lapply(args,pad,mx)
return(do.call(cbind,rs))
}
## assume your data is in a data.frame called dat
dat_split <- split(dat, dat$Element1)
out <- do.call( cbindPad, dat_split )
Upvotes: 1