Geo-sp
Geo-sp

Reputation: 1704

aggregate data frame and remove NA

I have a data frame and I want to reshape it so that I only have one row for each observation obs. Here is the example data:

data <- data.frame("obs" = c('1','1','1','2','2'),
                   "value1" = c(1,NA,NA,NA,NA),
                   "value2" = c(NA,NA,3,1,NA),
                   "value3" = c(NA,2,NA,NA,5))

data looks like this:

  obs value1 value2 value3
   1      1     NA     NA
   1     NA     NA      2
   1     NA      3     NA
   2     NA      1     NA
   2     NA     NA      5

and I want to reshape it into:

obs  value1  value2  value3
1       1      3       2
2       NA     1       5

Thanks!

Upvotes: 0

Views: 847

Answers (3)

joran
joran

Reputation: 173547

This is how I would do this, using plyr:

foo <- function(x){
    if (all(is.na(x))) return(NA)
    else return(x[!is.na(x)])
}

ddply(dat,.(obs),colwise(foo))

And this of course assumes that you really do only have at most one non-NA value in each column for each value of obs.

If this isn't the case, and you want to take the mean of multiple values, you might try doing as Justin suggested:

mean(x[!is.na(x)])

Upvotes: 2

Tyler Rinker
Tyler Rinker

Reputation: 109844

A base solution:

out <- lapply(split(data, data$obs), function(x) {
    ans <- lapply(x[, -1], na.omit)
    data.frame(obs = x[1, 1], t(sapply(ans, "[", 1)))
})

do.call(rbind, out)

## > do.call(rbind, out)
##   obs value1 value2 value3
## 1   1      1      3      2
## 2   2     NA      1      5

Upvotes: 2

eddi
eddi

Reputation: 49448

library(data.table)
dt = data.table(dat)

dt[, lapply(.SD, function(x) x[!is.na(x)]), by = obs]

If you have multiple entries per value for a given observation, this will use R's recycling logic to fill the rest.

Upvotes: 4

Related Questions