Reputation: 31
I need to aggregate rows of a data frame with MANY rows in this manner:
Lets say I have two rows that I want to merge (as they have the same first element):
x1=c(ID1,1,3,2,1,NA);
x2=c(ID1,2,2,3,NA,NA);
I want to merge their values so, that the result will in the case of clash (clash=both rows have a non-NA value in the element) prefer the value in a row with higher value in 2nd element (that is x2).
So in the example the result of the aggregation of these particular two rows will be:
x12=c(ID1,2,2,3,1,NA)
Can anyone please help?
I tried aggregate()
, but it offers just summary statistics like mean, max etc.
Upvotes: 0
Views: 385
Reputation: 23210
You can use SQL via the package sqldf
.
Using sql
you can group by
ID and select
the max
of the values you mentioned.
require(data.table)
require(sqldf)
x1 <- c(1,1,2,1,NA)
x2 <- c(1,3,3,NA,NA)
x <- data.table(rbind(x1,x2))
colnames(x) <- c("ID", "x1", "x2", "x3", "x4")
sqldf("select ID, max(x1) as x1, max(x2) as x2, max(x3) as x3, max(x4) as x4 from x group by ID")
ID x1 x2 x3 x4
1 1 3 3 1 NA
Upvotes: 0
Reputation: 1204
You can use max
as a function for the aggregate
function.
Data:
x1=c('ID1',1,2,1,NA)
x2=c('ID1',3,3,NA,NA)
data <- data.frame(rbind(x1, x2), stringsAsFactors = FALSE)
Aggregating:
aggregate(data[,2:5], by = list(name = data$X1), FUN = max, na.rm = TRUE)
Result:
name X2 X3 X4 X5
1 ID1 3 3 1 <NA>
Upvotes: 1