Reputation: 99
The following dataframe is a subset of a bigger df, which contains duplicated information
df<-data.frame(Caught=c(92,134,92,134),
Discarded=c(49,47,49,47),
Units=c(170,170,220,220),
Hours=c(72,72,72,72),
Colour=c("red","red","red","red"))
In Base R, I would like to get the following:
df_result<-data.frame(Caught=226,
Retained=96,
Units=390,
Hours=72,
colour="red")
So basically the results is the sum of unique values for columns Caught, Retained, Units and leaving the same value for Hours and colour (Caught=92+134, Retained=49+47, Units= 170+220, Hours=72, colour="red)
However, I intend to do this in a much bigger data.frame with several columns. My idea was to apply a function based on column names as:
l <- lapply(df, function(x) {
if(names(x) %in% c("Caught","Discarded","Units"))
sum(unique(x))
else
unique(x)
})
as.data.frame(l)
However, this does not work, as I am not entirely sure how to extract vector names when using lapply()
and other functions such as this.
I have tried withouth succes to implement by()
, apply()
functions.
Thanks
Upvotes: 0
Views: 39
Reputation: 405
A proposition:
df <-data.frame(Caught=c(92,134,92,134),
Discarded=c(49,47,49,47),
Units=c(170,170,220,220),
Hours=c(72,72,72,72),
Colour=c("red","red","red","red"))
df
#> Caught Discarded Units Hours Colour
#> 1 92 49 170 72 red
#> 2 134 47 170 72 red
#> 3 92 49 220 72 red
#> 4 134 47 220 72 red
df_results <- data.frame(Caught = sum(unique(df$Caught)),
Discarded = sum(unique(df$Discarded)),
Units = sum(unique(df$Units)),
Hours = unique(df$Hours),
Colour = unique(df$Colour))
df_results
#> Caught Discarded Units Hours Colour
#> 1 226 96 390 72 red
# Created on 2021-02-23 by the reprex package (v0.3.0.9001)
Regards,
Upvotes: 0
Reputation: 5429
Asking for Base R:
l <- lapply( df, function(n) {
if( is.numeric(n) )
sum( unique(n) )
else
unique( n )
})
as.data.frame(l)
This solution takes advantage of the fact that data.frames are really just lists of vectors.
It produces this:
# Caught Discarded Units Hours Colour
# 226 96 390 72 red
Upvotes: 1