DD chen
DD chen

Reputation: 189

R apply 2 differents functions in summarise_at

I have following modified mtcars: mtcars2 <- mtcars ; mtcars2[1,2] <- mtcars2[1,2] <- NA ; mtcars2 <- mtcars2[,c("vs","cyl", "disp")] I want to apply group_by "vs" by summing the first column without NA and apply length to the second column of mtcars2.

I try this:

mtcars3 <- mtcars2 %>% group_by(vs) %>% summarise_at(vars(names(mtcars2[-1])), list( Total = sum, n = length), na.rm=T)

it says that the parameter na.rm is also applying to length, which cause the problem.

the result I want is like :

mtcars3 <- mtcars3 <- mtcars2 %>% group_by(vs) %>% 
                      summarise_at(vars(names(mtcars2[-1])), list( Total = sum, n = length))
mtcars3 [1,2] <- sum(mtcars2$cyl, na.rm = T)
res <- mtcars3 %>% mutate(n = cyl_n)%>% select(-disp_n, -cyl_n)
res

do you have an idea?

Thanks!

Upvotes: 1

Views: 43

Answers (2)

akrun
akrun

Reputation: 887118

We can also use data.table

library(data.table)
as.data.table(mtcars2)[, c(lapply(.SD, sum, na.rm = TRUE), .(n = .N)) , vs]

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388982

You can use :

library(dplyr)

mtcars2 %>% 
  group_by(vs) %>% 
  summarise_at(vars(-group_cols()), 
              list(Total = ~sum(., na.rm = TRUE), n = length))


# A tibble: 2 x 5
#     vs cyl_Total disp_Total cyl_n disp_n
#  <dbl>     <dbl>      <dbl> <int>  <int>
#1     0       128      5529.    18     18
#2     1        64      1854.    14     14

If you want only one column of n in that case do :

mtcars2 %>% 
  mutate(n = 1) %>%
  group_by(vs) %>% 
  summarise_at(vars(-group_cols()), list(Total = ~sum(., na.rm = TRUE)))


# A tibble: 2 x 4
#     vs cyl_Total disp_Total n_Total
#  <dbl>     <dbl>      <dbl>   <dbl>
#1     0       128      5529.      18
#2     1        64      1854.      14

Upvotes: 1

Related Questions