Reputation: 3663
I want to sum up all but one numerical column in this dataframe.
Group, Registered, Votes, Beans
A, 111, 12, 100
A, 111, 13, 200
A, 111, 14, 300
I want to group this by Group
, summing up all the columns except Registered
.
summarise_if(
.tbl = group_by(
.data = x,
Precinct
),
.predicate = is.numeric,
.funs = sum
)
Problem here is the result is a data frame that sums ALL the numeric columns, including Registered
. How do I sum all but Registered
?
The output I want would look like this
Group, Registered, Votes, Beans
A, 111, 39, 600
Upvotes: 19
Views: 22712
Reputation: 28685
Edit:
Modified versions of the two methods below for dplyr version >= 1, since summarise_at
is superseded
df %>%
summarise(across(where(is.numeric) & !Registered, sum))
df %>%
summarise(across(-Registered, sum))
Original Answer:
I would use summarise_at
, and just make a logical vector which is FALSE
for non-numeric columns and Registered
and TRUE
otherwise, i.e.
df %>%
summarise_at(which(sapply(df, is.numeric) & names(df) != 'Registered'), sum)
If you wanted to just summarise all but one column you could do
df %>%
summarise_at(vars(-Registered), sum)
but in this case you have to check if it's numeric also.
Notes:
sapply(df, is.numeric)
with sapply(df, function(x) is.numeric(x) & !is.factor(x))
Upvotes: 19
Reputation: 293
I needed something similar, so by using the answer of @akrun This answer above I did
df <- as_tibble(df)
df %>%
select(-Type) %>%
summarise_all(sum)
Where "Type" is the non-numeric (char) column for example in the iris data set, Type is the "Species" columns. So I got the sum of all the other columns which happens to be numeric.
Upvotes: 1
Reputation: 887118
We can use summarise_if
library(dplyr)
df %>%
select(-Registered) %>%
summarise_if(is.numeric, sum)
# Votes Beans
#1 39 600
Upvotes: 4
Reputation: 16121
dt = read.table(text = "
Group Registered Votes Beans
A 111 12 100
A 111 13 200
A 111 14 300
", header=T)
library(dplyr)
# specify grouping variables
v1 = "Group"
v2 = "Registered"
dt %>%
group_by_(v1, v2) %>%
summarise_all(sum) %>%
ungroup()
# # A tibble: 1 x 4
# Group Registered Votes Beans
# <fct> <int> <int> <int>
# 1 A 111 39 600
Note that I have to assume that within each Group
value there's a unique Registered
value, so you can group by both variables, instead of grouping only by Group
and keeping the unique value of Registered
.
Upvotes: 2