Reputation: 500
I'm sure this has been answered somewhere, but I've been trying different methods all day without success. Below is a part of my large dataframe. By group ('legal'), I want to sum 3 columns and create a new column of each groups' total.
My dataframe:
legal y2015 y2016 y2017
NE2 0 1 1
NE2 0 1 1
SW4 1 0 0
SW4 1 0 0
Desired result:
legal y2015 y2016 y2017 total
NE2 0 1 1 2
NE2 0 1 1 2
SW4 1 0 0 1
SW4 1 0 0 1
I've tried by, ColSums, ddply, etc. and get various errors, often that my replacement has fewer rows than my data. The examples I'm working off of are all a bit different than what I'm trying to achieve.
Upvotes: 0
Views: 50
Reputation: 680
df$total = rowSums(df[,2:4])
I'm subsetting df
to avoid summing strings. Better to use dplyr::select
or subset
to specify columns, so you can avoid numeric indexing.
EDIT: noticed you said "by group", which confuses me since your example has multiple rows for each group. If you actually wanted
legal y2015 y2016 y2017 total
NE2 0 2 2 4
SW4 2 0 0 2
you can first use dplyr
.
df = df %>% group_by(legal) %>%
summarise(y2015 = sum(y2015),
y2016 = sum(y2016),
y2017 = sum(y2017))
Then use rowSums
on the result
Upvotes: 1