Emily
Emily

Reputation: 500

By group, create new column by summing part of row

I'm sure this has been answered somewhere, but I've been trying different methods all day without success. Below is a part of my large dataframe. By group ('legal'), I want to sum 3 columns and create a new column of each groups' total.

My dataframe:

 legal y2015 y2016 y2017
  NE2    0     1      1
  NE2    0     1      1
  SW4    1     0      0 
  SW4    1     0      0

Desired result:

 legal y2015 y2016 y2017  total
  NE2    0     1      1     2
  NE2    0     1      1     2
  SW4    1     0      0     1
  SW4    1     0      0     1

I've tried by, ColSums, ddply, etc. and get various errors, often that my replacement has fewer rows than my data. The examples I'm working off of are all a bit different than what I'm trying to achieve.

Upvotes: 0

Views: 50

Answers (1)

spinodal
spinodal

Reputation: 680

df$total = rowSums(df[,2:4])

I'm subsetting df to avoid summing strings. Better to use dplyr::select or subset to specify columns, so you can avoid numeric indexing.

EDIT: noticed you said "by group", which confuses me since your example has multiple rows for each group. If you actually wanted

 legal y2015 y2016 y2017  total
  NE2    0     2      2     4
  SW4    2     0      0     2

you can first use dplyr.

df = df %>% group_by(legal) %>% 
summarise(y2015 = sum(y2015), 
          y2016 = sum(y2016), 
          y2017 = sum(y2017))

Then use rowSums on the result

Upvotes: 1

Related Questions