Biocrazy
Biocrazy

Reputation: 401

Add row with group sum in new column at the end of group category

I have been searching this information since yesterday but so far I could not find a nice solution to my problem.

I have the following dataframe:

CODE    CONCEPT   P. NR.    NAME    DEPTO.  PRICE
 1       Lunch     11       John    SALES   160
 1       Lunch     11       John    SALES   120
 1       Lunch     11       John    SALES   10
 1       Lunch     13       Frank    IT     200
 2      Internet   13       Frank    IT     120

and I want to add a column with the sum of rows by group, for instance, the total amount of concept: Lunch, code: 1 by name in order to get an output like this:

CODE    CONCEPT   P. NR.    NAME    DEPTO.   PRICE  TOTAL
 1       Lunch     11       John    SALES     160   NA
 1       Lunch     11       John    SALES     120   NA
 1       Lunch     11       John    SALES     10    290
 1       Lunch     13       Frank    IT       200   200
 2      Internet   13       Frank    IT       120   120

So far, I tried with:

aggregate(PRICE~NAME+CODE, data = df, FUN = sum)

But this retrieves just the total of the concepts like this:

NAME    CODE    TOTAL
John     1      290
Frank    1      200
Frank    2      120

And not the table with the rest of the data as I would like to have it. I also tried adding an extra column with NA but somehow I cannot paste the total in a specific row position.

Any suggestions? I would like to have something I can do in BaseR. Thanks!!

Upvotes: 2

Views: 748

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388907

In base R you can use ave to add new column. We insert the sum of group only if it is last row in the group.

df$TOTAL <- with(df, ave(PRICE, CODE, CONCEPT, PNR, NAME, FUN = function(x) 
                ifelse(seq_along(x) == length(x), sum(x), NA)))

df
#  CODE  CONCEPT PNR  NAME DEPTO. PRICE TOTAL
#1    1    Lunch  11  John  SALES   160    NA
#2    1    Lunch  11  John  SALES   120    NA
#3    1    Lunch  11  John  SALES    10   290
#4    1    Lunch  13 Frank     IT   200   200
#5    2 Internet  13 Frank     IT   120   120

Similar logic using dplyr

library(dplyr)
df %>%
   group_by(CODE, CONCEPT, PNR, NAME) %>%
   mutate(TOTAL = ifelse(row_number() == n(), sum(PRICE) ,NA))

Upvotes: 4

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521028

For a base R option, you may try merging the original data frame and aggregate:

df2 <- aggregate(PRICE~NAME+CODE, data = df, FUN = sum)
out <- merge(df[ , !(names(df) %in% c("PRICE"))], df2, by=c("NAME", "CODE"))
out[with(out, order(CODE, NAME)), ]

   NAME CODE  CONCEPT PNR  DEPT PRICE
1 Frank    1    Lunch  13    IT   200
3  John    1    Lunch  11 SALES   290
4  John    1    Lunch  11 SALES   290
5  John    1    Lunch  11 SALES   290
2 Frank    2 Internet  13    IT   120

Upvotes: 0

Related Questions