Lara Vandooren
Lara Vandooren

Reputation: 11

R: add another column with the sum of a column, if all other are the same

I want to add another column to my dataframe in R, containing the sum of the amount of orders, when the other columns are the same.

For example if this is my data frame:

orderdate / orderhour / deliverydate / deliveryhour / orders / New column: sum


wednesday / 12:00 / Friday / 17:00 / 5 / 0
wednesday / 12:00 / Friday / 17:00 / 2 / 7
thursday / 13:00 / Friday / 17:00 / 3 / 3
thursday/ 13:00 / Friday / 19:00 / 1 / 1

The last column is the one I want to add. In the end, the rows with zero can be removed, so I no longer have duplicate data. So only if all columns match (except orders) the sum of the orders should be taken. In the other cases the same value can be kept.

Can someone please help me?

Thanks in advance!

Upvotes: 0

Views: 413

Answers (1)

AntoniosK
AntoniosK

Reputation: 16121

# example data
df = read.table(text = "
orderdate/orderhour/deliverydate/deliveryhour/orders
wednesday/12:00/Friday/17:00/5 
wednesday/12:00/Friday/17:00/2 
thursday/13:00/Friday/17:00/3 
thursday/13:00/Friday/19:00/1
", sep="/", header=T, stringsAsFactors=F)

library(dplyr)

df %>%
  group_by(orderdate, orderhour, deliverydate, deliveryhour) %>% # for each combination of those variables
  mutate(sum_orders = ifelse(n() > 1 &                           # if there are multiple rows
                             row_number() == 1,                  # and this is row 1
                             0,                                  # use a zero
                             cumsum(orders))) %>%                # else use the cum sum
  ungroup()                                                      # forget the grouping

# # A tibble: 4 x 6
#   orderdate orderhour deliverydate deliveryhour orders sum_orders
#   <chr>     <chr>     <chr>        <chr>         <dbl>      <dbl>
# 1 wednesday 12:00     Friday       17:00          5.00       0   
# 2 wednesday 12:00     Friday       17:00          2.00       7.00
# 3 thursday  13:00     Friday       17:00          3.00       3.00
# 4 thursday  13:00     Friday       19:00          1.00       1.00

Upvotes: 1

Related Questions