Kalenji
Kalenji

Reputation: 407

R - New column in data frame with aggregated value based on three conditions

Suppose I have the data frame:

df <- data.frame(Year = rep(1:3, each = 5)
                 , Terminal = c(1,1,1,1,1,1,2,2,2,2,2,2,2,1,2)
                 , day = c (1,1,1,1,1,1,2,2,2,2,2,2,2,1,2)
                 , Capacity = sample(1:15))

and trying to get a columnb "X" that is a sum of capacity for the same year, day and terminal.

Original df

enter image description here

Outcome:

enter image description here

I use below codes to do the calculations:

aggregate(Capacity ~ Terminal + Year + day , data=df, FUN=sum)

and

as.data.table(df)[, sum(Capacity), by = .(Terminal, Year, day)]

but when I try to create the new column it only prints either 1 or 2 value and not the sum. Also it gives the below warring. The code I have for the X is df["X"] <- aggregate(Capacity ~ Terminal + Year + day , data=df, FUN=sum)

Warning message: In [<-.data.frame(*tmp*, "X", value = list(Terminal = c(1, 1, : provided 4 variables to replace 1 variables

Upvotes: 0

Views: 45

Answers (1)

akrun
akrun

Reputation: 887951

The aggregate returns a summarised output and not create a new column. We can use mutate from dplyr

library(dplyr)
df %>%
   group_by(Year, day, Terminal) %>%
   mutate(X = sum(Capacity))

For the data.table approach we need to assign := to create a new column

as.data.table(df)[, X := sum(Capacity), by = .(Terminal, Year, day)]

Or with ave from base R

df$X <- with(df, ave(Capacity, Year, day, Terminal, FUN = sum))

Upvotes: 2

Related Questions