christian_db
christian_db

Reputation: 3

New computed column replicating SUMIFS from excel

I have a data frame (df) as follows:

>date       fx      mkt_val
>03/01/2016 AUD     50
>03/01/2016 AUD     75
>03/01/2016 USD     100
>03/01/2016 USD     150
>03/02/2016 AUD     50
>03/02/2016 AUD     500
>03/02/2016 USD     150
>03/02/2016 USD     275

I want to be able effectively replicate the SUMIFS function in excel, so that I get a new column that contains the sum of each fx value for each date. Such that the output would look as follows:

>date       fx  mkt_val sumifs
>03/01/2016 AUD   50    125
>03/01/2016 AUD   75    125
>03/01/2016 USD   100   250
>03/01/2016 USD   150   250
>03/02/2016 AUD   50    550
>03/02/2016 AUD   500   550
>03/02/2016 USD   150   425
>03/02/2016 USD   275   425

Any points to the right direction or code would help! Thank you.

Upvotes: 0

Views: 66

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269644

Try ave. No packages are used.

transform(DF, sumif = ave(mkt_val, date, fx, FUN = sum))

giving:

        date  fx mkt_val sumif
1 03/01/2016 AUD      50   125
2 03/01/2016 AUD      75   125
3 03/01/2016 USD     100   250
4 03/01/2016 USD     150   250
5 03/02/2016 AUD      50   550
6 03/02/2016 AUD     500   550
7 03/02/2016 USD     150   425
8 03/02/2016 USD     275   425

Note: the input in reproducible form is:

Lines <- "date       fx      mkt_val
03/01/2016 AUD     50
03/01/2016 AUD     75
03/01/2016 USD     100
03/01/2016 USD     150
03/02/2016 AUD     50
03/02/2016 AUD     500
03/02/2016 USD     150
03/02/2016 USD     275"
DF <- read.table(text = Lines, header = TRUE, as.is = TRUE)

Upvotes: 2

Andrew Lavers
Andrew Lavers

Reputation: 4378

Using dplyr ibraries. Please (in general) provide data using dput or other directly reproducible way, to make it easier to reproduce.

df <- read.delim(text="
date fx mkt_val
03/01/2016 AUD 50
03/01/2016 AUD 75
03/01/2016 USD 100
03/01/2016 USD 150
03/02/2016 AUD 50
03/02/2016 AUD 500
03/02/2016 USD 150
03/02/2016 USD 275
", sep=" ", row.names=NULL)

library(dplyr)

# summarize
summary <- df %>% 
  group_by(date, fx) %>%
  summarize(sumifs = sum(mkt_val)) 

# join back to the original
result <- df %>% 
  inner_join(summary, by=c("date", "fx"))

result

        date  fx mkt_val sumifs
1 03/01/2016 AUD      50    125
2 03/01/2016 AUD      75    125
3 03/01/2016 USD     100    250
4 03/01/2016 USD     150    250
5 03/02/2016 AUD      50    550
6 03/02/2016 AUD     500    550
7 03/02/2016 USD     150    425
8 03/02/2016 USD     275    425

Upvotes: 0

Related Questions