user3424575
user3424575

Reputation: 111

Adding values from 2 different dataframes to make a new one?

Sorry, couldn't think of a better way to phrase the question.

Basically, how would I go about merging something similar to the following dataframes

2014:

Team  Goals.Scored   Goals.Conceded
MUFC      2              4
MCFC      3              0
LFC       1              6

2015:

Team  Goals.Scored   Goals.Conceded
MUFC    4                   3
MCFC    3                   2
AVFC    1                   5

into something like this, that would automatically add the values for matching Teams

Overall

Team  Goals.Scored  Goals.Conceded
MUFC      6             7
MCFC      6             2
LFC       1             6
AVFC      1             5

I was originally using a simple command along the lines of

new.df = 2014$Goals.Scored + 2015$Goals.Scored

but I think that only works if the teams are in the same order in both data frames, and if they're the same teams across the years? Was thinking of doing a for loop but have been unsuccessful thus far.

Would appreciate any help, thanks.

Upvotes: 2

Views: 57

Answers (1)

akrun
akrun

Reputation: 886998

If the two datasets are 'df1', 'df2', we place it in a list, rbind it with rbindlist from data.table, grouped by 'Team', we get the sum of the Subset of Data.table (.SD)

library(data.table)
rbindlist(list(df1, df2))[,lapply(.SD, sum) ,Team]
#   Team Goals.Scored Goals.Conceded
#1: MUFC            6              7
#2: MCFC            6              2
#3:  LFC            1              6
#4: AVFC            1              5

Or using dplyr, we rbind the dataset (bind_rows), group by 'Team', and get the sum of the columns using summarise_each.

library(dplyr)
bind_rows(df1, df2) %>%
           group_by(Team) %>% 
           summarise_each(funs(sum))

Or a bit more compact code will be aggregate from base R (would be slower compared to the other methods described above for big datasets). We can use the formula with . suggesting all the other variables on the lhs of ~.

aggregate(.~Team, rbind(df1, df2), sum)
#  Team Goals.Scored Goals.Conceded
#1 AVFC            1              5
#2  LFC            1              6
#3 MCFC            6              2
#4 MUFC            6              7

It can be otherwise written as aggregate(cbind(Goals.Scored, Gaols.Conceded)~Team,....)

Upvotes: 4

Related Questions