Reputation: 339
I have time dependent data that includes "year" as a column and is the year of the data. I have a second variable, with one value per year, that I'd like to substract from the first variable when the years are identical.
library(dplyr)
a1 = data.frame(year = 2000:2005, y=0:5)
b1 = data.frame(year = 2000:2005, y=0:5)
ab = rbind(a1,b1)
c1 = data.frame(year = 2000:2005, x = 10:15)
# my best attempt - does not work
result <- ab %>% group_by(year) %>% mutate(diff = year - c1[year])
what I expect is that result has an entry with year = 2000, y = 0, and a new column diff = -10.
But, can't seem to make that work using dplyr.
How can this be accomplished using dplyr?
Upvotes: 0
Views: 226
Reputation: 1708
Is there a difference between a1 and b1? They look the same.
How about this?
d <- left_join(ab, c1, by = "year") %>%
mutate(diff = y-x)
Gives me this, which seems to solve your problem.
year y x diff
1 2000 0 10 -10
2 2001 1 11 -10
3 2002 2 12 -10
4 2003 3 13 -10
5 2004 4 14 -10
6 2005 5 15 -10
7 2000 0 10 -10
8 2001 1 11 -10
9 2002 2 12 -10
10 2003 3 13 -10
11 2004 4 14 -10
12 2005 5 15 -10
Upvotes: 1