Yellow_truffle
Yellow_truffle

Reputation: 923

multiplying column from data frame 1 by a condition found in data frame 2

I have two separate data frame and what I am trying to do is that for each year, I want to check data frame 2 (in the same year) and multiply a column from data frame 1 by the found number. So for example, imagine my first data frame is:

year <- c(2001,2003,2001,2004,2006,2007,2008,2008,2001,2009,2001)
price <- c(1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000)
df <- data.frame(year, price)

   year price
1  2001  1000
2  2003  1000
3  2001  1000
4  2004  1000
5  2006  1000
6  2007  1000
7  2008  1000
8  2008  1000
9  2001  1000
10 2009  1000
11 2001  1000

Now, I have a second data frame which includes inflation conversion rate (code from @akrun)

ref_inf <- c(2,3,1,2.2,1.3,1.5,1.9,1.8,1.9,1.9)
ref_year<- seq(2010,2001)
inf_data <- data.frame(ref_year,ref_inf)
inf_data<-inf_data %>% 
  mutate(final_inf = cumprod(1 + ref_inf/100))
   ref_year ref_inf final_inf
1      2010     2.0  1.020000
2      2009     3.0  1.050600
3      2008     1.0  1.061106
4      2007     2.2  1.084450
5      2006     1.3  1.098548
6      2005     1.5  1.115026
7      2004     1.9  1.136212
8      2003     1.8  1.156664
9      2002     1.9  1.178640
10     2001     1.9  1.201035

What I want to do is that for example for the first row of data frame 1, it's the year 2001, so I go and found a conversion for the year 2001 from data frame 2 which is 1.201035 and then multiply the price in a data frame 1 by this found conversion rate. So the result should look like this:

   year price  after_conv
1  2001  1000  1201.035
2  2003  1000  1156.664
3  2001  1000  1201.035
4  2004  1000  1136.212
5  2006  1000  1098.548
6  2007  1000  1084.450
7  2008  1000  1061.106
8  2008  1000  1061.106
9  2001  1000  1201.035
10 2009  1000  1050.600
11 2001  1000  1201.035

is there any way to do this without using else and if commands?

Upvotes: 1

Views: 56

Answers (2)

Len Greski
Len Greski

Reputation: 10855

Since the data is already being processed by dplyr, we can also solve this problem with dplyr. A dplyr based solution joins the data with the reference data by year and calculates after_conv.

year <- c(2001,2003,2001,2004,2006,2007,2008,2008,2001,2009,2001)
price <- c(1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000)
df <- data.frame(year, price)
library(dplyr)
ref_inf <- c(2,3,1,2.2,1.3,1.5,1.9,1.8,1.9,1.9)
ref_year<- seq(2010,2001)
inf_data <- data.frame(ref_year,ref_inf)
inf_data %>% 
     mutate(final_inf = cumprod(1 + ref_inf/100)) %>%
     rename(year = ref_year) %>%
     left_join(df,.) %>%
     mutate(after_conv = price * final_inf ) %>%
     select(year,price,after_conv)

We use left_join() to keep the data ordered in the original order of df as well as ensure rows in inf_data only contribute to the output if they match at least one row in df. We use . to reference the data already in the pipeline as the right side of the join, merging in final_inf so we can use it in the subsequent mutate() function. We then select() to keep the three result columns we need.

...and the output:

Joining, by = "year"
   year price after_conv
1  2001  1000   1201.035
2  2003  1000   1156.664
3  2001  1000   1201.035
4  2004  1000   1136.212
5  2006  1000   1098.548
6  2007  1000   1084.450
7  2008  1000   1061.106
8  2008  1000   1061.106
9  2001  1000   1201.035
10 2009  1000   1050.600
11 2001  1000   1201.035

We can save the result to the original df by writing the result of the pipeline to df.

inf_data %>% 
     mutate(final_inf = cumprod(1 + ref_inf/100)) %>%
     rename(year = ref_year) %>%
     left_join(df,.) %>%
     mutate(after_conv = price * final_inf ) %>%
     select(year,price,after_conv) -> df

Upvotes: 1

akrun
akrun

Reputation: 887118

We can do a join on the 'year' with 'ref_year' and create the new column by assigning (:=) the output of product of 'price' and 'final_inf'

library(data.table)
setDT(df)[inf_data,  after_conv := price * final_inf, on = .(year = ref_year)]

-output

df
#    year price after_conv
# 1: 2001  1000   1201.035
# 2: 2003  1000   1156.664
# 3: 2001  1000   1201.035
# 4: 2004  1000   1136.212
# 5: 2006  1000   1098.548
# 6: 2007  1000   1084.450
# 7: 2008  1000   1061.106
# 8: 2008  1000   1061.106
# 9: 2001  1000   1201.035
#10: 2009  1000   1050.600
#11: 2001  1000   1201.035

Upvotes: 1

Related Questions