Chemjong
Chemjong

Reputation: 47

Mutating join in r and writing function

I come from excel background. Vlookup has been answered before but that doesn't help me to solve my problem.

  1. I have dataframe named data1 & vlook. I am trying to vlookup vlook$Jul.16 against data1 for the month of 2016-07-01 only and so on.I also want to fill vlook$data with same value vlook$Jul.16 if there is match otherwise NO.
  2. Actual csv file has 50,000+ rows and 12 months. I am also looking for advise which could eliminate writing similar code 12 times.

datas:

df <- structure(list(Market = c("DOM", "INT", "INT", "INT", "DOM", 
"INT", "INT", "DOM", "DOM", "DOM", "INT", "DOM", "INT", "INT", 
"DOM", "DOM", "INT", "DOM", "DOM", "INT", "INT", "DOM", "DOM", 
"INT"), Code = c("ZNO", "ZBE", "ZAC", "ZJY", "DCC", "ZXL", "ZNO", 
"ZBE", "ZAC", "ZJY", "DCC", "ZXL", "YNO", "ZNO", "ZNO", "YNO", 
"DCC", "YNO", "ZNO", "ZBE", "ZAC", "ZJY", "ZXL", "ZNO"), Departure = c("2016-07-01", 
"2016-07-01", "2016-07-01", "2016-07-01", "2016-07-01", "2016-07-01", 
"2016-07-01", "2016-07-01", "2016-07-01", "2016-07-01", "2016-07-01", 
"2016-07-01", "2016-08-01", "2016-08-01", "2016-08-01", "2016-08-01", 
"2016-08-01", "2016-08-01", "2016-08-01", "2016-09-01", "2016-09-01", 
"2016-09-01", "2016-09-01", "2016-09-01"), Amount.per.km = c(0.36, 
0.34, 0.23, 0.26, 0.18, 0.28, 1.08, 0.38, 0.24, 0.34, 0.27, 0.13, 
0.11, 0.22, 0.13, 0.06, 0.76, 0.26, 0.23, 0.57, 0.18, 0.09, 0.84, 
0.28)), .Names = c("Market", "Code", "Departure", "Amount.per.km"
), class = "data.frame", row.names = c(NA, -24L))

vlook <- structure(list(Jul.16 = c("ZNO", "ZBE", "YNO", "ZAC", "ZJY", 
"DCC", "ZXL"), Data = c(NA, NA, NA, NA, NA, NA, NA), Aug.16 = c("ZNO", 
"ZBE", "YNO", "ZAC", "", "", ""), Data.1 = c(NA, NA, NA, NA, 
NA, NA, NA)), .Names = c("Jul.16", "Data", "Aug.16", "Data.1"
), class = "data.frame", row.names = c(NA, -7L))

My expected end result:

expected <- structure(list(Jul.16 = c("ZNO", "ZBE", "YNO", "ZAC", "ZJY", 
"DCC", "ZXL"), Data = c("ZNO", "ZBE", "NO", "ZAC", "ZJY", "DCC", 
"ZXL"), Aug.16 = c("ZNO", "ZBE", "YNO", "ZAC", "", "", ""), Data.1 = c("ZNO", 
"NO", "YNO", "NO", "", "", "")), .Names = c("Jul.16", "Data", 
"Aug.16", "Data.1"), class = "data.frame", row.names = c(NA, 
-7L))

My attempt to vlookup:

vlook %>% 
  select(Jul.16) %>% 
  left_join(.,data1, by = c("Jul.16" = "Code")) %>% 
  filter(Departure == "2016-07-01")

vlook %>% 
      select(Aug.16) %>% 
      left_join(.,data1, by = c("Aug.16" = "Code")) %>% 
      filter(Departure == "2016-08-01")

The above code correctly join two dataframes for given month. However this doesn't show me code which has no match. E.g. vlook$Jul.16 has "YNO" but data1 doesn't have "YNO" for month of July hence my code doesn't show me "YNO" as NA.

I also need help in putting values in vlook$Data and vlook$Data.1 for given month just like my expected output.

This is related to my work. I'll very grateful for your help. Thank you.

Upvotes: 0

Views: 205

Answers (1)

MrFlick
MrFlick

Reputation: 206308

Your vlook table is in a very un-R-like data format which makes this quite messy. You could do something like this:

vlook %>% 
    mutate(Data=ifelse(Jul.16 %in% filter(df, Departure=="2016-07-01")$Code, Jul.16,"NO")) %>% 
    mutate(Data.1=ifelse(Aug.16 %in% filter(df, Departure=="2016-08-01")$Code, Aug.16,"NO"))

Upvotes: 3

Related Questions