Reputation: 23
I'm working with a fairly large data set (100k rows) and want to replicate the Excel Index Match function in R Studio.
I'm looking for a way to create a new column that will pull a value from an existing column, if 3 values from three different columns match 3 values from 3 other columns.
Specifically, regarding the example below, I'm looking to create a new column '1994_Number' that pulls the value from '1995_Number' if all of the three columns '1994_Address', '1994_ZipCode' and '1994_Bank Name' match '1995_Address', '1995_ZipCode' and '1995_Bank Name'. Desired result in red.
Any help would be appreciated a lot.
Upvotes: 1
Views: 404
Reputation: 506
Possible solution to your problem with dplyr
:
library(dplyr)
dat <- data.frame(`1994_Adress` = c("123 Road", "123 Road", "321 Road"),
`1994_ZipCode` = c(99999, 99999, 11111),
`1994_Bank Name` = c("JPM", "JPM", "WF"),
`1995_Adress` = c("123 Road", "1234 Road", "321 Road"),
`1995_ZipCode` = c(99999, 99999, 11111),
`1995_Bank Name` = c("JPM", "JPM", "WF"),
`1995_Number` = c(1, 2, 3), check.names = F, stringsAsFactors = F)
dat <- dat %>%
mutate(`1994_Number` = case_when(`1994_Adress` == `1995_Adress` &
`1994_ZipCode` == `1995_ZipCode` &
`1994_Bank Name` == `1995_Bank Name` ~ `1995_Number`))
Upvotes: 2