coomie
coomie

Reputation: 411

R - Lookup values in one df using a conditional statement on a different df

I have two data frames. I can get R to do what I want using nested for loop but I'm sure there is a more efficient method.

Here is my loop:

for (x in 1:dim(data)[1]){
  for (y in 1:dim(Zone_Factor)[1]){
    if(data[x,"ZONE"] == Zone_Factor[y,"Zone"] & data[x,"BLOCK"] == Zone_Factor[y,"Zone_Number"]){
      data[x,"Zone_Factor"] <- Zone_Factor[y,"Factor"]
    }
  }
}

"data" is my main data frame (~100k rows) and "Zone_Factor" is a small table which I'm extracting data from based on values in "data". I'm much more familiar with VBA than R and basically I want to do a vlookup on "Zone_Factor" using "data$ZONE" and "data$BLOCK". I played around with which() but had no luck (my which() function posted below)

Zone_Factor[which(Zone_Factor[,"Zone"] == data$ZONE & Zone_Factor[,"Zone_Number"] == data$BLOCK), "Factor"]

This loop runs in about 2 minutes but I'm assuming there is a much more efficient way of doing this. What can I do instead of using a nested loop?

Upvotes: 0

Views: 172

Answers (1)

bshelt141
bshelt141

Reputation: 1223

Without a reproducible example, it looks like you're just trying to merge two data frames by two fields.

Creating two dummy data frames:

library(dplyr)

data <- data.frame(ZONE = c("Tim", "Brad"),
               BLOCK = c(32828, 32801))

Zone_Factor <- data.frame(Zone = c("Tim", "Tim", "John", "John", "Brad", "Brad"),
                      Zone_Number = c(32827, 32828, 32806, 32807, 32801, 32802),
                      Factor = c(10, 5, 20, 15, 30, 25))

Merging them together based on your naming conventions:

data <- data %>%
  mutate(uid = paste0(ZONE, "-", BLOCK))

Zone_Factor <- Zone_Factor %>%
  mutate(uid = paste0(Zone, "-", Zone_Number)) %>%
  select(-Zone, -Zone_Number)

final_data <- left_join(data, Zone_Factor, by = "uid")
final_data$uid <- NULL

print(final_data)
#  ZONE BLOCK Factor
#1  Tim 32828      5
#2 Brad 32801     30

Upvotes: 1

Related Questions