Reputation: 723
I have three data frames that need to be merged. There are a few small differences between the competitor names in each data frame. For instance, one name might not have a space between their middle and last name, while the other data frame correctly displays the persons name (Example: Sarah JaneDoe vs. Sarah Jane Doe). So, I used the fuzzy join package. When I run the below code, it just keeps running. I can't figure out how to fix this.
Can you identify where I went wrong?
library(fuzzyjoin)
library(tidyverse)
temp1 = read.csv('https://raw.githubusercontent.com/bandcar/bjj/main/temp1.csv')
stats=read.csv('https://raw.githubusercontent.com/bandcar/bjj/main/stats.csv')
winners = read.csv('https://raw.githubusercontent.com/bandcar/bjj/main/winners.csv')
#perform fuzzy matching full join
star = stringdist_join(temp1, stats,
by='Name', #match based on Name
mode='full', #use full join
method = "jw", #use jw distance metric
max_dist=99,
distance_col='dist') %>%
group_by(Name.x)
Upvotes: 0
Views: 54