Reputation: 967
I have two dataframes
master
P_ID C_ID D_ID ...
520686814 1222 1281
336892632 1232 8181 ...
and
str(master$P_ID) : integer64 [1:12000] 520686814 336892632 140101133 85433815 339389222
Also
Dimension
p_id year gender state
520686814 2017 1 A
336892632 2016 2 B
and
str(Dimension$p_id) : num [1:9915433] 520686814 336892632 636774575 248552775...
So essentially I can see that there are records matching in both dataframes for a p_id. So I do a join using
joined <- merge(master, Dimension, by.x = "P_ID", by.y = "p_id")
And to my surprise, I get no records in joined. Does the join don't work on different data types even though they work while subsetting?
Upvotes: 1
Views: 464
Reputation: 819
Assuming you're not performing any mathematical operations on them you could (some would argue should) store IDs as character strings instead. It's my understanding that this is a safer approach. Additionally, merging on strings is faster than merging on numerics.
master$P_ID <- as.character(master$P_ID)
Dimension$p_id <- as.character(Dimension$p_id)
joined <- merge(master, Dimension, by.x = "P_ID", by.y = "p_id")
The code above should get it done. It doesn't answer the question of why your merge fails but does provide you with a workaround.
Upvotes: 1