Reputation: 1039
In the below example, userids
is my reference data frame and userdata
is the data frame where the replacements should take place.
> userids <- data.frame(USER=c('Ann','Jim','Lee','Bob'),ID=c(1,2,3,4))
> userids
USER ID
1 Ann 1
2 Jim 2
3 Lee 3
4 Bob 4
> userdata <- data.frame(INFO=c('foo','bar','foo','bar'), ID=c('Bob','Jim','Ann','Lee'),AGE=c('43','33','53','26'), FRIENDID=c('Ann',NA,'Lee','Jim'))
> userdata
INFO ID AGE FRIENDID
1 foo Bob 43 Ann
2 bar Jim 33 NA
3 foo Ann 53 Lee
4 bar Lee 26 Jim
How do I replace ID and FRIENDID in userdata
with the ID corresponding to USER in userids
?
The desired output:
INFO ID AGE FRIENDID
1 foo 4 43 1
2 bar 2 33 NA
3 foo 1 53 3
4 bar 3 26 2
Upvotes: 18
Views: 10576
Reputation: 108
Here's a possible solution, which will also work on datasets with multiple records of each ID, though we will need to coerce the ID and FRIENDID variables to character first:
> userdata$ID <- sapply(userdata$ID, function(x){gsub(x, userids[userids$USER==x, 2], x)})
> userdata$FRIENDID <- sapply(userdata$FRIENDID, function(x){gsub(x, userids[userids$USER==x, 2], x)})
Upvotes: 0
Reputation: 110034
This is a possibility:
library(qdap)
userdata$FRIENDID <- lookup(userdata$FRIENDID, userids)
userdata$ID <- lookup(userdata$ID, userids)
or to win the one line prize:
userdata[, c(2, 4)] <- lapply(userdata[, c(2, 4)], lookup, key.match=userids)
Upvotes: 2
Reputation: 121608
Here a try using sqldf
to get the result as a multiple join on differents columns.
library(sqldf)
sqldf('SELECT d.INFO,d.AGE,i1.ID ,i2.ID FRIENDID
FROM
userdata d
INNER JOIN
userids i1 ON (i1.USER=d.FRIENDID)
INNER JOIN
userids i2 ON (i2.USER=d.ID)')
INFO AGE ID FRIENDID
1 foo 43 1 4
2 foo 53 3 1
3 bar 26 2 3
But this this removes NA lines! maybe someone can suggest me something on how to deal with NA!
EDIT
Thanks to G. Grothendieck comment, replacing the INNER by LEFT we get the result.
sqldf('SELECT d.INFO,d.AGE,i1.ID ,i2.ID FRIENDID
FROM
userdata d
LEFT JOIN
userids i1 ON (i1.USER=d.FRIENDID)
LEFT JOIN
userids i2 ON (i2.USER=d.ID)')
INFO AGE ID FRIENDID
1 foo 43 1 4
2 bar 33 NA 2
3 foo 53 3 1
4 bar 26 2 3
Upvotes: 0
Reputation: 118879
Use match
:
userdata$ID <- userids$ID[match(userdata$ID, userids$USER)]
userdata$FRIENDID <- userids$ID[match(userdata$FRIENDID, userids$USER)]
Upvotes: 26