Reputation: 197
Here’s my problem which has been driving me crazy all morning.
So, I have two tables “vessel” and “target”
v_registry<-c("","GBR000B11824","GBR000B10110","GBR000C17779","","GBR000C16255")
v_pln<-c("WH4","","BRD5","B291","LI8","UL78")
v_rss<-c("C19926","B11824","","C17779","A16190","C16255")
v_asset<- c(104892,104902,104905,104916,104919,104920)
vessel<-data.frame(v_registry,v_pln,v_rss,v_asset,stringsAsFactors=FALSE)
t_registry<-c("GBR000C19926","GBR000B11824","","","GBR000A16190","")
t_pln<-c("","","BRD5","B291","LI8","")
t_rss<-c("C19926","","","","","C16255")
target<-data.frame(t_registry,t_pln,t_rss,stringsAsFactors=FALSE)
target<-target[sample(nrow(target)),]
The vessel table has ID information about vessels. The target table is quite wide with lots of other data not needed here in the example. What I’m trying to achieve is to copy across the “t_asset” column (which is the only complete ID field) to the target table. The problem is that none of my tables is complete and I need to make this operation based on three different fields.
Below are a couple of attempts to do it. The sample line is just to shuffle it because for some strange reason if it is ordered it will work. The second attempt returns only a logical value and I haven’t managed to get the element instead of the logical value.
#Attempt 1
target$t_asset<-
vessel$v_asset[match(target$t_registry,vessel$v_registry,incomparables = "")|
match(target$t_pln,vessel$v_pln,incomparables = "")|
match(target$t_rss,vessel$v_rss,incomparables = "")]
#Attempt 2
target$t_asset<-
(vessel$v_asset[match(target$t_registry,vessel$v_registry,incomparables = "")]|
vessel$v_asset[match(target$t_pln,vessel$v_pln,incomparables = "")]|
vessel$v_asset[match(target$t_rss,vessel$v_rss,incomparables = "")])
The expected output is(rows might look different because of the shuffle):
> target
t_registry t_pln t_rss t_asset
1 GBR000C19926 C19926 104892
2 GBR000B11824 104902
3 BRD5 104905
4 B291 104916
5 GBR000A16190 LI8 104919
6 C16255 104920
Any ideas on how to solve it?
Cheers
Upvotes: 1
Views: 64
Reputation: 197
Both, earlier answers solved the example given. However, when applied in the real data set for some reason both gave errors.
So, finally I've manged to get some code giving the right answers and tested in the real data set. However, the code is not pretty and I'm sure it could be made more efficient.
# Creates three new columns each with an idependent match
target$t_asset_registry<-vessel$v_asset[match(target$t_registry,vessel$v_registry,incomparables = "")]
target$t_asset_pln<-vessel$v_asset[match(target$t_pln,vessel$v_pln,incomparables = "")]
target$t_asset_rss<-vessel$v_asset[match(target$t_rss,vessel$v_rss,incomparables = "")]
# an if statment to sumarize the results
target$asset<-ifelse(is.na(target$t_asset_registry),
ifelse(is.na(target$t_asset_pln),
ifelse(is.na(target$t_asset_rss),NA,target$t_asset_rss),
target$t_asset_pln),target$t_asset_registry)
The output is:
> target
t_registry t_pln t_rss t_asset_registry t_asset_pln t_asset_rss asset
4 B291 NA 104916 NA 104916
3 BRD5 NA 104905 NA 104905
6 C16255 NA NA 104920 104920
5 GBR000A16190 LI8 NA 104919 NA 104919
1 GBR000C19926 C19926 NA NA 104892 104892
2 GBR000B11824 104902 NA NA 104902
Looking to the output it is clear what I was trying to achieve.If anyone as a clever way of accomplish the same result please post it.
Thanks to all helping out
Upvotes: 0
Reputation: 40648
Using merge
:
target$t_asset <- merge(target, vessel, by.x=1:3, by.y=1:3, all.y = T, sort = F)$v_asset
> target
t_registry t_pln t_rss t_asset
6 C16255 104892
1 GBR000C19926 C19926 104902
3 BRD5 104905
2 GBR000B11824 104916
5 GBR000A16190 LI8 104919
4 B291 104920
Upvotes: 1
Reputation: 60000
# Find which rows from vessel are the match for target
x <- mapply( match , MoreArgs=list(incomparables="") , target , vessel )
# Remove the NA's and incase you have more than one piece of information
# available (multiple matches), reduce to a single number
idx <- apply(x,1, function(x) unique( x[!is.na(x) ] ))
# Use the matches to get the id field from vessel
target$t_asset <- vessel$v_asset[idx]
target
# t_registry t_pln t_rss t_asset
#3 BRD5 104905
#2 GBR000B11824 104902
#4 B291 104916
#1 GBR000C19926 C19926 104892
#6 C16255 104920
#5 GBR000A16190 LI8 104919
Upvotes: 1