Reputation: 10626
df
App1 App2
PS(id-123) id-345
HR(id-345) id-789
Web(id-567) id-123
Trading(id-789) id-345
etc
Columns in data frame are not ordered or may or maynot match the same row values. For example, on row one, PS(id-123) does not match id-345. Values on App1 and App2 freely input.
I need to go throguh App2 and if the value on App2 matches the value between Parentheis in Apps, need to replace that value in App2 with the value from App1.
This is what my final df should look like:
App1 App2 App3
PS(id-123) id-345 HR(id-345)
HR(id-345) id-789 Trading(id-789)
Web(id-567) id-123 PS(id-123)
Trading(id-789) id-345 HR(id-345)
search App1 based on value from Apps2 and if matched put the value in App3 column?
Any ideas how can approach this in R?
Upvotes: 1
Views: 96
Reputation: 89057
This is short and works with your current data:
transform(df, App3 = App1[sapply(App2, grep, App1)])
# App1 App2 App3
# 1 PS(id-123) id-345 HR(id-345)
# 2 HR(id-345) id-789 Trading(id-789)
# 3 Web(id-567) id-123 PS(id-123)
# 4 Trading(id-789) id-345 HR(id-345)
To make it a little more robust, you could replace App2
above with paste0("(", App2, ")")
, also replace sapply
with vapply
and the expected output: integer(1)
.
Edit:
Since it appears some ids do not have a match, an approach based on match
might be more appropriate:
transform(df, App3 = App1[match(App2, sub(".*\\((id-.+)\\)", "\\1", App1))])
where sub(".*\\((id-.+)\\)", "\\1", App1)
is used to extract the interesting information from the App1
column.
Upvotes: 6
Reputation: 263352
flodel's is a more elegant but this might help understand the steps.
> idx <- sapply(dat$App2, function(x) grep(x, dat$App1) )
> dat$App3b <- dat$App1[idx]
> dat
App1 App2 App3 App3b
1 PS(id-123) id-345 HR(id-345) HR(id-345)
2 HR(id-345) id-789 Trading(id-789) Trading(id-789)
3 Web(id-567) id-123 PS(id-123) PS(id-123)
4 Trading(id-789) id-345 HR(id-345) HR(id-345)
Upvotes: 1