Using the value in one column to specify from which row to retrieve a value for a new column

Question

I'm looking for an automated way of converting this:

dat = tribble(
    ~a, ~b, ~c
    , 'x', 1, 'y'
    , 'y', 2, NA
    , 'q', 4, NA
    , 'z', 3, 'q'
)

to:

tribble(
    ~a, ~b, ~d
    , 'x', 1, 2
    , 'z', 3, 4
)

So, the column c in dat encodes which row in dat to look at to grab a value for a new column d, and if c is NA, toss that row from the output. Any tips?

Ronak Shah · Accepted Answer

We can join dat with itself using c and a columns.

library(dplyr)

dat %>%
  inner_join(dat %>% select(-c) %>% rename(d = 'b'), 
             by = c('c' = 'a'))


# A tibble: 2 x 4
#  a         b c         d
#     
#1 x         1 y         2
#2 z         3 q         4

In base R, we can do this with merge :

merge(dat, dat[-3], by.x = 'c', by.y = 'a')

Using the value in one column to specify from which row to retrieve a value for a new column

Answers (2)

Related Questions