How to find indices of specific rows in dataframe r

Question

I have a dataframe, A, which looks like this:

col 1   col2   col3
 NL      6       9
 UK      5       5
 US      9       7

and I have a dataframe, B, consisting of a subset of the rows of the large dataframe looking like this:

 col 1   col2   col3
 NL      6       9
 UK      5       5

Now, I want to find the indices of the rows from B in A, so it should return 1 and 2. Does someone know how to do this?

EDIT Next, I also want to find the indices of the rows in A, when I have only the first two columns in B. So, in that case it should also return 1 and 2. Anyone an idea how to do this?

akrun · Accepted Answer

Generally, match gets the index. In our case, an approach is to paste the rows together and get the index with match

match(do.call(paste, df2), do.call(paste, df1)

If there are only subset of columns that are having the same column names, get the vector of column names with intersect, subset the datasets, do the paste and get the index with match

nm1 <- intersect(names(df1), names(df2))
match(do.call(paste, df2[nm1]), do.call(paste, df1[nm1]))

Another option is join where we create a row index in both datasets, do a join and extract the row index

 library(dplyr)
 df2 %>%
    mutate(rn = row_number()) %>% 
   left_join(df2 %>% 
          mutate(rn = row_number()), by = c('col1', 'col2', 'col3')) %>% 
   pull(rn.y)

How to find indices of specific rows in dataframe r

Answers (1)

Related Questions