LfB
LfB

Reputation: 17

How can I select rows from a dataframe that match any of the elements in a vector?

I am using the dataframe below:

Product.Name = c('BRILINTA','BRILINTA','Brilinta 6','Brilinta 9')
NDC = c(00186077739,00186077660,00186077739,00186077760)
df = cbind(Product.Name,NDC)

I have two lists below:

ticagrelor_ndc = c(00186077660,186077739,186077694,186077708,186077760,5515496180,5515496188,6923811346,6923811341)
ticagrelor_name = c('ticagrelor','Brilinta 6','Brilinta 9','Brilinta','BRILINTA')

I would like to select the rows from the data frame that have df$Product.Name matches any element in ticagrelor_name or that have df$NDC matches any element in ticagrelor_ndc.

I have tried the following:

df[(NDC %in% ticagrelor_ndc) | (Product.Name %in% ticagrelor_name)]
df[sapply(1:nrow(input_data), function(x) all(input_data$NDC %in% ndc_list)),]
subset(df,NDC %in% ndc_list | Product.Name %in% name_list)

Actual results:

1) Matching df$Product.Name to ticagrelor_name works perfectly. 2) Matching df$NDC to ticagrelor_ndc does not work at all.

Expected result: I would like to be able to match based upon df$Product.Name and df$NDC.

Upvotes: 0

Views: 44

Answers (1)

andrew_reece
andrew_reece

Reputation: 21274

Just make sure you actually have a data frame (see @neilfws's comment) and use the OR (|) operator in filter().

library(tidyverse)

df %>% filter(Product.Name %in% ticagrelor_name | NDC %in% ticagrelor_ndc)

# A tibble: 3 x 2
  Product.Name       NDC
  <chr>            <dbl>
1 BRILINTA     186077739
2 BRILINTA     186077660
3 Brilinta 9   186077760

Note: Your provided data doesn't actually seem to produce any failed matches - here's a modified dataset to demonstrate how the "Brilinta 6" row is filtered out when it doesn't match either _name or _ndc:

Product.Name <- c('BRILINTA','BRILINTA','Brilinta 6','Brilinta 9')
NDC = c(00186077739,00186077660,00186077739,00186077760)
ticagrelor_ndc <- c(00186077660,186077694,186077708, 186077760,
                    5515496180,5515496188,6923811346,6923811341)
ticagrelor_name <- c('ticagrelor','Brilinta 9', 'Brilinta','BRILINTA')
df <- data.frame(Product.Name, NDC)

Upvotes: 1

Related Questions