Anirudh Murali
Anirudh Murali

Reputation: 107

How to search for a string in an entire data.frame

I have the following table where I have Item name for spare parts of a car. I have the ITEM code for the particular part manufactured by the Car Manufacturers, I also have the corresponding ITEM Code for the same part manufactured by Part Manufacturers.

I periodically get an input, where I get only the ITEM Codes which have been sold. How do I identify which part was sold.

> trial
# A tibble: 6 x 5
  Name         `OEM Part` `OES 1 Code`   `OES 2 Code` `OES 3 Code`
  <chr>        <chr>      <chr>          <chr>        <chr>       
1 Brakes       231049A76  1910290/230023 NA           NA          
2 Cables       2410ASD12  NA             219930       3213Q23     
3 Tyres        9412HJ12   231233         NA           NA          
4 Suspension   756634K71  782320/880716  NA           NA          
5 Ball Bearing 2IW2WD23   231224         NA           NA          
6 Clutches     9304JFW3   NA             QQW223       23RQR3      

Say if I have an input of the following values

> item_code <- c("231049A76", "1910290", "1910290", "23RQR3")

I need the following output

Name
Brakes
Brakes
Brakes
Clutches

NOTE: The 1910290 and 230023 are separate parts; both of them are brakes with slight modification.

Upvotes: 3

Views: 85

Answers (3)

Thomas Guillerme
Thomas Guillerme

Reputation: 1867

Here is an example similar to yours using base:

## Create a dummy matrix
example <- cbind(matrix(1:4, 4,1), matrix(letters[1:20], 4, 4))
colnames(example) <- c("names", "W", "X", "Y", "Z")
#     names W   X   Y   Z  
#[1,] "1"   "a" "e" "i" "m"
#[2,] "2"   "b" "f" "j" "n"
#[3,] "3"   "c" "g" "k" "o"
#[4,] "4"   "d" "h" "l" "p"

This table is similar to yours where the names are in the first column and the patterns to match in the other ones.

## The pattern of interest
pattern <- c("a","e", "f", "p")

For this pattern we expect the following result: "1","1","2","4".

## Detecting the pattern per row
matching_rows <- row(example[,-1])[example[,-1] %in% pattern]
#[1] 1 1 2 4

## Returning the rows with the pattern
example[matching_rows,1]
#[1] "1" "1" "2" "4"

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389175

A not-so efficient method with sapply and apply where we find out which row in trial holds item_code and then get it's corresponding Name value.

sapply(item_code, function(x)   
            trial$Name[apply(trial[-1], 1,  function(y)  any(grepl(x, y)))])

# 231049A76    1910290    1910290     23RQR3 
#  "Brakes"   "Brakes"   "Brakes" "Clutches" 

If you do not need the names, set USE.NAMES = FALSE in sapply.

Upvotes: 3

alistaire
alistaire

Reputation: 43354

If you reshape your data to long form, you can use a join:

library(tidyverse)

trial <- tibble(Name = c("Brakes", "Cables", "Tyres", "Suspension", "Ball Bearing", "Clutches"), 
                `OEM Part` = c("231049A76", "2410ASD12", "9412HJ12", "756634K71", "2IW2WD23", "9304JFW3"), 
                `OES 1 Code` = c("1910290/230023", NA, "231233", "782320/880716", "231224", NA), 
                `OES 2 Code` = c(NA, "219930", NA, NA, NA, "QQW223"), 
                `OES 3 Code` = c(NA, "3213Q23", NA, NA, NA, "23RQR3"))

trial_long <- trial %>% 
    gather('code_type', 'code', -Name) %>%    # reshape to long form
    separate_rows(code) %>%    # separate double values
    drop_na(code)    # drop unnecessary NA rows

# join to filter and duplicate
trial_long %>% 
    right_join(tibble(code = c("231049A76", "1910290", "1910290", "23RQR3")))
#> # A tibble: 4 x 3
#>   Name     code_type  code     
#>   <chr>    <chr>      <chr>    
#> 1 Brakes   OEM Part   231049A76
#> 2 Brakes   OES 1 Code 1910290  
#> 3 Brakes   OES 1 Code 1910290  
#> 4 Clutches OES 3 Code 23RQR3

Upvotes: 6

Related Questions