Reta
Reta

Reputation: 383

Replace multiple string in column based on 2 columns conditionally in R

I'm trying to replace text in a column based on 2 other columns in my data using R.

I have this data with these columns:

Id     City        Street             Street_Type           Street_Category 
1     Dallas     State Route 315       Street               Street
2     Dallas     State Route 82        State Highways       Street
3     SF          State St             Street               Street
4     NY city      Corss St            Street               Street
5     SD          Steven Pkwy          Street               Street
6     LA          Harlem Pkwy          Parkway              Parkway

And I want my data to look like :

  Id     City          Street             Street_Type         Street_Category 
  1     Dallas     State Route 315         Street               State Highways
  2     Dallas     State Route 82          State Highways       State Highways
  3     SF          State St               Street               Street
  4     NY City     Corss St               Street               Street
  5     SD         Steven Pkwy             Street               Parkway
  6     LA         Harlem Pkwy             Parkway              Parkway

I want to make changes on the existing column Street_Category where if column Street has the text "State Route" and column Street_Type has text "Street", we replace the text in Street_Category with "State Highways" Also if column Street has the text "Pkwy" and column Street_Type has text "Street", we replace the text in Street_Category with "Parkway".

I have a large dataset with different values that need to be replaced similar to this example. How can I do it among all the datasets I have? Also, I want to take into consideration case sensitivity. For example, I don't want to change the Street_Category of "State St" to "State Highways" because it has the word "State" in it.

I used this code to create Street_Type column but it caused this wrong classification of the Street_Category.

df$Street_Type <- g %>% 
  mutate(Street = case_when( 
    str_detect(Street,"St") ~ "Street", 
    str_detect(Street," State Route") ~ "State Highways",
    str_detect(Street,"Route") ~ "State Highways",
    str_detect(Street,"Pkwy") ~ "Parkway",

    
    TRUE ~ "No type"
  )

But it gave me this the first output, and I tried this code to replace the column based on 2 different columns following the answer in this link :

df[Street == " State Route" &  Street_Type == "Street", Street_Category == "State Highways"]
df[Street == " Pkwy" &  Street_Type == "Street", Street_Category == "Parkway"]

But I get the error message:

Error in `[.data.frame`(df, Street == " State Route" & Street_Type ==  : 
  object 'Street_Category' not found

What am I missing here? I'll be sp thankful if you can point out to the error I'm making here.

Upvotes: 0

Views: 413

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388817

You can try -

library(dplyr)
library(stringr)

df %>%
  mutate(Street_Category = case_when(
      str_detect(Street, 'State Route') & Street_Type == 'Street' ~ "State Highways", 
      str_detect(Street, 'Pkwy') &  Street_Type == 'Street' ~ "Parkway", 
      TRUE ~ Street_Category))

#  Id    City          Street    Street_Type Street_Category
#1  1  Dallas State Route 315         Street  State Highways
#2  2  Dallas  State Route 82 State Highways          Street
#3  3      SF        State St         Street          Street
#4  4 NY city        Corss St         Street          Street
#5  5      SD     Steven Pkwy         Street         Parkway
#6  6      LA     Harlem Pkwy        Parkway         Parkway

data

It is easier to help if you provide data in a reproducible format

df <- structure(list(Id = 1:6, City = c("Dallas", "Dallas", "SF", "NY city", 
"SD", "LA"), Street = c("State Route 315", "State Route 82", 
"State St", "Corss St", "Steven Pkwy", "Harlem Pkwy"), Street_Type = c("Street", 
"State Highways", "Street", "Street", "Street", "Parkway"), Street_Category = c("Street", 
"Street", "Street", "Street", "Street", "Parkway")), row.names = c(NA, -6L), class = "data.frame")

Upvotes: 1

Related Questions