Reputation: 1105
So I have a dataframe like so:
Occupation
ELECTRICIAN
ROAD ELECTRICIAN
ELECTRICIAN
FARMER
GRASS ELECTRICIAN
POLE ELECTRICIANS
ELECTRICIAN
INSPECTOR
So I would like any time ELECTRICIAN appears to be just ELECTRICIAN, no matter anything else in the cell.
So final product should be:
Occupation
ELECTRICIAN
ELECTRICIAN
ELECTRICIAN
FARMER
ELECTRICIAN
ELECTRICIAN
ELECTRICIAN
INSPECTOR
I tried the following but this did not work....
ifelse(grep('CONDUCTOR',df$Occupation , value=TRUE), "CONDUCTOR",df$Occupation)
Upvotes: 0
Views: 57
Reputation: 166
Here's the tidyverse solution using the stringr
package.
library(stringr)
df$Occupation<- str_replace_all(df$Occupation,".*ELECTRICIAN.*","ELECTRICIAN")
There's some debate over whether tidyverse solutions are "preferred," but I personally prefer them. I think the function names are much more intuitive both for you and whoever may be reading your code. I also think it's concise and direct, doing exactly what you want to do.
Upvotes: 2
Reputation: 388982
Using grep
you can get the index in Occupation
where "ELECTRICIAN"
is present and replace those values.
df$Occupation[grep('ELECTRICIAN', df$Occupation)] <- 'ELECTRICIAN'
df
# Occupation
#1 ELECTRICIAN
#2 ELECTRICIAN
#3 ELECTRICIAN
#4 FARMER
#5 ELECTRICIAN
#6 ELECTRICIAN
#7 ELECTRICIAN
#8 INSPECTOR
Upvotes: 1
Reputation: 39595
I would suggest this approach. It is better using grepl()
as it produces logic values that can be easily detected in ifelse()
:
#Data
df <- structure(list(Occupation = c("ELECTRICIAN", "ROAD ELECTRICIAN",
"ELECTRICIAN", "FARMER", "GRASS ELECTRICIAN", "POLE ELECTRICIANS",
"ELECTRICIAN", "INSPECTOR")), row.names = c(NA, -8L), class = "data.frame")
Code:
#Code
df$Occupation <- ifelse(grepl('ELECTRICIAN',df$Occupation), 'ELECTRICIAN',df$Occupation)
Output:
Occupation
1 ELECTRICIAN
2 ELECTRICIAN
3 ELECTRICIAN
4 FARMER
5 ELECTRICIAN
6 ELECTRICIAN
7 ELECTRICIAN
8 INSPECTOR
Upvotes: 2