Reputation: 89
here i have the following dataframe df
in R.
kyid industry amount
112 Apparel 345436
234 APPEARELS 234567
213 apparels 345678
345 Airlines 235678
123 IT 456789
124 IT 897685
i want to replace in industry which incorrectly written Apparel
, or APPEARLS
to Apparels
.
i tried using creating a list and run it through a loop.
l<-c('Apparel ','APPEARELS','apparels')
for(i in range(1:3)){
df$industry<-gsub(pattern=l[i],"Apparels",df$industry)
}
it is not working.only one element changes.
But, when i take the statement individually it is not creating an error and its working.
df$industry<-gsub(pattern=","Apparels",df$industry)
but this is a large dataset so i nned this to work in R please help.
Upvotes: 0
Views: 553
Reputation: 54287
While range
returns a sequence in Python, it returns the minimum and maximum of a vector in R:
range(1:3)
# [1] 1 3
Instead, you could use 1:3
or seq(1,3)
or seq_along(l)
, which all return
# [1] 1 2 3
Also note the difference between 'Apparel'
and 'Apparel '
.
So
df<-read.table(header=T, text="kyid industry amount
112 Apparel 345436
234 APPEARELS 234567
213 apparels 345678
345 Airlines 235678
123 IT 456789
124 IT 897685")
l<-c('Apparel','APPEARELS','apparels')
for(i in seq_along(l)){
df$industry<-gsub(pattern=l[i],"Apparels",df$industry)
}
df
# kyid industry amount
# 1 112 Apparels 345436
# 2 234 Apparels 234567
# 3 213 Apparels 345678
# 4 345 Airlines 235678
# 5 123 IT 456789
# 6 124 IT 897685
Upvotes: 1
Reputation: 28379
sub
without loop using |
:
l <- c("Apparel" , "APPEARELS", "apparels")
# Using OPs data
sub(paste(l, collapse = "|"), "Apparels", df$industry)
# [1] "Apparels" "Apparels" "Apparels" "Airlines" "IT" "IT"
I'm using sub
instead of gsub
as there's only one occurrence of pattern in a string (at least in example).
Upvotes: 2