Reputation: 767
Given a large data frame with a column that has unique values
(ONE, TWO, THREE, FOUR, FIVE, SIX, SEVEN, EIGHT)
I want to replace some of the values. For example, every occurrence of 'ONE' should be replaced by '1' and
'FOUR' -> '2SQUARED'
'FIVE' -> '5'
'EIGHT' -> '2CUBED'
Other values should remain as they are.
IF/ELSE will run forever. How to apply a vectorized solution? Is match() the corrct way to go?
Upvotes: 0
Views: 162
Reputation: 99321
Your column is probably a factor
. Give this a try. Using rnso's data
, I'd recommend you first create two vectors of values to change from and values to change to
from <- c("FOUR", "FIVE", "EIGHT")
to <- c("2SQUARED", "5", "2CUBED")
Then replace the factors with
with(data, levels(vals)[match(from, levels(vals))] <- to)
This gives
data
# vals
# 1 ONE
# 2 TWO
# 3 THREE
# 4 2SQUARED
# 5 5
# 6 SIX
# 7 SEVEN
# 8 2CUBED
Upvotes: 0
Reputation: 92282
Using @rnso data set
library(plyr)
transform(data, vals = mapvalues(vals,
c('ONE', 'FOUR', 'FIVE', 'EIGHT'),
c('1','2SQUARED', '5', '2CUBED')))
# vals
# 1 1
# 2 TWO
# 3 THREE
# 4 2SQUARED
# 5 5
# 6 SIX
# 7 SEVEN
# 8 2CUBED
Upvotes: 0
Reputation: 24535
Try following using base R:
data = structure(list(vals = structure(c(4L, 8L, 7L, 3L, 2L, 6L, 5L,
1L), .Label = c("EIGHT", "FIVE", "FOUR", "ONE", "SEVEN", "SIX",
"THREE", "TWO"), class = "factor")), .Names = "vals", class = "data.frame", row.names = c(NA,
-8L))
initial = c('ONE', 'FOUR', 'FIVE', 'EIGHT')
final = c('1','2SQUARED', '5', '2CUBED')
myfn = function(ddf, init, fin){
refdf = data.frame(init,fin)
ddf$new = refdf[match(ddf$vals, init), 'fin']
ddf$new = as.character(ddf$new)
ndx = which(is.na(ddf$new))
ddf$new[ndx]= as.character(ddf$vals[ndx])
ddf
}
myfn(data, initial, final)
vals new
1 ONE 1
2 TWO TWO
3 THREE THREE
4 FOUR 2SQUARED
5 FIVE 5
6 SIX SIX
7 SEVEN SEVEN
8 EIGHT 2CUBED
>
Upvotes: 0