Reputation: 1032
I have a dataframe look like this:
C1 C2
0 1
2 -1
1 1
-1 2
0 0
and I want to replace all -1 to 'minus' , 0 to 'nc' , 1 to 'plus1' , 2 to 'plus2'. I know how to replace the numbers one by one by using 'gsub' but I do not know how to replace them all at once. as an example for 0 and -1 ,this is my code:
gsub(df, '0', 'nc');gsub(df, '-1', 'minus')
Thanks in advance,
Upvotes: 1
Views: 2285
Reputation: 887118
If you don't have any other values except the one specified for conversion, this also works
lvls <- c('minus', 'nc', 'plus1', 'plus2') #create a vector for specifying the levels of factor.
Convert each column to factor
and specify the labels
as lvls
and reconvert it back to character if you want character
columns
df[] <- lapply(df, function(x) as.character(factor(x, labels=lvls)))
df
# C1 C2
#1 nc plus1
#2 plus2 minus
#3 plus1 plus1
#4 minus plus2
#5 nc nc
Also, in case you want an option with gsub
there is mgsub
in qdap
which will take vectors
as search terms and replacements.
library(qdap)
pat <- -1:2
replacer <- c('minus', 'nc', 'plus1', 'plus2')
v1 <- mgsub(pat, replacer, as.matrix(df)) #on the original dataset
dim(v1) <- dim(df)
df[] <- v1
df
# C1 C2
#1 nc plus1
#2 plus2 minus
#3 plus1 plus1
#4 minus plus2
#5 nc nc
df <- structure(list(C1 = c(0L, 2L, 1L, -1L, 0L), C2 = c(1L, -1L, 1L,
2L, 0L)), .Names = c("C1", "C2"), class = "data.frame", row.names = c(NA,
-5L))
Upvotes: 1
Reputation: 121568
No need to use regular expressions here. matrix sub-setting and replacement within a simple loop here. Note that for replacement it is generally better to use a for loop than xxxpply family functions.
from <- -1:2
to <- c('minus', 'nc', 'plus1', 'plus2')
for (i in seq_along(from))df[df==from[i]] <- to[i]
C1 C2
1 nc plus1
2 plus2 minus
3 plus1 plus1
4 minus plus2
5 nc nc
Upvotes: 1
Reputation: 92292
Something like that maybe? Here I basically creating a "legend" once and then using match
over the whole data frame in order to replace the values in all the columns
temp <- data.frame(A = (-1:2), B = c('minus', 'nc', 'plus1', 'plus2'))
df[] <- lapply(df, function(x) temp[match(x, temp$A), "B"])
df
# C1 C2
# 1 nc plus1
# 2 plus2 minus
# 3 plus1 plus1
# 4 minus plus2
# 5 nc nc
Upvotes: 3