Reputation: 199
i would like to replace various factors in a data.frame with another factor (that is not part of the levels). For instance:
au1 <- c('deb', 'art', 'deb', 'seb', 'deb', 'deb', 'mar', 'mar', 'joy', 'deb')
au2 <- c('art', 'deb', 'soy', 'deb', 'joy', 'ani', 'deb', 'deb', 'nem', 'mar')
au3 <- c('mar', 'lio', 'mil', 'mar', 'ani', 'lul', 'nem', 'art', 'deb', 'tat')
tata <- data.frame(au1, au2, au3)
I would like to change all the 'deb' and 'joy' with 'XXX'.
Can't find a way to do that. I struggle with adding a level to a whole data.frame and with the use of %in% c('', '') for a data.frame.
any idea?
Upvotes: 2
Views: 2456
Reputation: 109864
Here's an approach using the NAer
function from the qdap package:
library(qdap)
tata[apply(tata, 2, '%in%', c('deb', 'joy'))] <- NA
NAer(tata, "XXX")
## au1 au2 au3
## 1 XXX art mar
## 2 art XXX lio
## 3 XXX soy mil
## 4 seb XXX mar
## 5 XXX XXX ani
## 6 XXX ani lul
## 7 mar XXX nem
## 8 mar XXX art
## 9 XXX nem XXX
## 10 XXX mar tat
Upvotes: 0
Reputation: 132706
A data.frame is a list. You cannot simply change the levels for a whole list, you need to iterate over the list content:
as.data.frame(
lapply(tata, function(x) {
levels(x)[levels(x) %in% c("deb", "joy")] <- "XXX"
x
}))
Upvotes: 2
Reputation: 98429
You could use function mapvalues()
from library plyr()
. As you want to do this this with multiple columns then you can use also function sapply()
. This solution works if all columns in your data frame are factors.
library(plyr)
xx<-as.data.frame(sapply(tata,
mapvalues, from = c("deb", "joy"), to = c("XXX", "XXX")))
Upvotes: 5