jejuba
jejuba

Reputation: 199

how to replace multiple factors in whole data.frame in R

i would like to replace various factors in a data.frame with another factor (that is not part of the levels). For instance:

au1 <- c('deb', 'art', 'deb', 'seb', 'deb', 'deb', 'mar', 'mar', 'joy', 'deb')
au2 <- c('art', 'deb', 'soy', 'deb', 'joy', 'ani', 'deb', 'deb', 'nem', 'mar')
au3 <- c('mar', 'lio', 'mil', 'mar', 'ani', 'lul', 'nem', 'art', 'deb', 'tat')

tata <- data.frame(au1, au2, au3)

I would like to change all the 'deb' and 'joy' with 'XXX'.

Can't find a way to do that. I struggle with adding a level to a whole data.frame and with the use of %in% c('', '') for a data.frame.

any idea?

Upvotes: 2

Views: 2456

Answers (3)

Tyler Rinker
Tyler Rinker

Reputation: 109864

Here's an approach using the NAer function from the qdap package:

library(qdap)

tata[apply(tata, 2,  '%in%', c('deb', 'joy'))] <- NA
NAer(tata, "XXX")

##    au1 au2 au3
## 1  XXX art mar
## 2  art XXX lio
## 3  XXX soy mil
## 4  seb XXX mar
## 5  XXX XXX ani
## 6  XXX ani lul
## 7  mar XXX nem
## 8  mar XXX art
## 9  XXX nem XXX
## 10 XXX mar tat

Upvotes: 0

Roland
Roland

Reputation: 132706

A data.frame is a list. You cannot simply change the levels for a whole list, you need to iterate over the list content:

as.data.frame(
  lapply(tata, function(x) {
    levels(x)[levels(x) %in% c("deb", "joy")] <- "XXX"
    x
  }))

Upvotes: 2

Didzis Elferts
Didzis Elferts

Reputation: 98429

You could use function mapvalues() from library plyr(). As you want to do this this with multiple columns then you can use also function sapply(). This solution works if all columns in your data frame are factors.

library(plyr)
xx<-as.data.frame(sapply(tata,
          mapvalues, from = c("deb", "joy"), to = c("XXX", "XXX")))

Upvotes: 5

Related Questions