Jamie
Jamie

Reputation: 553

Using SetDT to collapse multiple cells

I have a language variable in my dataset that looks similar to this (keep in mind there are a lot more languages than shown below):

> dput(dt$LanguageDSC)
c("English", "English", "English", "Portuguese", "English", "English", 
"English", "English", "English", "Mandarin", "English", "English", 
"English", "English", "English", "English", "English", "English", 
"English", "English", "English", "English", "English", "English", 
"English", "English", "English", "English", "English", "English", 
"English", "English", "English", "English", "English", "English", 
"English", "English", "English", "English", "English", "English", 
"English", "Spanish", "English", "English", "English", "English", 
"English", "English", "English", "English", "English", "English", 
"English", "English", "English", "English", "English", "English", 
"English", "English", "English", "English", "English", "English", 
"English", "English", "English", "English", "English", "English", 
"English", "Spanish", "Spanish", "English", "English", "English", 
"English", "English", "English", "English", "English", "English", 
"English", "English", "English", "English", "Arabic", "Spanish", 
"English", "English", "English", "English", "English", "English", 
"English", "English", "English", "English")

Since my dataset has around 30 different languages, I want to collapse some of the language variables. I want the following categories:

English
Spanish
Cantonese
Mandarin
Vietnamese 
Other (all other languages)

So far I have this, but it only classifies 'English' or 'Other'. How can I modify this to include the other 4 languages that I included above?

setDT(dt)[!(LanguageDSC == "English"), LanguageDSC := "Other"]

Upvotes: 1

Views: 40

Answers (1)

akrun
akrun

Reputation: 887501

We may use %in% with ! to select multiple languages

library(data.table)
slt_langs <-  c("English", "Spanish", "Cantonese", 
          "Mandarin", "Vietnamese")
setDT(dt)[!(LanguageDSC %in% slt_langs),
         LanguageDSC := "Other"]

Upvotes: 0

Related Questions