Reputation: 179
Silly example df, "cat":
species color tail_length
calico brown 6
calico gray 6
tabby multi 5
tabby brown 5
Suppose I want to create a new variable, personality. The values here will be recoded based on tail_length, but will also be conditional upon the species and color of the cat. So the ideal final df would look like this:
species color tail_length personality
calico brown 6 mean
calico gray 6 nice
tabby multi 5 mean
tabby brown 5 nice
At present, I'm using the codes:
library(car)
cat$personality<-recode(cat$tail_length, "'6'==mean, '5'==nice")
cat$personality[cat$species=="calico" & cat$color=="brown"] <- mean
cat$personality[cat$species=="calico" & cat$color=="gray"] <- nice
cat$personality[cat$species=="tabby" & cat$color=="multi"]<- mean
cat$personality[cat$species=="tabby" & cat$color=="brown"]<-nice
My main question is this: is there a simpler way to do this/consolidate these functions into one? Given that I made up this example data on the fly, please take it with a grain of salt when answering. Thanks! As an R beginner, I really appreciate your help.
Upvotes: 3
Views: 2945
Reputation: 109864
Here's one approach using qdap and qdapTools (CRAN packages that I maintain):
library(qdap); library(qdapTools)
key <- list(
mean = c( "calico.gray", "tabby.brown"),
nice = c("calico.brown", "tabby.multi")
)
dat[["personality"]] <- paste2(dat[1:2]) %l% key
dat
## species color tail_length personality
## 1 calico brown 6 nice
## 2 calico gray 6 mean
## 3 tabby multi 5 nice
## 4 tabby brown 5 mean
Basically you create a key that's a named list based on the combined columns. Then %l%
acts as a hash table lookup.
Upvotes: 1
Reputation: 263332
This is really just a merge
operation. (Furthermore, you have over specified the criteria since species
and tail_length
are completely dependent. But since it's only an example that may not be an issue.) Let's say your first dataframe is dat
and the criteria dataframe is lookup
. Then all you need to do is:
> merge(dat, lookup)
species color tail_length personality
1 calico brown 6 mean
2 calico gray 6 nice
3 tabby brown 5 nice
4 tabby multi 5 mean
Not a very interesting or dramatic result because it looks just like the lookup
dataframe, but give it something a bit larger and:
> merge( rbind(dat,dat,dat) , lookup)
species color tail_length personality
1 calico brown 6 mean
2 calico brown 6 mean
3 calico brown 6 mean
4 calico gray 6 nice
5 calico gray 6 nice
6 calico gray 6 nice
7 tabby brown 5 nice
8 tabby brown 5 nice
9 tabby brown 5 nice
10 tabby multi 5 mean
11 tabby multi 5 mean
12 tabby multi 5 mean
Upvotes: 0
Reputation: 13833
There isn't much you can do here because, at the end of the day, you still need to specify the conditions and new variables to assign.
However, you can cut down on the boilerplate code by using within
:
within(cat, {
personality <- recode(tail_length, "'6'==mean, '5'==nice")
personality[species == "calico" & color == "brown"] <- "mean"
personality[species=="calico" & color=="gray"] <- "nice"
personality[species=="tabby" & color=="multi"] <- "mean"
personality[species=="tabby" & color=="brown"] <- "nice"
})
Upvotes: 0