Mist
Mist

Reputation: 1948

Clean data with a mapping table

I am trying to clean data with a mapping table and looking for the best way to achieve that.

Here's some sample data:

df <- data.frame(Q1 = c("Yes", "Yes", "Non"),
                 Q2 = c("Non", "No", "Oui"))

I have a mapping table for converting those answers to TRUE/FALSE

map <- data.frame(answer  = c("Yes", "Oui", "No", "Non"),
                  mapping = c(TRUE, TRUE, FALSE, FALSE))

This is what I want as output

out <- data.frame(Q1 = c(TRUE, TRUE, FALSE),
                  Q2 = c(FALSE, FALSE, TRUE))

I've tried this approach:

out <- merge(df,  map, by.x = "Q1", by.y = "answer", all.x = TRUE)
out <- merge(out, map, by.x = "Q2", by.y = "answer", all.x = TRUE)
out <- out[ , 3:4]
names(out) <- c("Q1", "Q2")

But this seems kind-of clumsy as you create new columns, delete the old and then rename. Is there a better way to do this; like an option to merge into an existing column?

Upvotes: 0

Views: 381

Answers (2)

thelatemail
thelatemail

Reputation: 93813

Or use Map or something similar to loop through df and grab the corresponding output from the mapping variable.

Map(function(x,m,o) o[match(x,m)], df, map["answer"], map["mapping"])
#$Q1
#[1]  TRUE  TRUE FALSE
#
#$Q2
#[1] FALSE FALSE  TRUE

Wrap it all in data.frame if you need to convert the result.

Upvotes: 2

Roman Luštrik
Roman Luštrik

Reputation: 70643

You could try plyr::mapvalues.

> mapvalues(df$Q1, from = map$answer, to = map$mapping)
The following `from` values were not present in `x`: Oui, No
[1] TRUE  TRUE  FALSE
Levels: FALSE TRUE

Upvotes: 0

Related Questions