Reputation: 961
I would like to transform the values of a given column using some mapping function. Example:
df <- data.frame(A = 1:5, B = sample(1:20, 10))
df
A B
1 1 17
2 2 5
3 3 3
4 4 11
5 5 19
6 1 16
7 2 4
8 3 7
9 4 6
10 5 9
My goal is to map all elements of column A as following:
1 -> "tt"
2 -> "ff"
3 -> "ss"
4 -> "fs"
5 -> "sf"
I have written the following:
mappingList <- c("tt", "ff", "ss", "fs", "sf")
df$A <- unlist(lapply(df$A, function(x){replace(x, x>0, mappingList[x])}))
df
A B
1 tt 17
2 ff 5
3 ss 3
4 fs 11
5 sf 19
6 tt 16
7 ff 4
8 ss 7
9 fs 6
10 sf 9
The code as above worked fine.
Now let's assume another dataframe where column A is not made of integers 1,2,3,4,5 but rather any other 'generic' items, say:
df <- data.frame(A = paste("str",1:5,sep=""), B = sample(1:20, 10))
or
df <- data.frame(A = seq(5, 25, by=5), B = sample(1:20, 10))
Question: How would you write the mapping ?
Upvotes: 0
Views: 595
Reputation: 887058
Try:
mappingList[df$A]
#[1] "tt" "ff" "ss" "fs" "sf" "tt" "ff" "ss" "fs" "sf"
For the two other datasets:
df1 <- data.frame(A = paste("str",1:5,sep=""), B = sample(1:20, 10))
df2 <- data.frame(A = seq(5, 25, by=5), B = sample(1:20, 10))
mappingList[as.numeric(df1$A)]
#[1] "tt" "ff" "ss" "fs" "sf" "tt" "ff" "ss" "fs" "sf"
mappingList[as.numeric(factor(df2$A))]
#[1] "tt" "ff" "ss" "fs" "sf" "tt" "ff" "ss" "fs" "sf"
Upvotes: 0
Reputation: 193517
Did you look at factor
?
df$A_2 <- factor(df$A, levels = 1:5, labels = c("tt", "ff", "ss", "fs", "sf"))
df
# A B A_2
# 1 1 17 tt
# 2 2 5 ff
# 3 3 3 ss
# 4 4 11 fs
# 5 5 19 sf
# 6 1 16 tt
# 7 2 4 ff
# 8 3 7 ss
# 9 4 6 fs
# 10 5 9 sf
Basically, your levels
argument should have the original values to match, and your labels
argument should have the replacement values.
You could also create a look-up table with a named vector.
Example:
df <- data.frame(A = paste("str",1:5,sep=""), B = sample(1:20, 10))
NamedVec <- setNames(paste("str",1:5,sep=""), c("tt", "ff", "ss", "fs", "sf"))
NamedVec
# tt ff ss fs sf
# "str1" "str2" "str3" "str4" "str5"
NamedVec[df$A]
# tt ff ss fs sf tt ff ss fs sf
# "str1" "str2" "str3" "str4" "str5" "str1" "str2" "str3" "str4" "str5"
names(NamedVec[df$A])
# [1] "tt" "ff" "ss" "fs" "sf" "tt" "ff" "ss" "fs" "sf"
Upvotes: 2