Reputation: 14222
I am looking at replacing all numbers in a dataframe with words/strings. Each number will be replaced with the exact same word. e.g. all instances of the number 5 should be replaced with 'banana', all instances of the number 10 with 'kiwi', and so on.
Here is a sample dataframe. Rownames and colnames are numbers too:
# 1 2 3 4 5 6
#1 7 7 7 7 7 7
#2 5 5 5 5 5 5
#3 4 4 4 4 4 4
#4 8 8 8 8 8 8
#5 1 1 1 1 1 1
#6 2 2 2 2 2 2
#7 6 6 6 6 3 3
#8 3 3 3 3 6 6
#9 10 10 10 10 10 10
#10 11 11 11 11 11 11
#11 12 12 12 12 12 12
#12 9 9 9 9 9 9
Here is the sample data (mydf) for reproducing this:
mydf<-structure(c(7, 5, 4, 8, 1, 2, 6, 3, 10, 11, 12, 9, 7, 5, 4, 8,
1, 2, 6, 3, 10, 11, 12, 9, 7, 5, 4, 8, 1, 2, 6, 3, 10, 11, 12,
9, 7, 5, 4, 8, 1, 2, 6, 3, 10, 11, 12, 9, 7, 5, 4, 8, 1, 2, 3,
6, 10, 11, 12, 9, 7, 5, 4, 8, 1, 2, 3, 6, 10, 11, 12, 9), .Dim = c(12L,
6L), .Dimnames = list(c("1", "2", "3", "4", "5", "6", "7", "8",
"9", "10", "11", "12"), c("1", "2", "3", "4", "5", "6")))
Here is a dataframe (mydata) I constructed showing which number should be replaced with which word/fruit:
mydata <- data.frame(nums = c(1:12))
mydata$fruits<-c("apple", "pear", "orange", "melon", "banana", "grape", "pineapple", "mango", "lemon", "kiwi", "guava", "peach")
I have tried looking through similarly named threads, but they mainly discuss changing certain parts of dataframes (e.g. specific variables or specific observations), not the contents of the whole dataframe.
I tried using multiple gsub commands, but this doesn't work for multiple reasons. I guess I need to use a function to apply across all variables in the df, but not sure what.
The final result should look something like this:
1 2 3 4 5 6
1 "pineapple" "pineapple" "pineapple" "pineapple" "pineapple" "pineapple"
2 "banana" "banana" "banana" "banana" "banana" "banana"
3 "melon" "melon" "melon" "melon" "melon" "melon"
4 "mango" "mango" "mango" "mango" "mango" "mango"
5 "apple" "apple" "apple" "apple" "apple" "apple"
6 "pear" "pear" "pear" "pear" "pear" "pear"
7 "grape" "grape" "grape" "grape" "orange" "orange"
8 "orange" "orange" "orange" "orange" "grape" "grape"
9 "kiwi" "kiwi" "kiwi" "kiwi" "kiwi" "kiwi"
10 "guava" "guava" "guava" "guava" "guava" "guava"
11 "peach" "peach" "peach" "peach" "peach" "peach"
12 "lemon" "lemon" "lemon" "lemon" "lemon" "lemon"
Though ideally, the quote marks would not be visible (I'm not sure if this is possible though).
Upvotes: 3
Views: 677
Reputation: 99391
replace
might work for you here.
> replace(mydf, seq_along(mydf), mydata[[2]][mydf])
# 1 2 3 4 5 6
# 1 "pineapple" "pineapple" "pineapple" "pineapple" "pineapple" "pineapple"
# 2 "banana" "banana" "banana" "banana" "banana" "banana"
# 3 "melon" "melon" "melon" "melon" "melon" "melon"
# 4 "mango" "mango" "mango" "mango" "mango" "mango"
# 5 "apple" "apple" "apple" "apple" "apple" "apple"
# 6 "pear" "pear" "pear" "pear" "pear" "pear"
# 7 "grape" "grape" "grape" "grape" "orange" "orange"
# 8 "orange" "orange" "orange" "orange" "grape" "grape"
# 9 "kiwi" "kiwi" "kiwi" "kiwi" "kiwi" "kiwi"
# 10 "guava" "guava" "guava" "guava" "guava" "guava"
# 11 "peach" "peach" "peach" "peach" "peach" "peach"
# 12 "lemon" "lemon" "lemon" "lemon" "lemon" "lemon"
And it can be wrapped with as.data.frame
to remove quotes if necessary.
Upvotes: 0
Reputation: 110072
Another possible approach:
library(qdapTools)
as.data.frame(apply(mydf, 2, lookup, mydata))
## 1 2 3 4 5 6
## 1 pineapple pineapple pineapple pineapple pineapple pineapple
## 2 banana banana banana banana banana banana
## 3 melon melon melon melon melon melon
## 4 mango mango mango mango mango mango
## 5 apple apple apple apple apple apple
## 6 pear pear pear pear pear pear
## 7 grape grape grape grape orange orange
## 8 orange orange orange orange grape grape
## 9 kiwi kiwi kiwi kiwi kiwi kiwi
## 10 guava guava guava guava guava guava
## 11 peach peach peach peach peach peach
## 12 lemon lemon lemon lemon lemon lemon
Upvotes: 0
Reputation: 42689
As the fruits are in the correct order and are indexed by 1:12
, you can use the entries of mydf
to index into mydata$fruits
:
apply(mydf, 2, function(x) mydata$fruits[x])
If the values are not in the correct order, or do not cover all possible values (have "holes"), you can use a factor to translate:
apply(mydf, 2, function(x) factor(x, levels=mydata$nums, labels=mydata$fruits))
Upvotes: 0
Reputation: 27408
You can do this with match
, which refers to a lookup vector (your mydata
), returning the position in that vector of each element of another vector.
mydf[] <- mydata$fruits[match(mydf, mydata$nums)]
If you coerce to a data.frame
, quotes aren't visible when you print the object to screen:
as.data.frame(mydf)
# 1 2 3 4 5 6
# 1 pineapple pineapple pineapple pineapple pineapple pineapple
# 2 banana banana banana banana banana banana
# 3 melon melon melon melon melon melon
# 4 mango mango mango mango mango mango
# 5 apple apple apple apple apple apple
# 6 pear pear pear pear pear pear
# 7 grape grape grape grape orange orange
# 8 orange orange orange orange grape grape
# 9 kiwi kiwi kiwi kiwi kiwi kiwi
# 10 guava guava guava guava guava guava
# 11 peach peach peach peach peach peach
# 12 lemon lemon lemon lemon lemon lemon
Whether or not you coerce to data.frame
, you can supply quote=FALSE
to write.table
or write.csv
to prevent quotes appearing around character strings in the exported file.
Upvotes: 4