qalis
qalis

Reputation: 1533

How to map arbitrary set of strings to integers and back in R?

In arbitrary sets of data there are string columns, e. g. species in Iris set. I have to convert those to small integers for ML purposes (matrix operations, so numbers only), and reverse it after calculations. For example: {"setosa" -> 1, "versicolor" -> 2, "virginica" -> 3).

I iterate through columns and check type of each colum (mode of first element). If it's character (only problematic mode), I want to get set of different values from that column (e. g. 3 species in Iris set), map them to consecutive integers (so I'll really have matrix instead of data frame) and reverse it after calculations (e. g. show predicted values in target set as strings, not my arbitrary mapped integers). I think I need a list mapping column index (I don't know in advance which columns will be mapped) to map (string -> integer) for particular column.

Upvotes: 2

Views: 1628

Answers (1)

user2554330
user2554330

Reputation: 44887

Do something like this:

fac <- factor(charvar)
num <- as.numeric(fac)
# Do some manipulation of num, producing newnum
newcharvar <- levels(fac)[newnum]

For example,

>     fac <- factor(iris$Species)
>     num <- as.numeric(fac)
>     head(num)
[1] 1 1 1 1 1 1
>     newnum <- num[c(1, 100)]
>     newnum
[1] 1 2
>     levels(fac)[newnum]
[1] "setosa"     "versicolor"

Upvotes: 1

Related Questions