Reputation: 1533
In arbitrary sets of data there are string columns, e. g. species in Iris set. I have to convert those to small integers for ML purposes (matrix operations, so numbers only), and reverse it after calculations. For example: {"setosa" -> 1, "versicolor" -> 2, "virginica" -> 3).
I iterate through columns and check type of each colum (mode of first element). If it's character (only problematic mode), I want to get set of different values from that column (e. g. 3 species in Iris set), map them to consecutive integers (so I'll really have matrix instead of data frame) and reverse it after calculations (e. g. show predicted values in target set as strings, not my arbitrary mapped integers). I think I need a list mapping column index (I don't know in advance which columns will be mapped) to map (string -> integer) for particular column.
Upvotes: 2
Views: 1628
Reputation: 44887
Do something like this:
fac <- factor(charvar)
num <- as.numeric(fac)
# Do some manipulation of num, producing newnum
newcharvar <- levels(fac)[newnum]
For example,
> fac <- factor(iris$Species)
> num <- as.numeric(fac)
> head(num)
[1] 1 1 1 1 1 1
> newnum <- num[c(1, 100)]
> newnum
[1] 1 2
> levels(fac)[newnum]
[1] "setosa" "versicolor"
Upvotes: 1