Reputation: 679
Consider the data frame
a = c(0, 1, 3, 5, 6, 0, 1, 3, 6, 12)
b = c(letters[5:9], letters[2:6])
c = data.frame(var1 = a, var2 = b)
I want to convert all values in the data frame to consecutive integers factor levels starting from 1
and use these as numeric values to compute something (in reality I don't do this for the letters but I just added these to explain my problem ;) ).
With some help (Converting numeric values of multiple columns to factor levels that are consecutive integers in (descending) order), I did this through:
c[] = lapply(c, function(x) {levels(x) <- 1:length(unique(x)); x})
Unfortunately, this only replaces the values with their respective factor levels for the character
column var2
but not the for the numeric
column var1
(notice the 0
in column var1
)
> c
var1 var2
1 0 4
2 1 5
3 3 6
4 5 7
...
To alleviate the problem I converted all columns to character when creating c
c = as.data.frame(sapply(data.frame(var1 = a, var2 = b), as.character))
This yields
var1 var2
1 1 4
2 2 5
3 4 6
4 5 7
5 6 8
6 1 1
7 2 2
8 4 3
9 6 4
10 3 5
The problem here, however, is that the value 12
(c[10,'var1']
) in column var1
is considered as the 3rd value (it gets assigned factor level 3
after levels 1
and 2
for values 0
and 1
) rather than the last value (factor level 6
because it is the largest numeric value in var1
).
Is there a way to assign factor levels on the basis of the numeric ordering at the same time replacing the numeric values by their factor levels?
Upvotes: 1
Views: 997
Reputation: 886948
Based on the description, it seems like the OP wanted to change the levels
to numeric values starting from 1. This can be done using match
c[] <- lapply(c, function(x) factor(match(x, sort(unique(x)))))
c
# var1 var2
#1 1 4
#2 2 5
#3 3 6
#4 4 7
#5 5 8
#6 1 1
#7 2 2
#8 3 3
#9 5 4
#10 6 5
a <- c(0, 1, 3, 5, 6, 0, 1, 3, 6, 12)
b <- c(letters[5:9], letters[2:6])
c <- data.frame(var1 = a, var2 = b)
Based on the code in the comments, another option to replace str_pad
is
c <- data.frame(var1 = sprintf("%02d", a), var2=b, stringsAsFactors=FALSE)
Upvotes: 2