Reputation: 11
I have a dataframe like this:
from to func
94019567899392 94019567898224 f1
94019567899392 94019567896800 f1
94019567900608 94019567899392 f4
Now I want to map my numeric values to something simpler: All values need to be consistent over the first two columns.
from to func
1 3 f1
1 4 f1
2 1 f4
Or to chars (don't care)
from to func
A C f1
A D f1
B A f4
How can I do that in R?
Upvotes: 0
Views: 710
Reputation: 2170
Sounds like the factor
format is what you are looking for. If you convert a vector to a factor, all the unique values are turned into 'levels', which are represented under the hood as integers. Converting this factor back to numerics should expose these again:
> bigNumbers <- c(94019567899392, 94019567898224,
+ 94019567899392, 94019567896800,
+ 94019567900608, 94019567899392)
> factor(bigNumbers)
[1] 94019567899392 94019567898224 94019567899392 94019567896800 94019567900608 94019567899392
Levels: 94019567896800 94019567898224 94019567899392 94019567900608
> as.numeric(factor(bigNumbers))
[1] 3 2 3 1 4 3
As mentioned in the comments, this doesn't work for multiple columns, since each will be changed individually.
If we take the part of the data.frame that we want to convert, turn it into a matrix, then we can do our factor -> numeric transformation, and then place it back into our data.frame.
x <- data.frame(x = c(94019567899392,94019567899392,94019567900608), y = c(94019567898224,94019567896800,94019567899392), z = 1:3)
convertedColumns <- 1:2
toConvert <- as.matrix(x[, convertedColumns])
result <- matrix(as.numeric(factor(toConvert)), ncol = length(convertedColumns))
for(column in convertedColumns){
x[[column]] <- result[, column]
}
x
x y z
1 3 2 1
2 3 1 2
3 4 3 3
Upvotes: 1