Reputation: 355
I have a dataframe of values for multiple variables, and I want to replace all the numeric values with a character which will label a specific numeric range. I do NOT want equal ranges, so cut() is not an option so far as I understand.
In the following code, if I generate the dataframe and then run any one or two of the replacement commands, they do exactly what I want them to do. But when I run them all together, the final table populates with all "f" values.
#Generate test dataframe
test1<-data.frame(replicate(10,sample(0:1000,100,rep=TRUE)))
#Duplicate dataframe so you can go back and reality check category labels against original data
test<-data.frame(test1)
#These are my replacement commands
test[test <10] <- "a"
test[test >=10 & test <25] <- "b"
test[test >=25 & test <50] <- "c"
test[test >=50 & test <100] <- "d"
test[test >=100 & test <500] <- "e"
test[test >=500] <- "f"
single-run any of the replacement commands and you'll see the variables with those values replaced with the corresponding letter. All I want is this in all values, in all columns, for this dataset. The ultimate purpose is so I can create a frequency table of the variables by the specified ranges.
Upvotes: 1
Views: 58
Reputation: 887153
We can use cut
to create the labels based on specifying the breaks
. For multiple columns, use lapply
from base R
to loop over the columns, apply the cut
and assign back to the dataset of interest
test[] <- lapply(test, function(x)
cut(x, breaks = c(-Inf, 10, 25, 50, 100, 500, Inf), labels = letters[1:6]))
Upvotes: 2