Reputation: 568
Example:
> x <- c(1e6,2e6,500e3,10e6,233e3,50e3)
> cut(x,c(0,250e3,500e3,1e6,2.5e6,99e6),dig.lab=10)
[1] (500000,1000000] (1000000,2500000] (250000,500000] (2500000,99000000] (0,250000]
[6] (0,250000]
Levels: (0,250000] (250000,500000] (500000,1000000] (1000000,2500000] (2500000,99000000]
Is there a way I can use the cut function with custom labels? I would like it to show labels such as: (0,250k],(250k,500k],(500k,1m],(1m,2.5m],(2.5m,inf]
Is there a way to do this?
Upvotes: 1
Views: 1642
Reputation: 160447
As @user20650 said, labels=
will allow you to do it manually (and for that, it is a dupe question of Adding labels to cut function in R).
If you want a programmatic way to convert levels to this SI (or IEC) format, we can borrow from utils:::format.object_size
and define our own function:
#' Kilo, Mega, Giga
#'
#' Convert numbers to SI or IEC format.
#'
#' @param x numeric
#' @param standard one of "SI" (1000-based) or "IEC" (1024)
#' @param digits number of significant digits to round to
#' @param sep character, what to put between the number and the letter
#' @param suffix character what to put immediately after the K/M/G
#' (e.g., "b" or "B")
#' @return character
#' @export
KMG <- function(x, standard = "SI", digits = 1L, sep = "", suffix = "") {
known_bases <- c(legacy = 1024, IEC = 1024, SI = 1000)
known_units <- list(SI = c("", "k", "M", "G", "T", "P",
"E", "Z", "Y"), IEC = c("", "Ki", "Mi", "Gi",
"Ti", "Pi", "Ei", "Zi", "Yi"), legacy = c("", "K",
"M", "G", "T", "P"))
standard <- match.arg(standard, c("auto", names(known_bases)))
if (is.null(digits))
digits <- 1L
base <- known_bases[[standard]]
units_map <- known_units[[standard]]
powers <- rep(0L, length(x))
powers[x > 0] <- pmin(as.integer(log(x[x > 0], base = base)), length(units_map) - 1L)
units <- paste0(units_map[powers + 1L], suffix)
paste(round(x/base^powers, digits = digits), units, sep = sep)
}
(Edited to vectorize the function and remove "LEGACY"
, and add a suffix=
argument.)
Its use:
x <- c(1e6,2e6,500e3,10e6,233e3,50e3)
cuts <- cut(x, c(0,250e3,500e3,1e6,2.5e6,99e6), dig.lab=10)
lvls <- levels(cuts)
lvls
# [1] "(0,250000]" "(250000,500000]" "(500000,1000000]" "(1000000,2500000]" "(2500000,99000000]"
gre <- gregexpr("[[:digit:].]+", lvls)
regmatches(lvls, gre) <-
lapply(regmatches(lvls, gre), function(lvl) KMG(as.numeric(lvl)))
lvls
# [1] "(0,250k]" "(250k,500k]" "(500k,1M]" "(1M,2.5M]" "(2.5M,99M]"
levels(cuts) <- lvls
cuts
# [1] (500k,1M] (1M,2.5M] (250k,500k] (2.5M,99M] (0,250k] (0,250k]
# Levels: (0,250k] (250k,500k] (500k,1M] (1M,2.5M] (2.5M,99M]
Upvotes: 3