Museful
Museful

Reputation: 6959

How to format numbers for minimal string length

For graphical annotations with tight space constraints, I would like to format numbers in a way that minimizes each number's representation string length. For example, the powers of 10 should be formatted like this:

as.character seems to do almost this, but unfortunately it puts a redundant leading zero in single-digit exponents, and it inserts a redundant '+' before positive exponents.

> as.character(10^(-5:5))
 [1] "1e-05" "1e-04" "0.001" "0.01"  "0.1"   "1"     "10"    "100"   "1000"  "10000" "1e+05"

So instead of 1e5, for example, we get 1e+05, which is almost double in length.

Upvotes: 1

Views: 168

Answers (1)

Simon O'Hanlon
Simon O'Hanlon

Reputation: 59980

How about using a regex to remove the unwanted characters...

gsub( "\\+|(?<=\\+|\\-)0" , "" , 10^(-5:5) , perl = TRUE )
#[1] "1e-5"  "1e-4"  "0.001" "0.01"  "0.1"   "1"     "10"    "100"   "1000" 
#[10] "10000" "1e5"
  • \\+ removes the +
  • (?<=...)0 is a zero-width look-behind assertion that removes 0 as long as it is preceeded by whatever is in ..., in this case \\+|\\- which is either + or -

The | separator chains the expressions. The "" in the second argument of gsub replaces matches with nothing.

EDIT: Building on ideas raised in the discussion, here is a ready-to-go solution:

formatBrief <- function(x){
    options(scipen=-5)
    sci <- gsub( "(?<=e)\\+?0*|(?<=e-)0*" , "" , x , perl=TRUE)
    options(scipen=5)
    fp <- as.character(x)
    options(scipen=0)
    return (ifelse(nchar(sci)<nchar(fp),sci,fp))
}

> formatBrief(10^(-5:5))
 [1] "1e-5" "1e-4" "1e-3" "0.01" "0.1"  "1"    "10"   "100"  "1e3"  "1e4"  "1e5" 

Upvotes: 2

Related Questions