SlightlyBuilt
SlightlyBuilt

Reputation: 147

Convert integers to decimal values

I have a set of integer data between 1:10000. I need to bring them in range 0:1.

For example, converting

etc. (note that I don't want to scale the values).

Any suggestions how to do this on all the data at once?

Upvotes: 5

Views: 274

Answers (2)

Rich Scriven
Rich Scriven

Reputation: 99371

The non-mathy way would be to add the decimal with paste() then coerce back to numeric.

x <- c(2, 14, 128, 1940, 140, 20000)
as.numeric(paste0(".", x))
# [1] 0.200 0.140 0.128 0.194 0.140 0.200

Update 1: There was some interest about the timings of the two originally posted methods. According to the following benchmarks, they seem to be about the same.

library(microbenchmark)

x <- 1:1e5
microbenchmark(
      david = { david <- x/10^nchar(x) },
    richard = { richard <- as.numeric(paste0(".", x)) }
)
# Unit: milliseconds
#     expr      min       lq     mean   median       uq       max neval
#    david 88.94391 89.18379 89.70962 89.40736 89.71012  99.68126   100
#  richard 87.89776 88.17234 89.38383 88.44439 88.77052 105.06066   100

identical(richard, david)
# [1] TRUE

Update 2: I have also remembered that sprintf() is often faster than paste0(). We can also use the following.

as.numeric(sprintf(".%d", x))

Now using the same x from above, and only comparing these two choices, we have a good improvement in the timing of sprintf() versus paste(), as shown below.

microbenchmark(
     paste0 = as.numeric(paste0(".", x)),
    sprintf = as.numeric(sprintf(".%d", x))
)
# Unit: milliseconds
#      expr      min       lq     mean   median       uq      max neval
#    paste0 87.89413 88.41606 90.25795 88.82484 89.65674 107.8080   100
#   sprintf 61.16524 61.23328 62.26202 61.29192 61.48316  79.1202   100

Upvotes: 10

David Arenburg
David Arenburg

Reputation: 92300

I would simply do

x <- c(2, 14, 128, 1940, 140, 20000)
x/10^nchar(x)
## [1] 0.200 0.140 0.128 0.194 0.140 0.200

But a much faster approach (which avoids to character conversion) offered by @Frank

x/10^ceiling(log10(x))

Benchmark

library(microbenchmark)

set.seed(123)
x <- sample(1e8, 1e6)

microbenchmark(
  david = x/10^nchar(x),
  davidfrank = x/10^ceiling(log10(x)),
  richard1 = as.numeric(paste0(".", x)),
  richard2 = as.numeric(sprintf(".%d", x))
)

# Unit: milliseconds
#       expr       min        lq      mean    median        uq       max neval cld
#      david  691.0513  822.6482 1052.2473  956.5541 1153.4779 2391.7856   100  b 
# davidfrank  130.0522  164.3227  255.8397  197.3158  339.3224  576.2255   100 a  
#   richard1 1130.5160 1429.8314 1972.2624 1689.8454 2473.6409 4791.0558   100   c
#   richard2  712.8357  926.8013 1181.5349 1103.1661 1315.4459 2753.6795   100  b 

Upvotes: 10

Related Questions