Tetlanesh
Tetlanesh

Reputation: 403

Numbers formatted in R to look better (thousands separators, etc) are no longer numbers?

Let me start with that I'm new to R, with that out of the picture let's get on it.

Let's say I have a numeric vector:

> eve
 [1] 208999990 208999990 208999995 208999997 208999998 209499990 209499999 209999986 209999997 210000000 210000000 210000000 210000000 210000000 210000000
[16] 217000000 217998986 217998988 218000000 218500000 218999994 218999997 218999998 223999900 223999900 223999945 223999945 223999945 223999999 224199999
[31] 224199999 224199999 224199999 224200000 224799999 224999977 225100004 226998997 226998998 226998998 226999999 227000000 227000000 227000000 227000000
[46] 227000000 227399967 227700100 227798981 228199990 229199988 230000000 230278899 234388500 234388582 235000000 235999999 236388592 236388593 236388599
[61] 236388599 236388599 236388599 236388599 236388600 236388655 236989874 238388583 244000000 246992877 247992884 247997972 247997979 250000000 250000000
[76] 250000000 255000000 261000000 265000000 280000000 285000000

I'm using formatC to make it look more readable/pretty and actually see the digits after the decimal place (default R behaviour seems to hide decimal places for large numbers?):

> eve<-formatC(eve, decimal.mark=",", big.mark=" ", digits = 2, format = "f")
> eve
 [1] "208 999 989,99" "208 999 990,00" "208 999 994,99" "208 999 997,00" "208 999 998,00" "209 499 989,99" "209 499 999,00" "209 999 985,99" "209 999 996,99"
[10] "209 999 999,89" "209 999 999,92" "209 999 999,93" "209 999 999,95" "209 999 999,98" "209 999 999,99" "216 999 999,97" "217 998 985,77" "217 998 987,55"
[19] "218 000 000,00" "218 500 000,00" "218 999 994,00" "218 999 997,00" "218 999 997,99" "223 999 900,00" "223 999 900,00" "223 999 944,65" "223 999 944,72"
[28] "223 999 944,95" "223 999 998,99" "224 199 998,59" "224 199 998,69" "224 199 998,77" "224 199 998,80" "224 199 999,93" "224 799 998,77" "224 999 976,99"
[37] "225 100 004,00" "226 998 996,78" "226 998 997,88" "226 998 997,99" "226 999 998,98" "227 000 000,00" "227 000 000,00" "227 000 000,00" "227 000 000,00"
[46] "227 000 000,00" "227 399 966,91" "227 700 099,99" "227 798 980,71" "228 199 990,00" "229 199 987,98" "230 000 000,00" "230 278 898,81" "234 388 500,00"
[55] "234 388 582,00" "235 000 000,00" "235 999 999,00" "236 388 591,91" "236 388 592,78" "236 388 598,93" "236 388 598,94" "236 388 598,95" "236 388 598,96"
[64] "236 388 598,97" "236 388 600,00" "236 388 655,00" "236 989 873,90" "238 388 582,81" "244 000 000,00" "246 992 877,00" "247 992 884,00" "247 997 972,00"
[73] "247 997 978,98" "249 999 999,99" "250 000 000,00" "250 000 000,00" "254 999 999,99" "261 000 000,00" "264 999 999,99" "280 000 000,00" "285 000 000,00"

The problem is that I can't do any numerical operations on that vector anymore due to it being of character type:

> class(eve)
[1] "character"
> typeof(eve)
[1] "character"

Is there a way in R to keep numbers displayed in neat format and still be able to run numerical operation on them?

I know I can just run all operation on original vector and only display formatted values when needed via formatting functions, but that seems like a waste of time to me. When looking at numbers, especially large number you often loose sight of actual values they represent, You can't tell how big the number is unless you count exact number of digits and can't tell if one of the values is 10 times bigger / smaller because it's all blurred without proper formatting.

Upvotes: 2

Views: 5002

Answers (1)

Roland
Roland

Reputation: 132706

You need to understand the difference between the internal representation of an object and how it is printed.

I don't agree that defining a class is overkill here:

num <- c(208999990, 308999990, 408999995)
class(num) <- c("niceprint", class(num))

print.niceprint <- function(x, decimal.mark=",", big.mark=" ", digits = 2, ...) {
  print(formatC(unclass(x), decimal.mark=decimal.mark, big.mark=big.mark, digits = digits, format = "f"))
}

#for printing in data.frames
format.niceprint <- function(x, decimal.mark=",", big.mark=" ", digits = 2, ...) {
  formatC(unclass(x), decimal.mark=decimal.mark, big.mark=big.mark, digits = digits, format = "f")
}

num
#[1] "208 999 990,00" "308 999 990,00" "408 999 995,00"

data.frame(x=num, y=2*num)
#                 x                y
#1 208 999 990,0000 417 999 980,0000
#2 308 999 990,0000 617 999 980,0000
#3 408 999 995,0000 817 999 990,0000

#a matrix
t(num)
#     [,1]             [,2]             [,3]            
#[1,] "208 999 990,00" "308 999 990,00" "408 999 995,00"

Upvotes: 4

Related Questions