Reputation: 243
I am having trouble getting the desired number of decimal places from summarise. Here is a simple example:
test2 <- data.frame(c("a","a","b","b"), c(245,246,247,248))
library(dplyr)
colnames(test2) <- c("V1","V2")
group_by(test2,V1) %>% summarise(mean(V2))
The dataframe is:
V1 V2
1 a 245
2 a 246
3 b 247
4 b 248
The output is:
V1 `mean(V2)`
<fctr> <dbl>
1 a 246
2 b 248
I would like it to give me the means including the decimal place (i.e. 245.5 and 247.5)
Upvotes: 23
Views: 36637
Reputation: 4879
This is one solution-
test2 <- data.frame(c("a", "a", "b", "b"), c(245, 246, 247, 248))
library(dplyr)
colnames(test2) <- c("V1", "V2")
group_by(test2, V1) %>%
dplyr::summarise(mean(V2)) %>%
dplyr::mutate_if(is.numeric, format, 1)
#> # A tibble: 2 x 2
#> V1 `mean(V2)`
#> <fct> <chr>
#> 1 a 245.5
#> 2 b 247.5
Created on 2018-01-20 by the reprex package (v0.1.1.9000).
If you want to keep it as numeric :
test2 <- data.frame(c("a", "a", "b", "b"), c(245, 246, 247, 248))
library(dplyr)
colnames(test2) <- c("V1", "V2")
group_by(test2, V1) %>%
dplyr::summarise(mean(V2)) %>%
as.data.frame(.) %>%
dplyr::mutate_if(is.numeric, round, 1)
Gives
V1 mean(V2)
1 a 245.5
2 b 247.5
And with another example (from @Matifou) :
tab <- tibble(x = c(0.1234, 1.1234, 10.1234, 100.1234, 1000.1234))
tab %>%
as.data.frame(.) %>%
dplyr::mutate_if(is.numeric, round, 2)
Gives :
x
1 0.12
2 1.12
3 10.12
4 100.12
5 1000.12
Upvotes: 9
Reputation: 8880
Because you are using dplyr
tools, the resulting output is actually a tibble, which by default prints numbers with 3 significant digits (see option pillar.sigfig
). This is not the same as number of digits after the period. To obtain the latter, convert it simply to a data.frame: as.data.frame
Note that tibble's concept of significant digits is somehow complicated, and does not indicate how many digits after the period are represented, but the minimum number of digits necessary to have a given accurate representation of the number (I think 99.9%, see discussion here).
This means the number of digits printed depends on the "size" of your number:
library(tibble)
packageVersion("tibble")
#> [1] '2.1.3'
packageVersion("pillar")
#> [1] '1.4.2'
tab <- tibble(x = c(0.1234, 1.1234, 10.1234, 100.1234, 1000.1234))
options(pillar.sigfig=3)
tab
#> # A tibble: 5 x 1
#> x
#> <dbl>
#> 1 0.123
#> 2 1.12
#> 3 10.1
#> 4 100.
#> 5 1000.
options(pillar.sigfig=4)
tab
#> # A tibble: 5 x 1
#> x
#> <dbl>
#> 1 0.1234
#> 2 1.123
#> 3 10.12
#> 4 100.1
#> 5 1000.
as.data.frame(tab)
#> x
#> 1 0.1234
#> 2 1.1234
#> 3 10.1234
#> 4 100.1234
#> 5 1000.1234
Created on 2019-08-21 by the reprex package (v0.3.0)
Upvotes: 18
Reputation: 2289
I think the simplest solution is the following:
test2 <- data.frame(c("a","a","b","b"), c(245,246,247,248))
library(dplyr)
colnames(test2) <- c("V1","V2")
group_by(test2,V1) %>% summarise(`mean(V2)` = sprintf("%0.1f",mean(V2)))
# A tibble: 2 x 2
V1 `mean(V2)`
<fct> <chr>
1 a 245.5
2 b 247.5
Upvotes: 1