Reputation: 33
This question was previously asked: How to round numbers of a data frame in R preserving the sum?
I also want to implement this function in dplyr that rounds and preserves unity depending on number of digits wanted:
round_preserve_sum <- function(x, digits = 0) {
up <- 10 ^ digits
x <- x * up
y <- floor(x)
indices <- tail(order(x-y), round(sum(x)) - sum(y))
y[indices] <- y[indices] + 1
y / up
}
Here is a dataframe:
df <- data.frame(SAND = c(0.00000, 28.00000, 27.27273),
SILT = c(45.45455, 35.00000, 34.34343),
CLAY = c(54.54545, 37.00000, 38.38384))
Using this function with these values separately, I get:
round_preserve_sum(c(0.00000, 45.45455, 54.54545), 0)
[1] 0 45 55
round_preserve_sum(c(28.00000, 35.00000, 37.00000), 0)
[1] 28 35 37
round_preserve_sum(c(27.27273, 34.34343, 38.38384), 0)
[1] 27 34 39
which all sum to a 100
When I implement this function in dplyr :
df.Rd0 <-df %>%
mutate(across(c(SAND, SILT, CLAY), ~round_preserve_sum(.,0)),
Sum = SAND + SILT + CLAY)
I get :
SAND SILT CLAY Sum
1 0 46 55 101
2 28 35 37 100
3 27 34 38 99
Not using tilde:
df.Rd0 <-df %>%
mutate(across(c(SAND, SILT, CLAY), round_preserve_sum(.,0)),
Sum = SAND + SILT + CLAY)
I get this error message:
Error : Problem with `mutate()` input `..1`.
i `..1 = across(c(SAND, SILT, CLAY), round_preserve_sum(., 0))`.
x undefined columns selected
I guess the function is not programmed for vectors?
Upvotes: 3
Views: 102
Reputation: 886948
The ~
is a lambda expression i.e. short form for function(.x)
. If we don't use it, then specify the format parameters as named one
library(dplyr)
df %>%
mutate(across(c(SAND, SILT, CLAY), round_preserve_sum, digits = 0),
Sum = SAND + SILT + CLAY)
-output
SAND SILT CLAY Sum
1 0 46 55 101
2 28 35 37 100
3 27 34 38 99
Regarding the OP's manual use of getting sum as 100, it was rowwise
, and not column wise - across
loops over columns. We need rowwise
with c_across
df %>%
rowwise %>%
mutate(Sum = sum(round_preserve_sum(c_across(everything()), 0))) %>%
ungroup
-ouptut
# A tibble: 3 x 4
SAND SILT CLAY Sum
<dbl> <dbl> <dbl> <dbl>
1 0 45.5 54.5 100
2 28 35 37 100
3 27.3 34.3 38.4 100
If we want to return the rounded columns along with sum, use pmap
library(purrr)
df %>%
pmap_dfr(~ {tmp <- round_preserve_sum(c(...), 0)
c(tmp, Sum = sum(tmp))})
# A tibble: 3 x 4
SAND SILT CLAY Sum
<dbl> <dbl> <dbl> <dbl>
1 0 45 55 100
2 28 35 37 100
3 27 34 39 100
THis can be made faster with dapply
from collapse
library(collapse)
df <- dapply(df, MARGIN = 1, FUN = round_preserve_sum, 0)
df$Sum <- rowSums(df, na.rm = TRUE)
Upvotes: 3