Reputation: 688
I'm trying to round numeric values in a data frame to closest interval. I want to round at different intervals based on how big the number is. I've started with this (coming from excel mindset) but I'm stuck to translate it to R code. Note round_any rounds a number to the closest interval(e.g. 5.13->5, 5.85->6)
library(plyr)
DataFrame <- sapply(DataFrame, function(x) {
if(x>1) round_any(x,0.25),
if(x>5) round_any(x,0.5),
if(x>10) round_any(x,1),
else x})
Could you please help me out?
Upvotes: 1
Views: 1369
Reputation: 688
Thank you all for your help. Based on your responses the following code worked for my data frame
library(plyr)
library(dplyr)
DataFrame[] <- lapply(DataFrame, function(x){
round_any(x,
case_when(
x > 10 ~ 1.0,
x > 5 ~ 0.50,
x > 1 ~ 0.25,
TRUE ~ 0.001))})
Upvotes: 0
Reputation: 11878
When using sapply
on a data frame, you are iterating over the column vectors rather than individual values. As such, you should be looking at vectorized conditional logic functions: just using the standard if
control flow isn't terribly useful, as it can only take scalar (length 1) conditions.
In this case, plyr::round_any
can take a vector as the accuracy
argument; the dplyr
function case_when
could be useful here. From ?case_when
:
This function allows you to vectorise multiple if and else if statements. It is an R equivalent of the SQL CASE WHEN statement.
Here's an example for the case of a single vector to be rounded:
set.seed(11)
# Generate some raw numbers
x <- runif(8, max = 20)
print(x, digits = 4)
#> [1] 5.54500 0.01037 10.21217 0.28096 1.29380 19.09698 1.72992 5.79950
# Round to differing accuracy
plyr::round_any(
x,
dplyr::case_when(
x > 10 ~ 1.0,
x > 5 ~ 0.50,
x > 1 ~ 0.25,
TRUE ~ 0.001
)
)
#> [1] 5.500 0.010 10.000 0.281 1.250 19.000 1.750 6.000
Created on 2018-05-11 by the reprex package (v0.2.0).
Upvotes: 2