Reputation: 8474
I have a dataframe, and I wish to round all of the numbers (ready for export). This must be straightforward, but I am having problems because some bits of the dataframe are not numeric numbers. For example I want to round the figures to the nearest whole number in the example below:
ID = c("a","b","c","d","e")
Value1 = c("3.4","6.4","8.7","1.1","0.1")
Value2 = c("8.2","1.7","6.4","1.9","10.3")
df<-data.frame(ID,Value1,Value2)
Can anyone help me out? I can round individual columns (e.g., round(df$Value1, 2)
) but I want to round a whole table which contains some columns which are not numeric.
Upvotes: 75
Views: 154463
Reputation: 1045
This solution is post dplyr 1.1.0, where the "..." argument of across() has been deprecated. Practically speaking, this means that the options above using across will no longer function
df %>% mutate(across(where(is.numeric), round, digits=3))
First I prefer to vectorize the columns (though completely unnecessary - it just helps if the values will change over time.
v_make_decimal <- c("Value1", "Value2")
Then perform mutate(across:
## round to 1 decimal point
mutate(across(any_of(v_make_decimal), ~round(.x, digits = 1))))
As you mention that your data might be character, we can update this:
mutate(across(any_of(v_make_decimal), ~round(as.numeric(.x, digits = 1))))
Note above the differences between the display and what is stored.
Upvotes: 0
Reputation: 101
If you are exporting to a LaTeX format, xtable()
from the xtable
package is the quickest way.
The code xtable(df,digits=2)
produces a LaTeX table with all numeric variables rounded to 2 digits.
Upvotes: 0
Reputation: 454
Here is an alternative. This function makes it easy to specify the actual rounding function and accepts unique digits value for each column:
rounddf <- function(x, digits = rep(2, ncol(x)), func = round) {
if (length(digits) == 1) {
digits <- rep(digits, ncol(x))
} else if (length(digits) != ncol(x)) {
digits <- c(digits, rep(digits[1], ncol(x) - length(digits)))
warning('First value in digits repeated to match length.')
}
for(i in 1:ncol(x)) {
if(class(x[, i])[1] == 'numeric') x[, i] <- func(x[, i], digits[i])
}
return(x)
}
It's posted (and sometimes updated) at https://github.com/sashahafner/jumbled
Upvotes: 2
Reputation: 4205
Note that some solutions proposed above do not take care of row names, meaning that they got lost.
For example, try:
df <- data.frame(v1 = seq(1.11, 1.20, 0.01), v2 = letters[1:10])
row.names(df) = df$v2
and then, as suggested above, try:
data.frame( lapply(df, function(y) if(is.numeric(y)) round(y, 2) else y) )
Note that the row names are no longer there.
Akhmed's suggestion keeps row names because it works with replacements.
Upvotes: 2
Reputation: 6639
I think the neatest way of doing this now is using dplyr
library(dplyr)
df %>%
mutate_if(is.numeric, round)
This will round all numeric columns in your dataframe
Upvotes: 130
Reputation: 2954
The answers above point out a couple of stumbling blocks in the initial question, that make it more complicated than just rounding multiple columns, primarily:
data.frame()
default converts the character-numbers to factors The response by Ben details how to handle these issues, and applies purrr::dmap()
. The purrr
package has since been modified and the dmap
function is deprecated (in favor of map_df()
).
There is also a newer function, modify_if()
which can solve the problem of rounding multiple numeric columns, and so I wanted to update this answer.
I'll enter the data as numbers, adding a few more digits to round to make the example more broadly applicable:
df <- data.frame(ID = c("a","b","c","d","e"),
Value1 =c(3.4532897,6.41325,8.71235,1.115,0.115),
Value2 = c(8.2125,1.71235,6.4135,1.915,10.3235))
Using the purrr::modify_if()
function:
purrr::modify_if(df, ~is.numeric(.), ~round(., 0))
ID Value1 Value2
1 a 3 8
2 b 6 2
3 c 9 6
4 d 1 2
5 e 0 10
just change to round(digits= 0)
to the appropriate decimal spaces
modify_if(df, ~is.numeric(.), ~round(., 2))
ID Value1 Value2
1 a 3.45 8.21
2 b 6.41 1.71
3 c 8.71 6.41
4 d 1.12 1.92
5 e 0.12 10.32
see http://purrr.tidyverse.org/ for further documentation on syntax
This could also be done in two steps using base R apply functions, by creating an index for the columns (numVars) and then standard indexing to modify only those columns:
numVars <- sapply(df, is.numeric)
ID Value1 Value2
FALSE TRUE TRUE
df[, numVars] <- lapply(df[, numVars], round, 0)
df
ID Value1 Value2
1 a 3 8
2 b 6 2
3 c 9 6
4 d 1 2
5 e 0 10
Upvotes: 3
Reputation: 203
Here is a one-liner that I like using:
(this will apply the round
function to only the columns of class type specified in the classes
argument)
df2 <- rapply(object = df, f = round, classes = "numeric", how = "replace", digits = 0)
Upvotes: 14
Reputation: 837
I know this is a late reply, but I also had this same problem. After doing some searching I found this to be the most elegant solution:
data.frame(lapply(x, function(y) if(is.numeric(y)) round(y, 2) else y))
Solution originally from: Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA
http://r.789695.n4.nabble.com/round-a-data-frame-containing-character-variables-td3732415.html
Upvotes: 20
Reputation: 42283
The other answers do not quite answer the OP's question exactly because they assume the example data is different from what the OP has provided.
If we read the question literally, and we want a general solution that will find columns with digits in them (of any vector type), convert them to numeric, and then perform another numeric operation, such as rounding. We can use purrr:dmap
and do it like this:
Here's the data as provided by the OP, where all cols are factors (an annoying default, but we can deal with it):
ID = c("a","b","c","d","e")
Value1 = c("3.4","6.4","8.7","1.1","0.1")
Value2 = c("8.2","1.7","6.4","1.9","10.3")
df<-data.frame(ID,Value1,Value2)
str(df)
'data.frame': 5 obs. of 3 variables:
$ ID : Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5
$ Value1: Factor w/ 5 levels "0.1","1.1","3.4",..: 3 4 5 2 1
$ Value2: Factor w/ 5 levels "1.7","1.9","10.3",..: 5 1 4 2 3
We'll search for cols with digits in them, and make a dataframe of indices to mark the numerics:
library(dplyr)
library(purrr)
df_logical <-
df %>%
dmap(function(i) grepl("[0-9]", i))
df_logical
ID Value1 Value2
1 FALSE TRUE TRUE
2 FALSE TRUE TRUE
3 FALSE TRUE TRUE
4 FALSE TRUE TRUE
5 FALSE TRUE TRUE
str(df_logical)
'data.frame': 5 obs. of 3 variables:
$ ID : logi FALSE FALSE FALSE FALSE FALSE
$ Value1: logi TRUE TRUE TRUE TRUE TRUE
$ Value2: logi TRUE TRUE TRUE TRUE TRUE
Then we can use these indices to select a subset of the cols in the original dataframe and convert them to numeric, and do other things also (in this case, rounding):
df_numerics <-
map(1:ncol(df), function(i) ifelse(df_logical[,i],
as.numeric(as.character(df[,i])),
df[,i])) %>%
dmap(round, 0) %>%
setNames(names(df))
And we've got the desired result:
df_numerics
ID Value1 Value2
1 1 3 8
2 2 6 2
3 3 9 6
4 4 1 2
5 5 0 10
str(df_numerics)
'data.frame': 5 obs. of 3 variables:
$ ID : num 1 2 3 4 5
$ Value1: num 3 6 9 1 0
$ Value2: num 8 2 6 2 10
This could be useful in the case of a dataframe with a large number of columns, and where we have many character/factor type cols full of digits that we want as numeric, but it's too tedious to do by hand.
Upvotes: 4
Reputation: 3635
Recognizing that this is an old question and one answer is accepted, I would like to offer another solution since the question appears as a top-ranked result on Google.
A more general solution is to create a separate function that searches for all numerical variables and rounds them to the specified number of digits:
round_df <- function(df, digits) {
nums <- vapply(df, is.numeric, FUN.VALUE = logical(1))
df[,nums] <- round(df[,nums], digits = digits)
(df)
}
Once defined, you can use it as follows:
> round_df(df, digits=3)
Upvotes: 60
Reputation: 16277
First make sure your number columns are numeric:
ID = c("a","b","c","d","e")
Value1 = as.numeric(c("3.4","6.4","8.7","1.1","0.1"))
Value2 = as.numeric(c("8.2","1.7","6.4","1.9","10.3"))
df<-data.frame(ID,Value1,Value2, stringsAsFactors = FALSE)
Then, round only the numeric columns:
df[,-1] <-round(df[,-1],0) #the "-1" excludes column 1
df
ID Value1 Value2
1 a 3 8
2 b 6 2
3 c 9 6
4 d 1 2
5 e 0 10
Upvotes: 41
Reputation: 1961
Why don't you just use ID as the row name?
... and take out the "'s from value1 and value2 data
Try this instead:
ID = c("a","b","c","d","e")
Value1 = c(3.4,6.4,8.7,1.1,0.1)
Value2 = c(8.2,1.7,6.4,1.9,10.3)
df<-data.frame(ID,Value1,Value2,row.names=TRUE)
> df
Value1 Value2
a 3.4 8.2
b 6.4 1.7
c 8.7 6.4
d 1.1 1.9
e 0.1 10.3
> str(df)
'data.frame': 5 obs. of 2 variables:
$ Value1: num 3.4 6.4 8.7 1.1 0.1
$ Value2: num 8.2 1.7 6.4 1.9 10.3
I am not sure what you want to do with the round, but you have some options in R:
?ceiling()
?floor()
?trunc()
Upvotes: 1