KT_1
KT_1

Reputation: 8474

How to round a data.frame in R that contains some character variables?

I have a dataframe, and I wish to round all of the numbers (ready for export). This must be straightforward, but I am having problems because some bits of the dataframe are not numeric numbers. For example I want to round the figures to the nearest whole number in the example below:

ID = c("a","b","c","d","e")
Value1 = c("3.4","6.4","8.7","1.1","0.1")
Value2 = c("8.2","1.7","6.4","1.9","10.3")
df<-data.frame(ID,Value1,Value2)

Can anyone help me out? I can round individual columns (e.g., round(df$Value1, 2)) but I want to round a whole table which contains some columns which are not numeric.

Upvotes: 75

Views: 154463

Answers (12)

glenninboston
glenninboston

Reputation: 1045

This solution is post dplyr 1.1.0, where the "..." argument of across() has been deprecated. Practically speaking, this means that the options above using across will no longer function

df %>% mutate(across(where(is.numeric), round, digits=3))

First I prefer to vectorize the columns (though completely unnecessary - it just helps if the values will change over time.

v_make_decimal <- c("Value1", "Value2")

Then perform mutate(across:

## round to 1 decimal point
mutate(across(any_of(v_make_decimal), ~round(.x, digits = 1))))

As you mention that your data might be character, we can update this:

mutate(across(any_of(v_make_decimal), ~round(as.numeric(.x, digits = 1))))

Note above the differences between the display and what is stored.

Upvotes: 0

TheSimpliFire
TheSimpliFire

Reputation: 101

If you are exporting to a LaTeX format, xtable() from the xtable package is the quickest way.

The code xtable(df,digits=2) produces a LaTeX table with all numeric variables rounded to 2 digits.

Upvotes: 0

sashahafner
sashahafner

Reputation: 454

Here is an alternative. This function makes it easy to specify the actual rounding function and accepts unique digits value for each column:

rounddf <- function(x, digits = rep(2, ncol(x)), func = round) {
  if (length(digits) == 1) {
    digits <- rep(digits, ncol(x))
  } else if (length(digits) != ncol(x)) {
    digits <- c(digits, rep(digits[1], ncol(x) - length(digits)))
    warning('First value in digits repeated to match length.')
  }

  for(i in 1:ncol(x)) {
    if(class(x[, i])[1] == 'numeric') x[, i] <- func(x[, i], digits[i])
  }

  return(x)
}

It's posted (and sometimes updated) at https://github.com/sashahafner/jumbled

Upvotes: 2

Rtist
Rtist

Reputation: 4205

Note that some solutions proposed above do not take care of row names, meaning that they got lost.

For example, try:

df <- data.frame(v1 = seq(1.11, 1.20, 0.01), v2 = letters[1:10])
row.names(df) = df$v2

and then, as suggested above, try:

data.frame( lapply(df, function(y) if(is.numeric(y)) round(y, 2) else y) ) 

Note that the row names are no longer there.

Akhmed's suggestion keeps row names because it works with replacements.

Upvotes: 2

user1165199
user1165199

Reputation: 6639

I think the neatest way of doing this now is using dplyr

library(dplyr)
df %>% 
 mutate_if(is.numeric, round)

This will round all numeric columns in your dataframe

Upvotes: 130

Matt L.
Matt L.

Reputation: 2954

The answers above point out a couple of stumbling blocks in the initial question, that make it more complicated than just rounding multiple columns, primarily:

  1. Numbers were entered as characters, and
  2. data.frame() default converts the character-numbers to factors

The response by Ben details how to handle these issues, and applies purrr::dmap(). The purrr package has since been modified and the dmap function is deprecated (in favor of map_df()).
There is also a newer function, modify_if() which can solve the problem of rounding multiple numeric columns, and so I wanted to update this answer.


I'll enter the data as numbers, adding a few more digits to round to make the example more broadly applicable:

df <- data.frame(ID = c("a","b","c","d","e"), 
                 Value1 =c(3.4532897,6.41325,8.71235,1.115,0.115), 
                 Value2 = c(8.2125,1.71235,6.4135,1.915,10.3235))

Using the purrr::modify_if() function:

purrr::modify_if(df, ~is.numeric(.), ~round(., 0))

  ID Value1 Value2
1  a      3      8
2  b      6      2
3  c      9      6
4  d      1      2
5  e      0     10

just change to round(digits= 0) to the appropriate decimal spaces

modify_if(df, ~is.numeric(.), ~round(., 2))
  ID Value1 Value2
1  a   3.45   8.21
2  b   6.41   1.71
3  c   8.71   6.41
4  d   1.12   1.92
5  e   0.12  10.32

see http://purrr.tidyverse.org/ for further documentation on syntax

This could also be done in two steps using base R apply functions, by creating an index for the columns (numVars) and then standard indexing to modify only those columns:

numVars <-  sapply(df, is.numeric)
   ID Value1 Value2 
FALSE   TRUE   TRUE 

df[, numVars] <- lapply(df[, numVars], round, 0)
df
  ID Value1 Value2
1  a      3      8
2  b      6      2
3  c      9      6
4  d      1      2
5  e      0     10

Upvotes: 3

trisaratops
trisaratops

Reputation: 203

Here is a one-liner that I like using: (this will apply the round function to only the columns of class type specified in the classes argument)

df2 <- rapply(object = df, f = round, classes = "numeric", how = "replace", digits = 0) 

Upvotes: 14

Ali
Ali

Reputation: 837

I know this is a late reply, but I also had this same problem. After doing some searching I found this to be the most elegant solution:

data.frame(lapply(x, function(y) if(is.numeric(y)) round(y, 2) else y)) 

Solution originally from: Jean V. Adams Statistician U.S. Geological Survey Great Lakes Science Center 223 East Steinfest Road Antigo, WI 54409 USA

http://r.789695.n4.nabble.com/round-a-data-frame-containing-character-variables-td3732415.html

Upvotes: 20

Ben
Ben

Reputation: 42283

The other answers do not quite answer the OP's question exactly because they assume the example data is different from what the OP has provided.

If we read the question literally, and we want a general solution that will find columns with digits in them (of any vector type), convert them to numeric, and then perform another numeric operation, such as rounding. We can use purrr:dmap and do it like this:

Here's the data as provided by the OP, where all cols are factors (an annoying default, but we can deal with it):

ID = c("a","b","c","d","e")
Value1 = c("3.4","6.4","8.7","1.1","0.1")
Value2 = c("8.2","1.7","6.4","1.9","10.3")
df<-data.frame(ID,Value1,Value2)

str(df)
'data.frame':   5 obs. of  3 variables:
 $ ID    : Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5
 $ Value1: Factor w/ 5 levels "0.1","1.1","3.4",..: 3 4 5 2 1
 $ Value2: Factor w/ 5 levels "1.7","1.9","10.3",..: 5 1 4 2 3

We'll search for cols with digits in them, and make a dataframe of indices to mark the numerics:

library(dplyr)
library(purrr)

df_logical <- 
df %>% 
  dmap(function(i) grepl("[0-9]", i))

df_logical
     ID Value1 Value2
1 FALSE   TRUE   TRUE
2 FALSE   TRUE   TRUE
3 FALSE   TRUE   TRUE
4 FALSE   TRUE   TRUE
5 FALSE   TRUE   TRUE

str(df_logical)
'data.frame':   5 obs. of  3 variables:
 $ ID    : logi  FALSE FALSE FALSE FALSE FALSE
 $ Value1: logi  TRUE TRUE TRUE TRUE TRUE
 $ Value2: logi  TRUE TRUE TRUE TRUE TRUE

Then we can use these indices to select a subset of the cols in the original dataframe and convert them to numeric, and do other things also (in this case, rounding):

df_numerics <- 
map(1:ncol(df), function(i) ifelse(df_logical[,i], 
                                      as.numeric(as.character(df[,i])), 
                                      df[,i])) %>% 
  dmap(round, 0) %>% 
  setNames(names(df)) 

And we've got the desired result:

df_numerics
  ID Value1 Value2
1  1      3      8
2  2      6      2
3  3      9      6
4  4      1      2
5  5      0     10

str(df_numerics)
'data.frame':   5 obs. of  3 variables:
 $ ID    : num  1 2 3 4 5
 $ Value1: num  3 6 9 1 0
 $ Value2: num  8 2 6 2 10

This could be useful in the case of a dataframe with a large number of columns, and where we have many character/factor type cols full of digits that we want as numeric, but it's too tedious to do by hand.

Upvotes: 4

akhmed
akhmed

Reputation: 3635

Recognizing that this is an old question and one answer is accepted, I would like to offer another solution since the question appears as a top-ranked result on Google.

A more general solution is to create a separate function that searches for all numerical variables and rounds them to the specified number of digits:

round_df <- function(df, digits) {
  nums <- vapply(df, is.numeric, FUN.VALUE = logical(1))

  df[,nums] <- round(df[,nums], digits = digits)

  (df)
}

Once defined, you can use it as follows:

> round_df(df, digits=3)

Upvotes: 60

Pierre Lapointe
Pierre Lapointe

Reputation: 16277

First make sure your number columns are numeric:

ID = c("a","b","c","d","e")
Value1 = as.numeric(c("3.4","6.4","8.7","1.1","0.1"))
Value2 = as.numeric(c("8.2","1.7","6.4","1.9","10.3"))
df<-data.frame(ID,Value1,Value2, stringsAsFactors = FALSE)

Then, round only the numeric columns:

df[,-1] <-round(df[,-1],0) #the "-1" excludes column 1
df

  ID Value1 Value2
1  a      3      8
2  b      6      2
3  c      9      6
4  d      1      2
5  e      0     10

Upvotes: 41

Gago-Silva
Gago-Silva

Reputation: 1961

Why don't you just use ID as the row name?

... and take out the "'s from value1 and value2 data

Try this instead:

ID = c("a","b","c","d","e")
Value1 = c(3.4,6.4,8.7,1.1,0.1)
Value2 = c(8.2,1.7,6.4,1.9,10.3)

df<-data.frame(ID,Value1,Value2,row.names=TRUE)

> df
  Value1 Value2
a    3.4    8.2
b    6.4    1.7
c    8.7    6.4
d    1.1    1.9
e    0.1   10.3

> str(df)
'data.frame':   5 obs. of  2 variables:
 $ Value1: num  3.4 6.4 8.7 1.1 0.1
 $ Value2: num  8.2 1.7 6.4 1.9 10.3

I am not sure what you want to do with the round, but you have some options in R:

?ceiling()
?floor()
?trunc()

Upvotes: 1

Related Questions