FitzKaos
FitzKaos

Reputation: 391

R Tidyverse spread() function multiple decimal places truncation issue

I have noticed an issue with the rounding in spread() (and I assume gather()). I have re-created the issue with some dummy data (below). What happens, is that when using spread() with doubles of more than 4 decimal places, the output of the spread has only 3 decimal places.

If anyone can shed some light on this that would be very helpful since I need to retain the 4 decimal place accuracy.

# Loading packages
library(tidyverse)

# Creating a dummy data set.
dummy_data <- tibble(
  day_of_week = c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday"),
  person = c("Jack", "Bob", "Bob", "Simon", "Simon"),
  value = c(0.2346, 0.7635, 0.7253, 0.7356, 0.1693)
)

# Spreading the data.
spread_data = dummy_data %>%
  spread(person, value)

Upvotes: 0

Views: 923

Answers (3)

Felix M
Felix M

Reputation: 122

I recreated the dummy variables in my R envir.

Indeed when print(spead_data), i get:

    day_of_week    Bob   Jack  Simon
  <chr>        <dbl>  <dbl>  <dbl>
1 Friday      NA     NA      0.169
2 Monday      NA      0.235 NA    
3 Thursday    NA     NA      0.736
4 Tuesday      0.764 NA     NA    
5 Wednesday    0.725 NA     NA   

However, if you access values directly, for example spead_data$Bob yields :

[1]     NA     NA     NA 0.7635 0.7253

Here are your 4 digits ! They never left, just the print function of tibbles that is a bit different.

I don't recommend turning your values to factors as @saisaran suggests, you won't be able to use them properly afterwards.


Edit : if you use print.data.frame(spead_data) instead of print(spead_data), you will get the output you need :

  day_of_week    Bob   Jack  Simon
1      Friday     NA     NA 0.1693
2      Monday     NA 0.2346     NA
3    Thursday     NA     NA 0.7356
4     Tuesday 0.7635     NA     NA
5   Wednesday 0.7253     NA     NA 

Source : https://community.rstudio.com/t/why-do-tibbles-and-data-frames-display-decimal-places-a-bit-differently/5722

Upvotes: 1

sai saran
sai saran

Reputation: 757

problem with the data type and i was changed the data type :

dummy_data$value<-as.factor(dummy_data$value)
# Spreading the data.
spead_data = dummy_data %>%
  spread(person, value)       

OUTPUT:

# A tibble: 5 x 4
  day_of_week Bob    Jack   Simon 
  <chr>       <fct>  <fct>  <fct> 
1 Friday      NA     NA     0.1693
2 Monday      NA     0.2346 NA    
3 Thursday    NA     NA     0.7356
4 Tuesday     0.7635 NA     NA    
5 Wednesday   0.7253 NA     NA   

Note : Be cautious with the factor type data while using in any kind

Upvotes: 0

NelsonGon
NelsonGon

Reputation: 13319

Does this work for you?

require(reshape2)
dummy_data %>%
  melt(id.vars=c("person","day_of_week")) %>% 
  dcast(value+day_of_week~person) %>% 
  select(-value)

You have several NAs but here is your result:

day_of_week    Bob   Jack  Simon
1      Friday     NA     NA 0.1693
2      Monday     NA 0.2346     NA
3   Wednesday 0.7253     NA     NA
4    Thursday     NA     NA 0.7356
5     Tuesday 0.7635     NA     NA

Upvotes: 0

Related Questions