SimRock
SimRock

Reputation: 239

pivot_wider does not seem to work with missing values. How to turn spread() into pivot_wider() when there is missing values

as the spread() function is being replaced by the new pivot_wider() function, I was trying to use the pivot_wider() from now on but it does not seem to work because of the missing values. Any help is much appreciated

# This is an example I saw on the web

surveys <- read.csv("http://kbroman.org/datacarp/portal_data_joined.csv",
                    stringsAsFactors = FALSE)
library(dplyr)

surveys %>%
  filter(taxa == "Rodent",
         !is.na(weight)) %>%
  group_by(sex,genus) %>%
  summarize(mean_weight = mean(weight)) %>% 
  spread(sex, mean_weight)```

#It gives me the following output. This is what I would like to get
# A tibble: 10 x 4
   genus              V1      F      M
   <chr>           <dbl>  <dbl>  <dbl>
 1 Baiomys          NA     9.16   7.36
 2 Chaetodipus      19.8  23.8   24.7 
 3 Dipodomys        81.4  55.2   56.2 
 4 Neotoma         168.  154.   166.  
 5 Onychomys        23.4  26.8   26.2 
 6 Perognathus       6     8.57   8.20
 7 Peromyscus       19.9  22.5   20.6 
 8 Reithrodontomys  11.1  11.2   10.2 
 9 Sigmodon         70.3  71.7   61.3 
10 Spermophilus     NA    57    130  
surveys %>%
  filter(taxa == "Rodent",
         !is.na(weight)) %>%
  group_by(sex,genus) %>%
  summarize(mean_weight = mean(weight)) %>%
  pivot_wider(
    names_from = sex,
    values_from = mean_weight,
    names_repair = "minimal"
    )

It says the following
Error: Column 1 must be named.
Use .name_repair to specify repair.
Run `rlang::last_error()` to see where the error occurred.

Upvotes: 1

Views: 3566

Answers (2)

Lizz Huntley
Lizz Huntley

Reputation: 11

If you don't want to pre-process your data, pivot_wider has recently gained a new argument to assist with missing factor levels (https://cran.r-project.org/web/packages/tidyr/news/news.html):

"pivot_wider() gains new names_expand and id_expand arguments for turning implicit missing factor levels and variable combinations into explicit ones. This is similar to the drop argument from spread()"

Upvotes: 1

Giovanni Colitti
Giovanni Colitti

Reputation: 2344

Replace the missing values in sex before pivoting:

surveys %>%
  filter(taxa == "Rodent",
         !is.na(weight)) %>%
  group_by(sex,genus) %>%
  summarize(mean_weight = mean(weight)) %>%
  ungroup() %>% 
  mutate(sex = if_else(sex == "", "unknown", sex)) %>% 
  pivot_wider(
    names_from = sex,
    values_from = mean_weight,
    names_repair = "minimal"
  )

Upvotes: 4

Related Questions