Abe
Abe

Reputation: 415

Combining mutate, across, and as_date('yyyymmdd') results in all NA

I'm trying to combine mutate, across, and as_date, including the format argument in as_date. This is resulting in all NA values, as shown below.

library(tidyverse)
library(tis)

holidays(2023) %>% 
  data.frame() %>% 
  set_names('holiday_date') %>% 
  rownames_to_column() %>%
  pivot_wider(names_from = rowname,values_from = holiday_date) %>% 
  mutate(across(everything(), as.character)) %>%
  mutate(across(everything(), 
                #as_date                      # WORKS
                ~as_date(.,format="yyyymmdd") # DOESN'T WORK
  ))

This results in

# A tibble: 1 × 10
  NewYears MLKing GWBirthday Memorial Juneteenth Independence Labor  Columbus Thanksgiving
  <date>   <date> <date>     <date>   <date>     <date>       <date> <date>   <date>      
1 NA       NA     NA         NA       NA         NA           NA     NA       NA          
# ℹ 1 more variable: Christmas <date>

If I swap the commented out as_date line (labelled "WORKS") for the ~as_date line ("DOESN'T WORK"), I get the expected result, sans the desired formatting:

# A tibble: 1 × 10
  NewYears   MLKing     GWBirthday Memorial   Juneteenth Independence Labor      Columbus   Thanksgiving
  <date>     <date>     <date>     <date>     <date>     <date>       <date>     <date>     <date>      
1 2023-01-02 2023-01-16 2023-02-20 2023-05-29 2023-06-19 2023-07-04   2023-09-04 2023-10-09 2023-11-23  
# ℹ 1 more variable: Christmas <date>

Can someone tell me what is going wrong here and how to obtain the desired result?

Upvotes: 0

Views: 182

Answers (3)

The format argument in the function as_date (as well as in the as.Date from base) is to parse the input values into a date class value, it's not the format for the desired output. That would be the format function but that one would turn your data class values into character.

Just by using as.character instead of as_date the format would be the desired "yyyymmdd" (written "%Y%m%d") but it would be a character and not a date object.

Upvotes: 1

Mark
Mark

Reputation: 12548

The issue can be seen if you look at the data type of the output of holidays(2023):

typeof(holidays(2023)[[1]]) # [1] "double"

holidays(2023)[[1]] - 3 # [1] 20230099

Even though it looks like a date, or maybe even a string, it's actually just a number; 20230102 is 20 million, 230 thousand and 102, hence subtracting three gives you 20230099.

The solution is to convert it into a character, and then convert it back into a Date (the date format in R is a type of double, under the hood, but that's a conversation for a different time).

This can all be done quite succinctly with enframe() and the values_fn argument of pivot_wider():

enframe(holidays(2023)) |>
  pivot_wider(values_fn = \(x) as.character(x) |> as.Date(format = '%Y%m%d'))

Upvotes: 1

jeffreyohene
jeffreyohene

Reputation: 606

when you used pivot_wider() and then applied mutate(across(...)), the column types got converted to character format, hence the problem with the date conversion with as_date with the format argument.

instead, use mutate(across(...)) directly after pivot_wider() to convert the columns to character, and then use another mutate(across(...)) to convert them to dates. you can specify your own desired format.

holidays(2023) %>% 
  data.frame() %>% 
  set_names('holiday_date') %>% 
  rownames_to_column() %>%
  pivot_wider(names_from = rowname, values_from = holiday_date) %>% 
  mutate(across(everything(), as.character)) %>%
  mutate(across(everything(), as_date)) %>%  # Convert to dates without format
  mutate(across(everything(), 
                ~as_date(., format = "%Y-%m-%d")))  # Convert to dates with the desired format

Upvotes: 1

Related Questions