CodeMaster
CodeMaster

Reputation: 449

Converting empty values to NULL in R - Handling date column

I have a simple dataframe as: dput(emp)

structure(list(name = structure(1L, .Label = "Alex", class = "factor"), 
    job = structure(1L, .Label = "", class = "factor"), Mgr = structure(1L, .Label = "", class = "factor"), 
    update = structure(18498, class = "Date")), class = "data.frame", row.names = c(NA, 
-1L))

I want to convert all empty rows to NULL

The simplest way to achieve is:

emp[emp==""] <- NA

Which ofcourse would have worked but I get the error for the date column as:

Error in charToDate(x) : 
  character string is not in a standard unambiguous format

How can I convert all other empty rows to NULL without having to deal with the date column? Please note that the actual data frame has 30000+ rows.

Upvotes: 2

Views: 1987

Answers (4)

ThomasIsCoding
ThomasIsCoding

Reputation: 102309

You can try type.convert like below

type.convert(emp,as.is = TRUE)

such that

  name job Mgr     update
1 Alex  NA  NA 2020-08-24

Upvotes: 4

Ivn Ant
Ivn Ant

Reputation: 135

You may try this using dplyr:

library(dplyr)

df %>% 
  mutate_at(vars(update),as.character) %>%
  na_if(.,"")

As mentioned by @Duck, you have to format the date variable as character.

afterwards you can transform it back to date if you need it:

library(dplyr)

df %>% 
  mutate_at(vars(update),as.character) %>%
  na_if(.,"") %>%
  mutate_at(vars(update),as.Date)

Upvotes: 2

Karthik S
Karthik S

Reputation: 11594

See if this works:

> library(dplyr)
> library(purrr)
> emp <- structure(list(name = structure(1L, .Label = "Alex", class = "factor"), 
+                      job = structure(1L, .Label = "", class = "factor"), Mgr = structure(1L, .Label = "", class = "factor"), 
+                      update = structure(18498, class = "Date")), class = "data.frame", row.names = c(NA, 
+                                                                                                      -1L))
> emp
  name job Mgr     update
1 Alex         2020-08-24
> emp %>% mutate(update = as.character(update)) %>% map_df(~gsub('^$',NA, .x)) %>% mutate(update = as.Date(update)) %>% mutate(across(1:3, as.factor))
# A tibble: 1 x 4
  name  job   Mgr   update    
  <fct> <fct> <fct> <date>    
1 Alex  NA    NA    2020-08-24
> 

Upvotes: 1

Duck
Duck

Reputation: 39613

Try formating the date variable as character, make the change and transform to date again:

#Format date
emp$update <- as.character(emp$update)
#Replace
emp[emp=='']<-NA
#Reformat date
emp$update <- as.Date(emp$update)
 

Output:

  name  job  Mgr     update
1 Alex <NA> <NA> 2020-08-24

Upvotes: 4

Related Questions