Krokkie.za
Krokkie.za

Reputation: 19

Date format and R vs. regional settings

I'm struggling to let R adhere to the Date format / Regional Settings as specified on my Windows machine.

I have:

> Sys.getlocale("LC_TIME")
[1] "English_United States.1252"

But when I check my Control Panel \ Regional and Language Settings, I have:

Format: English (United States)
Short date: M/d/yyyy

It appears that R ignores the date-specific settings entirely. Dates formatted by R is hardcoded(?) on a fixed format "%Y-%m-%d":

format.Date(Sys.Date())
[1] "2019-01-25"

The problem comes in whilst interacting with data from external programs, that does adhere to the local regional settings. For example dates formatted by Microsoft Excel. And making the code generic across different regional settings, with different date formats.

I found this to work to some degree:

sShortDate <- readRegistry("Control Panel\\International", "HCU")$sShortDate
# convert Windows date format strings into R-date-format-strings
CPformat2Rformat <- function(f) {
  f %>%  
  gsub("yyyy", "%Y", .) %>%
  gsub("M",    "%m", .) %>%
  gsub('d',    '%d', .) 
}
sShortDate <- CPformat2Rformat(sShortDate)

# now we have the right format
format.Date(Sys.Date(), sShortDate)
# and the other way around;
strptime("01/25/2019", sShortDate)

Is this the best way to handle this? Or this there a better function that handles the ShortDate and LongDate formats better?

Upvotes: 1

Views: 710

Answers (2)

r2evans
r2evans

Reputation: 160607

If I read you correctly, you are asking why R does not display dates per your locale. I agree, it would be nice if you could control the format of a displayed Date object while preserving it as a Date object. But there is no real way without providing your own extension without a bit of work.

Here's a hack: override the default print.Date method, though it only works with direct variables on the console, not with embedded ones such as data.frame:

x <- data.frame(dt = Sys.Date(), dttm = Sys.time())
x$dt
# [1] "2019-01-25"
class(x$dt)
# [1] "Date"

print.Date <- function (x, max = NULL, ...) {
    fmt <- getOption("date.format", "%m/%d/%Y")
    if (is.null(max)) 
        max <- getOption("max.print", 9999L)
    if (max < length(x)) {
        print(format(x[seq_len(max)], format=fmt), max = max, ...)
        cat(" [ reached getOption(\"max.print\") -- omitted", 
            length(x) - max, "entries ]\n")
    }
    else if (length(x)) 
        print(format(x, format=fmt), max = max, ...)
    else cat(class(x)[1L], "of length 0\n")
    invisible(x)
}

x
#           dt                dttm
# 1 2019-01-25 2019-01-25 00:22:38
x$dt
# [1] "01/25/2019"
class(x$dt)
# [1] "Date"
options(date.format="%b %d, %Y")
x$dt
# [1] "Jan 25, 2019"

I defined the default to be "%m/%d/%Y" here, but that is catering to your preference there. A more generic application of this function might default to the R-default of "%Y-%m-%d" and have the user change it at some point based on their needs.

(I don't know off-hand how to have this function be referenced for data.frame representations. It is likely due to scope, where it is using base::print.Date due to the search path of namespaces. If you ask "can I replace that one", it's been asked and typically the answer is along the lines of "this might work but ..." or "caveat emptor", and in my experience is never really perfectly resolved, especially when trying to over-ride base functions.)

Upvotes: 1

Paul
Paul

Reputation: 2959

You can use the as.Date() function to adapt for different date formats, for example:

as.Date("01/25/2019", format='%m/%d/%Y')
as.Date("Jan-30-2019", format='%b-%d-%Y')
as.Date("25012019", format='%d%m%Y')

will all give the same generic date object.

Upvotes: 0

Related Questions