Nil_07
Nil_07

Reputation: 146

R: Regex to string detect date-time format in R

What would be a solution to detect date-time format

14/07/2009 19:15:29

Is this a fullproof solution?

str_detect(s,regex("([0-9]{2}/[0-9]{2}/[0-9]{4}) [0-9]{2}:[0-9]{2}:[0-9]{2}"))

For example for the format

14.07.2009

I have written the regex to be

str_detect(date,regex("([0-9]{2}\\.[0-9]{2}\\.[0-9]{4})"))

I don't have much idea regarding regex in R or regex in general, just the very basic stuff so would appreciate an easy approach with detailed logic. Thanks in advance.

Upvotes: 0

Views: 873

Answers (1)

ktiu
ktiu

Reputation: 2626

As a beginner, I sometimes found it helpful to assemble the pattern as follows:

c(
  "[0-9]{2}", # day
  "/",
  "[0-9]{2}", # month
  "/",
  "[0-9]{4}", # year
  " ",
  "[0-9]{2}", # Hour
  ":",
  "[0-9]{2}", # minute
  ":",
  "[0-9]{2}"  # second
) |> paste(collapse = "")

Returns the pattern:

[1] "[0-9]{2}/[0-9]{2}/[0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2}"
stringr::str_detect("14/07/2009 19:15:29",
                    "[0-9]{2}/[0-9]{2}/[0-9]{4} [0-9]{2}:[0-9]{2}:[0-9]{2}")
# [1] TRUE

Update (as per comments)

Here is how you could use the lubridate package. dmy_hms() finds datetimes in your format:

lubridate::dmy_hms("14/07/2009 19:15:29")

# [1] "2009-07-14 19:15:29 UTC"

But it will not parse invalid dates:

lubridate::dmy_hms("14/07/2009 19:15:70") # invalid seconds

# [1] NA

So to validate you could do:

(! is.na(lubridate::dmy_hms("14/07/2009 19:15:29")))

# [1] TRUE

Upvotes: 1

Related Questions