Ruslan
Ruslan

Reputation: 423

How to deal with similar date formats using regular expressions?

I have two weird date ranges:

01.01-02.01.2022
01-02.01.2022

In order to extract fist date in usual format (01.01.2022) I have to detect both situatuations using regular expressions:

For 01.01-02.01.2022:

\d{2}\.\d{2}-\d{2}\.\d{2}\.\d{4} 

And for 01-02.01.2022 (also 01-02/01/2022 or 01-02\01\22)

\d{2}-\d{2}[/\.\\]\d{2}[\./\\]\d{2,4}

Problem is 01.01-02.01.2022 contains 01-02.01.2022 in it

What regual expression sould I use for 01.01-02.01.2022 to avoid situation when the date format is matched by both expressions?

Of course it would be easy if my strings begin with the date. ^ in in front of the expression could solve my problem. But sometimes dates are wrapped with something else.

Upvotes: 1

Views: 51

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626748

You may use

(^|\D\.|[^0-9.])(\d{2}-\d{2}[/.\]\d{2}[/.\]\d{2,4})($|\D)

See the regex demo. Grab Group 2 value.

Details

  • (^|\D\.|[^0-9.]) - Group 1: either start of string (^), a non-digit char and a dot (\D\.) or any char but digit and dot ([^0-9.])
  • (\d{2}-\d{2}[/.\]\d{2}[/.\]\d{2,4}) - Group 2 (this is what you need to extract): 2 digits, -, 2 digits, / or . or \, two digits, / or . or \, two, three or four digits
  • ($|\D) - Group 3: end of string ($) or a non-digit char (\D)

If you mean to match 2 or 4 digits with \d{2,4}, you must replace it with (\d{4}|\d{2}) or \d{2}(\d{2})?.

Upvotes: 1

Related Questions