Reputation: 659
I'm fetching some data out of my database to be outputted inside .csv
file. These outputted rows may or may not include two dates in multiple different formats, including:
YYYY-MM-DD (2019-02-01)
DD.MM.YYYY (01.02.2019)
D.M.YYYY (1.2.2019)
DD.MM.YY (01.02.19)
D.MM.YY (1.02.19)
D.MM.YYYY (1.02.2019)
These two dates are usually located at the end of the string, however there are few occasions where the dates are in the middle of the string. Here's a few example rows outputted from the database:
Product 1, 1.10.2018 - 31.12.2018 just a test string
Product 2 15.12.18-23.6.19
What I'd like to do now is to parse both of these dates in one format (DD.MM.YYYY) if found from the string into own variables.
If dates are not found in the string, the row can be ignored for now. Here's what I've tried so far:
<?php
function contains_date($str) {
if (preg_match('/\b(\d{4})-(\d{2})-(\d{2})\b/', $str, $matches)) {
if (checkdate($matches[2], $matches[3], $matches[1])) {
return true;
}
}
return false;
}
$i = 0;
$table = [];
while($row = $stmt->fetch()) {
if(contains_date($row['product'])) {
$product = preg_replace('/\s+/', '', $row['product']);
$date = substr($product, -21);
$periodStart = date('d.m.Y', strtotime(substr($date, 0, 10)));
$periodEnd = date('d.m.Y', strtotime(substr($date, 11)));
}
$table[$i]['product'] = $row['product'];
$table[$i]['startDate'] = $periodStart;
$table[$i]['endDate'] = $periodEnd;
$i++;
}
?>
This does work on cases where the date is in YYYY-MM-DD format and if the two dates are found at the end of the string. It does not cover all the cases where date format is different nor if the dates are found in the middle of the string. Any help would be much appreciated!
Upvotes: 2
Views: 329
Reputation: 626758
You may use identically named groups to match different date formats with one regex:
$rx = '/(?J)(?<!\d)(?:(?<year>\d{4})-(?<month>\d{1,2})-(?<day>\d{1,2})|(?<day>\d{1,2})\.(?<month>\d{1,2})\.(?<year>\d{2}(?:\d{2})?))(?!\d)/';
See the regex demo.
POIs:
J
modifier(?<!\d)
negative lookbehind at the start of the regex cancels the match after a digit (it matches a location that is not immediately preceded with a digit)(?!\d)
negative lookahead at the end of the regex cancels the match before a digit (it matches a location that is not immediately followed with a digit)$strs = ['YYYY-MM-DD (2019-02-01)', 'DD.MM.YYYY (01.02.2019)', 'D.M.YYYY (1.2.2019)','DD.MM.YY (01.02.19)','D.MM.YY (1.02.19)','D.MM.YYYY (1.02.2019)','Product 1, 1.10.2018 - 31.12.2018 just a test string','Product 2 15.12.18-23.6.19' ];
$rx = '/(?J)(?<!\d)(?:(?<year>\d{4})-(?<month>\d{1,2})-(?<day>\d{1,2})|(?<day>\d{1,2})\.(?<month>\d{1,2})\.(?<year>\d{2}(?:\d{2})?))(?!\d)/';
foreach ($strs as $s) {
echo "INPUT: $s\n";
if (preg_match_all($rx, $s, $matches, PREG_SET_ORDER, 0)) {
foreach ($matches as $m) {
echo "DAY: " . $m["day"] . "\nMONTH: " . $m["month"] . "\nYEAR: " . $m["year"] . "\n\n";
}
}
}
Output:
INPUT: YYYY-MM-DD (2019-02-01)
DAY: 01
MONTH: 02
YEAR: 2019
INPUT: DD.MM.YYYY (01.02.2019)
DAY: 01
MONTH: 02
YEAR: 2019
INPUT: D.M.YYYY (1.2.2019)
DAY: 1
MONTH: 2
YEAR: 2019
INPUT: DD.MM.YY (01.02.19)
DAY: 01
MONTH: 02
YEAR: 19
INPUT: D.MM.YY (1.02.19)
DAY: 1
MONTH: 02
YEAR: 19
INPUT: D.MM.YYYY (1.02.2019)
DAY: 1
MONTH: 02
YEAR: 2019
INPUT: Product 1, 1.10.2018 - 31.12.2018 just a test string
DAY: 1
MONTH: 10
YEAR: 2018
DAY: 31
MONTH: 12
YEAR: 2018
INPUT: Product 2 15.12.18-23.6.19
DAY: 15
MONTH: 12
YEAR: 18
DAY: 23
MONTH: 6
YEAR: 19
Upvotes: 2