Nicolas Manzanos
Nicolas Manzanos

Reputation: 21

Split a log file using preg_split based on dates with PHP?

I have a txt file with this format:

14/12/2020 12:02:50
LOG_HERE_1 XXXXX

14/12/2020 12:04:55
LOG_HERE_2 XXXXX

14/12/2020 12:10:33
LOG_HERE_3 XXXXX

And I need to parse it, using a regexp on the dates (dd/mm/yyyy hh:mm:ss), but keeping the date on the array. For example:

Array(
 [0] => '14/12/2020 12:02:50 LOG_HERE_1 XXXXX',
 [1] => '14/12/2020 12:02:50 LOG_HERE_2 XXXXX',
 [2] => '14/12/2020 12:02:50 LOG_HERE_3 XXXXX'
)

I tried this:

$array = preg_split('/(\d{2}\/\d{2}\/\d{4}\s\d{2}[:]\d{2}[:]\d{2})/', $data, null, PREG_SPLIT_DELIM_CAPTURE);

but it shows me:

{
 0: "",
 1: "14/12/2020 12:02:50",
 2: "",
 3: "14/12/2020 12:04:55",
 4: "",
 5: "14/12/2020 12:10:33",
 6: ""
}

Upvotes: 1

Views: 255

Answers (1)

The fourth bird
The fourth bird

Reputation: 163372

Using the flag PREG_SPLIT_DELIM_CAPTURE you can also match the lines that start with a date time like format followed by all lines that do not start with one using a negative lookahead (?!

^(\d{2}/\d{2}/\d{4}\h\d{2}:\d{2}:\d{2}\b.*\R(?:(?!\d{2}/\d{2}/\d{4}\h\d{2}:\d{2}:\d{2}\b).*\R?)*)

If starting with a date like pattern would also suffice, you could shorten it to:

^(\d{2}/\d{2}/\d{4}\b.*\R(?:(?!\d{2}/\d{2}/\d{4}\b).*\R?)*)

See a regex demo

For example

$pattern = "~^(\d{2}/\d{2}/\d{4}\h\d{2}:\d{2}:\d{2}\b.*\R(?:(?!\d{2}/\d{2}/\d{4}\h\d{2}:\d{2}:\d{2}\b).*\R?)*)~m";    
$result = preg_split($pattern, $data, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
print_r($result);

See a php demo or with the log on the same line.

Output

Array
(
    [0] => 14/12/2020 12:02:50
LOG_HERE_1 XXXXX


    [1] => 14/12/2020 12:04:55
LOG_HERE_2 XXXXX


    [2] => 14/12/2020 12:10:33
LOG_HERE_3 XXXXX
)

Note that the date like pattern does not validate a date itself.

Upvotes: 1

Related Questions