Regex for column separated values (PHP)

Question

I have the following dataset, organized in columns:

startword:          blablabla:              blablabla:
123456              123456                  123456
important phrase 1  important phrase 2      important phrase 3

And I want to have the important phrases in different matches. At the moment, I have the following regex (which works fine, see a demo on regex101.com):

^startword:(?:.*\R){2}\K(?.*)
# look for the startword, consume two non-important lines and throw it away (\K)
# capture everything from the important line to a group called "important"

Expected output (an array actually):

match1 => important phrase 1
match2 => important phrase 2
match3 => important phrase 3

At the moment, I split the line programmatically in PHP (using two spaces as delimiter), but I wonder if there's a better way to have the matches in groups directly (some \G magic ?).

anubhava · Accepted Answer

You can use this single regex to capture all your matches:

(?:^startword:(?:.*\R){2}\K|(?!^)\G\h{2,})(.+?(?=\h{2}|$))

Updated RegEx Demo

\G asserts position at the end of the previous match or the start of the string for the first match.
Lookahead (?=\h{2}|$) makes sure that we have 2 or more horizontal white-spaces ahead or end of line is reached after our captured text.

Regex for column separated values (PHP)

Answers (1)

Related Questions