Regex how to stop capturing if there is more than one space

Question

So I want to capture name and code from this kind of table :

|   2 | Aix en Provence (Gare SNCF)                        | QXB  |
|   3 | Ajaccio                                            | AJA  |
|   4 | Angers                                             | ANE  |
|   5 | Angers (Gare SNCF)                                 | QXG  |

With \|\s+\d+\s\|\s([^|]+)\|\s(\w+)\s+\| I can extract the whole line until |.

However I want to trim the first capture groupe.

So my question is how can I say a regex to stop capturing if there is more than one space between the words?

Here you have a playground.

Wiktor Stribiżew · Accepted Answer

You may turn the greedy + after [^|] character class to a lazy one and add a \s* (zero or more whitespaces) pattern right after it.

Use

\|\s+\d+\s*\|\s*([^|]+?)\s*\|\s*(\w+)\s+\|
                ^^^^^^^^^^^

See the regex demo.

Since the lazily quantified subpatterns are only tested after all the subsequent subpatterns fail to find a match, the whitespaces, formerly captured into Group 1, are now consumed with \s* pattern, and Group 1 gets rid of them.

Regex how to stop capturing if there is more than one space

Answers (1)

Related Questions