Reputation: 5640
So I want to capture name and code from this kind of table :
| 2 | Aix en Provence (Gare SNCF) | QXB |
| 3 | Ajaccio | AJA |
| 4 | Angers | ANE |
| 5 | Angers (Gare SNCF) | QXG |
With \|\s+\d+\s\|\s([^|]+)\|\s(\w+)\s+\|
I can extract the whole line until |
.
However I want to trim
the first capture groupe.
So my question is how can I say a regex to stop capturing if there is more than one space between the words?
Upvotes: 1
Views: 161
Reputation: 626738
You may turn the greedy +
after [^|]
character class to a lazy one and add a \s*
(zero or more whitespaces) pattern right after it.
Use
\|\s+\d+\s*\|\s*([^|]+?)\s*\|\s*(\w+)\s+\|
^^^^^^^^^^^
See the regex demo.
Since the lazily quantified subpatterns are only tested after all the subsequent subpatterns fail to find a match, the whitespaces, formerly captured into Group 1, are now consumed with \s*
pattern, and Group 1 gets rid of them.
Upvotes: 1