Reputation: 1754
Sorry for the confusing title, I'm a bit lost myself.
I have the following (sample of) strings I've been retrieving while scraping government datas:
- Responsable administrative et financière - 01 02 03 04 05
- Gestionnaire travaux d'entretien sur les monuments historiques (Titre 3 - fonctionnement) et contrôle scientifique et technique sur les MH inscrits - Ille-et-Vilaine - 01 02 03 04 05
- Conseiller
- Ingénieure des services culturels urbanisme et environnement
As you can see, some of them have a phone number (the last 10 numbers after the final dash for the 2 firsts lines), and some don't.
I'm looking for a way to group everything from after the first dash to the end, and if there is a dash, to make another group with the phone number in it.
So, with my input, I'd like to get the following back:
group1: "Responsable administrative et financière"
group2: "01 02 03 04 05"
group1: "Gestionnaire travaux d'entretien sur les monuments historiques (Titre 3 - fonctionnement) et contrôle scientifique et technique sur les MH inscrits - Ille-et-Vilaine"
group2: "01 02 03 04 05"
group1: "Conseiller"
group1: "Ingénieure des services culturels urbanisme et environnement"
The closest I've been with regex is the following:
/- (.*)(?: - (.*))/gm
But then I don't really know where to go, since if I add a "?" to make the second part optional, then it matches everything, so I'm a bit lost. Demo here
How should I proceed?
Thank you in advance
Upvotes: 1
Views: 24
Reputation: 163217
You can match any char except a -
in the second part, and make that part optional while the first part is non greedy:
^- (.*?)(?: - ([^-\n]*))?$
^-
Start of string, and match -
(.*?)
Captture group 1, match any char except a newline, as least as possible(?:
Non capture group
-
Match literally([^-\n]*)
Capture group 2, match optional chars other than -
and a newline)?
Close non capture group and make it optional$
End of stringUpvotes: 2