Reputation: 423
I am trying to write a ruby regex in order to extract some data from a long string (HTML source code).
From the below string I want to keep the four numbers (1, 11, 30, 90) and the first single quoted string (blablabla)
AjouterRDV(1, 11, 30, 90, 'blablabla', '123' ... (it goes on) );
My regex currently works with the above example, but fails when the string contains an escaped apostrophe, like in
AjouterRDV(1, 11, 30, 90, 'it\'s failing!', '123' ... (it goes on) );
Here is my regex with two example string (one passing and the other one failing) - Rubular
Upvotes: 1
Views: 437
Reputation: 2852
A simpler way (assumes you don't need to match anything past your captures):
AjouterRDV\((\d+),(\d+),(\d+),(\d+),'(.+?)',
See Rubular example
Upvotes: 3
Reputation: 213351
You can try this: -
/AjouterRDV\( (\d+), (\d+), (\d+), (\d+), '((?:(?<=\\)[']|[^'])*)', .* \);$/ix
'((?:(?<=\\)[']|[^'])*)'
matches '
preceded by \
, or matches any character except '
Upvotes: 2
Reputation: 16861
Hmmm, there was just a comment by someone, but it seems he deleted it. His proposal was
AjouterRDV\( (\d+), (\d+), (\d+), (\d+), '((?<=\\)[']|[^'])*', .* \);$
which almost works, except for the fact that it doesn't capture the 5th group correctly. For that you need:
AjouterRDV\( (\d+), (\d+), (\d+), (\d+), '((?:(?<=\\)[']|[^'])*)', .* \);$
which converts his 'outer' group to a non-capturing group and then captures the selection within the single quotes.
Upvotes: 1