MrMills
MrMills

Reputation: 161

How do I extract the nth occurence of a regex match from a string?

I have a string in one cell of a Google sheet from which I wish to extract image filenames into separate cells in the sheet to isolate the filenames from the string. For instance ...

enter image description here

So, if the original string was ...

{[file_name_1.png][yes]}{[file_name_2.jpg][no]}{[file_name_3.png][no]}

... then I'd like the three following cells to have file_name_1.png, file_name_2.jpg, file_name_3.png in them.

I've tried the following ...

(.*?[A-Za-z0-9_]+\.(?:(?:png)|(?:jpg))){1} which gives {[file_name_1.png

(.*?[A-Za-z0-9_]+\.(?:(?:png)|(?:jpg))){2} which gives ][yes]}{[file_name_2.jpg

(.*?[A-Za-z0-9_]+\.(?:(?:png)|(?:jpg))){3} which gives ][no]}{[file_name_3.png

... so I feel like I'm nearly there. Could someone please help me understand where i'm going wrong? Thanks

Upvotes: 1

Views: 1865

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626709

You may use three capturing groups separated with .*? pattern that matches any 0 or more chars other than line break chars, as few as possible:

=REGEXEXTRACT(A1, "([A-Za-z0-9_]+\.(?:png|jpg)).*?([A-Za-z0-9_]+\.(?:png|jpg)).*?([A-Za-z0-9_]+\.(?:png|jpg))")

enter image description here

See the regex demo.

Details

  • ([A-Za-z0-9_]+\.(?:png|jpg)) - a capturing group matching 1 or more letters, digits or underscores, ., and then either png or jpg
  • .*? - matches 0 or more chars other than line break chars, as few as possible.

If the second and third file names are optional, wrap the .*? and the file pattern with an optional non-capturing group:

"([A-Za-z0-9_]+\.(?:png|jpg))(?:.*?([A-Za-z0-9_]+\.(?:png|jpg)))?(?:.*?([A-Za-z0-9_]+\.(?:png|jpg)))?"

Recap

To get the first match, just use

=REGEXEXTRACT(A1, "[A-Za-z0-9_]+\.(?:png|jpg)")

To get the second, use

=REGEXEXTRACT(A1, "(?:.*?([A-Za-z0-9_]+\.(?:png|jpg))){2}")

To get the third, use

=REGEXEXTRACT(A1, "(?:.*?([A-Za-z0-9_]+\.(?:png|jpg))){3}")

Upvotes: 1

Related Questions