CodeCamper
CodeCamper

Reputation: 6980

google sheets regextract nth occurence on new line

Example Data (all in one google sheet cell)

Test# 123
Bob# abc
how are you doing
John# test
... # ...

My goal is to return everything after the third # so in this example " test" I have fiddled with a lot of examples online but they seem to be incompatible with the google sheets version of regextract for some reason.

I can use (\n.*){4} to return the fourth line but that is no good because as you can see from the example data I do not know how many lines of data will be between a # and also that is not extracting from the # let alone the third #.

Goal: extract the third # to the end of the line including or excluding the # will do.

Here is an idea I have but surely the format is all butchered ((\n)(?=#.*)){3} I would expect this regex to grab the third # IF it was the beginning of the line but I can't even get that working let alone if it were to occur in the middle of the line was as was my example.

Upvotes: 1

Views: 1118

Answers (2)

player0
player0

Reputation: 1

also works:

=REGEXEXTRACT(INDEX(SPLIT(A1, "#"),,4), " (.+)")

0

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626728

You may use

=REGEXEXTRACT(A1, "^(?:[^#]*#){3}(.+)")

To get all text from the 3rd till the 4th # char:

=REGEXEXTRACT(A1, "^(?:[^#]*#){3}([^#]+)")

Details

  • ^ - start of string
  • (?:[^#]*#){3} - three occurrences of 0+ chars other than # and then # (the (?:...) is a non-capturing group, you need it to group a sequence of patterns without the need to return the text this group pattern matches)
  • (.+) - Capturing group 1 (REGEXEXTRACT returns the text captured into the group only if the group is specified): any 1+ chars other than line break chars
  • ([^#]+) - this captures into Group 1 any one or more chars other than #.

enter image description here

Upvotes: 2

Related Questions