Reputation: 59
In Bigquery, I am trying to find a way to extract particular segments of a string based on how many dashes come before it. The number of total dashes in the string will always be the same. For example, I could be looking for the string after the second dash and before the third dash in the following string:
abc-defgh-hij-kl-mnop
Currently, I am using the following regex to extract, which counts the dashes from the back:
([^-]+)(?:-[^-]+){2}$
The problem is that if there is nothing in between the dashes, the regex doesn't work. For example, something like this returns null:
abc-defgh-hij--mnop
Is there a way to use regex to extract a string after a certain number of dashes and cut it off before the subsequent dash?
Thank you!
Upvotes: 1
Views: 1704
Reputation: 173210
Below is for BigQuery Standrd SQL
The simplest way in your case is to use SPLIT and OFFSET as in below example
SELECT SPLIT(str, '-')[OFFSET(3)]
above will return empty string for abc-defgh-hij--mnop
to prevent error in case of calling non-existing element - better to use SAFE_OFFSET
SELECT SPLIT(str, '-')[SAFE_OFFSET(3)]
Upvotes: 3