Reputation: 576
I am trying to extract a item_subtype field from an URL.
This regex works fine in the to get the first item item_type
SELECT REGEXP_EXTRACT('info?item_type=icecream&item_subtype=chocolate/cookies%20cream,vanilla&page=1', r'item_type=(\w+)')
but what is the correct regex to get everything starting from 'chocolate' all the way to before the '&page1'
I have tried this, but can't seem to get it to work to go further
SELECT REGEXP_EXTRACT('info?item_type=icecream&item_subtype=chocolate/cookies%20cream,vanilla&page=1', r'item_subtype=(\w+[^Z])')
basically, I want to extract 'chocolate/cookies%20cream,vanilla'
Upvotes: 1
Views: 47
Reputation: 626728
In your case, \w+
only matches one or more letters, digits or underscores. Your expected values may contain other characters, too.
You may use
SELECT REGEXP_EXTRACT('info?item_type=icecream&item_subtype=chocolate/cookies%20cream,vanilla&page=1', r'item_subtype=([^&]+)')
See the regex demo.
Notes:
item_subtype=
- this string is matched as a literal char sequence([^&]+)
- a Capturing group 1 that matches and captures one or more chars other than &
into a separate memory buffer that is returned by REGEXP_EXTRACT
function.Upvotes: 1