Reputation: 83
May I ask your help in order to build a regular expression to be used on Google Big Query using REGEXP_EXTRACT that will parse the value of an url parameter identified by a specific key?
Let's suppose that the parameter I would like to parse has key value equal to "source". The parsing should:
Here are some example of desired behaviour (they should all provide "google" as output):
Thanks very much for any help!
Upvotes: 4
Views: 14213
Reputation: 11
The value of source
can be extracted as follows:
select regexp_extract("www.google.it?medium=cpc&source=google&keyword=foo&source=bing", "[?&]source=([^&]+)")
Upvotes: 0
Reputation: 7389
If you need to extract all parameters from a URL, you can also use REGEXP_EXTRACT_ALL as follows:
REGEXP_EXTRACT_ALL(query,r'(?:\?|&)((?:[^=]+)=(?:[^&]*))') as params
(Posting here because this question ranks highly on Google for "bigquery parse url query string", but the chosen answer only works for one parameter that is already defined).
This will return the result as an array (see How to extract URL parameters as ARRAY in Google BigQuery):
Upvotes: 6
Reputation: 2152
The first captured group in the match will be the value of the "source" parameter.
[?&]
Either ?
or &
source=
Literal text([^&]+)
A captured group containing 1 or more characters that are not &
Upvotes: 10