Jonk
Jonk

Reputation: 83

Extract parameter from url in Google BigQuery using regex

May I ask your help in order to build a regular expression to be used on Google Big Query using REGEXP_EXTRACT that will parse the value of an url parameter identified by a specific key?

Let's suppose that the parameter I would like to parse has key value equal to "source". The parsing should:

Here are some example of desired behaviour (they should all provide "google" as output):

Thanks very much for any help!

Upvotes: 4

Views: 14213

Answers (3)

Metin
Metin

Reputation: 11

The value of source can be extracted as follows:

select regexp_extract("www.google.it?medium=cpc&source=google&keyword=foo&source=bing", "[?&]source=([^&]+)")

Upvotes: 0

Hirnhamster
Hirnhamster

Reputation: 7389

If you need to extract all parameters from a URL, you can also use REGEXP_EXTRACT_ALL as follows:

REGEXP_EXTRACT_ALL(query,r'(?:\?|&)((?:[^=]+)=(?:[^&]*))') as params

(Posting here because this question ranks highly on Google for "bigquery parse url query string", but the chosen answer only works for one parameter that is already defined).

This will return the result as an array (see How to extract URL parameters as ARRAY in Google BigQuery):

BigQuery example

Upvotes: 6

Whothehellisthat
Whothehellisthat

Reputation: 2152

[?&]source=([^&]+)

The first captured group in the match will be the value of the "source" parameter.

  • [?&] Either ? or &
  • source= Literal text
  • ([^&]+) A captured group containing 1 or more characters that are not &

Upvotes: 10

Related Questions