Reputation: 27
I have a URL:
https://www.example.com/one/two/three/four/five/six/seven/x2-small-2
I'd like to extract the last part of the URL using 3 regex in Google Bigquery to get as return
This regex
REGEXP_EXTRACT(JSON_EXTRACT( A.resource_data,"$.Type" ), '"(.*)"') AS model,
gives
https://www.example.com/one/two/three/four/five/six/seven/x2-small-2
As far I understand, in Bigquery it's REGEX Re2
Thanks for your help ! Sorry, I'm a newbie to regex
This regex
REGEXP_EXTRACT(JSON_EXTRACT( A.resource_data,"$.Type" ), '".*/(.*)"') AS model,
gives
x2-small-2
I'd like at least to extract the last number (2). Any Idea ?
Upvotes: 1
Views: 198
Reputation: 54381
To get the last element, you use this pattern. (Sorry for the leaning toothpick syndrome, I don't know what the correct pattern building syntax in the technology you are using is).
/\/([^\/]+)$/
You will get one capture group with the whole last path part. See regex101.
You can then break this up into smaller pieces. If it's always the same format with the three fields, you could write your pattern for that too.
/\/([^\/]+)-(.+)-(.+)$/
This will give you three capture groups. See regex101.
Upvotes: 0
Reputation: 173210
Below is for BigQuery Standard SQL
#standardSQL
SELECT url,
parts[SAFE_OFFSET(0)] part1,
parts[SAFE_OFFSET(1)] part2,
parts[SAFE_OFFSET(2)] part3
FROM (
SELECT url, SPLIT(ARRAY_REVERSE(SPLIT(url, '/'))[OFFSET(0)], '-') parts
FROM `project.dataset.table`
)
if to apply to sample data from your question as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'https://www.example.com/one/two/three/four/five/six/seven/x2-small-2' url
)
SELECT url,
parts[SAFE_OFFSET(0)] part1,
parts[SAFE_OFFSET(1)] part2,
parts[SAFE_OFFSET(2)] part3
FROM (
SELECT url, SPLIT(ARRAY_REVERSE(SPLIT(url, '/'))[OFFSET(0)], '-') parts
FROM `project.dataset.table`
)
output is
Row url part1 part2 part3
1 https://www.example.com/one/two/three/four/five/six/seven/x2-small-2 x2 small 2
Upvotes: 2