Reputation: 4881
I'm trying to come up with a regex that selects A|B|C from the first string below. The difficulty it that it's not always 3 values and the only static thing I know is, that I want to select all values that are left from the 7th occurrence of | from the end.
A|B|C|Lion|Zebra|Date|Whatever|Something|122|Scooby
A|Lion|Zebra|Date|Whatever|Something|122|Dooby
A|B|C|D|E|Lion|Zebra|Date|Whatever|Something|122|Doo
So the result should be:
A|B|C
A
A|B|C|D|E
Thanks
Upvotes: 0
Views: 1411
Reputation: 173028
You can use REGEXP_EXTRACT as
SELECT REGEXP_EXTRACT(str, r'^(.*)(?:\|[^|]+){7}$')
Note: this function available in both BigQuery Standard SQL and Legacy SQL, but using Standard SQL is highly recommended
Below example is for BigQuery Standard SQL and using dummy data from your question
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'A|B|C|Lion|Zebra|Date|Whatever|Something|122|Scooby' str UNION ALL
SELECT 'A|Lion|Zebra|Date|Whatever|Something|122|Dooby' UNION ALL
SELECT 'A|B|C|D|E|Lion|Zebra|Date|Whatever|Something|122|Doo'
)
SELECT
REGEXP_EXTRACT(str, r'^(.*)(?:\|[^|]+){7}$') result
FROM `project.dataset.table`
as output is (as expected)
Row result
1 A|B|C
2 A
3 A|B|C|D|E
Upvotes: 2
Reputation: 37367
Try this pattern to delete matched part: ([a-zA-Z0-9]+\|?){7}$
.
Upvotes: 0