jimmy
jimmy

Reputation: 4881

Regex to match the beginning from the end

I'm trying to come up with a regex that selects A|B|C from the first string below. The difficulty it that it's not always 3 values and the only static thing I know is, that I want to select all values that are left from the 7th occurrence of | from the end.

A|B|C|Lion|Zebra|Date|Whatever|Something|122|Scooby
A|Lion|Zebra|Date|Whatever|Something|122|Dooby
A|B|C|D|E|Lion|Zebra|Date|Whatever|Something|122|Doo

So the result should be:

A|B|C
A
A|B|C|D|E

Thanks

Upvotes: 0

Views: 1411

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173028

You can use REGEXP_EXTRACT as

SELECT REGEXP_EXTRACT(str, r'^(.*)(?:\|[^|]+){7}$')

Note: this function available in both BigQuery Standard SQL and Legacy SQL, but using Standard SQL is highly recommended

Below example is for BigQuery Standard SQL and using dummy data from your question

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'A|B|C|Lion|Zebra|Date|Whatever|Something|122|Scooby' str UNION ALL
  SELECT 'A|Lion|Zebra|Date|Whatever|Something|122|Dooby' UNION ALL
  SELECT 'A|B|C|D|E|Lion|Zebra|Date|Whatever|Something|122|Doo'
)
SELECT 
  REGEXP_EXTRACT(str, r'^(.*)(?:\|[^|]+){7}$') result
FROM `project.dataset.table`   

as output is (as expected)

Row result   
1   A|B|C    
2   A    
3   A|B|C|D|E    

Upvotes: 2

Michał Turczyn
Michał Turczyn

Reputation: 37367

Try this pattern to delete matched part: ([a-zA-Z0-9]+\|?){7}$.

Demo

Upvotes: 0

Related Questions