gigaman
gigaman

Reputation: 27

Bigquery query to get all occurences that match a certain pattern after slash

I have a table in bigquery that has a column with many rows that contain strings that can be something like this

row 1 mmmmm hhhhh ccccc tttt /tst /kl:2 /aaaa  nnnn
row 2 ddd bb /lamp /mode:2 /nana
row 3 /dada

I need to catch all of the: tst, kl and aaaa, lamp, mode, nana, dada (meaning all of the words after slash)

How do i do that?

Tried something like this but it did not find

SELECT column1,
  SPLIT(REGEXP_REPLACE(column1,r'(\/.*?(\s|$))', ','))  AS regex_found  
  FROM table

Upvotes: 0

Views: 159

Answers (1)

Marcin Pietraszek
Marcin Pietraszek

Reputation: 3214

You could use query similar to this one. It returns single repeated columns tokens containing words that you're interested in:

SELECT
  REGEXP_EXTRACT_ALL(column, r'\/([^ :]+)') AS tokens
FROM
  UNNEST(['mmmmm hhhhh ccccc tttt /tst /kl:2 /aaaa  nnnn', 'ddd bb /lamp /mode:2 /nana', '/dada']) AS column

Upvotes: 2

Related Questions