user2227402
user2227402

Reputation: 207

Finding the location of one string within another string in Bigquery

I couldn't find a function in BigQuery query reference which looks for one string within a second one and returns the index of the location. Something like instr() in other SQL dialects. Is there any substitute or any technique to achieve this?

For example: Looking into "de" in "abcdef" will return 4.

Upvotes: 5

Views: 28377

Answers (4)

binarytrails
binarytrails

Reputation: 645

I'm late to the party but the BigQuery API changed, now the Regex syntax is as follow:

SELECT mydomains FROM `myproject.mydataset.mytable`
       where regexp_contains(mydomains, r'^(.*)example.*');

To answer the question with For example: Looking into "de" in "abcdef" will return 4., it would look like:

SELECT de FROM `myproject.mydataset.mytable`
       where regexp_contains(de, r'^(.*)abcdef.*');

REGEXP_MATCH is now part of Legacy SQL Functions and Operators as per the reference link.

Hope it helps the one! :)

Upvotes: 0

s2t2
s2t2

Reputation: 2686

The old answer is now deprecated and @carlos answer works:

STRPOS(string, substring) 

Upvotes: 10

Carlos Mendoza
Carlos Mendoza

Reputation: 29

The legacy SQL INSTR(str1,str2) function "Returns the one-based index of the first occurrence of a string." So that should work for you.

https://cloud.google.com/bigquery/docs/reference/legacy-sql

Upvotes: 2

Michael Manoochehri
Michael Manoochehri

Reputation: 7877

One way you can do this is with a Regular Expression extract (see reference here):

SELECT
  title, LENGTH(REGEXP_EXTRACT(title, r'^(.*)def.*')) + 1 AS location_of_fragment
FROM
  [publicdata:samples.wikipedia]
WHERE
  REGEXP_MATCH(title, r'^(.*)def.*')
LIMIT 10;

Returns:

Row title   location_of_fragment     
1   Austrian air defense    14   
2   Talk:Interface defeat   16   
3   High-definition television  6    
4   Talk:IAU definition of planet   10   
5   Wikipedia:Articles for deletion/Culture defines politics    41   
6   Wikipedia:WikiProject Spam/LinkReports/defenders.org    40   
7   Adenine phosphoribosyltransferase deficiency    35   
8   Stay-at-home defenceman 14   
9   Manganese deficiency (plant)    11   
10  High-definition television  6   

Upvotes: 8

Related Questions