Reputation: 207
I couldn't find a function in BigQuery query reference which looks for one string within a second one and returns the index of the location. Something like instr() in other SQL dialects. Is there any substitute or any technique to achieve this?
For example: Looking into "de" in "abcdef" will return 4.
Upvotes: 5
Views: 28377
Reputation: 645
I'm late to the party but the BigQuery API changed, now the Regex syntax is as follow:
SELECT mydomains FROM `myproject.mydataset.mytable`
where regexp_contains(mydomains, r'^(.*)example.*');
To answer the question with For example: Looking into "de" in "abcdef" will return 4., it would look like:
SELECT de FROM `myproject.mydataset.mytable`
where regexp_contains(de, r'^(.*)abcdef.*');
REGEXP_MATCH is now part of Legacy SQL Functions and Operators as per the reference link.
Hope it helps the one! :)
Upvotes: 0
Reputation: 2686
The old answer is now deprecated and @carlos answer works:
STRPOS(string, substring)
Upvotes: 10
Reputation: 29
The legacy SQL INSTR(str1,str2) function "Returns the one-based index of the first occurrence of a string." So that should work for you.
https://cloud.google.com/bigquery/docs/reference/legacy-sql
Upvotes: 2
Reputation: 7877
One way you can do this is with a Regular Expression extract (see reference here):
SELECT
title, LENGTH(REGEXP_EXTRACT(title, r'^(.*)def.*')) + 1 AS location_of_fragment
FROM
[publicdata:samples.wikipedia]
WHERE
REGEXP_MATCH(title, r'^(.*)def.*')
LIMIT 10;
Returns:
Row title location_of_fragment
1 Austrian air defense 14
2 Talk:Interface defeat 16
3 High-definition television 6
4 Talk:IAU definition of planet 10
5 Wikipedia:Articles for deletion/Culture defines politics 41
6 Wikipedia:WikiProject Spam/LinkReports/defenders.org 40
7 Adenine phosphoribosyltransferase deficiency 35
8 Stay-at-home defenceman 14
9 Manganese deficiency (plant) 11
10 High-definition television 6
Upvotes: 8