user22972173
user22972173

Reputation: 1

BigQuery optimalization of the query CASE when

I am trying to optimalize my query.

I will share it based on "books" example : My case is :

There are the same categories for books and library with paricular CODES

Category Code category
fantasy fan
fiction fic
biography bio

What i need to do :

  1. categorize Books to particular codes based on their names

  2. categorize data for "library codes" based on their names

  3. Check if books from particular category fall into same category as library category

I have data like this

Book names library
A Game of Thrones - fantasy //AA/Project/Biography/shelf/blablabio
A Game of Thrones - fantasy //AA/Project/Fantasy/hhhh/fan
Fan.The Lord of rings //AA/Project/biography-123/123
Fan.The Lord of rings //AA/Project//Fantasy/bio
Freddy Mercury.biography //AA/Project/biography/123fin
Steve Jobs.bio //AA/Project/fantasy555/567
The Handmaid's Tale.fic //AA/Project/fiction/890bio
Robinson Crusoe.novel //AA/Project/fin/555fin

So at the end i need to have logic IF Book Names contains same code as library it is fine and just list down Book names that have mismatches with library so here is my proposal of the query



With book_part as (
SELECT distinct
Book_names, 
library 
(CASE
when lower(Book_names) like '%fan%' then 'fan'
when Book_names like '%fic%'then 'fic'
when Book_names like '%bio%' then 'bio'
end ) as BOOK_code,
SPLIT(REPLACE(library, '/', ''),'/')[SAFE_OFFSET(2)] as Shelf,
FROM `table_name`),

with library_part (
Book_names, 
BOOK_code,
(CASE
when library like '%fan%' then 'fan'
when library like '%fic%'then 'fic'
when library like '%bio%' then 'bio'
end) as LIBRARY_CODE
from book_part)

Select *
from 
where BOOK_code = LIBRARY_CODE

and based on this I should get results of all mismatches of the codes :

Book names BOOK_CODE library Shelf LIBRARY_CODE
A Game of Thrones - fantasy fan //AA/Project/Biography/shelf/blablabio Biography bio
Fan.The Lord of rings fan //AA/Project/biography-123/123 biography-biography-123 bio
Steve Jobs.bio bio //AA/Project/fantasy555/567 fantasy555 fan
Robinson Crusoe.novel null //AA/Project/fin/555fin fin fin

Described above with example

Upvotes: 0

Views: 45

Answers (1)

Samuel
Samuel

Reputation: 3538

Please tell us why you need to optimize your query. For readability, a sub Select with an unnest looks better. You can also may create an UDF with this code.

WITH
 tbl AS (
 SELECT *
 FROM UNNEST(["Book fan fun"," Book bio fun", "BIOlogy Book"]) a)

SELECT
 *,
 (SELECT REPLACE(ANY_VALUE( IF (lower(a) LIKE test,test,NULL)),"%","") FROM UNNEST(["%fan%","%bio%","%fic%"]) AS test ) as tag,
 (SELECT ANY_VALUE( IF (regexp_contains(lower(a), '('||test||')'),test,NULL)) FROM UNNEST(["fan","bio","fic"]) AS test ) as tag2,
 (SELECT ANY_VALUE( IF (regexp_contains(lower(a), '('||test.tag||')'),test.descr,NULL)) 
  FROM UNNEST([struct("fan" as tag,"fantasy" as descr),struct("bio","biography"),struct("fic","fiction")]) AS test ) as tag3,
FROM
 tbl

However, your bio tag will trigger on words in the title such as BIOlogy.

Upvotes: 0

Related Questions