Mask
Mask

Reputation: 573

How to count the frequency of words from table column GoogleSQL Bigquery

I have a column from a Table which has certain words, for example.

MyTable

 Name    device_Used
-----  ----------------  
 a       iphone, Ipad, laptop
 b       mobile, laptop, android, desktop
 c       android, desktop, washing machine

Now I want to search a set of words for example [Ipad,android,laptop] from that table and count its frequency.

I tried doing this by reading some online solution and creating one for my own but got got errors. This is what I have tried.

WITH words AS (
  SELECT word
  FROM UNNEST([Ipad,android,laptop,desktop]) word
),
listcolumn AS (
  SELECT device
  FROM UNNEST(MyTable.device_Used) device
)
SELECT device, COUNT(word) AS cnt
FROM listcolumn
JOIN words
ON (device <> "")
GROUP BY device
ORDER BY cnt DESC;

I want to create a new table result like this :-

   device_Used      count
---------------   -----------
Ipad                1            
android             2
laptop              2
desktop             2

As I am new to this, You can Suggest a new code as the one I am using is manipulated by reading online suggestions.

Upvotes: 0

Views: 1149

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172994

Below is for BigQuery Standard SQL

#standardSQL
SELECT TRIM(LOWER(device_Used)) AS device_Used, COUNT(1) `count`
FROM `project.dataset.table`,
UNNEST(SPLIT(device_Used)) device_Used
GROUP BY device_Used
HAVING LOWER(device_Used) IN ('ipad','android','laptop')

If to apply to sample data in your question as in below example

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'a' Name, 'iphone, Ipad, laptop' device_Used UNION ALL
  SELECT 'b', 'mobile, laptop, android, desktop' UNION ALL
  SELECT 'c', 'android, desktop, washing machine' 
)
SELECT TRIM(LOWER(device_Used)) AS device_Used, COUNT(1) `count`
FROM `project.dataset.table`,
UNNEST(SPLIT(device_Used)) device_Used
GROUP BY device_Used
HAVING LOWER(device_Used) IN ('ipad','android','laptop')

result is

Row device_Used count    
1   ipad        1    
2   laptop      2    
3   android     2     

if you can provide comment what line does what it will be of great help

SELECT TRIM(LOWER(device_Used)) AS device_Used, COUNT(1) `count` -- (4) 
FROM `project.dataset.table`, -- (1)
UNNEST(SPLIT(device_Used)) device_Used -- (2)  
GROUP BY device_Used -- (3) 
HAVING LOWER(device_Used) IN ('ipad','android','laptop') -- (5)    
  1. select from table
  2. convert comma separated list into array and unnest it
  3. group by elements of unnested array
  4. remove trailing spaces and lower case to address case sensitivity
  5. just leaves in output those devices in the predefined list

Upvotes: 1

Related Questions