Reputation: 2825
I am trying to find occurrences of certain phrases in reddit comment bigquery 2015. The phrases were also obtained through running a query. I saved the table results so that the phrases could be accessed (stored in column):
SELECT c
FROM (
SELECT title
FROM [lucid-bond-156818:bquijob_658cefdf_15a212b8aaf] #the phrases I want to search within reddit comment bigquery
SELECT COUNT(title) as c in [fh-bigquery:reddit_comments.all_2015]
)
I'm a beginner in SQL so I'm having trouble fixing it, in the last line I basically want to tell the query to count the titles in the reddit comments 2015 data.
Upvotes: 1
Views: 5769
Reputation: 172944
To start with - try below
This query will give you list of titles along with count of comments this title appeared in. (i think this is what you asked for - if i understood you correctly)
#legacySQL
SELECT title, COUNT(1) AS occurrences
FROM [fh-bigquery:reddit_comments.2005] AS c
CROSS JOIN YourTableWithTitles
WHERE body CONTAINS title
GROUP BY title
ORDER BY occurrences DESC
Above assumes that you have all your titles stored in YourTableWithTitles
table in title
column
As you can see here, I am using [fh-bigquery:reddit_comments.2005]
table instead of [fh-bigquery:reddit_comments.all_2015]
view. This is to make query cheap while testing (2005 table has just 1075 rows whereas all_2015 view span over 668 million rows). When you make final version of it - you can switch back to that view
Also, as this view is written in Legacy SQL - the query that uses this view must also be in Legacy SQL, so I stick with it too
As you are beginner - it is not much a difference - but ideally when you get a little bit into it - you should check BigQuery Standard SQL
Upvotes: 2