Reputation: 25
I'm using the google bigquery tool, I'm trying to select ALL sample github repositories that have a pom.xml file and within the content of the file, have an artifact id ex-ex e.g
<artifactId>ex-ex</artifactId>
For this I have broken it down into 2 steps:
1) Find all pom.xml files
SELECT sample_repo_name FROM 'bigquery-public-data.github_repos.sample_contents' WHERE sample_path LIKE 'pom.xml'
2) Select the repositories which contain ex-ex artifact (in the content table)
AND content LIKE '%ex-ex'
The 2nd part of the query does not work (no results found) and is likely due to some syntax error somewhere. Full query below:
SELECT sample_repo_name FROM 'bigquery-public-data.github_repos.sample_contents' WHERE sample_path LIKE 'pom.xml' AND content LIKE '%ex-ex' LIMIT 1000
Would really appreciate help with this, thanks!
Upvotes: 0
Views: 143
Reputation: 3616
Have you tried '%ex-ex%'
? Without the second %
, you are only searching for records whose last 5 characters are 'ex-ex'
. Adding content
to the select
in your first query and spot checking a few results, the content
field appears to be XML (pom.xml, duh) and seem to end with </project>
, and thus will probably never match with '%ex-ex'
.
Upvotes: 1