denica
denica

Reputation: 321

Advanced Search on Github?

I would like to make an advanced search for repos under github for commits which match the conditions below:

I know that github uses Lucene to perform its searchs, but searching around I can't find any documentation on query syntax, and if I follow the guidelines of the apache Lucene documentation I often end up with an "Invalid query syntax" message.

For my personal query I have already passed the language, size and forks queries with no problem, but I still have problem to find a good match to perform a query syntax based on dates.

Is it mandatory that I have to include the Timestamp in the date queries?
Could I make some computation for dates like NOW - 3MONTHS?
For example, how could I search repos that were created 4 MONTHS AGO TO NOW?

EDIT:

I talked to github support and they said to me that they use the Solr query syntax which allows the date range queries using calculations such as NOW - 4MONTHS, but for some reason it doesn't work ok for me or I just don't understand how these filters operate (created and pushed).

Just to test it, I tried to find any Repos, with Javascript as the main language, both of this selected from the combo boxes and then try to search using the [created} filter and see what strange results I have.

For the first search I try to find any javascript repo created between today and 12 months ago.

created:[NOW-12MONTHS/DAY TO NOW/DAY]

That gives me a total of 233500 Repos and I have listed the "twitter/bootstrap" repo at the top.

For the second search I tried to find any Javascript repo created between today and 24 months ago.

created:[NOW-24MONTHS/DAY TO NOW/DAY]

Not only it gives me less repos than before, 11867 in total, but I don't have the "twitter/bootstrap" repo listed any more in the results page (which I think is wrong because my second search "contains" the first one). The first result has less watchers than "twitter/bootstrap" and if I order the results by watchers count it would be wrong to not have it at the top!

I'm not saying that there is a bug on the site, but I just don't understand how it works for doing calculations with date ranges. Hope someone can help me clarify my issues.

Upvotes: 4

Views: 5549

Answers (3)

VonC
VonC

Reputation: 1329582

Note that since November 26th, 2012 ("Search Syntax Improvements") (by Tim Pease), the Solr-style syntax for comparison and range criteria is no longer the only alternative.

So searching for items with more than 10 stars looked like:

stars:[10 TO *]

Now it is:

stars:>10

However range doesn't support Solr-like syntax like now, you need to specify dates, but without timestamps.

cats pushed:2012-04-30..2012-07-04


Update August 2013: you now have even more search api examples

 curl -ni "https://api.github.com/search/repositories?q=more+useful+keyboard" -H 'Accept: application/vnd.github.preview'

Stars and watchers are in a transition period. Until that transition is complete, you get the number of stars using the old terminology (i.e., "watchers_count").

Upvotes: 2

hoju
hoju

Reputation: 29482

Check the Solr documentation page for exact syntax: http://wiki.apache.org/solr/SolrQuerySyntax

For the date searches the syntax is like this:

created:[2008-01-01T00:00:00Z TO NOW]

Upvotes: 1

Mark Leighton Fisher
Mark Leighton Fisher

Reputation: 5703

It's ugly, but you could wrap a layer around the search that interprets these date queries specially. For example, rewriting "Created:[NOW-4MONTHS to NOW]" to "Created:[2012-01-21 TO 2012-05-20]" before passing the query to Lucene.

Among the problems you'll have with this approach:

  • You need to come up with the wrapper query syntax.
  • You need to parse the wrapper query syntax correctly.
  • You need to rewrite your wrapper query syntax correctly into Lucene's syntax.

As far as I know, a range query cannot have a subquery inside of it, so you might be able to just use regular expressions to detect your date range queries, especially if you can count on specific field name(s) for the date/time queries.

Upvotes: 2

Related Questions