ErikR
ErikR

Reputation: 52029

github repo counts by language (and historical data)?

I'm interested in getting a count of github repos for a certain set of languages (with historical data if possible.)

Here are things I've tried to start collecting the stats myself:

  1. Screen scraping a page like:

https://github.com/search?q=language%3Aperl&type=&ref=simplesearch

  1. Using the github API:

https://api.github.com/legacy/repos/search/KEYWORD?language=perl

but unfortunately this seems to require a KEYWORD to get any results. Also, I only need a count not the meta data on each repo.

I'm also interested in historical data, and it seems that those stats might already be available somewhere.

Any ideas on better ways to get repo counts by language and/or historical data?

Upvotes: 5

Views: 3253

Answers (1)

bray
bray

Reputation: 601

You can try this: https://api.github.com/search/repositories?q=language:Python

Also, you can query the github archive. Using big query interface, the query should be:

bq query 'SELECT repository_language, count(repository_language) as pushes
FROM [githubarchive:github.timeline]
WHERE type="CreateEvent" and repository_fork == "false"
GROUP BY repository_language
ORDER BY pushes DESC'

This query generates statistics of number of repos per language.

Upvotes: 6

Related Questions