Reputation: 33
I want to sample random GitHub public repositories to download them one by one for statistical purposes. I tried with the following Powershell code:
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
$id = Get-Random -Minimum 0 -Maximum 500
$jsonContent = Invoke-WebRequest "http://api.github.com/repositories?since=$id" | ConvertFrom-Json
I do get a list of public repositories but I can't limit the amount of them.
I tried with ?page=1&per_page=1
but it didn't work. I just want to parse the clone_url
to git clone
.
Any ideas? Other solutions to download random repos from github are also welcomed.
Upvotes: 2
Views: 302
Reputation: 3046
You were close. This should work:
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
$id = Get-Random -Minimum 0 -Maximum 99
$jsonContent = Invoke-WebRequest "http://api.github.com/repositories?since=1" | ConvertFrom-Json
$gitUrl = Invoke-WebRequest -Uri ($jsonContent[$id].url) | ConvertFrom-Json | Select-Object -ExpandProperty git_url
git clone $gitUrl
If you want to clone via ssh be sure to change git_url
to ssh_url
.
The call to http://api.github.com/repositories?since=x allways represents the same last 100 repos so there is no need to randomize the number there.
The list from the first call gives you the api URL to a 100 Repos. So randomly check one of them and grep the url to clone from with another WebRequest to the api URL of the Repository.
Upvotes: 4