Reputation: 1327
I need to retrieve the name of all the organizations from the contributors of a GitHub repository. I know that the following API request gives the list of the first 100 contributors of a repository, ordered by their number of contributions:
https://api.github.com/repos/{owner}/{repo}/contributors?per_page=200&anon=true
However, this API call only shows the metadata from the first 100 contributors of a repo, and there is no information about which organizations they belong to.
What I need is iterating through the list of all contributors and retrieve all the organizations they belong to. Is there any way to get this information?
Upvotes: 0
Views: 433
Reputation: 1327
GitHub users may belong to several organizations. The organizations administrator can accept or reject a GitHub user in the organization. On the other hand, Github users can also choose to declare that they belong to a particular company, and this information is displayed on their GitHub profile. Here is an illustrative example:
This information can be obtained via the GitHub API using bash commands (notably curl
, jq
, and grep
). Here are working script examples of how to get this data for the GitHub repository ConsenSys/teku
.
#!/bin/bash
# Get the GitHub API url of each user that contributed to the ConsenSys/teku project.
# Store in a txt file the list of URLs for post processing.
curl --fail --silent --show-error https://api.github.com/repos/ConsenSys/teku/contributors\?per_page\=100\&page\=1\&anon\=true | jq -r '.[].organizations_url' | grep 'https' > teku_contributors_organizations_url.txt
# Iterate over the list of URLs and append the curl result to a txt file.
for i in $(cat teku_contributors_organizations_url.txt); do
content="$(curl -s "$i")"
echo "$content" >> teku_contributors_organizations_url_data.txt
done
# Show the organizations that contributed to the project ordered by number of instances
cat teku_contributors_organizations_url_data.txt | grep "login" | sort | uniq -c | sort -nr > teku_contributors_organizations.txt
OUTPUT
4 "login": "ConsenSys",
3 "login": "hyperledger",
3 "login": "arithm3tica",
3 "login": "PegaSysEng",
3 "login": "EntEthAlliance",
2 "login": "apache",
1 "login": "tmio",
1 "login": "splunkdlt",
1 "login": "solsuite",
1 "login": "sigp",
1 "login": "puniverse",
1 "login": "prrkl",
1 "login": "openethereum",
1 "login": "mana-ethereum",
1 "login": "jbosgi",
1 "login": "goerli",
1 "login": "exthereum",
1 "login": "ethsearch",
1 "login": "ethjs",
1 "login": "ethereum",
1 "login": "eth-clients",
1 "login": "eclipse",
1 "login": "deltap2p",
1 "login": "dappnode",
1 "login": "byz-f",
1 "login": "arquillian",
1 "login": "argentlabs",
1 "login": "Thera169",
1 "login": "InternetOfPeers",
1 "login": "Department-of-Decentralization",
1 "login": "ChainSafe",
1 "login": "Centareum"
#!/bin/bash
# Get the GitHub API url of each user that contributed to the hyperledger/teku project.
# Store in a txt file the list of URLs for post processing.
curl --fail --silent --show-error https://api.github.com/repos/ConsenSys/teku/contributors\?per_page\=100\&page\=1\&anon\=true | jq -r '.[].url' | grep 'https' > teku_contributors_urls.txt
# Iterate over the list of URLs and append the curl result to a txt file.
for i in $(cat teku_contributors_urls.txt); do
content="$(curl -s "$i")"
echo "$content" >> teku_contributors_data.txt
done
# Show the list companies that contributed to the project ordered by number of instances
cat teku_contributors_data.txt | grep "company" | sort | uniq -c | sort -nr > teku_contributors_companies.txt
OUTPUT
5 "company": "Consensys",
3 "company": "ConsenSys",
2 "company": "AlmavivA",
2 "company": "@tmio ",
2 "company": "@sushiswap ",
2 "company": "@eucrypt",
2 "company": "@element-fi",
2 "company": "@derivadex",
2 "company": "@blk-io",
2 "company": "@PegaSysEng | @ConsenSys",
2 "company": "@Consensys",
2 "company": "@ConsenSys @hyperledger ",
2 "company": "@ConsenSys ",
2 "company": "@ChainSafe ",
2 "company": "@ArgentLabs",
1 "company": "RedHat Inc.",
1 "company": "Hedera @hashgraph",
1 "company": "Ethereum",
1 "company": "Contract Worker for the Ethereum Foundation",
1 "company": "Base2 Cloud",
1 "company": "Baisysoft",
1 "company": "@coinbase",
1 "company": "@KiFoundation "
NOTE: For repositories with more than 100 contributors, you may get the message: “API rate limit exceeded.” This issue can be solved authenticating your GitHub account by adding the flag -u <user>:<token>
to curl
.
Upvotes: 1