GitHub API: Repositories Contributed To

outoftime picture outoftime · Dec 21, 2013 · Viewed 8.6k times · Source

Is there a way to get access to the data in the “Repositories contributed to” module on GitHub profile pages via the GitHub API? Ideally the entire list, not just the top five, which are all you can get on the web apparently.

Answer

Kyle Kelley picture Kyle Kelley · Dec 25, 2014

Using Google BigQuery with the GitHub Archive, I pulled all the repositories I made a pull request to using:

SELECT repository_url 
FROM [githubarchive:github.timeline]
WHERE payload_pull_request_user_login ='rgbkrk'
GROUP BY repository_url;

You can use similar semantics to pull out just the quantities of repositories you contributed to as well as the languages they were in:

SELECT COUNT(DISTINCT repository_url) AS count_repositories_contributed_to,
       COUNT(DISTINCT repository_language) AS count_languages_in
FROM [githubarchive:github.timeline]
WHERE payload_pull_request_user_login ='rgbkrk';

If you're looking for overall contributions, which includes issues reported use

SELECT COUNT(DISTINCT repository_url) AS count_repositories_contributed_to,
       COUNT(DISTINCT repository_language) AS count_languages_in
FROM [githubarchive:github.timeline]
WHERE actor_attributes_login = 'rgbkrk'
GROUP BY repository_url;

The difference there is actor_attributes_login which comes from the Issue Events API.

You may also want to capture your own repos, which may not have issues or PRs filed by yourself.