So far as I understand it, messages are associated with commits. But when you look at a repo on GitHub it helpfully lists the message by each file, for when it was last changed.
I'd like to replicate that in a web view of a repo I have. Looking at the GitHub api it looks to me the only way to get that info is to download all the commits (which can be paged), and work from the most recent ones assigning commit messages to the files in your local cache, going further and further back until you've got the message for every file, potentially to the very first commit, if any of the files have not been changed since the initial commit
Question is, is that the right way to do it? Is that not going to kill even the 5000/hr quota?
Ok, after figuring out that what you need is the latest commit message for each file, here's what you can do.
First, get the list of files in your repository. To do this, you need to:
1) fetch the reference object of the branch that you want to list files for:
GET https://api.github.com/repos/:owner/:repo/git/refs/heads/:branch
You probably want the master branch, so this is an example of the request you will make:
https://api.github.com/repos/izuzak/pmrpc/git/refs/heads/master
The response you will get will look like this:
{
"ref": "refs/heads/master",
"url": "https://api.github.com/repos/izuzak/pmrpc/git/refs/heads/master",
"object": {
"sha": "fd6973f430a3367ad718ff049f1b075843913d6f",
"type": "commit",
"url": "https://api.github.com/repos/izuzak/pmrpc/git/commits/fd6973f430a3367ad718ff049f1b075843913d6f"
}
}
2) fetch the commit object that the reference points to, using the object.url
property of the response you received in the previous step:
GET https://api.github.com/repos/izuzak/pmrpc/git/commits/fd6973f430a3367ad718ff049f1b075843913d6f
The response you will get will look like this:
{
"sha": "fd6973f430a3367ad718ff049f1b075843913d6f",
"url": "https://api.github.com/repos/izuzak/pmrpc/git/commits/fd6973f430a3367ad718ff049f1b075843913d6f",
"html_url": "https://github.com/izuzak/pmrpc/commits/fd6973f430a3367ad718ff049f1b075843913d6f",
"author": {
"name": "Ivan Zuzak",
"email": "[email protected]",
"date": "2013-04-09T08:55:45Z"
},
"committer": {
"name": "Ivan Zuzak",
"email": "[email protected]",
"date": "2013-04-09T08:55:45Z"
},
"tree": {
"sha": "f5f5de80f67dd794ffbd4abb855fb7d1a573660e",
"url": "https://api.github.com/repos/izuzak/pmrpc/git/trees/f5f5de80f67dd794ffbd4abb855fb7d1a573660e"
},
"message": "fix typos",
"parents": [
{
"sha": "d3617ae56dda793131e743b2ff394984bbab6ca3",
"url": "https://api.github.com/repos/izuzak/pmrpc/git/commits/d3617ae56dda793131e743b2ff394984bbab6ca3",
"html_url": "https://github.com/izuzak/pmrpc/commits/d3617ae56dda793131e743b2ff394984bbab6ca3"
}
]
}
3) fetch the tree object of the commit object fetched in the previous step. You will do this by following the tree.url
link provided in the response of the previous step:
GET https://api.github.com/repos/izuzak/pmrpc/git/trees/f5f5de80f67dd794ffbd4abb855fb7d1a573660e
The response will look like this:
{
"sha": "f5f5de80f67dd794ffbd4abb855fb7d1a573660e",
"url": "https://api.github.com/repos/izuzak/pmrpc/git/trees/f5f5de80f67dd794ffbd4abb855fb7d1a573660e",
"tree": [
{
"mode": "100644",
"type": "blob",
"sha": "726f21a4adec8c24c2fab6cf5b455d094a8b21bf",
"path": "LICENSE.markdown",
"size": 568,
"url": "https://api.github.com/repos/izuzak/pmrpc/git/blobs/726f21a4adec8c24c2fab6cf5b455d094a8b21bf"
},
{
"mode": "100644",
"type": "blob",
"sha": "eb94760b81441b34a73d1b085d9f153ae48b0e63",
"path": "README.markdown",
"size": 5772,
"url": "https://api.github.com/repos/izuzak/pmrpc/git/blobs/eb94760b81441b34a73d1b085d9f153ae48b0e63"
},
{
"mode": "040000",
"type": "tree",
"sha": "2e72b217b8644ce6874cda03387a7ab2d8eee55e",
"path": "examples",
"url": "https://api.github.com/repos/izuzak/pmrpc/git/trees/2e72b217b8644ce6874cda03387a7ab2d8eee55e"
},
{
"mode": "100644",
"type": "blob",
"sha": "64b0dbe4981759c0f9640c8e882c97c7324fc798",
"path": "pmrpc.js",
"size": 24546,
"url": "https://api.github.com/repos/izuzak/pmrpc/git/blobs/64b0dbe4981759c0f9640c8e882c97c7324fc798"
}
]
}
These are all the files and folders in the repository. Notice however that for folders you will need to recursively fetch the folder tree object to get the list of files in the folder. In the response above, the examples
is a folder which you can see by the tree value of the type property. So, you would to another GET request on the url provided with the folder:
GET https://api.github.com/repos/izuzak/pmrpc/git/trees/2e72b217b8644ce6874cda03387a7ab2d8eee55e
An alternative approach is to get the list of all files (in all folders) with just one request, using the recursive=1
parameter, as described here. I suggest you use this approach since it requires just a single HTTP request.
Next, now that you have the list of files and folders in the repo, you will get the last commit that changed each of the files/folders. To do that, make this request
GET https://api.github.com/repos/:user/:repo/commits?path=FILE_OR_FOLDER_PATH
So, for example, this is a request to fetch the commits for the examples
folder mentioned above:
GET https://api.github.com/repos/izuzak/pmrpc/commits?path=examples
The response you will get is a list of commit object, and you should just look at the first object in that list (since you are interested in the last commit for the file) and retrieve the commit.message
property to get the message you need:
[
{
"sha": "3437f015257683a86e3b973b3279754df9ac2b24",
"commit": {
"author": { ... },
"committer": { ... },
"message": "change mode",
"tree": { ... },
"url": "https://api.github.com/repos/izuzak/pmrpc/git/commits/3437f015257683a86e3b973b3279754df9ac2b24",
"comment_count": 0
},
...
},
{
...
}
]
In this case, the message for the latest commit that changed the folder examples
is "change mode."
So, basically, you need to make 3 HTTP requests to fetch the list of files, and then 1 HTTP request for each file and folder. The bad news is that if you have lots of files -- you will be making lots of HTTP requests. The good news is that you can cache responses so that you don't need to make requests if nothing changed (see here for more info). Also, you will not be fetching all the commit messages at once, you will fetch them as the user navigates through the folders (just as on GitHub as you click on folders). Thus you should be able to stay within limits of 5000 requests easily.
Hope this helps! And let me know if you find an easier way to do this :). I don't know if theres a way to achieve this with just 1-2 requests, which is probably what you expected.