Get all file names from a Github repo through the Github API

Anton Antonov picture Anton Antonov · Jul 29, 2014 · Viewed 19.4k times · Source

Is it possible to get all the file names from repository using the GitHub API?

I'm currently trying to tinker this using PyGithub, but I'm totally ok with manually doing the request as long as it works.

My algorithm so far is:

  1. Get the user repo names
  2. Get the user repo that matches a certain description
  3. ??? get repo file names?

Answer

Chris picture Chris · Jul 29, 2014

This will have to be relative to a particular commit, as some files may be present in some commits and absent in others, so before you can look at files you'll need to use something like List commits on a repository:

GET /repos/:owner/:repo/commits

If you're just interested in the latest commit on a branch you can set the sha parameter to the branch name:

sha string SHA or branch to start listing commits from.

Once you have a commit hash, you can inspect that commit

GET /repos/:owner/:repo/git/commits/:sha

which should return something like this (truncated from GitHub's documentation):

{
  "sha": "...",
  "...",
  "tree": {
    "url": "https://api.github.com/repos/octocat/Hello-World/git/trees/691272480426f78a0138979dd3ce63b77f706feb",
    "sha": "691272480426f78a0138979dd3ce63b77f706feb"
  },
  "...": "..."
}

Look at the hash of its tree, which is essentially its directory contents. In this case, 691272480426f78a0138979dd3ce63b77f706feb. Now we can finally request the contents of that tree:

GET /repos/:owner/:repo/git/trees/:sha

The output from GitHub's example is

{
  "sha": "9fb037999f264ba9a7fc6274d15fa3ae2ab98312",
  "url": "https://api.github.com/repos/octocat/Hello-World/trees/9fb037999f264ba9a7fc6274d15fa3ae2ab98312",
  "tree": [
    {
      "path": "file.rb",
      "mode": "100644",
      "type": "blob",
      "size": 30,
      "sha": "44b4fc6d56897b048c772eb4087f854f46256132",
      "url": "https://api.github.com/repos/octocat/Hello-World/git/blobs/44b4fc6d56897b048c772eb4087f854f46256132"
    },
    {
      "path": "subdir",
      "mode": "040000",
      "type": "tree",
      "sha": "f484d249c660418515fb01c2b9662073663c242e",
      "url": "https://api.github.com/repos/octocat/Hello-World/git/blobs/f484d249c660418515fb01c2b9662073663c242e"
    },
    {
      "path": "exec_file",
      "mode": "100755",
      "type": "blob",
      "size": 75,
      "sha": "45b983be36b73c0788dc9cbcb76cbb80fc7bb057",
      "url": "https://api.github.com/repos/octocat/Hello-World/git/blobs/45b983be36b73c0788dc9cbcb76cbb80fc7bb057"
    }
  ]
}

As you can see, we have some blobs, which correspond to files, and some additional trees, which correspond to subdirectories. You may want to do this recursively.