Where does bitbucket server store the repository?

Taher Stenberg picture Taher Stenberg · Apr 23, 2018 · Viewed 9.1k times · Source

Background:

Some time ago my company decided to host our own Atlassian products and we have our source code on a Bitbucket server that we host ourselves. Now our VM (Windows Server 2012) is falling apart and we can't reach bitbucket from the outside.

As noone has developed on our product for a little while we are not sure if we the code that we have locally is up to date or not.

Problem:

We need to get the latest version of our source code even though we can't get bitbucket server up and running.

I have connected to the server using remote desktop but can't seem to find the files.

Question

What is the default location of the GIT repo? Is there a folder, other than "git" that I can search for that should get me to the right place?

BTW

We are moving to a hosted service...

All the best /Taher

Answer

daveruinseverything picture daveruinseverything · Apr 27, 2018

Full disclosure: I work for Atlassian and was a Premier Support Engineer for Bitbucket Server / Bitbucket Data Center for two years. That out of the way, let's get down to it:

Locating your repository

Bitbucket Server repositories are stored on disk at $BITBUCKET_HOME/shared/data/repositories/<id>, where <id> is the repository ID as found in the repository table in your database. Let's say you had a repository with the slug "jira" and the project key "ATLAS". (The slug is the url safe name you see in your git clone url; the project key is the uppercase abbreviation for your project name.) In order to find the repository ID you could run the following database query:

SELECT repository.id, repository.name, project.project_key
  FROM repository
  JOIN project ON project.id = repository.project_id
 WHERE repository.slug = 'jira'
   AND project.project_key = 'ATLAS';

With the ID in hand, you can locate the repository folder from the path above.

If, for some reason, your database is not available, there's another way. Each repository contains a repository-config file which, amongst other things, contains the repository and project names. Its contents might look something like this:

#>***********************************************
# THIS FILE IS MAINTAINED BY ATLASSIAN BITBUCKET
# IT CONTAINS NO USER-SERVICEABLE PARTS.
#>***********************************************
[bitbucket]
    hierarchy = 8597e1f873a45c2b9d3f
    project = ATLAS
    repository = jira

You could use any number of bash one-liners to find the repo you're interested in. Here's a simple one I've used before, that just searches on repository name:

find $BITBUCKET_HOME/shared/data/repositories -mindepth 2 -maxdepth 2 -type f -name repository-config -print0 | xargs -0 grep "repository = jira"

For Windows you could use findstr in combination with other Windows command line utilities to achieve a similar effect.

Using the repository

Repositories in Bitbucket Server are stored as bare repositories. Before going further it's important to understand the difference:

  • "Normal" non-bare repositories are the kind you see when you clone a repo with git clone <url> or create a new repo with git init. This kind of repo stores all of the actual git repository data, with all the history, revisions, objects etc. inside a .git subfolder in your working directory. This kind of repository has a "work tree", which basically means it has all of the files from whatever branch you are working on "checked out" and available inside your working directory. Run ls .git in a repo on your local machine and you'll see what the inside of a git repo looks like.
  • Bare repositories have no work tree, and no checked out files. What you would normally see inside a .git folder exists at the top level of a bare repo. ls inside any of the folders under $BITBUCKET_HOME/shared/data/repositories and you'll see the same kinds of files / folders you saw in your local repo's .git folder.

The reason Bitbucket stores repos as "bare" is simple: no development work is being done server-side, so there's no need to ever "check out" a particular branch or work tree to do work on.

Now, it's very, very dangerous to try and perform any actions on a bare repository in Bitbucket Server. You're bypassing the entire server mechanism, all of its internal business logic, and you're liable to break things if you try. Best case scenario you could introduce changes that are not accounted for in Bitbucket's database and create inconsistencies that will be a lot of work to recover from.

What you want to do is the same thing you would always do: clone it! The great thing about git is that it's so flexible. You can git clone from a file path just as well as you can from a URL. Find a way to access your bare repository from a workstation directly, either by mounting the disk remotely, or simply by copying the relevant <id> repo folder to your local machine. Then clone it! git clone $BITBUCKET_HOME/shared/data/repositories/<id> mylocalworkingcopy will clone your bare repo into a new folder named mylocalworkingcopy and check out the default branch. You can do whatever you need to do here.

Warning: don't push your changes back. Pushing will work, but either you'll be pushing changes to a dummy copy, or you'll be pushing changes directly to the server, bypassing Bitbucket, and introducing commits, branches and objects that aren't being tracked by Bitbucket Server. This is the inconsistent state I was referring to above. It's something you can recover from, but it's a pain in the neck and best avoided.