How to avoid reinstalling dependencies for each job in Gitlab CI

Tamlyn picture Tamlyn · Nov 4, 2015 · Viewed 13.5k times · Source

I'm using Gitlab CI 8.0 with gitlab-ci-multi-runner 0.6.0. I have a .gitlab-ci.yml file similar to the following:

before_script:
  - npm install

server_tests:
  script: mocha

client_tests:
  script: karma start karma.conf.js

This works but it means the dependencies are installed independently before each test job. For a large project with many dependencies this adds a considerable overhead.

In Jenkins I would use one job to install dependencies then TAR them up and create a build artefact which is then copied to downstream jobs. Would something similar work with Gitlab CI? Is there a recommended approach?

Answer

Tamlyn picture Tamlyn · Dec 3, 2015

Update: I now recommend using artifacts with a short expire_in. This is superior to cache because it only has to write the artifact once per pipeline whereas the cache is updated after every job. Also the cache is per runner so if you run your jobs in parallel on multiple runners it's not guaranteed to be populated, unlike artifacts which are stored centrally.


Gitlab CI 8.2 adds runner caching which lets you reuse files between builds. However I've found this to be very slow.

Instead I've implemented my own caching system using a bit of shell scripting:

before_script:
  # unique hash of required dependencies
  - PACKAGE_HASH=($(md5sum package.json))
  # path to cache file
  - DEPS_CACHE=/tmp/dependencies_${PACKAGE_HASH}.tar.gz
  # Check if cache file exists and if not, create it
  - if [ -f $DEPS_CACHE ];
    then
      tar zxf $DEPS_CACHE;
    else
      npm install --quiet;
      tar zcf - ./node_modules > $DEPS_CACHE;
    fi

This will run before every job in your .gitlab-ci.yml and only install your dependencies if package.json has changed or the cache file is missing (e.g. first run, or file was manually deleted). Note that if you have several runners on different servers, they will each have their own cache file.

You may want to clear out the cache file on a regular basis in order to get the latest dependencies. We do this with the following cron entry:

@daily               find /tmp/dependencies_* -mtime +1 -type f -delete