GitLab CI preserve environment between build stages

Philip O'Brien picture Philip O'Brien · Aug 10, 2016 · Viewed 11.3k times · Source

I am working on a python project and using miniconda to manage my environment. I am using GitLab for CI with the following runner configuration

stages:
  - build
  - test 

build:
  stage: build
  script:
    - if hash $HOME/miniconda/bin/conda 2>/dev/null; 
      then
         export PATH="$HOME/miniconda/bin:$PATH";
      else
        wget http://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh -O miniconda.sh;
        bash miniconda.sh -b -p $HOME/miniconda;
        export PATH="$HOME/miniconda/bin:$PATH";
      fi
    - conda update --yes conda

test:
  stage: test
  script:
    - conda env create --quiet --force --file environment.yml
    - source activate myenv
    - nosetests --with-coverage --cover-erase --cover-package=mypackage --cover-html
    - pylint --reports=n tests/test_final.py
    - pep8 tests/test_final.py
    - grep pc_cov cover/index.html | egrep -o "[0-9]+\%" | awk '{ print "covered " $1;}'

I assumed (incorrectly) that my build stage would setup the correct environment in which I could run my test stage. Looking at this question and this GitLab issue I see that

each job defined in .gitlab-ci.yml is run as separate build (where we assume that there's no history)

But the alternative of lumping everything together in one stage isn't appealing

stages:
  - test 

test:
  stage: test
  script:
    - if hash $HOME/miniconda/bin/conda 2>/dev/null; 
      then
         export PATH="$HOME/miniconda/bin:$PATH";
      else
        wget http://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh -O miniconda.sh;
        bash miniconda.sh -b -p $HOME/miniconda;
        export PATH="$HOME/miniconda/bin:$PATH";
      fi
    - conda update --yes conda
    - conda env create --quiet --force --file environment.yml
    - source activate myenv
    - nosetests --with-coverage --cover-erase --cover-package=mypackage --cover-html
    - pylint --reports=n tests/test_final.py
    - pep8 tests/test_final.py
    - grep pc_cov cover/index.html | egrep -o "[0-9]+\%" | awk '{ print "covered " $1;}'

The only other option I can think of is to put the environment creation steps in a before_script stage, but it seems redundant to continuously recreate the same environment before each stage.

Answer

tmt picture tmt · Aug 10, 2016

The independence of the jobs is a design feature. You might have noticed that GitLab's interface allows you to re-run a single job which wouldn't be possible if the jobs depended on each other.

I don't know what Miniconda exactly performs but if it builds a virtual environment in specific folders, you can use cache to preserve the content of those folders between the jobs. However, you cannot fully rely on it because the documentation states that...

The cache is provided on a best-effort basis, so don't expect that the cache will be always present. For implementation details, please check GitLab Runner.

Considering that your job absolutely depends on the environment being built, you would need a mechanism to detect whether the (cached) environment exists and re-create it only if needed.

I think you are taking good path trying to separate the environment setup and the jobs because it might save lots of time in case you decide one day to run different tests simultaneously (jobs at the same stage run in parallel).