I have the below Dockerfile.
FROM ubuntu:14.04
MAINTAINER Samuel Alexander <[email protected]>
RUN apt-get -y install software-properties-common
RUN apt-get -y update
# Install Java.
RUN echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | debconf-set-selections
RUN add-apt-repository -y ppa:webupd8team/java
RUN apt-get -y update
RUN apt-get install -y oracle-java8-installer
RUN rm -rf /var/lib/apt/lists/*
RUN rm -rf /var/cache/oracle-jdk8-installer
# Define working directory.
WORKDIR /work
# Define commonly used JAVA_HOME variable
ENV JAVA_HOME /usr/lib/jvm/java-8-oracle
# JAVA PATH
ENV PATH /usr/lib/jvm/java-8-oracle/bin:$PATH
# Install maven
RUN apt-get -y update
RUN apt-get -y install maven
# Install Open SSH and git
RUN apt-get -y install openssh-server
RUN apt-get -y install git
# clone Spark
RUN git clone https://github.com/apache/spark.git
WORKDIR /work/spark
RUN mvn -DskipTests clean package
# clone and build zeppelin fork
RUN git clone https://github.com/apache/incubator-zeppelin.git
WORKDIR /work/incubator-zeppelin
RUN mvn clean package -Pspark-1.6 -Phadoop-2.6 -DskipTests
# Install Supervisord
RUN apt-get -y install supervisor
RUN mkdir -p var/log/supervisor
# Configure Supervisord
COPY conf/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
# bash
RUN sed -i s#/home/git:/bin/false#/home/git:/bin/bash# /etc/passwd
EXPOSE 8080 8082
CMD ["/usr/bin/supervisord"]
While building image it failed in step 23 i.e.
RUN mvn clean package -Pspark-1.6 -Phadoop-2.6 -DskipTests
Now when I rebuild it starts to build from step 23 as docker is using cache.
But if I want to rebuild the image from step 21 i.e.
RUN git clone https://github.com/apache/incubator-zeppelin.git
How can I do that? Is deleting the cached image is the only option? Is there any additional parameter to do that?
You can rebuild the entire thing without using the cache by doing a
docker build --no-cache -t user/image-name
To force a rerun starting at a specific line, you can pass an arg that is otherwise unused. Docker passes ARG values as environment variables to your RUN command, so changing an ARG is a change to the command which breaks the cache. It's not even necessary to define it yourself on the RUN line.
FROM ubuntu:14.04
MAINTAINER Samuel Alexander <[email protected]>
RUN apt-get -y install software-properties-common
RUN apt-get -y update
# Install Java.
RUN echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | debconf-set-selections
RUN add-apt-repository -y ppa:webupd8team/java
RUN apt-get -y update
RUN apt-get install -y oracle-java8-installer
RUN rm -rf /var/lib/apt/lists/*
RUN rm -rf /var/cache/oracle-jdk8-installer
# Define working directory.
WORKDIR /work
# Define commonly used JAVA_HOME variable
ENV JAVA_HOME /usr/lib/jvm/java-8-oracle
# JAVA PATH
ENV PATH /usr/lib/jvm/java-8-oracle/bin:$PATH
# Install maven
RUN apt-get -y update
RUN apt-get -y install maven
# Install Open SSH and git
RUN apt-get -y install openssh-server
RUN apt-get -y install git
# clone Spark
RUN git clone https://github.com/apache/spark.git
WORKDIR /work/spark
RUN mvn -DskipTests clean package
# clone and build zeppelin fork, changing INCUBATOR_VER will break the cache here
ARG INCUBATOR_VER=unknown
RUN git clone https://github.com/apache/incubator-zeppelin.git
WORKDIR /work/incubator-zeppelin
RUN mvn clean package -Pspark-1.6 -Phadoop-2.6 -DskipTests
# Install Supervisord
RUN apt-get -y install supervisor
RUN mkdir -p var/log/supervisor
# Configure Supervisord
COPY conf/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
# bash
RUN sed -i s#/home/git:/bin/false#/home/git:/bin/bash# /etc/passwd
EXPOSE 8080 8082
CMD ["/usr/bin/supervisord"]
And then just run it with a unique arg:
docker build --build-arg INCUBATOR_VER=20160613.2 -t user/image-name .
To change the argument with every build, you can pass a timestamp as the arg:
docker build --build-arg INCUBATOR_VER=$(date +%Y%m%d-%H%M%S) -t user/image-name .
or:
docker build --build-arg INCUBATOR_VER=$(date +%s) -t user/image-name .
As an aside, I'd recommend the following changes to keep your layers smaller, the more you can merge the cleanup and delete steps on a single RUN
command after the download and install, the smaller your final image will be. Otherwise your layers will include all the intermediate steps between the download and cleanup:
FROM ubuntu:14.04
MAINTAINER Samuel Alexander <[email protected]>
RUN DEBIAN_FRONTEND=noninteractive \
apt-get -y install software-properties-common && \
apt-get -y update
# Install Java.
RUN echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | debconf-set-selections && \
add-apt-repository -y ppa:webupd8team/java && \
apt-get -y update && \
DEBIAN_FRONTEND=noninteractive \
apt-get install -y oracle-java8-installer && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* && \
rm -rf /var/cache/oracle-jdk8-installer && \
# Define working directory.
WORKDIR /work
# Define commonly used JAVA_HOME variable
ENV JAVA_HOME /usr/lib/jvm/java-8-oracle
# JAVA PATH
ENV PATH /usr/lib/jvm/java-8-oracle/bin:$PATH
# Install maven
RUN apt-get -y update && \
DEBIAN_FRONTEND=noninteractive \
apt-get -y install
maven \
openssh-server \
git \
supervisor && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# clone Spark
RUN git clone https://github.com/apache/spark.git
WORKDIR /work/spark
RUN mvn -DskipTests clean package
# clone and build zeppelin fork
ARG INCUBATOR_VER=unknown
RUN git clone https://github.com/apache/incubator-zeppelin.git
WORKDIR /work/incubator-zeppelin
RUN mvn clean package -Pspark-1.6 -Phadoop-2.6 -DskipTests
# Configure Supervisord
RUN mkdir -p var/log/supervisor
COPY conf/supervisord.conf /etc/supervisor/conf.d/supervisord.conf
# bash
RUN sed -i s#/home/git:/bin/false#/home/git:/bin/bash# /etc/passwd
EXPOSE 8080 8082
CMD ["/usr/bin/supervisord"]