In Python you install application dependencies and (preferably) the application itself using pip tool. When we run it during image build, pip just installs everything under /usr
so there is no immediate way to copy artifacts (that is the app and its dependencies installed by pip) into the next build stage.
The solution that I came up with is to coerce pip to install everything into a dedicated directory. There are many ways of doing so, but from my experiments I found installing with --user
flag and properly setting PYTHONUSERBASE
as the most convenient way to install both Python libraries and app binaries (e.g. entrypoint scripts).
Eventually it's quite straight forward and I wonder why I didn't find any formal guides on this.
One caveat I came along later is if you have packages already installed in system as part of pip/pipenv/setuptools dependencies, pip will not reinstall them under /pyroot
, hence there will be missing dependencies in production image - this is the reason for using --ignore-installed
flag.
Without further ado, let's see how it can be done.
Setup
Lets use a sample Python Hello World project that contains a propersetup.py
to install both the app's libs and the entrypoint script.
Note: I urge you to use setup.py
even if you don't plan to distribute your app. Simply copying your Python sources into docker image will eventually break - you may end up copying __pycache__
directories, tests, test fixtures, etc. Having a working setup.py
makes it easy to use your app as an installable component in other apps/images.
Let's setup our test environment:
git clone git@github.com:haizaar/python-helloworld.git
cd python-helloworld/
# Add some artificial requirements to make the example more real
echo pycrypto==2.6.1 > requirements.txt
The Dockerfile
All the "magic" is happening below. I've added inline comments to ease on reading.
FROM alpine:3.8 AS builder
ENV LANG C.UTF-8
# This is our runtime
RUN apk add --no-cache python3
RUN ln -sf /usr/bin/pip3 /usr/bin/pip
RUN ln -sf /usr/bin/python3 /usr/bin/python
# This is dev runtime
RUN apk add --no-cache --virtual .build-deps build-base python3-dev
# Using latest versions, but pinning them
RUN pip install --upgrade pip==19.0.1
RUN pip install --upgrade setuptools==40.4.1
# This is where pip will install to
ENV PYROOT /pyroot
# A convenience to have console_scripts in PATH
ENV PATH $PYROOT/bin:$PATH
ENV PYTHONUSERBASE $PYROOT
# THE MAIN COURSE #
WORKDIR /build
# Install dependencies
COPY requirements.txt ./
RUN pip install --user --ignore-installed -r requirements.txt
# Install our application
COPY . ./
RUN pip install --user .
####################
# Production image #
####################
FROM alpine:3.8 AS prod
# This is our runtime, again
# It's better be refactored to a separate image to avoid instruction duplication
RUN apk add --no-cache python3
RUN ln -sf /usr/bin/pip3 /usr/bin/pip
RUN ln -sf /usr/bin/python3 /usr/bin/python
ENV PYROOT /pyroot
ENV PATH $PYROOT/bin:$PATH
ENV PYTHONPATH $PYROOT/lib/python:$PATH
# This is crucial for pkg_resources to work
ENV PYTHONUSERBASE $PYROOT
# Finally, copy artifacts
COPY --from=builder $PYROOT/lib/ $PYROOT/lib/
# In most cases we don't need entry points provided by other libraries
COPY --from=builder $PYROOT/bin/helloworld_in_python $PYROOT/bin/
CMD ["helloworld_in_python"]
Let's see that it works:
$ docker build -t pyhello .
$ docker run --rm -ti pyhello
Hello, world
As I mentioned before - it's really straight forward. So far I've managed to pack one of our real apps with the approach and it works well so far.
Using pipenv?
If you use pipenv, which I like a lot, you can happily apply the same approach. It's a bit tricky to coerce pipenv to install into a separate dir, but this command does the trick:
# THE MAIN COURSE #
WORKDIR /build
# Install dependencies
COPY Pipfile Pipfile.lock ./
# --ignore-installed is vital to re-install packages that are already present
# (e.g. brought by pipenv dependencies) into $PYROOT
# Need to use pip eventually because of https://github.com/pypa/pipenv/issues/4453
RUN set -ex && \
export HOME=/tmp && \
pipenv lock -r | pip install --user --ignore-installed -r /dev/stdin
# Install our application
COPY . ./
RUN pip install --user .
Thanks. This was very helpful for figuring out how to make Docker work with Pipenv. I do want to point out two errors (I think) in the pipenv instructions:
ReplyDeletePipefile.lock should be Pipfile.lock (it took me a while to figure out why it kept saying the file didn't exist).
Missing "RUN" command before pipenv install:
RUN PIP_USER=1 PIP_IGNORE_INSTALLED=1 pipenv install --system --deploy
Thanks for the remarks - fixed! I'm glad to hear it helped someone. I surely missed such guide when I was digging into it myself originally.
DeleteThis comment has been removed by the author.
ReplyDeleteI think this is the first time I have ever commented on an article, and I just wanted to say that it is awesome!
ReplyDeleteI'm happy it was helpful and thanks for the feedback.
DeleteHi, can you provide full example for pipenv. I can't get it working.
ReplyDeleteWhat error are you getting exactly?
DeleteRecent pipenv versions broke support for PIP_IGNORE_INSTALLED env var. I've fixed the example to address that.