Tuesday, February 25, 2014

dh-virtualenv - the ultimate way of deploying python apps

Its been a while since I've had to deploy a complete python application. This it not a web app - just stand-alone python daemon. Its not even online, so no AWS Beanstalk, Heroku or other frameworks are relevant.

There is a good old virtualenv + pip -r requirements.txt way. But it requires custom scriptology to roll out new code, run before/after update scripts and all that other tasks that we, programmers, like leave to "integration" stage of the project :). So I've googled to check whether is there any good news on this subject.

And there are! - dh-virtualenv. IMHO it is as good as it can be if you use Debian-based distro. In the nutshell, it wraps your whole virtualenv together with your code into one single deb package. So the deployment boils down to:

  • Copy resulting deb file to target machine
  • Run sudo dpkg -i mypackage.deb
  • Run sudo apt-get install -f - this one is to automatically install missing dependencies for your package
That's it. No need for any custom install/upgrade/uninstall tinkering - just make sure your Debian package behaves as a good citizen. And dpkg system already has tons of tools for system integration, like installing cron files, config files, creating directories, etc. If you have python script files installed through setup.py then after installation you can run them just like this:
/usr/share/python/<YOU PACKAGE NAME>/bin/myscript
And it will automatically use your venv's python and all your packages! Convenient, isn't it?

Using dh-virtualenv is straight-forward, especially if you have some background on Debian packaging. I've just followed the tutorial. However there are some quirks you better be aware of:

Caching pip downloads

Each time you'll run dpkg-buildpackage dh-virtualenv will download all of the packages from your requirements.txt. If you have a dozen packages that becomes really annoying and lengthens your build time significantly.

To make things worse, pip gets random Connection reset by peer errors. Its a lesser problem when installing packages interactively, because pip uses cache and install attempt usually succeeds from the second try. But if you hit this with dh-virtualenv, next time it will start from the beginning. I found myself running dpkg-buildpackage 6-7 times until download went successfully - definitely not an option.

The cure is to run your own, local and transparent PyPi mirror. I've used devpi - its sort of a transparent caching proxy daemon for PyPi. Just follow the setup steps in the above link and you'll have it running in no time. The next step is to make dh-virtualenv aware of it. Here are my debian/rules to introduce dh-virtualenv to devpi:

#!/usr/bin/make -f
%:
    dh $@ --with python-virtualenv

override_dh_virtualenv:
    dh_virtualenv --pypi-url http://localhost:3141/root/pypi/

(Note that this a Makefile and those spaces in the beginning of the lines should be tabs!)

Build your package on a server with similar Debian/Ubuntu version

For example, building your package on Ubuntu 13.10 and deploying it Ubuntu 12.04 LTS would not work if your code uses C extensions (even through imported packages). This issue is general to virtualenv and is not dh-virtualenv specific. For example if you create virtualenv and copy it to another, binary-incompatible server, then chances are that you'll be able to run python from this copied venv, but will NOT be able to import, say, ssl. This is because python executable is embedded into virtualenv, but standard libraries are not; and of course importing module compiled against other version of interpreter would not work.

The problem does not ends here - if you upgrade python on your production server (through apt) as part of security upgrade or similar, then it will break your virtualenv!

The bottom line - if you are going to deploy to a certain version of Debian/Ubuntu, then have another separate "builder" server with the same OS where you'll build your package and note:

  • If you run apt-get upgrade - run it on both servers
  • If you upgrade python as part of the upgrade - better rebuild and redeploy your app

Hooking your own post-build commands

If you custom operations performed before the deb file is being packed it may be not obvious where to hook them. Defining override_dh_install target will probably be not that helpful, because it runs before dh_virtualenv - your python files just will not be installed yet. So the solution I've found is to hook them into dh_virtualenv override:
#!/usr/bin/make -f
%:
    dh $@ --with python-virtualenv

override_dh_virtualenv:
    dh_virtualenv --pypi-url http://localhost:3141/root/pypi/
    # !! your commands go here !!

Final words

I think that this project is a big leap forward in deploying python packages in most convenient and dependency-managed manner. I hope you'll enjoy it as much as I do.