Introduction to Packaging¶
This document describes the current state of packaging in Python using Distribution Utilities (“Distutils”) and its extensions from the end-user’s point-of-view, describing how to extend the capabilities of a standard Python installation by building packages and installing third-party packages, modules and extensions.
Python is known for it’s “batteries included” philosophy and has a rich standard library. However, being a popular language, the number of third party packages is much larger than the number of standard library packages. So it eventually becomes necessary to discover how packages are used, found and created in Python.
It can be tedious to manually install extra packages that one needs or requires. Therefore, Python has a packaging system that allows people to distribute their programs and libraries in a standard format that makes it easy to install and use them. In addition to distributing a package, Python also provides a central service for contributing packages. This service is known as The Python Package Index (PyPI). Information about The Python Package Index (PyPI) will be provided throughout this documentation. This allows a developer to distribute a package to the greater community with little effort.
This documentation aims to explain how to install packages and create packages for the end-user, while still providing references to advanced topics.
The Packaging Ecosystem¶
A package is simply a directory with an
__init__.py file inside it. For example:
$ mkdir mypackage $ cd mypackage $ touch __init__.py $ echo "# A Package" > __init__.py $ cd ..
This creates a package that can be imported using the
>>> import mypackage >>> mypackage.__file__ 'mypackage/__init__.py'
Discovering a Python Package¶
Using packages in the current working directory only works for
small projects in most cases. Using the working directory as a package location
usually becomes a problem when distributing packages for larger systems.
distutils was created to install packages into the
PYTHONPATH with little difficulty. The
sys.path in code, is a list of locations to look for Python packages.
>>> import sys >>> sys.path ['', '/usr/local/lib/python2.6', '/usr/local/lib/python2.6/site-packages', ...] >>> import mypackage >>> mypackage.__file__ 'mypackage/__init__.py'
The first value, the null or empty string, in
sys.path is the current
working directory, which is what allows the packages in the current working
directory to be found.
PYTHONPATH values will likely be different from those
Explicitly Including a Package Location¶
The convention way of manually installing packages is to put them in the
.../site-packages/ directory. But one may want to install Python
modules into some arbitrary directory. For example, your site may have a
convention of keeping all software related to the web server application under
/www. Add-on Python modules might then belong in
/www/pythonx.y/, and in order to import them, this directory must be
sys.path. There are several different ways to add the directory.
TODO Better define where the
.../site-packages/ directory is
The most convenient way is to add a path configuration file to a directory
that’s already in Python’s path, which could be the
directory. Path configuration files have an extension of
.pth, and each
line must contain a single path that will be appended to
the new paths are appended to
sys.path, modules in the added directories
will not override standard modules. This means you can’t use this mechanism for
installing fixed versions of standard modules.)
Paths can be absolute or relative, in which case they’re relative to the
directory containing the
.pth file. See the documentation of
site module for more information.
In addition there are two environment variables that can modify
PYTHONHOME sets an alternate value for the prefix of the Python
installation. For example, if
PYTHONHOME is set to
/www/python/lib/python2.6/, the search path will be set to
PYTHONPATH variable can be set to a list of paths that will be
added to the beginning of
sys.path. For example, if
/www/python:/opt/py, the search path will begin with
['', '/www/python', '/opt/py', ...].
Directories must exist in order to be added to
site module removes paths that don’t exist.
sys.path is just a regular Python list, so any Python application
can modify it by adding or removing entries.
The zc.buildout package
modifies the sys.path in order to include all packages relative to a
zc.buildout package is often used to build large projects
that have external build requirements.
Python file layout¶
A Python installation has a
site-packages directory inside the
module directory. This directory is where user installed packages are
.pth file in this directory is maintained which
contains paths to the directories where the extra packages are
For details on the
.pth file, please refer to modifying Python’s
In short, when a new package is installed using
distutils or one of
its extenders, the contents of the package are dropped into
site-packages directory and then the name of the new package
directory is added to a
.pth file. This allows Python upon the next
startup to see the new package.
Benefits of packaging¶
While it’s possible to unpack tarballs and manually put them into your Python installation directories (see Explicitly Including a Package Location), using a package management system gives you some significant benefits. Here are some of the obvious ones:
- Dependency management
- Often, the package you want to install requires that others be there. A
package management system can automatically resolve dependencies and make
your installation pain free and quick. This is one of the basic facilities
distutils. However, other extensions to
distutilsdo a better job of installing dependencies. (see Distribute)
- Package managers can maintain lists of things installed and other metadata like the version installed etc. which makes is easy for the user to know what are the things his system has. (see Pip Installs Python (Pip))
- Package managers can give you push button ways of removing a package from your environment. (see Pip Installs Python (Pip))
- Find packages by searching a package index for specific terminology. (see Pip Installs Python (Pip))
Current State of Packaging¶
distutils modules is part of the standard library and will be
until Python 3.3. The
distutils module will be discontinued in Python
distutils2 (note the number two) will be backwards compatible
for Python 2.4 onward; and will be part of the standard library in
distutils module provides the basics for packaging Python.
distutils module is riddled with problems, which is
why a small group of python developers are working on
distutils2 is complete it is recommended that the
Developer either use pure
distutils or the `Distribute package
<distribute_info_>`_ for packaging Python software.
In the mean time, if a package requires the
setuptools package, it is our
recommendation that you install the Distribute package, which provides a more
up to date version of
setuptools than does the original Setuptools package.
In the future
distutils2 will replace
distutils, which will also remove the need for Distribute. And as stated before
distutils will be removed from
the standard library. For more information, please refer to the
Future of Packaging.
Please use the Distribute package rather than the Setuptools package because there are problems in this package that can and will not be fixed.
Creating a Micro-Ecosystem with
Here we have a small digression to briefly discuss Virtual Environments, which will be covered later in this guide. In most situations,
site-packages directory is part of the system Python installation and
not writable by unprivileged users. Also, it’s useful to have a solid reliable
installation of the language which we can use. Experimental packages shouldn’t
be mixed with the stable ones if we want to keep this quality. In order to
achieve this, most Python developers use the virtualenv
package which allows people to create a virtual installation of Python. This
site-packages directory in an user writable area. The
site-packages directory located in the Virtual Environments is in addition to
the global one. While orthogonal to the whole package installation process, it’s
an extremely useful and natural way to work and so the whole thing will be
mentioned again. The installation and usage of
virtualenv is covered in
Virtual Environments document.