PEP: 376 Title: Changing the .egg-info structure Version: $Revision$ Last-Modified: $Date$ Author: Tarek Ziadé Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 22-Feb-2009 Python-Version: 2.7, 3.1 Post-History: Abstract ======== This PEP proposes various enhancements for Distutils: - A new format for the .egg-info structure. - An install script to install a package in Python. - An uninstall script to uninstall a package in Python. Rationale ========= There are three problems right now in the way packages are installed in Python: - There are too many ways to install a package in Python. - There is no way to uninstall a package. - There is no API to get the metadata of installed packages. How packages are installed -------------------------- Right now, when a package is installed in Python, using the Distutils `install` command, the `install_egg_info` subcommand is called in order to create an `.egg-info` file in the site-packages directory, right beside the package itself. For example, if the `zlib` package is installed, two elements will be installed in `site-packages`:: - zlib - zlib-2.5.2-py2.6.egg-info Where `zlib` is the package itself, and `zlib-2.5.2-py2.6.egg-info` is a file containing the package metadata as described in PEP 314. This file corresponds to the file called `PKG-INFO`, built by the `sdist` command. The problem is that many people use `easy_install` (setuptools) or `pip` to install their packages, and these third-party tools do not install packages in the same way that Distutils does: - `easy_install` creates an `EGG-INFO` directory inside an `.egg` directory, and adds a `PKG-INFO` file inside this directory, amongst other files. - `pip` creates an `.egg-info` directory inside the site-packages directory besides the package, and adds a `PKG-INFO` file inside it. They both add other files in the `EGG-INFO` or `.egg-info` directory, and create or modify `.pth` files. `pip` also creates one `.pth` file per installed package, which may lead to slow initialisation of Python. The uninstall command --------------------- Python doesn't provide any `uninstall` command. If you want to uninstall a package, you have to be a power user and remove the package directory from the right site-packages directory, then look over the right pth files. And this method differs, depending on the tools you are using. The worst issue is that you depend on the way the packager created his package. When you call `python setup.py install`, it will not be installed the same way depending on the tool used by the packager (distutils or setuptools). But there's common behavior: files are copied in your installation. And there's a way to keep track of theses file, so to remove them. Installing a package -------------------- There are too many different ways to install a package in Python: - by hand, by getting a distribution and running the install command - using `easy_install`, the script provided by setuptools - using `pip` The problem is: they do not install the package the same way, and Python should provide one and only one way to do it. What this PEP proposes ---------------------- To address those issues, this PEP proposes a few changes: - a new `.egg-info` structure using a directory; - a list of elements this directory holds; - some new functions in `pkgutil` - addition of an install and an uninstall script .egg-info becomes a directory ============================= The first change would be to make `.egg-info` a directory and let it hold the `PKG-INFO` file built by the `write_pkg_file` method. This change will not impact Python itself, because this file is not used anywhere yet in the standard library. So there's no need of deprecation. Although it will impact the `setuptools` and `pip` projects, but given the fact that they already work with a directory that contains a `PKG-INFO` file, the change will be small. For example, if the `zlib` package is installed, two elements will be installed in `site-packages`:: - zlib - zlib-2.5.2-py2.6.egg-info/ PKG-INFO To be able to implement this change, the impacted code in Distutils is the `install_egg_info` command. Adding MANIFEST and RECORD in the .egg-info directory ===================================================== Some files can be added inside the `.egg-info` directory at installation time. They will all be UPPERCASE files. - the `MANIFEST` file built by the `sdist` command. Notice that some fixes were made lately on the default file names added in `MANIFEST` when `MANIFEST.in` is not provided (see #2279 for instance). - the `RECORD` file will hold the list of installed files. These correspond to the files listed by the `record` option of the `install` command, and will always be generated. This will allow uninstall, as explained later in this PEP. The `install` command will record by default installed files in the RECORD file. The `sdist` module will introduce an `EGG_INFO_FILES` constant to list all files located in the `.egg-info` directory:: from collections import namedtuple EggInfos = namedtuple('EggInfo', 'manifest record pkg_info') # files added in egg-info EGG_INFO_FILES = EggInfos('MANIFEST', 'RECORD', 'PKG-INFO') Back to our `zlib` example, we will have:: - zlib - zlib-2.5.2-py2.6.egg-info/ PKG-INFO MANIFEST RECORD XXX See if we want to keep the 2.5.2-py2.6 part New functions in pkgutil ======================== To use the `.egg-info` directory content, we need to add in the standard library a set of APIs. The best place to put these APIs seems to be `pkgutil`. The new functions added in the package are : - get_egg_info(pkg_name) -> path or None Scans all site-packages directories and looks for all `pkg_name.egg-info` directories. Returns the directory path that contains a PKG-INFO that matches `pkg_name` for the `name` metadata. Notice that there should be at most one result. If more than one path matches the pkg_name, a DistutilsError is raised. If the directory is not found, returns None. - get_metadata(pkg_name) -> DistributionMetadata or None Uses `get_egg_info` to get the `PKG-INFO` file, and returns a `DistributionMetadata` instance that contains the metadata. This will require a small change in `DistributionMetadata` (see #4908). - get_egg_info_file(pkg_name, filename) -> file object or None Uses `get_egg_info` and gets any file inside the directory, pointed by filename. filename can be any value found in `distutils.sdist.EGG_INFO_FILES`. Let's use it with our `zlib` example:: >>> from pkgutil import get_egg_info, get_metadata, get_egg_info_file >>> get_egg_info('zlib') '/opt/local/lib/python2.6/site-packages/zlib-2.5.2-py2.6.egg-info' >>> metadata = get_metadata('zlib') >>> metadata.version '2.5.2' >>> from distutils.dist import EGG_INFO_FILES >>> get_egg_info_file('zlib', EGG_INFO_FILES.manifest).read() some ... files Adding an install and an uninstall script ========================================= `easy_install` and `pip` does basically the same work, besides other features that are not discussed here. - they look for the package at PyPI - they download it and build it - they install it inside the site-packages directory - they add an entry in a .pth file A new script called `install.py` is added in a new directory called `scripts` in Distutils, and lets people run an installation using:: $ python -m distutils.scripts.install zlib An uninstall command is added as well that removes the files recorded in the RECORD file. This removal will warn on files that no longer exist and will not take care of side effects, like the removal of a file used by another element of the system. XXX work to be done here : specification of the two commands (probably a mix of easy_install and pip) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: