PEP: 252 Title: Making Types Look More Like Classes Version: $Revision$ Author: guido@python.org (Guido van Rossum) Status: Draft Type: Standards Track Python-Version: 2.2 Created: 19-Apr-2001 Post-History: Abstract This PEP proposes changes to the introspection API for types that makes them look more like classes. For example, type(x) will be equivalent to x.__class__ for most built-in types. When C is x.__class__, x.meth(a) will be equivalent to C.meth(x, a), and C.__dict__ contains descriptors for x's methods and other attributes. The PEP also introduces a new approach to specifying attributes, using attribute descriptors, or descriptors for short. Descriptors unify and generalize several different common mechanisms used for describing attributes: a descriptor can describe a method, a typed field in the object structure, or a generalized attribute represented by getter and setter functions. Introduction One of Python's oldest language warts is the difference between classes and types. For example, you can't directly subclass the dictionary type, and the introspection interface for finding out what methods and instance variables an object has is different for types and for classes. Healing the class/type split is a big effort, because it affects many aspects of how Python is implemented. This PEP concerns itself with making the introspection API for types look the same as that for classes. Other PEPs will propose making classes look more like types, and subclassing from built-in types; these topics are not on the table for this PEP. Introspection APIs Introspection concerns itself with finding out what attributes an object has. Python's very general getattr/setattr API makes it impossible to guarantee that there always is a way to get a list of all attributes supported by a specific object, but in practice two conventions have appeared that together work for almost all objects. I'll call them the class-based introspection API and the type-based introspection API; class API and type API for short. The class-based introspection API is used primarily for class instances; it is also used by Jim Fulton's ExtensionClasses. It assumes that all data attributes of an object x are stored in the dictionary x.__dict__, and that all methods and class variables can be found by inspection of x's class, written as x.__class__. Classes have a __dict__ attribute, which yields a dictionary containing methods and class variables defined by the class itself, and a __bases__ attribute, which is a tuple of base classes that must be inspected recursively. Some assumption here are: - attributes defined in the instance dict override attributes defined by the object's class; - attributes defined in a derived class override attributes defined in a base class; - attributes in an earlier base class (meaning occurring earlier in __bases__) override attributes in a later base class. (The last two rules together are often summarized as the left-to-right, depth-first rule for attribute search.) The type-based introspection API is supported in one form or another by most built-in objects. It uses two special attributes, __members__ and __methods__. The __methods__ attribute, if present, is a list of method names supported by the object. The __members__ attribute, if present, is a list of data attribute names supported by the object. The type API is sometimes combined by a __dict__ that works the same was as for instances (e.g., for function objects in Python 2.1, f.__dict__ contains f's dynamic attributes, while f.__members__ lists the names of f's statically defined attributes). Some caution must be exercised: some objects don't list theire "intrinsic" attributes (e.g. __dict__ and __doc__) in __members__, while others do; sometimes attribute names that occur both in __members__ or __methods__ and as keys in __dict__, in which case it's anybody's guess whether the value found in __dict__ is used or not. The type API has never been carefully specified. It is part of Python folklore, and most third party extensions support it because they follow examples that support it. Also, any type that uses Py_FindMethod() and/or PyMember_Get() in its tp_getattr handler supports it, because these two functions special-case the attribute names __methods__ and __members__, respectively. Jim Fulton's ExtensionClasses ignore the type API, and instead emulate the class API, which is more powerful. In this PEP, I propose to phase out the type API in favor of supporting the class API for all types. One argument in favor of the class API is that it doesn't require you to create an instance in order to find out which attributes a type supports; this in turn is useful for documentation processors. For example, the socket module exports the SocketType object, but this currently doesn't tell us what methods are defined on socket objects. Using the class API, SocketType shows us exactly what the methods for socket objects are, and we can even extract their docstrings, without creating a socket. (Since this is a C extension module, the source-scanning approach to docstring extraction isn't feasible in this case.) Specification of the class-based introspection API Objects may have two kinds of attributes: static and dynamic. The names and sometimes other properties of static attributes are knowable by inspection of the object's type or class, which is accessible through obj.__class__ or type(obj). (I'm using type and class interchangeably, because that's the goal of the exercise.) (XXX static and dynamic are lousy names, because the "static" attributes may actually behave quite dynamically.) The names and values of dynamic properties are typically stored in a dictionary, and this dictionary is typically accessible as obj.__dict__. The rest of this specification is more concerned with discovering the names and properties of static attributes than with dynamic attributes. Examples of dynamic attributes are instance variables of class instances, module attributes, etc. Examples of static attributes are the methods of built-in objects like lists and dictionaries, and the attributes of frame and code objects (c.co_code, c.co_filename, etc.). When an object with dynamic attributes exposes these through its __dict__ attribute, __dict__ is a static attribute. In the discussion below, I distinguish two kinds of objects: regular objects (e.g. lists, ints, functions) and meta-objects. Meta-objects are types and classes. Meta-objects are also regular objects, but we're mostly interested in them because they are referenced by the __class__ attribute of regular objects (or by the __bases__ attribute of meta-objects). The class introspection API consists of the following elements: - the __class__ and __dict__ attributes on regular objects; - the __bases__ and __dict__ attributes on meta-objects; - precedence rules; - attribute descriptors. 1. The __dict__ attribute on regular objects A regular object may have a __dict__ attribute. If it does, this should be a mapping (not necessarily a dictionary) supporting at least __getitem__, keys(), and has_key(). This gives the dynamic attributes of the object. The keys in the mapping give attribute names, and the corresponding values give their values. Typically, the value of an attribute with a given name is the same object as the value corresponding to that name as a key in the __dict__. In othe words, obj.__dict__['spam'] is obj.spam. (But see the precedence rules below; a static attribute with the same name *may* override the dictionary item.) 2. The __class__ attribute on regular objects A regular object may have a __class__ attributes. If it does, this references a meta-object. A meta-object can define static attributes for the regular object whose __class__ it is. 3. The __dict__ attribute on meta-objects A meta-object may have a __dict__ attribute, of the same form as the __dict__ attribute for regular objects (mapping, etc). If it does, the keys of the meta-object's __dict__ are names of static attributes for the corresponding regular object. The values are attribute descriptors; we'll explain these later. (An unbound method is a special case of an attribute descriptor.) Becase a meta-object is also a regular object, the items in a meta-object's __dict__ correspond to attributes of the meta-object; however, some transformation may be applied, and bases (see below) may define additional dynamic attributes. In other words, mobj.spam is not always mobj.__dict__['spam']. (This rule contains a loophole because for classes, if C.__dict__['spam'] is a function, C.spam is an unbound method object.) 4. The __bases__ attribute on meta-objects A meta-object may have a __bases__ attribute. If it does, this should be a sequence (not necessarily a tuple) of other meta-objects, the bases. An absent __bases__ is equivalent to an empty sequece of bases. There must never be a cycle in the relationship between meta objects defined by __bases__ attributes; in other words, the __bases__ attributes define an inheritance tree, where the root of the tree is the __class__ attribute of a regular object, and the leaves of the trees are meta-objects without bases. The __dict__ attributes of the meta-objects in the inheritance tree supply attribute descriptors for the regular object whose __class__ is at the top of the inheritance tree. 5. Precedence rules When two meta-objects in the inheritance tree both define an attribute descriptor with the same name, the left-to-right depth-first rule applies. (XXX define rigorously.) When a dynamic attribute (one defined in a regular object's __dict__) has the same name as a static attribute (one defined by a meta-object in the inheritance tree rooted at the regular object's __class__), the dynamic attribute *usually* wins, but for some attributes the meta-object may specify that the static attribute overrides the dynamic attribute. (We can't have a simples rule like "static overrides dynamic" or "dynamic overrides static", because some static attributes indeed override dynamic attributes, e.g. a key '__class__' in an instance's __dict__ is ignored in favor of the statically defined __class__ pointer, but on the other hand most keys in inst.__dict__ override attributes defined in inst.__class__. The mechanism whereby a meta-object can specify that a particular attribute has precedence is not yet specified.) 6. Attribute descriptors This is where it gets interesting -- and messy. Attribute descriptors (descriptors for short) are stored in the meta-object's __dict__, and have two uses: a descriptor can be used to get or set the corresponding attribute value on the (non-meta) object, and it has an additional interface that describes the attribute for documentation or introspection purposes. There is little prior art in Python for designing the descriptor's interface, neither for getting/setting the value nor for describing the attribute otherwise, except some trivial properties (e.g. it's reasonable to assume that __name__ and __doc__ should be the attribute's name and docstring). I will propose such an API below. If an object found in the meta-object's __dict__ is not an attribute descriptor, backward compatibility dictates semantics. This basically means that if it is a Python function or an unbound method, the attribute is a method; otherwise, it is the default value for a data attribute. Backwards compatibility also dictates that (in the absence of a __setattr__ method) it is legal to assign to an attribute of type method, and that this creates a data attribute shadowing the method for this particular instance. However, these semantics are only required for backwards compatibility with regular classes. The introspection API is a read-only API. We don't define the effect of assignment to any of the special attributes (__dict__, __class__ and __bases__), nor the effect of assignment to the items of a __dict__. Generally, such assignments should be considered off-limits. An extension of this PEP may define some semantics for some such assignments. (Especially because currently instances support assignment to __class__ and __dict__, and classes support assignment to __bases__ and __dict__.) Specification of the attribute descriptor API Attribute descriptors have the following attributes. In the examples, x is an object, C is x.__class__, x.meth() is a method, and x.ivar is a data attribute or instance variable. - name: the original attribute name. Note that because of aliasing and renaming, the attribute may be known under a different name, but this is the name under which it was born. Example: C.meth.name == 'meth'. - doc: the attribute's documentation string. - objclass: the class that declared this attribute. The descriptor only applies to objects that are instances of this class (this includes instances of its subclasses). Example: C.meth.objclass is C. - kind: either "method" or "data". This distinguishes between methods and data attributes. The primary operation on a method attribute is to call it. The primary operations on a data attribute are to get and to set it. Example: C.meth.kind == 'method'; C.ivar.kind == 'data'. - default: for optional data attributes, this gives a default or initial value. XXX Python has two kinds of semantics for referencing "absent" attributes: this may raise an AttributeError, or it may produce a default value stored somewhere in the class. There could be a flag that distinguishes between these two cases. Also, there could be a flag that tells whether it's OK to delete an attribute (and what happens then -- a default value takes its place, or it's truly gone). - attrclass: for data attributes, this can be the class of the attribute value, or None. If this is not None, the attribute value is restricted to being an instance of this class (or of a subclass thereof). If this is None, the attribute value is not constrained. For method attributes, this should normally be None (a class is not sufficient information to describe a method signature). If and when optional static typing is added to Python, this the meaning of this attribute may change to describe the type of the attribute. - signature: for methods, an object that describes the signature of the method. Signature objects will be described further below. - readonly: Boolean indicating whether assignment to this attribute is disallowed. This is usually true for methods. Example: C.meth.readonly == 1; C.ivar.readonly == 0. - get(): a function of one argument that retrieves the attribute value from an object. Examples: C.ivar.get(x) ~~ x.ivar; C.meth.get(x) ~~ x.meth. - set(): a function of two arguments that sets the attribute value on the object. If readonly is set, this method raises a TypeError exception. Example: C.ivar.set(x, y) ~~ x.ivar = y. - call(): for method descriptors, this is a function of at least one argument that calls the method. The first argument is the object whose method is called; the remaining arguments (including keyword arguments) are passed on to the method. Example: C.meth.call(x, 1, 2) ~~ x.meth(1, 2). - bind(): for method descriptiors, this is a function of one argument that returns a "bound method object". This in turn can be called exactly like the method should be called (in fact this is what is returned for a bound method). This is the same as get(). Example: C.meth.bind(x) ~~ x.meth. For convenience, __name__ and __doc__ are defined as aliases for name and doc. Also for convenience, calling the descriptor can do one of three things: - Calling a method descriptor is the same as calling its call() method. Example: C.meth(x, 1, 2) ~~ x.meth(1, 2). - Calling a data descriptor with one argument is the same as calling its get() method. Example: C.ivar(x) ~~ x.ivar. - Calling a data descriptor with two arguments is the same as calling its set() method. Example: C.ivar(x, y) ~~ x.ivar = y. Note that this specification does not define how to create specific attribute descriptors. This is up to the individual attribute descriptor implementations, of which there may be many. Specification of the signature object API XXX Discussion XXX Examples XXX Backwards compatibility XXX Compatibility of C API XXX Warnings and Errors XXX Implementation A partial implementation of this PEP is available from CVS as a branch named "descr-branch". To experiment with this implementation, proceed to check out Python from CVS according to the instructions at http://sourceforge.net/cvs/?group_id=5470 but add the arguments "-r descr-branch" to the cvs checkout command. (You can also start with an existing checkout and do "cvs update -r descr-branch".) For some examples of the features described here, see the file Lib/test/test_descr.py. Note: the code in this branch goes beyond this PEP; it is also on the way to implementing PEP 253 (Subtyping Built-in Types). References XXX Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil End: