Securing Python
#####################################################################

Status
///////////////////////////////////////

+ Dangerous types (`Constructors`_) [done]
    - file
        * Create PyFile_Init() from file_init() [done]
        * Switch current C-level uses of 'file' constructor to
          use PyFile_Type.tp_new() and PyFile_Init(). [done]
            + built-in open() [done]
            + bz2 module [done]
        * Expose PyFile_Init() in objcap module so that file
          subclasses are actually worth something. [done]
        * Create PyFile_Safe*() version of C API that goes through
          open() built-in.
    - code [done]
        * Add objcap.code_new() function [done]
    - frame
        * do not allow importing 'sys' module to get to
          sys._getframe(), sys._current_frames(), or setting a trace
          or profile function. [done]
    - object() [done]
        * Remove object.__subclasses__ (`Mutable Shared State`_) [done]
+ Sandboxed versions of built-ins (`Sanitizing Built-In Types`_)
    - open()
    - __import__() / PEP 302 importer (`Imports`_) [done]
        * Make sure importing built-in modules can be blocked.
        * Make sure that no abilities are exposed by importers since
          they will be accessible from inheritance through sys data
          dict for any created interpreters.
            + Do not inject full sys module.
            + Most likely will need to wrap built-in importer so as to
              be able to effectively block access to sys.
        * Could make __import__ self-contained such that 'sys' is not
          directly referenced.
            + Allows hiding sys.path by storing in the self-contained
              object.
            + Could expose other objects if desired.
            + Could make it so that there is only exposure of certain
              functions that allow removal from a whitelist but no
              additions.
            + Other things can be options that can only be turned off.
            + Importers could be made like this with the one-way
              narrowing of abilities.
    - execfile()
        * Force to go through open()
            + Prevents opening unauthorized files.
            + Prevents using as a way to probe filesystem.
        * Just promote removal
    - exit()
        * Have SystemExit exit the process only if no other
          interpreters are running.
+ Filesystem path hiding (`Filesystem Information`_)
+ Tweaked stdlib modules
    - mini 'sys' module (`Making the ``sys`` Module Safe`_)
    - genericpath module (for os.path when C modules blocked)
    - socket (`Safe Networking`_)
    - thread (only if worried about thread resource starvation,
      interrupt_main() not per-interpreter, and stack_size()
      considered dangerous)
+ Create sandboxed interpreter stdlib module <critical>
    - Be able to specify built-ins [done]
    - Set 'sys' module settings [done]
    - Set 'sys.modules' [done]
    - API
        * Python [done]
        * C
    - Securely handle exceptions being raised in sub-interpreter
      [done]
    - Redirect output [done]
    - Inherit abilities provided by creating interpreter ONLY
        * Only get importers as provided by creating interpreter.
        * Only get built-ins as provided by creating interpreter.
+ Tear out old restricted mode code.


Introduction
///////////////////////////////////////

As of Python 2.5, the Python does not support any form of security
model for
executing arbitrary Python code in some form of protected interpreter.
While one can use such things as ``exec`` and ``eval`` to garner a
very weak form of sandboxing, it does not provide any thorough
protections from malicious code.

This should be rectified.  This document attempts to lay out what
would be needed to secure Python in such a way as to allow arbitrary
Python code to execute in a sandboxed interpreter without worries of
that interpreter providing access to any resource of the operating
system without being given explicit authority to do so.

Throughout this document several terms are going to be used.  A
"sandboxed interpreter" is one where the built-in namespace is not the
same as that of an interpreter whose built-ins were unaltered, which
is called an "unprotected interpreter".

A "bare interpreter" is one where the built-in namespace has been
stripped down the bare minimum needed to run any form of basic Python
program.  This means that all atomic types (i.e., syntactically
supported types), ``object``, and the exceptions provided by the
``exceptions`` module are considered in the built-in namespace.  There
have also been no imports executed in the interpreter.

The "security domain" is the boundary at which security is cared
about.  For this dicussion, it is the interpreter.  Anything that
happens within a security domain is considered open and unprotected.
But any action that tries to cross the boundary of the security domain
is where the security model and protection comes in.

The "powerbox" is the thing that possesses the ultimate power in the
system for giving out abilities.  In our case it is the Python
process.  No interpreter can possess any ability that the overall
process does not have.  It is up to the Python process to initially
hand out abilities to interpreters to use either for themselves or to
give to interpreters they create themselves.  This means that we care
about interpreter<->interpreter interaction along with
interpreter<->process interactions.


Rationale
///////////////////////////////////////

Python is used extensively as an embedded language within existing
programs.  These applications often times need to provide the
functionality of allowing users to run Python code written by someone
else where they can trust that no unintentional harm will come to
their system regardless of their trust of the code they are executing.

For instance, think of an application that supports a plug-in system
with Python as the language used for writing plug-ins.  You do not
want to have to examine every plug-in you download to make sure that
it does not alter your filesystem if you can help it.  With a proper
security model and implementation in place this hinderance of having
to examine all code you execute should be alleviated.


Approaches to Security
///////////////////////////////////////

There are essentially two types of security: who-I-am
security and what-I-have security.

Who-I-Am Security
========================

With who-I-am security (a.k.a., permissions-based security), the
ability to use a resource requires providing who you are, validating
you are allowed to access the resource you are requesting, and then
performing the requested action on the resource.

The ACL security system on most UNIX filesystems is who-I-am security.
When you want to open a file, say ``/etc/passwd``, you make the
function call to open the file.  Within that function, it fetchs
the ACL for the file, finds out who the caller is, checks to see if
the caller is on the ACL for opening the file, and then proceeds to
either deny access or return an open file object.


What-I-Have Security
========================

A contrast to who-I-am security, what-I-have security never requires
knowing who is requesting a resource.  If you know what resources are
allowed or needed when you begin a security domain, you can just have
the powerbox pass in those resources to begin with and not provide a
function to retrieve them.  This alleviates the worry of providing a
function that can wield more power than the security domain should
ever have if security is breached on that function.

But if you don't know the exact resources needed ahead of time, you
pass in a proxy to the resource that checks its arguments to make sure
they are valid in terms of allowed usage of the protected resource.
With this approach, you are only doing argument validation, where the
validation happens to be related to security.  No identity check is
needed at any point.

Using our file example, the program trying to open a file is given the
open file object directly at time of creation that it will need to
work with.  A proxy to the full-powered open function can be used if
you need more of a wildcard support for opening files.  But
it works just as well, if not better, to pass in all needed file
objects at the beginning when the allowed files to work with is known
so as to not even risk exposing the file opening function.

This illustrates a subtle, but key difference between who-I-am and
what-I-have security.  For who-I-am, you must know who the caller is
and check that the arguments are valid for the person calling.  For
what-I-have security, you only have to validate the arguments.


Object-Capabilities
///////////////////////////////////////

What-I-have security is a super-set of the object-capabilities
security model.  The belief here is in POLA (Principle Of Least
Authority): you give a program exactly what it needs, and no more.  By
providing a function that can open any file that relies on identity to
decide if to open something, you are still providing a fully capable
function that just requires faking one's identity to circumvent
security.  It also means that if you accidentally run code that
performs actions that you did not expect (e.g., deleting all your
files), there is no way to stop it since it operates with *your*
permissions.

Using POLA and object-capabilities, you only give access to resources
to the extent that someone needs.  This means if a program only needs
access to a single file, you only give them a function that can open
that single file.  If you accidentally run code that tries to delete
all of your files, it can only delete the one file you authorized the
program to open.

Object-capabilities use the reference graph of objects to provide the
security of accessing resources.  If you do not have a reference to a
resource (or a reference to an object that can references a resource),
you cannot access it, period.  You can provide conditional access by
using a proxy between code and a resource, but that still requires a
reference to the resource by the proxy.  This means that your security
model can be viewed simply by using a whiteboard to draw out the
interactions between your security domains where by any connection
between domains is a possible security issue if you do not put in a
proxy to mediate between the two domains.

This leads to a much cleaner implementation of security.  By not
having to change internal code in the interpreter to perform identity
checks, you can instead shift the burden of security to proxies
which are much more flexible and have less of an adverse affect on the
interpreter directly (assuming you have the basic requirements for
object-capabilities met).


Difficulties in Python for Object-Capabilities
//////////////////////////////////////////////

In order to provide the proper protection of references that
object-capabilities require, you must set up a secure perimeter
defense around your security domain.  The domain can be anthing:
objects, interpreters, processes, etc.  The point is that the domain
is where you draw the line for allowing arbitrary access to resources.
This means that with the interpreter is the security domain, then
anything within an interpreter can be expected to be freely shared,
but beyond that, reference access is strictly controlled.

Three key requirements for providing a proper perimeter defence is
private namespaces, immutable shared state across domains, and
unforgeable references.  Unfortunately Python only has one of the
three requirements by default (you cannot forge a reference in Python
code).


Problem of No Private Namespace
===============================

Typically, in languages that are statically typed (like C++), you have
public and private attributes on objects.  Those private attributes
provide a private namespace for the class and instances that are not
accessible by other objects.

The Python language has no such thing as a persistent, private
namespace.  The language has the philosophy that if exposing something
to the programmer could provide some use, then it is exposed.  This
has led to Python having a wonderful amount of introspection
abilities.  Unfortunately this makes the possibility of a private
namespace non-existent.  This poses an issue for providing proxies for
resources since there is no way in Python code to hide the reference
to a resource.  It also makes providing security at the object level
using object-capabilities non-existent in pure Python code without
changing the language (e.g., protecting nested scoped variables from
external introspection).

Luckily, the Python virtual machine *does* provide a private namespace,
albeit not for pure Python source code.  If you use the Python/C
language barrier in extension modules, you can provide a private
namespace by using the struct allocated for each instance of an
object.  This provides a way to create proxies, written in C, that can
protect resources properly.  Throughout this document, when mentioning
proxies, it is assumed they have been implemented in C.


Problem of Mutable Shared State
===============================

Another problem that Python's introspection abilties cause is that of
mutable shared state.  At the interpreter level, there has never been
a concerted effort to isolate state shared between all interpreters
running in the same Python process.  Sometimes this is for performance
reasons, sometimes because it is just easier to implement this way.
Regardless, sharing of state that can be influenced by another
interpreter is not safe for object-capabilities.

To rectify the situation, some changes will be needed to some built-in
objects in Python.  It should mostly consist of abstracting or
refactoring certain abilities out to an extension module so that
access can be protected using import guards.  For instance, as it
stands now, ``object.__subclasses__()`` will return a tuple of all of
its subclasses, regardless of what interpreter the subclass was
defined in.


Threat Model
///////////////////////////////////////

The threat that this security model is attempting to handle is the
execution of arbitrary Python code in a sandboxed interpreter such
that the code in that interpreter is not able to harm anything outside
of itself unless explicitly allowed to.  This means that:

* An interpreter cannot gain abilties the Python process possesses
  without explicitly being given those abilities.
    + With the Python process being the powerbox, if an interpreter
      could gain whatever abilities it wanted to then the security
      domain would be completely breached.
* An interpreter cannot influence another interpreter directly at the
  Python level without explicitly allowing it.
    + This includes preventing communicating with another interpreter.
    + Mutable objects cannot be shared between interpreters without
      explicit allowance for it.
    + "Explicit allowance" includes the importation of C extension
      modules because a technical detail requires that these modules
      not be re-initialized per interpreter, meaning that all
      interpreters in a single Python process share the same C
      extension modules.
* An interpreter cannot use operating system resources without being
  explicitly given those resources.
    + This includes importing modules since that requires the ability
      to use the resource of the filesystem.
    + This is mediated by having to go through the process to gain the
    abilities in the OS that the process possesses.

In order to accomplish these goals, certain things must be made true.

* The Python process is the powerbox.
    + It controls the initial granting of abilties to interpreters.
* A bare Python interpreter is always trusted.
    + Python source code that can be created in a bare interpreter is
      always trusted.
    + Python source code created within a bare interpreter cannot
      crash the interpreter.
* Python bytecode is always distrusted.
    + Malicious bytecode can bring down an interpreter.
* Pure Python source code is always safe on its own.
    + Python source code is not able to violate the restrictions
      placed upon the interpreter it is running in.
    + Possibly malicious abilities are derived from C extension
      modules, built-in modules, and unsafe types implemented in C,
      not from pure Python source.
* A sub-interpreter started by another interpreter does not inherit
  any state.
    + The sub-interpreter starts out with a fresh global namespace and
      whatever built-ins it was initially given.


Guiding Principles
///////////////////////////////////////

To begin, the Python process garners all power as the powerbox.  It is
up to the process to initially hand out access to resources and
abilities to interpreters.  This might take the form of an interpreter
with all abilities granted (i.e., a standard interpreter as launched
when you execute Python), which then creates sub-interpreters with
sandboxed abilities.  Another alternative is only creating
interpreters with sandboxed abilities (i.e., Python being embedded in
an application that only uses sandboxed interpreters).

All security measures should never have to ask who an interpreter is.
This means that what abilities an interpreter has should not be stored
at the interpreter level when the security can be provided at the
Python level.  This means that while supporting a memory cap can
have a per-interpreter setting that is checked (because access to the
operating system's memory allocator is not supported at the program
level), protecting files and imports should not such a per-interpreter
protection at such a low level (because those can have extension
module proxies to provide the security).  This means that security is
based on possessing the authority to do something through a reference
to an object that can perform the action.  And that object will most
likely decide whether to carry out its action based on the arguments
passed in (whether that is an opaque token, file path allowed to be
opened, etc.).

For common case security measures, the Python standard library
(stdlib) should provide a simple way to provide those measures.  Most
commonly this will take the form of providing factory functions that
create instances of proxies for providing protection of key resources.

Backwards-compatibility will not be a hindrance upon the design or
implementation of the security model.  Because the security model will
inherently remove resources and abilities that existing code expects,
it is not reasonable to expect all existing code to work in a
sandboxed interpreter.

Keeping Python "pythonic" is required for all design decisions.  
In general, being pythonic means that something fits the general
design guidelines of the Python programming language (run
``import this`` from a Python interpreter to see the basic ones).
If removing an ability leads to something being unpythonic, it will not
be done unless there is an extremely compelling reason to do so.
This does not mean existing pythonic code must continue to work, but
the spirit of being pythonic will not be compromised in the name of the
security model.  While this might lead to a weaker security model, this
is a price that must be paid in order for Python to continue to be the
language that it is.


Implementation
///////////////////////////////////////

Restricting what is in the built-in namespace and the safe-guarding
the interpreter (which includes safe-guarding the built-in types) is
where the majority of security will come from.  Imports and the
``file`` type are both part of the standard namespace and must be
restricted in order for any security implementation to be effective.
The built-in types which are needed for basic Python usage (e.g.,
``object`` code objects, etc.) must be made safe to use in a sandboxed
interpreter since they are easily accessbile and yet required for
Python to function.

The rest of the security for Python will come in the form of
protecting physical resources.  For those resources that can be denied
in a Denial of Service (DoS) attack but protected in a
platform-agnositc fashion, they should.  This means, for instance,
that memory should be protected but CPU usage can't.


Abilities of a Standard Sandboxed Interpreter
=============================================

In the end, a standard sandboxed interpreter should (not)
allow certain things to be doable by code running within itself.
Below is a list of abilities that will (not) be allowed in the default
instance of a sandboxed interpreter comparative to an unprotected
interpreter that has not imported any modules.  These protections can
be tweaked by using proxies to allow for certain extended abilities to
be accessible.

* You cannot open any files directly.
* Importation
    + You can import any pure Python module.
    + You cannot import any Python bytecode module.
    + You cannot import any C extension module.
    + You cannot import any built-in module.
* You cannot find out any information about the operating system you
  are running on.
* Only safe built-ins are provided.


Implementation Details
========================

An important point to keep in mind when reading about the
implementation details for the security model is that these are
general changes and are not special to any type of interpreter,
sandboxed or otherwise.  That means if a change to a built-in type is
suggested and it does not involve a proxy, that change is meant
Python-wide for *all* interpreters.


Imports
-------

A proxy for protecting imports will be provided.  This is done by
setting the proper values in 'sys' that involve imports:
sys.path, sys.meta_path, sys.path_hooks, and sys.path_importer.cache.

It must be warned that importing any C extension module is dangerous.
Not only are they able to circumvent security measures by executing C
code, but they share state across interpreters.  Because an extension
module's init function is only called once for the Python *process*,
its initial state is set only once.  This means that if some mutable
object is exposed at the module level, a sandboxed interpreter could
mutate that object, return, and then if the creating interpreter
accesses that mutated object it is essentially communicating and/or
acting on behalf of the sandboxed interpreter.  This violates the
perimeter defence.  No one should import extension modules blindly.

Bytecode files will be flat-out disallowed.  Because malicious
bytecode can be created that can crash the interpreter all bytecode
files will be ignored.


Implementing Phase 2 of PEP 302
+++++++++++++++++++++++++++++++

Currently Python's built-in importer is monolithic in that __import__
will automatically import a .py, .pyc, extension modules, or built-in
modules if a custom importer does handle the import.  This does not
give one the flexibility needed to control imports at the level of
file type.

In order to be able to prevent extension module imports and .pyc file
imports, the current import machinery will be refactored to be PEP
302 importers.  This will allow for better control over what can and
cannot be imported.


Implementing Import in Python
+++++++++++++++++++++++++++++

The import machinery should be rewritten in Python.  The C code is
considered delicate and does not lend itself to being read.  There is
also not a very strong definition of the import rules.  Rewriting
import in Python would help clarify the semantics of imports.

This rewrite will require some
bootstrapping in order for the code to be loaded into the process
without itself requiring importation, but that should be doable.  Plus
some care must be taken to not lead to circular dependency on
importing modules needed to handle importing (e.g. importing sys but
having that import call the import call, etc.).

Interaction with another interpreter that might provide an import
function must also be dealt with.  One cannot expose the importation
of a needed module for the import machinery as.
This can be handled by allowing the powerbox's import
function to have modules directly injected into its global namespace.
But there is also the issue of using the proper ``sys.modules`` for
storing the modules already imported.  You do not want to inject the
``sys`` module of the powerbox and have all imports end up in its
``sys.modules`` but in the interpreter making the call.  This must be
dealt with in some fashion (injecting per-call, having a factory
function create a new import function based on an interpreter passed
in, etc.).


Python/import.c Notes for PEP 302 Phase 2
+++++++++++++++++++++++++++++++++++++++++

__import__ -> PyImport_ImportModuleLevel() -> import_module_level()

import_module_level()
    -> get_parent()
        Find out if import is occurring in a package and return the
        parent module in the package.
    -> load_next()
        Find out next module to import in chain in dotted name (e.g.,
        ``b`` in ``a.b.c`` if ``a`` already imported but not ``b``).

load_next() -> import_submodule()

import_submodule()
    -> find_module() *
        Check meta_path, path_hooks/path_importer_cache, else file
        location for built-in import.
    -> load_module() *
        Execute loading of code based on type of import:
        * PY_SOURCE -> path_hook
        * PY_COMPILED -> path_hook
        * C_EXTENSION -> path_hook
        * PKG_DIRECTORY -> path_hook
            Could it be worked so that if package found put in an
            empty module in sys.modules and then import pkg.__init__?
        * C_BUILTIN -> meta_path
        * PY_FROZEN -> meta_path
        * IMP_HOOK = importer -> path_hook

Changes to find_module()
    * Make only use PEP 302 resolution.
    * Rip out part that finds file/directory and make it the importer
      that finds files.  Then have different file types register their
      file extension and a factory function to call to get the
      importer to return (if proper file type found).
      objects for py source, bytecode, packages, and extension
      modules. (path_hooks factory function comes from part that just
      verifies that sys.path entry is a directory).

Changes to load_module()
    * Rip out switch and have only do loaders.
    * Change py source, bytecode, packages, and extension module
      loaders be individual loaders.
    * Change built-ins and frozen module loaders to be meta_path
      loaders.


Sanitizing Built-In Types
-------------------------

Python contains a wealth of bulit-in types.  These are used at a basic
level so that they are easily accessible to any Python code.  They are
also shared amongst all interpreters in a Python process.  This means
all built-in types need to be made safe (e.g., immutable shared
state) so that they can be used by any and all interpreters in a
single Python process.  Several aspects of built-in types need to be
examined.


Constructors
++++++++++++

Almost all of Python's built-in types contain a constructor that allows
code to create a new instance of a type as long as you have the type
itself.  Unfortunately this does not work well in an object-capabilities
system without either providing a proxy to the constructor or just
removing it when access to such a constructor should be controlled.

The plan is to remove select constructors of the types that are
dangerous and either relocate them to an extension module as factory
functions or create a new built-in that acts a generic factory
function for all types, missing constructor or not.  The former approach
will allow for protections to be enforced by import proxy; just don't
allow the extension module to be imported.  The latter approach would
allow either a unique constructor per type, or more generic built-in(s)
for construction (e.g., introducing a ``construct()`` function that
takes in a type and any arguments desired to be passed in for
constructing an instance of the type) and allowing using proxies to
provide security.

Some might consider this unpythonic.  Python very rarely separates the
constructor of an object from the class/type and require that you go
through a function.  But there is some precedent for not using a
type's constructor to get an instance of a type.  The ``file`` type,
for instance, typically has its instances created through the
``open()`` function.  This slight shift for certain types to have their
(dangerous) constructor not on the type but in a function is
considered an acceptable compromise.

Types whose constructors are considered dangerous are:

* ``file``
    + Will definitely use the ``open()`` built-in.
* code objects
* XXX sockets?


Filesystem Information
++++++++++++++++++++++

When running code in a sandboxed interpreter, POLA suggests that you
do not want to expose information about your environment on top of
protecting its use.  This means that filesystem paths typically should
not be exposed.  Unfortunately, Python exposes file paths all over the
place that will need to be hidden:

* Modules
    + ``__file__`` attribute
* Code objects
    + ``co_filename`` attribute
* Packages
    + ``__path__`` attribute
* XXX

XXX how to expose safely?  ``path()`` built-in?


Mutable Shared State
++++++++++++++++++++

Because built-in types are shared between interpreters, they cannot
expose any mutable shared state.  Unfortunately, as it stands, some
do.  Below is a list of types that share some form of dangerous state,
how they share it, and how to fix the problem:

* ``object``
    + ``__subclasses__()`` function
        - Remove the function; never seen used in real-world code.


Perimeter Defences Between a Created Interpreter and Its Creator
----------------------------------------------------------------

The plan is to allow interpreters to instantiate sandboxed
interpreters safely.  By using the creating interpreter's abilities to
provide abilities to the created interpreter, you make sure there is
no escalation in abilities.

But by creating a sandboxed interpreter and passing in any code into
it, you open up the chance of possible ways of getting back to the
creating interpreter or escalating privileges.  Those ways are:

* ``__del__`` created in sandboxed interpreter but object is cleaned
  up in unprotected interpreter.
  - XXX Watch out for objects being set in __builtin__.__dict__ and
    thus not cleaned up until the interpreter object is deleted and
    thus possibly executed in the creator's environment!
* Using frames to walk the frame stack back to another interpreter.
* XXX A generator's execution frame?


Making the ``sys`` Module Safe
------------------------------

The ``sys`` module is an odd mix of both information and settings for
the interpreter.  Because of this dichotomy, some very useful, but
innocuous information is stored in the module along with things that
should not be exposed to sandboxed interpreters.  This includes
settings that are global to the Python process along with settings
that are specific to each interpreter.

This means that the ``sys`` module needs to have its safe information
separated out from the unsafe settings.  This will allow an import
proxy to let through safe information but block out the ability to set
values.

This separation will also require some reworking of the underpinnings
of how interpreters are created as currently Py_NewInterpreter() sets
an interpreter's sys module dict to one that is shared by *all*
interpreters.

XXX separate modules, ``sys.settings`` and ``sys.info``, or strip
``sys`` to settings and put info somewhere else (interpreter?)?  Or
provide a method that will create a faked sys module that has the safe
values copied into it?

The safe attributes are:

* builtin_module_names : Modules/config.c:PyImport_Inittab
    Information about what might be blocked from importation.
* byteorder : Python/sysmodule.c:_PySys_Init()
    Needed for networking.
* copyright : Python/getcopyright.c:Py_GetCopyright()
    Set to a string about the interpreter.
* displayhook() : per-interpreter (Python/sysmodule.c:sys_displayhook())
    (?)
* excepthook() : per-interpreter (Python/sysmodule.c:sys_excepthook())
    (?)
* exc_info() : per-thread (Python/sysmodule.c:sys_exc_info())
    (?)
* exc_clear() : per-thread (Python/sysmodule.c:sys_exc_clear())
    (?)
* exit() : per-thread (Python/sysmodule.c:sys_exit())
    Raises SystemExit (XXX make sure only exits interpreter if
    multiple interpreters running)
* getcheckinterval() : per-process (Python/ceval.c:_Py_CheckInterval)
    Returns an int.
* getdefaultencoding() : per-process (Objects/unicodeobject.c:PyUnicode_GetDefaultEncoding())
    Returns a string about interpreter.
* getrefcount() : per-object
    Returns an int about the passed-in object.
* getrecursionlimit() : per-process (Python/ceval.c:Py_GetRecursionLimit())
    Returns an int about the interpreter.
* hexversion : Python/sysmodule.c:_PySys_Init()
    Set to an int about the interpreter.
* last_type : Python/pythonrun.c:PyErr_PrintEx()
    (XXX make sure doesn't return value from creating interpreter) 
* last_value : Python/pythonrun.c:PyErr_PrintEx()
    (XXX see last_type worry)
* last_traceback : Python/pythonrun.c:PyErr_PrintEx()
    (?)
* maxint : Objects/intobject.c:PyInt_GetMax()
    Set to an int that exposes ambiguous information about the
    computer.
* maxunicode : Objects/unicodeobject.c:PyUnicode_GetMax()
    Returns a string about the interpreter.
* meta_path : Python/import.c:_PyImportHooks_Init()
    (?)
* path_hooks : Python/import.c:_PyImportHooks_Init()
    (?)
* path_importer_cache : Python/import.c:_PyImportHooks_Init()
    (?)
* ps1 : Python/pythonrun.c:PyRun_InteractiveLoopFlags()
* ps2 : Python/pythonrun.c:PyRun_InteractiveLoopFlags()
* stdin : Python/sysmodule.c:_PySys_Init()
* stdout : Python/sysmodule.c:_PySys_Init()
* stderr : Python/sysmodule.c:_PySys_Init()
* tracebacklimit : (XXX don't know where it is set)
    (?)
* version : Python/sysmodule.c:_PySys_Init()
* api_version : Python/sysmodule.c:_PySys_Init()
* version_info : Python/sysmodule.c:_PySys_Init()
* warnoptions : Python/sysmodule.c:_PySys_Init()
    (?) (XXX per-process value)

The dangerous settings are:

* argv : Modules/main.c:Py_Main()
* subversion : Python/sysmodule.c:_PySys_Init()
* _current_frames() : per-thread (Python/sysmodule.c:sys_current_frames())
* __displayhook__ : Python/sysmodule.c:_PySys_Init()
    (?)
* __excepthook__ : Python/sysmodule.c:_PySys_Init()
    (?)
* dllhandle : Python/sysmodule.c:_PySys_Init()
* exc_type : Python/ceval.c:(re)?set_exc_info()
    Deprecated since 1.5 .
* exc_value : Python/ceval.c:(re)?set_exc_info()
    Deprecated since 1.5 .
* exc_traceback : Python/ceval.c:(re)?set_exc_info()
    Deprecated since 1.5 .
* exec_prefix : Python/sysmodule.c:_PySys_Init()
    Exposes filesystem information.
* executable : Python/sysmodule.c:_PySys_Init()
    Exposes filesystem information.
* exitfunc : set by user
    Deprecated.
* _getframe() : per-thread (Python/sysmodule.c:sys_getframe())
* getwindowsversion() : per-process (Python/sysmodule.c:sys_getwindowsversion())
    Exposes OS information.
* modules : per-interpreter (Python/pythonrun.c:(Py_InitializeEx() | Py_NewInterpreter()))
* path : per-interpreter (Python/sysmodule.c:PySys_SetPath() called by Py_InitializeEx() and Py_NewInterpreter())
* platform : Python/sysmodule.c:_PySys_Init()
    Exposes OS information.
* prefix : Python/sysmodule.c:_PySys_Init()
    Exposes filesystem information.
* setcheckinterval() : per-process (Python/sysmodule.c:sys_setcheckinterval())
* setdefaultencoding() : per-process
* (Python/sysmodule.c:sys_setdefaultencoding() using PyUnicode_SetDefaultEncoding())
* setdlopenflags() : per-interpreter (Python/sysmodule.c:sys_setdlopenflags())
* setprofile() : per-thread (Python/sysmodule.c:sys_setprofile() using PyEval_SetProfile())
* setrecursionlimit() : per-process (Python/sysmodule.c:sys_setrecursionlimit() using Py_SetRecursionLimit())
* settrace() : per-thread (Python/sysmodule.c:sys_settrace() using PyEval_SetTrace())
* settcsdump() : per-interpreter (Python/sysmodule.c:set_settscdump())
* __stdin__ : Python/sysmodule.c:_PySys_Init()
* __stdout__ : Python/sysmodule.c:_PySys_Init()
* __stderr__ : Python/sysmodule.c:_PySys_Init()
* winver : Python/sysmodule.c:_PySys_Init()
    Exposes OS information.


Protecting I/O
++++++++++++++

The ``print`` keyword and the built-ins ``raw_input()`` and
``input()`` use the values stored in ``sys.stdout`` and ``sys.stdin``.
By exposing these attributes to the creating interpreter, one can set
them to safe objects, such as instances of ``StringIO``.


Safe Networking
---------------

XXX proxy on socket module, modify open() to be the constructor, etc.


Protecting Memory Usage
-----------------------

To protect memory, low-level hooks into the memory allocator for
Python is needed.  By hooking into the C API for memory allocation and
deallocation a *very* rough running count of used memory can be kept.
This can be used to compare against the set memory cap to prevent
sandboxed interpreters from using so much memory that it impacts the
overall performance of the system.

Because this has no direct connection with object-capabilities or has
any form of exposure at the Python level, this feature can be safely
implemented separately from the rest of the security model.

Existing APIs to protect are (as declared in Include/pymem.h and
Include/objimpl.h; both header files must be thoroughly checked for
public API and macros):

* PyObject_Malloc(), macros, & friends
* PyObject_New(), macros, & friends
* _PyObject_New(), & friends
* PyMem_Malloc(), macros, & friends
* PyObject_GC_New(), & friends

An implementation is in Python's svn repository under the
bcannon-sandboxing branch.