Securing Python ##################################################################### Status /////////////////////////////////////// + Dangerous types (`Constructors`_) [done] - file * Create PyFile_Init() from file_init() [done] * Switch current C-level uses of 'file' constructor to use PyFile_Type.tp_new() and PyFile_Init(). [done] + built-in open() [done] + bz2 module [done] * Expose PyFile_Init() in objcap module so that file subclasses are actually worth something. [done] * Create PyFile_Safe*() version of C API that goes through open() built-in. - code [done] * Add objcap.code_new() function [done] - frame * do not allow importing 'sys' module to get to sys._getframe(), sys._current_frames(), or setting a trace or profile function. [done] - object() [done] * Remove object.__subclasses__ (`Mutable Shared State`_) [done] + Sandboxed versions of built-ins (`Sanitizing Built-In Types`_) - open() - __import__() / PEP 302 importer (`Imports`_) [done] * Make sure importing built-in modules can be blocked. * Make sure that no abilities are exposed by importers since they will be accessible from inheritance through sys data dict for any created interpreters. + Do not inject full sys module. + Most likely will need to wrap built-in importer so as to be able to effectively block access to sys. * Could make __import__ self-contained such that 'sys' is not directly referenced. + Allows hiding sys.path by storing in the self-contained object. + Could expose other objects if desired. + Could make it so that there is only exposure of certain functions that allow removal from a whitelist but no additions. + Other things can be options that can only be turned off. + Importers could be made like this with the one-way narrowing of abilities. - execfile() * Force to go through open() + Prevents opening unauthorized files. + Prevents using as a way to probe filesystem. * Just promote removal - exit() * Have SystemExit exit the process only if no other interpreters are running. + Filesystem path hiding (`Filesystem Information`_) + Tweaked stdlib modules - mini 'sys' module (`Making the ``sys`` Module Safe`_) - genericpath module (for os.path when C modules blocked) - socket (`Safe Networking`_) - thread (only if worried about thread resource starvation, interrupt_main() not per-interpreter, and stack_size() considered dangerous) + Create sandboxed interpreter stdlib module - Be able to specify built-ins [done] - Set 'sys' module settings [done] - Set 'sys.modules' [done] - API * Python [done] * C - Securely handle exceptions being raised in sub-interpreter [done] - Redirect output [done] - Inherit abilities provided by creating interpreter ONLY * Only get importers as provided by creating interpreter. * Only get built-ins as provided by creating interpreter. + Tear out old restricted mode code. Introduction /////////////////////////////////////// As of Python 2.5, the Python does not support any form of security model for executing arbitrary Python code in some form of protected interpreter. While one can use such things as ``exec`` and ``eval`` to garner a very weak form of sandboxing, it does not provide any thorough protections from malicious code. This should be rectified. This document attempts to lay out what would be needed to secure Python in such a way as to allow arbitrary Python code to execute in a sandboxed interpreter without worries of that interpreter providing access to any resource of the operating system without being given explicit authority to do so. Throughout this document several terms are going to be used. A "sandboxed interpreter" is one where the built-in namespace is not the same as that of an interpreter whose built-ins were unaltered, which is called an "unprotected interpreter". A "bare interpreter" is one where the built-in namespace has been stripped down the bare minimum needed to run any form of basic Python program. This means that all atomic types (i.e., syntactically supported types), ``object``, and the exceptions provided by the ``exceptions`` module are considered in the built-in namespace. There have also been no imports executed in the interpreter. The "security domain" is the boundary at which security is cared about. For this dicussion, it is the interpreter. Anything that happens within a security domain is considered open and unprotected. But any action that tries to cross the boundary of the security domain is where the security model and protection comes in. The "powerbox" is the thing that possesses the ultimate power in the system for giving out abilities. In our case it is the Python process. No interpreter can possess any ability that the overall process does not have. It is up to the Python process to initially hand out abilities to interpreters to use either for themselves or to give to interpreters they create themselves. This means that we care about interpreter<->interpreter interaction along with interpreter<->process interactions. Rationale /////////////////////////////////////// Python is used extensively as an embedded language within existing programs. These applications often times need to provide the functionality of allowing users to run Python code written by someone else where they can trust that no unintentional harm will come to their system regardless of their trust of the code they are executing. For instance, think of an application that supports a plug-in system with Python as the language used for writing plug-ins. You do not want to have to examine every plug-in you download to make sure that it does not alter your filesystem if you can help it. With a proper security model and implementation in place this hinderance of having to examine all code you execute should be alleviated. Approaches to Security /////////////////////////////////////// There are essentially two types of security: who-I-am security and what-I-have security. Who-I-Am Security ======================== With who-I-am security (a.k.a., permissions-based security), the ability to use a resource requires providing who you are, validating you are allowed to access the resource you are requesting, and then performing the requested action on the resource. The ACL security system on most UNIX filesystems is who-I-am security. When you want to open a file, say ``/etc/passwd``, you make the function call to open the file. Within that function, it fetchs the ACL for the file, finds out who the caller is, checks to see if the caller is on the ACL for opening the file, and then proceeds to either deny access or return an open file object. What-I-Have Security ======================== A contrast to who-I-am security, what-I-have security never requires knowing who is requesting a resource. If you know what resources are allowed or needed when you begin a security domain, you can just have the powerbox pass in those resources to begin with and not provide a function to retrieve them. This alleviates the worry of providing a function that can wield more power than the security domain should ever have if security is breached on that function. But if you don't know the exact resources needed ahead of time, you pass in a proxy to the resource that checks its arguments to make sure they are valid in terms of allowed usage of the protected resource. With this approach, you are only doing argument validation, where the validation happens to be related to security. No identity check is needed at any point. Using our file example, the program trying to open a file is given the open file object directly at time of creation that it will need to work with. A proxy to the full-powered open function can be used if you need more of a wildcard support for opening files. But it works just as well, if not better, to pass in all needed file objects at the beginning when the allowed files to work with is known so as to not even risk exposing the file opening function. This illustrates a subtle, but key difference between who-I-am and what-I-have security. For who-I-am, you must know who the caller is and check that the arguments are valid for the person calling. For what-I-have security, you only have to validate the arguments. Object-Capabilities /////////////////////////////////////// What-I-have security is a super-set of the object-capabilities security model. The belief here is in POLA (Principle Of Least Authority): you give a program exactly what it needs, and no more. By providing a function that can open any file that relies on identity to decide if to open something, you are still providing a fully capable function that just requires faking one's identity to circumvent security. It also means that if you accidentally run code that performs actions that you did not expect (e.g., deleting all your files), there is no way to stop it since it operates with *your* permissions. Using POLA and object-capabilities, you only give access to resources to the extent that someone needs. This means if a program only needs access to a single file, you only give them a function that can open that single file. If you accidentally run code that tries to delete all of your files, it can only delete the one file you authorized the program to open. Object-capabilities use the reference graph of objects to provide the security of accessing resources. If you do not have a reference to a resource (or a reference to an object that can references a resource), you cannot access it, period. You can provide conditional access by using a proxy between code and a resource, but that still requires a reference to the resource by the proxy. This means that your security model can be viewed simply by using a whiteboard to draw out the interactions between your security domains where by any connection between domains is a possible security issue if you do not put in a proxy to mediate between the two domains. This leads to a much cleaner implementation of security. By not having to change internal code in the interpreter to perform identity checks, you can instead shift the burden of security to proxies which are much more flexible and have less of an adverse affect on the interpreter directly (assuming you have the basic requirements for object-capabilities met). Difficulties in Python for Object-Capabilities ////////////////////////////////////////////// In order to provide the proper protection of references that object-capabilities require, you must set up a secure perimeter defense around your security domain. The domain can be anthing: objects, interpreters, processes, etc. The point is that the domain is where you draw the line for allowing arbitrary access to resources. This means that with the interpreter is the security domain, then anything within an interpreter can be expected to be freely shared, but beyond that, reference access is strictly controlled. Three key requirements for providing a proper perimeter defence is private namespaces, immutable shared state across domains, and unforgeable references. Unfortunately Python only has one of the three requirements by default (you cannot forge a reference in Python code). Problem of No Private Namespace =============================== Typically, in languages that are statically typed (like C++), you have public and private attributes on objects. Those private attributes provide a private namespace for the class and instances that are not accessible by other objects. The Python language has no such thing as a persistent, private namespace. The language has the philosophy that if exposing something to the programmer could provide some use, then it is exposed. This has led to Python having a wonderful amount of introspection abilities. Unfortunately this makes the possibility of a private namespace non-existent. This poses an issue for providing proxies for resources since there is no way in Python code to hide the reference to a resource. It also makes providing security at the object level using object-capabilities non-existent in pure Python code without changing the language (e.g., protecting nested scoped variables from external introspection). Luckily, the Python virtual machine *does* provide a private namespace, albeit not for pure Python source code. If you use the Python/C language barrier in extension modules, you can provide a private namespace by using the struct allocated for each instance of an object. This provides a way to create proxies, written in C, that can protect resources properly. Throughout this document, when mentioning proxies, it is assumed they have been implemented in C. Problem of Mutable Shared State =============================== Another problem that Python's introspection abilties cause is that of mutable shared state. At the interpreter level, there has never been a concerted effort to isolate state shared between all interpreters running in the same Python process. Sometimes this is for performance reasons, sometimes because it is just easier to implement this way. Regardless, sharing of state that can be influenced by another interpreter is not safe for object-capabilities. To rectify the situation, some changes will be needed to some built-in objects in Python. It should mostly consist of abstracting or refactoring certain abilities out to an extension module so that access can be protected using import guards. For instance, as it stands now, ``object.__subclasses__()`` will return a tuple of all of its subclasses, regardless of what interpreter the subclass was defined in. Threat Model /////////////////////////////////////// The threat that this security model is attempting to handle is the execution of arbitrary Python code in a sandboxed interpreter such that the code in that interpreter is not able to harm anything outside of itself unless explicitly allowed to. This means that: * An interpreter cannot gain abilties the Python process possesses without explicitly being given those abilities. + With the Python process being the powerbox, if an interpreter could gain whatever abilities it wanted to then the security domain would be completely breached. * An interpreter cannot influence another interpreter directly at the Python level without explicitly allowing it. + This includes preventing communicating with another interpreter. + Mutable objects cannot be shared between interpreters without explicit allowance for it. + "Explicit allowance" includes the importation of C extension modules because a technical detail requires that these modules not be re-initialized per interpreter, meaning that all interpreters in a single Python process share the same C extension modules. * An interpreter cannot use operating system resources without being explicitly given those resources. + This includes importing modules since that requires the ability to use the resource of the filesystem. + This is mediated by having to go through the process to gain the abilities in the OS that the process possesses. In order to accomplish these goals, certain things must be made true. * The Python process is the powerbox. + It controls the initial granting of abilties to interpreters. * A bare Python interpreter is always trusted. + Python source code that can be created in a bare interpreter is always trusted. + Python source code created within a bare interpreter cannot crash the interpreter. * Python bytecode is always distrusted. + Malicious bytecode can bring down an interpreter. * Pure Python source code is always safe on its own. + Python source code is not able to violate the restrictions placed upon the interpreter it is running in. + Possibly malicious abilities are derived from C extension modules, built-in modules, and unsafe types implemented in C, not from pure Python source. * A sub-interpreter started by another interpreter does not inherit any state. + The sub-interpreter starts out with a fresh global namespace and whatever built-ins it was initially given. Guiding Principles /////////////////////////////////////// To begin, the Python process garners all power as the powerbox. It is up to the process to initially hand out access to resources and abilities to interpreters. This might take the form of an interpreter with all abilities granted (i.e., a standard interpreter as launched when you execute Python), which then creates sub-interpreters with sandboxed abilities. Another alternative is only creating interpreters with sandboxed abilities (i.e., Python being embedded in an application that only uses sandboxed interpreters). All security measures should never have to ask who an interpreter is. This means that what abilities an interpreter has should not be stored at the interpreter level when the security can be provided at the Python level. This means that while supporting a memory cap can have a per-interpreter setting that is checked (because access to the operating system's memory allocator is not supported at the program level), protecting files and imports should not such a per-interpreter protection at such a low level (because those can have extension module proxies to provide the security). This means that security is based on possessing the authority to do something through a reference to an object that can perform the action. And that object will most likely decide whether to carry out its action based on the arguments passed in (whether that is an opaque token, file path allowed to be opened, etc.). For common case security measures, the Python standard library (stdlib) should provide a simple way to provide those measures. Most commonly this will take the form of providing factory functions that create instances of proxies for providing protection of key resources. Backwards-compatibility will not be a hindrance upon the design or implementation of the security model. Because the security model will inherently remove resources and abilities that existing code expects, it is not reasonable to expect all existing code to work in a sandboxed interpreter. Keeping Python "pythonic" is required for all design decisions. In general, being pythonic means that something fits the general design guidelines of the Python programming language (run ``import this`` from a Python interpreter to see the basic ones). If removing an ability leads to something being unpythonic, it will not be done unless there is an extremely compelling reason to do so. This does not mean existing pythonic code must continue to work, but the spirit of being pythonic will not be compromised in the name of the security model. While this might lead to a weaker security model, this is a price that must be paid in order for Python to continue to be the language that it is. Implementation /////////////////////////////////////// Restricting what is in the built-in namespace and the safe-guarding the interpreter (which includes safe-guarding the built-in types) is where the majority of security will come from. Imports and the ``file`` type are both part of the standard namespace and must be restricted in order for any security implementation to be effective. The built-in types which are needed for basic Python usage (e.g., ``object`` code objects, etc.) must be made safe to use in a sandboxed interpreter since they are easily accessbile and yet required for Python to function. The rest of the security for Python will come in the form of protecting physical resources. For those resources that can be denied in a Denial of Service (DoS) attack but protected in a platform-agnositc fashion, they should. This means, for instance, that memory should be protected but CPU usage can't. Abilities of a Standard Sandboxed Interpreter ============================================= In the end, a standard sandboxed interpreter should (not) allow certain things to be doable by code running within itself. Below is a list of abilities that will (not) be allowed in the default instance of a sandboxed interpreter comparative to an unprotected interpreter that has not imported any modules. These protections can be tweaked by using proxies to allow for certain extended abilities to be accessible. * You cannot open any files directly. * Importation + You can import any pure Python module. + You cannot import any Python bytecode module. + You cannot import any C extension module. + You cannot import any built-in module. * You cannot find out any information about the operating system you are running on. * Only safe built-ins are provided. Implementation Details ======================== An important point to keep in mind when reading about the implementation details for the security model is that these are general changes and are not special to any type of interpreter, sandboxed or otherwise. That means if a change to a built-in type is suggested and it does not involve a proxy, that change is meant Python-wide for *all* interpreters. Imports ------- A proxy for protecting imports will be provided. This is done by setting the proper values in 'sys' that involve imports: sys.path, sys.meta_path, sys.path_hooks, and sys.path_importer.cache. It must be warned that importing any C extension module is dangerous. Not only are they able to circumvent security measures by executing C code, but they share state across interpreters. Because an extension module's init function is only called once for the Python *process*, its initial state is set only once. This means that if some mutable object is exposed at the module level, a sandboxed interpreter could mutate that object, return, and then if the creating interpreter accesses that mutated object it is essentially communicating and/or acting on behalf of the sandboxed interpreter. This violates the perimeter defence. No one should import extension modules blindly. Bytecode files will be flat-out disallowed. Because malicious bytecode can be created that can crash the interpreter all bytecode files will be ignored. Implementing Phase 2 of PEP 302 +++++++++++++++++++++++++++++++ Currently Python's built-in importer is monolithic in that __import__ will automatically import a .py, .pyc, extension modules, or built-in modules if a custom importer does handle the import. This does not give one the flexibility needed to control imports at the level of file type. In order to be able to prevent extension module imports and .pyc file imports, the current import machinery will be refactored to be PEP 302 importers. This will allow for better control over what can and cannot be imported. Implementing Import in Python +++++++++++++++++++++++++++++ The import machinery should be rewritten in Python. The C code is considered delicate and does not lend itself to being read. There is also not a very strong definition of the import rules. Rewriting import in Python would help clarify the semantics of imports. This rewrite will require some bootstrapping in order for the code to be loaded into the process without itself requiring importation, but that should be doable. Plus some care must be taken to not lead to circular dependency on importing modules needed to handle importing (e.g. importing sys but having that import call the import call, etc.). Interaction with another interpreter that might provide an import function must also be dealt with. One cannot expose the importation of a needed module for the import machinery as. This can be handled by allowing the powerbox's import function to have modules directly injected into its global namespace. But there is also the issue of using the proper ``sys.modules`` for storing the modules already imported. You do not want to inject the ``sys`` module of the powerbox and have all imports end up in its ``sys.modules`` but in the interpreter making the call. This must be dealt with in some fashion (injecting per-call, having a factory function create a new import function based on an interpreter passed in, etc.). Python/import.c Notes for PEP 302 Phase 2 +++++++++++++++++++++++++++++++++++++++++ __import__ -> PyImport_ImportModuleLevel() -> import_module_level() import_module_level() -> get_parent() Find out if import is occurring in a package and return the parent module in the package. -> load_next() Find out next module to import in chain in dotted name (e.g., ``b`` in ``a.b.c`` if ``a`` already imported but not ``b``). load_next() -> import_submodule() import_submodule() -> find_module() * Check meta_path, path_hooks/path_importer_cache, else file location for built-in import. -> load_module() * Execute loading of code based on type of import: * PY_SOURCE -> path_hook * PY_COMPILED -> path_hook * C_EXTENSION -> path_hook * PKG_DIRECTORY -> path_hook Could it be worked so that if package found put in an empty module in sys.modules and then import pkg.__init__? * C_BUILTIN -> meta_path * PY_FROZEN -> meta_path * IMP_HOOK = importer -> path_hook Changes to find_module() * Make only use PEP 302 resolution. * Rip out part that finds file/directory and make it the importer that finds files. Then have different file types register their file extension and a factory function to call to get the importer to return (if proper file type found). objects for py source, bytecode, packages, and extension modules. (path_hooks factory function comes from part that just verifies that sys.path entry is a directory). Changes to load_module() * Rip out switch and have only do loaders. * Change py source, bytecode, packages, and extension module loaders be individual loaders. * Change built-ins and frozen module loaders to be meta_path loaders. Sanitizing Built-In Types ------------------------- Python contains a wealth of bulit-in types. These are used at a basic level so that they are easily accessible to any Python code. They are also shared amongst all interpreters in a Python process. This means all built-in types need to be made safe (e.g., immutable shared state) so that they can be used by any and all interpreters in a single Python process. Several aspects of built-in types need to be examined. Constructors ++++++++++++ Almost all of Python's built-in types contain a constructor that allows code to create a new instance of a type as long as you have the type itself. Unfortunately this does not work well in an object-capabilities system without either providing a proxy to the constructor or just removing it when access to such a constructor should be controlled. The plan is to remove select constructors of the types that are dangerous and either relocate them to an extension module as factory functions or create a new built-in that acts a generic factory function for all types, missing constructor or not. The former approach will allow for protections to be enforced by import proxy; just don't allow the extension module to be imported. The latter approach would allow either a unique constructor per type, or more generic built-in(s) for construction (e.g., introducing a ``construct()`` function that takes in a type and any arguments desired to be passed in for constructing an instance of the type) and allowing using proxies to provide security. Some might consider this unpythonic. Python very rarely separates the constructor of an object from the class/type and require that you go through a function. But there is some precedent for not using a type's constructor to get an instance of a type. The ``file`` type, for instance, typically has its instances created through the ``open()`` function. This slight shift for certain types to have their (dangerous) constructor not on the type but in a function is considered an acceptable compromise. Types whose constructors are considered dangerous are: * ``file`` + Will definitely use the ``open()`` built-in. * code objects * XXX sockets? Filesystem Information ++++++++++++++++++++++ When running code in a sandboxed interpreter, POLA suggests that you do not want to expose information about your environment on top of protecting its use. This means that filesystem paths typically should not be exposed. Unfortunately, Python exposes file paths all over the place that will need to be hidden: * Modules + ``__file__`` attribute * Code objects + ``co_filename`` attribute * Packages + ``__path__`` attribute * XXX XXX how to expose safely? ``path()`` built-in? Mutable Shared State ++++++++++++++++++++ Because built-in types are shared between interpreters, they cannot expose any mutable shared state. Unfortunately, as it stands, some do. Below is a list of types that share some form of dangerous state, how they share it, and how to fix the problem: * ``object`` + ``__subclasses__()`` function - Remove the function; never seen used in real-world code. Perimeter Defences Between a Created Interpreter and Its Creator ---------------------------------------------------------------- The plan is to allow interpreters to instantiate sandboxed interpreters safely. By using the creating interpreter's abilities to provide abilities to the created interpreter, you make sure there is no escalation in abilities. But by creating a sandboxed interpreter and passing in any code into it, you open up the chance of possible ways of getting back to the creating interpreter or escalating privileges. Those ways are: * ``__del__`` created in sandboxed interpreter but object is cleaned up in unprotected interpreter. - XXX Watch out for objects being set in __builtin__.__dict__ and thus not cleaned up until the interpreter object is deleted and thus possibly executed in the creator's environment! * Using frames to walk the frame stack back to another interpreter. * XXX A generator's execution frame? Making the ``sys`` Module Safe ------------------------------ The ``sys`` module is an odd mix of both information and settings for the interpreter. Because of this dichotomy, some very useful, but innocuous information is stored in the module along with things that should not be exposed to sandboxed interpreters. This includes settings that are global to the Python process along with settings that are specific to each interpreter. This means that the ``sys`` module needs to have its safe information separated out from the unsafe settings. This will allow an import proxy to let through safe information but block out the ability to set values. This separation will also require some reworking of the underpinnings of how interpreters are created as currently Py_NewInterpreter() sets an interpreter's sys module dict to one that is shared by *all* interpreters. XXX separate modules, ``sys.settings`` and ``sys.info``, or strip ``sys`` to settings and put info somewhere else (interpreter?)? Or provide a method that will create a faked sys module that has the safe values copied into it? The safe attributes are: * builtin_module_names : Modules/config.c:PyImport_Inittab Information about what might be blocked from importation. * byteorder : Python/sysmodule.c:_PySys_Init() Needed for networking. * copyright : Python/getcopyright.c:Py_GetCopyright() Set to a string about the interpreter. * displayhook() : per-interpreter (Python/sysmodule.c:sys_displayhook()) (?) * excepthook() : per-interpreter (Python/sysmodule.c:sys_excepthook()) (?) * exc_info() : per-thread (Python/sysmodule.c:sys_exc_info()) (?) * exc_clear() : per-thread (Python/sysmodule.c:sys_exc_clear()) (?) * exit() : per-thread (Python/sysmodule.c:sys_exit()) Raises SystemExit (XXX make sure only exits interpreter if multiple interpreters running) * getcheckinterval() : per-process (Python/ceval.c:_Py_CheckInterval) Returns an int. * getdefaultencoding() : per-process (Objects/unicodeobject.c:PyUnicode_GetDefaultEncoding()) Returns a string about interpreter. * getrefcount() : per-object Returns an int about the passed-in object. * getrecursionlimit() : per-process (Python/ceval.c:Py_GetRecursionLimit()) Returns an int about the interpreter. * hexversion : Python/sysmodule.c:_PySys_Init() Set to an int about the interpreter. * last_type : Python/pythonrun.c:PyErr_PrintEx() (XXX make sure doesn't return value from creating interpreter) * last_value : Python/pythonrun.c:PyErr_PrintEx() (XXX see last_type worry) * last_traceback : Python/pythonrun.c:PyErr_PrintEx() (?) * maxint : Objects/intobject.c:PyInt_GetMax() Set to an int that exposes ambiguous information about the computer. * maxunicode : Objects/unicodeobject.c:PyUnicode_GetMax() Returns a string about the interpreter. * meta_path : Python/import.c:_PyImportHooks_Init() (?) * path_hooks : Python/import.c:_PyImportHooks_Init() (?) * path_importer_cache : Python/import.c:_PyImportHooks_Init() (?) * ps1 : Python/pythonrun.c:PyRun_InteractiveLoopFlags() * ps2 : Python/pythonrun.c:PyRun_InteractiveLoopFlags() * stdin : Python/sysmodule.c:_PySys_Init() * stdout : Python/sysmodule.c:_PySys_Init() * stderr : Python/sysmodule.c:_PySys_Init() * tracebacklimit : (XXX don't know where it is set) (?) * version : Python/sysmodule.c:_PySys_Init() * api_version : Python/sysmodule.c:_PySys_Init() * version_info : Python/sysmodule.c:_PySys_Init() * warnoptions : Python/sysmodule.c:_PySys_Init() (?) (XXX per-process value) The dangerous settings are: * argv : Modules/main.c:Py_Main() * subversion : Python/sysmodule.c:_PySys_Init() * _current_frames() : per-thread (Python/sysmodule.c:sys_current_frames()) * __displayhook__ : Python/sysmodule.c:_PySys_Init() (?) * __excepthook__ : Python/sysmodule.c:_PySys_Init() (?) * dllhandle : Python/sysmodule.c:_PySys_Init() * exc_type : Python/ceval.c:(re)?set_exc_info() Deprecated since 1.5 . * exc_value : Python/ceval.c:(re)?set_exc_info() Deprecated since 1.5 . * exc_traceback : Python/ceval.c:(re)?set_exc_info() Deprecated since 1.5 . * exec_prefix : Python/sysmodule.c:_PySys_Init() Exposes filesystem information. * executable : Python/sysmodule.c:_PySys_Init() Exposes filesystem information. * exitfunc : set by user Deprecated. * _getframe() : per-thread (Python/sysmodule.c:sys_getframe()) * getwindowsversion() : per-process (Python/sysmodule.c:sys_getwindowsversion()) Exposes OS information. * modules : per-interpreter (Python/pythonrun.c:(Py_InitializeEx() | Py_NewInterpreter())) * path : per-interpreter (Python/sysmodule.c:PySys_SetPath() called by Py_InitializeEx() and Py_NewInterpreter()) * platform : Python/sysmodule.c:_PySys_Init() Exposes OS information. * prefix : Python/sysmodule.c:_PySys_Init() Exposes filesystem information. * setcheckinterval() : per-process (Python/sysmodule.c:sys_setcheckinterval()) * setdefaultencoding() : per-process * (Python/sysmodule.c:sys_setdefaultencoding() using PyUnicode_SetDefaultEncoding()) * setdlopenflags() : per-interpreter (Python/sysmodule.c:sys_setdlopenflags()) * setprofile() : per-thread (Python/sysmodule.c:sys_setprofile() using PyEval_SetProfile()) * setrecursionlimit() : per-process (Python/sysmodule.c:sys_setrecursionlimit() using Py_SetRecursionLimit()) * settrace() : per-thread (Python/sysmodule.c:sys_settrace() using PyEval_SetTrace()) * settcsdump() : per-interpreter (Python/sysmodule.c:set_settscdump()) * __stdin__ : Python/sysmodule.c:_PySys_Init() * __stdout__ : Python/sysmodule.c:_PySys_Init() * __stderr__ : Python/sysmodule.c:_PySys_Init() * winver : Python/sysmodule.c:_PySys_Init() Exposes OS information. Protecting I/O ++++++++++++++ The ``print`` keyword and the built-ins ``raw_input()`` and ``input()`` use the values stored in ``sys.stdout`` and ``sys.stdin``. By exposing these attributes to the creating interpreter, one can set them to safe objects, such as instances of ``StringIO``. Safe Networking --------------- XXX proxy on socket module, modify open() to be the constructor, etc. Protecting Memory Usage ----------------------- To protect memory, low-level hooks into the memory allocator for Python is needed. By hooking into the C API for memory allocation and deallocation a *very* rough running count of used memory can be kept. This can be used to compare against the set memory cap to prevent sandboxed interpreters from using so much memory that it impacts the overall performance of the system. Because this has no direct connection with object-capabilities or has any form of exposure at the Python level, this feature can be safely implemented separately from the rest of the security model. Existing APIs to protect are (as declared in Include/pymem.h and Include/objimpl.h; both header files must be thoroughly checked for public API and macros): * PyObject_Malloc(), macros, & friends * PyObject_New(), macros, & friends * _PyObject_New(), & friends * PyMem_Malloc(), macros, & friends * PyObject_GC_New(), & friends An implementation is in Python's svn repository under the bcannon-sandboxing branch.