VTK/Python Wrapper Enhancement
This project will improve the python wrappers bit-by-bit, with the goal of making the Python interface as close as possible to the original C++ interface. When each piece is finished, it will be marked as "done".
The original author of this document is David Gobbi. He can be reached on the VTK developers mailing list.
- 1 Enums and constants (mostly done as of July 31, 2010, finished on Dec 13, 2014)
- 2 GetTuple/SetTuple (done as of Aug 6, 2010)
- 3 Default value arguments (done as of Sept 17, 2010)
- 4 Reference arg wrapping (done as of Sept 17, 2010)
- 5 Wrapping of multi-dimensional arrays (done as of Sept 17, 2010)
- 6 Operator support (partly done as of May 19, 2011)
- 7 Hierarchies of special types (done as of May 20, 2011)
- 8 Templated type handling (done as of May 31, 2011)
- 9 Wrap Namespaces (partly done as of Nov 21, 2014)
- 10 Eliminate WRAP_SPECIAL
- 11 Wrapping istream and ostream
- 12 Pointer arg wrapping
- 13 Wrap vtkCommand and allow it to be subclassed
- 14 Wrap CallData for observer methods (done as of Feb 18, 2014)
- 15 The using directive (done as of May 15, 2015)
- 16 Python 3 (done as of Aug 31, 2015)
- 17 Improved installation
Enums and constants (mostly done as of July 31, 2010, finished on Dec 13, 2014)
The new vtkParse will parse enums and #define constants.
- The FileInfo struct must be expanded to hold these constants
- Constant values should be stored as strings to simplify typing*
- The wrappers will have to automatically add the class scope to enum constants.
- Type checking for named enums, instead of treating them like "int" (done on Nov 21, 2014)
* The strings can be written literally into the wrapper .cxx files where they will be evaluated as the correct type.
GetTuple/SetTuple (done as of Aug 6, 2010)
The wrappers should make a "special case" for vtkDataArray and wrap the GetTuple/SetTuple methods using the knowledge that the tuple size is equal to the number of components. The same can be done for the subclasses, with GetTupleValue/SetTupleValue. This is a change that could also be easily done for Tcl and Java. (done as of Aug 6, 2010 for Python wrappers).
Default value arguments (done as of Sept 17, 2010)
Default argument value support would require the following:
- vtkParse must store the default value in the FunctionInfo struct as a string*
- vtkWrapPython.c must use these default values to initialize parameter values
- vtkWrapPython.c must place a bar "|" before the default args in the ParseTuple format string
- some other small changes would be needed
* the default value must be stored as a string to accommodate all types and to accommodate simple mathematical expressions, e.g. the default might be SomeMethod(int param = VTK_CONST1 - VTK_CONST2). The string "VTK_CONST1 - VTK_CONST2" can be dropped directly into the wrapper CXX code where it will be evaluated.
Reference arg wrapping (done as of Sept 17, 2010)
This is trivial to add, only a few lines would have to be added to vtkWrapPython. For the reference arg, the user would have to pass a container object that supported both the sequence protocol and the number protocol. A new "mutable" object was created for this purpose.
Wrapping of multi-dimensional arrays (done as of Sept 17, 2010)
Python can unpack nested sequences, so reading multi-array args is easy. Writing back to them is a bit more complicated.
Sept 17: reading and writing is done. Reading requires that each element is passed as an arg to PyArg_ParseTuple, which is nice for small arrays but not so good for large arrays. Instead of having PyArg_ParseTuple unpack the elements, a subroutine could be added to vtkPythonUtil.
Operator support (partly done as of May 19, 2011)
The new vtkParse provides information about operator methods for the VTK classes. These operator methods can be mirrored in Python by:
- defining the appropriate "protocols" for special type objects
- defining the proper methods e.g. __setitem__() for vtkObjectBase objects
Each VTK special type will have its own python "type" object, and can thus support its own set of protocols that will be inherited by "subtypes". All vtkObjectBase objects have the same python "type" object so protocols cannot be used, but underscore methods can potentially be used.
Operators supported so far:
- the << operator for printing
- the < <= != etc. operators for comparisons
- the "[ ]" operator for indexing
Operators that could be supported in the future:
- the "[ ]" operator for mapping
- the "( )" operator for callable objects
- arithmetic operators (would require a lot of tedious work)
Hierarchies of special types (done as of May 20, 2011)
If each non-vtkObjectBase special type had its own PyTypeObject struct (generated by vtkWrapPython.c) then:
- These types could have a hierarchy via python's subclass system
- Type-specific protocols (number, sequence, buffer, etc) could be supported, this would require proper parsing of operators
Also: see VTK/WrapHierarchy for a tool that provides the entire hierarchy to the wrappers at compile-time. Note that the WrapHierarchy tool is of critical importance... without it, the wrappers cannot tell if a "vtkSomething *" parameter is a pointer to a vtkObjectBase object or to a special VTK type.
Templated type handling (done as of May 31, 2011)
Should be made to look similar to C++, but with square brackets instead of angle brackets. E.g. vtkValue['float32', 3]( ) would create a vtkValue<float, 3>( ). In Python, the specialized types would be stored in a dictionary. There is already template support in vtkParse, so all the information about the templates is available to vtkWrapPython... it is just a matter of instantiating and wrapping the class templates.
Wrap Namespaces (partly done as of Nov 21, 2014)
There are two ways that namespaces could be handled:
- as modules, with all items in the namespace placed within the dict of the module (this option was chosen)
- as class objects, with all items as attributes of the class (functions would be static methods of the class)
Only constants and enum types within the namespace are wrapped. Functions and classes in the namespace are not yet wrapped.
The WRAP_SPECIAL (and WRAP_EXCLUDE) indicators in CMakeList.txt are a pain to maintain. It would be possible to just let the python wrappers attempt to wrap everything, and if any types turn up as "unwrappable" they could be wrapped as opaque pointers. That way they could still be passed back and forth between fully-wrapped objects. The only caveat is that some classes in VTK are incomplete i.e. missing method definitions in their .cxx files, but this would be a good opportunity to fix such classes, or place them in an "#ifndef __WRAP__" block.
Wrapping istream and ostream
The wrapper parser already identifies input and output streams as their own types. It would be straightforward to wrap these as Python file objects.
Pointer arg wrapping
Pointer arguments (as opposed to array arguments) are used for the following in VTK:
- passing data arrays, e.g. vtkIntArray::SetArray(int *data, vtkIdType size, int save)
- 'tuples' where the tuple size can change, e.g. vtkDataArray::SetTuple(double *tuple)
- return slots, e.g. vtkVariant::ToInt(bool *valid)
For (1) and (2), there is an analogous situations for return values:
- int *vtkIntArray::GetPointer(vtkIdType offset)
The closest thing that Python has to a "pointer" is its buffer objects, such as array, string, and numeric array. The problem is that a python buffer always requires a size argument, but C++ rarely provides any hints about the size of the data object that a pointer is pointing to. Some heuristics would have to be applied:
- the wrappers can be made to look for vtkDataArray and properly handle their pointer methods
- other methods will need some sort of hinting.
The "count" for pointer args should be hinted so that they can be properly wrapped. E.g.
- vtkVariant::ToInt(bool *vtkSingleValue(valid))
- vtkVariant::ToInt(bool *vtkOptionalSingleValue(valid)) - can be safely set to NULL
- vtkDataArray::SetTuple(double *vtkMultiValue(tuple, ->GetNumberOfComponents()))
In the latter, the name of the method to get the count is supplied in the hint. Recognizing these macros in vtkParse would be easy. Unfortunately, they make the C++ code very ugly.
Wrap vtkCommand and allow it to be subclassed
Right now, vtkCommand() cannot be used from python because it is an abstract class, and abstract VTK classes cannot be subclassed in Python. Even without vtkCommand, the VTK Command/Observer features can still be used in Python because the vtkObject's AddObserver method can take any python method as an argument, and the wrappers internally convert that python method into a vtkPythonCommand. Unfortunately, though, some flexibility is lost because these vtkCommand methods are lost:
I have done some work to remedy this, but the work is not yet complete. So far, I have:
- changed the CMake files so that vtkCommand is wrapped (but it is abstract and cannot be instantiated)
- made it possible to subclass vtkCommand in python (but the subclasses are abstract)
To make everything work, vtkCommand subclasses in python must actually be subclassed from vtkPythonCommand (which is concrete), and vtkPythonCommand must provide virtual function hooks so that Execute can be overridden as a virtual method. This is only possible if vtkPythonCommand is provided with a "PyObject *" slot for its pythonic other half, so that when Execute is called it can search the python dict to see if the method has been overridden.
Wrap CallData for observer methods (done as of Feb 18, 2014)
A method is passed to vtkObject.AddObserver() in python takes two args (O, s) where "O" is the observed object, and "s" is a string that gives the event type. The corresponding C++ method takes a third object: a void pointer called "CallData" that contains extra information about the event. The CallData is usually NULL, but sometimes contains useful information such as an error message, a pointer to a vtkObject, or a pointer to a numeric value. The python observer methods should be made to take an optional third argument, which will be the CallData automatically resolved to the correct type. It will be tricky to achieve this in a backwards-compatible manner because there is a lot of existing code that will break if passed a third argument, but judicious error detection within vtkPythonCommand.cxx can work around this by attempting to call the method with three parameters first, and then retrying with two parameters if a TypeError occurred with the usual parameter-count error text and if the traceback is exactly one level deep.
The using directive (done as of May 15, 2015)
In C++, when a class overrides a superclass method, then all superclass signatures of that method will be shadowed. In order to avoid this, the "using" directive can be used to bring them into the subclass namespace:
using Superclass::SetColor; // bring in e.g. SetColor(r,g,b) void SetColor(double color); // override SetColor(color)
Exactly the same shadowing occurs in python, and because the wrappers ignore the "using" directive, the above code does not fix the shadowing problem for the wrappers. In order for the wrappers to apply the "using" directive, the following must be done:
- When vtkWrapPython parses a header, it must recursively parse superclass headers
- If "using" is encountered, items should be brought in from the superclass namespace
- For recursive parsing, see vtkParse_SetRecursive() in vtkParse.y and preprocessor_directive() in vtkParse.l
Python 3 (done as of Aug 31, 2015)
Python 3 introduced major API changes for strings and ints, and minor API changes elsewhere.
- It will be possible to support
Python 2.3 though Python 3Done, supports 2.5, 2.6, 2.7 and 3.2+
- Older versions of python will have to be dropped
(Python 2.2 is a grey area)
- "Classic" classes are gone in Python 3, so PyVTKObject might not work anymore
- PyIntObject and PyStringObject are absent in Python 3
- Python 3 uses unicode for all strings (ramifications for VTK?) Python wrappers assume VTK uses utf-8
- A new, multi-dimensional buffer interface exists (plus, memoryview) Memoryview can be used with vtkDataArray
- Small changes to PyTypeObject Done
- Language changes: some examples will have to be rewritten Done
There has been interest in improving the installation of the vtk python modules, including:
- Install the modules within the existing python path (can vary from system to system, sometimes impossible for user installs, must be overridden for embedding e.g. ParaView)
- Creating a wheel (whl) binary installer for use with pip
The sysconfig module for Python 2.7 and 3.2+ can be useful here (it contains tons of info, run "python -m sysconfig" for a dump).