VTK/Tutorials/New Pipeline

From KitwarePublic
< VTK‎ | Tutorials
Jump to: navigation, search


This section will introduce the classes and concepts used in the new VTK pipeline. You should be familiar with the very basic design of the old pipeline before delving into this. The new pipeline was designed to reduce complexity while at the same time provide more flexibility. In the old pipeline the pipeline functionality and mechanics were contained in the data object and filters. In the new pipeline this functionality is contained in a new class (and its subclasses) called vtkExecutive.

Introduction

There are four key classes that make up the new pipeline. They are:

provides the flexibility to grow. Most of the methods and meta information storage make use of this class. vtkInformation is a map-based data structure that supports heterogeneous key-value operations with compile time type checking. There is also a vtkInformationVector class for storing vectors of information objects. When passing information up or down the pipeline (or from the executive to the algorithm) this is the class to use.

in the past this class both stored data and handled some of the pipeline logic. In the new pipeline this class is only supposed to store data. In practice there are some pipeline methods in the class for backwards compatibility so that all of VTK doesn’t break, but the goal is that vtkDataObject should only be about storing data. vtkDataObject has an instance of vtkInformation that can be used to store key-value pairs in. For example the current extent of the data object is stored in there but the whole extent is not, because that is a pipeline attribute containing information about a specific pipeline topology .

an algorithm is the new superclass for all filters/sources/sinks in VTK. It is basically the replacement for vtkSource. Like vtkDataObject, vtkAlgorithm should know nothing about the pipeline and should only be an algorithm/function/filter. Call it with the correct arguments and it will produce results. It also has a vtkInformation instance that describes the properties of the algorithm and it has information objects that describe its input and output port characteristics. The main method of an algorithm is ProcessRequest.

contains the logic of how to connect and execute a pipeline. This class is the superclass of all executives. Executives are distributed (as opposed to centralized) and each filter/algorithm has its own executive that communicates with other executives. vtkExecutive has a subclass called vtkDemandDrivenPipeline which in turn has a subclass called vtkStreamingDemandDrivenPipeline. vtkStreamingDemandDrivenPipeline provides most of the functionality that was found in the old VTK pipeline and is the default executive for all algorithms if you do not specify one.


Let us first look at the vtkAlgorithm class. It may seem odd that a class with no notion of a pipeline has one key method called ProcessRequest. At its simplest, an algorithm has a basic function to take input data and produce output data. This is a down-stream request (specifically REQUEST_DATA) that all algorithms should implement. Requests for information or data flowing from input to output are called downstream requests. Requests for information or data flowing from output to input are called upstream requests. But algorithms can do more than just produce data; they also have characteristics or metadata that they can provide. For example, an algorithm can provide information about what type of output it will produce when you execute it. An imaging algorithm might only be capable of producing double results. The algorithm can specify this by responding to another down-stream request called REQUEST_INFORMATION. Consider the following code fragment:

        int vtkMyAlgorithm::ProcessRequest(
           vtkInformation       *request,
           vtkInformationVector **inputVector,
           vtkInformationVector *outputVector)
        {
           // generate the data
           if(request->Has(vtkDemandDrivenPipeline::REQUEST_INFORMATION()))
              {
              // specify that the output (only one for this filter) will be double
              vtkInformation* outInfo = outputVector->GetInformationObject(0);
              outInfo->Set(vtkDataObject::SCALAR_TYPE(),VTK_DOUBLE);
              return 1;
              }
           return this->Superclass::ProcessRequest(request, inputVector,outputVector);
        }

This method takes three information objects as input. The first is the request which specifies what you are asking the algorithm to do. Typically this is just one key such as REQUEST_INFORMATION. The next two arguments are information vectors: one for the inputs to this algorithm and one for the outputs of this algorithm. In the above example no input information was used. The output information vector was used to get the information object associated with the first output of this algorithm. Into that information was placed a key-value pair specifying that it would produce results of type double. Any requests that the algorithm doesn’t handle should be forwarded to the superclass.

The pipeline topology in the new pipeline is a little different from the old one. In the new pipeline you connect the output port of one algorithm to the input port of another algorithm. For example,

        alg1->SetInputConnection( inPort, alg2->GetOutputPort( outPort ))

Using this terminology a port is like a pin on an integrated circuit. An algorithm has some input ports and some output ports. A connection is a “connection” between two ports. So to connect two algorithms you make a connection between the output port of one algorithm and the input port of another algorithm. Any input port in the new pipeline can be specified as 'repeatable' and or 'optional':

  1. Repeatable means that more than one connection can be made to this input port (such as for append filters).
  2. Optional means that the input is not required for execution.

While ports can be repeatable there is still a need for multiple ports. Your algorithm should have multiple ports when the concept, data type, or semantics of a port are different. So AppendFilter only needs one repeatable input port because it treats all of its inputs the same. Glyph in contrast should have two ports, one for the input points and one for the glyph model because these are two distinct concepts. Of course the old style of SetInput, GetOutput connections will work with existing algorithms as well.

So in the new pipeline outputs are referred to by port number while inputs are referred to by both their port number and connection number (because a single input port can have more than one connection)

Typical Pipeline Execution

Let us take a quick look at the typical execution of a pipeline in VTK.

  • First the use instantiates an Algorithm such as vtkImageGradient, when the algorithm is instantiated it will automatically create a default executive and connect the algorithm and executive together. The algorithm will also call FillInputPortInformation and FillOutputPortInformation on each input and output port respectively. These two methods should setup the static characteristics of the input and output ports such as data type requirements and whether the port is optional or repeatable. For example, by default all subclasses of vtkImageAlgorithm are assumed to take an vtkImageData as input:


        int vtkImageAlgorithm::FillInputPortInformation(int vtkNotUsed(port), vtkInformation*info)
        {
          info->Set(vtkAlgorithm::INPUT_REQUIRED_DATA_TYPE(), "vtkImageData");
          return 1;
        }


This can be overridden by subclasses as required.

  • Once the algorithm and executive are instantiated they will be connected to other algorithm executive pairs. If the new SetInputConnection signature is used (and it should be used) then this just stores the connectivity information. The old VTK pipeline used the data objects to store connectivity and thus required that the data objects be instantiated prior to calling GetOutput.
  • Once the entire pipeline is instantiated and connected it will be executed (typically as the result of a Render() call) Typically the first request an algorithm will see is upon execution is REQUEST_DATA_OBJECT (see vtkExecutive, vtkDataObject and their subclasses for all the possible keys and requests.) This request asks the algorithm to create an instance of vtkDataObject (or appropriate subclass) for all of its output ports. If you handle this request you can set the output ports output data to be whatever type you want (see vtkDataSetAlgorithm for an example), if the algorithm does not handle the request then by default the executive will look at the output port information to see what type the output port has been set to (from FillOutputPortInformation) If this is a concrete type then it will instantiate an instance of that type and use it. If it isn’t concrete then (I can’t remember if it just fails with an error or has another fallback position)

Request Information

At this point the pipeline is instantiated, connected, and the data objects have been instantiated to store the data. The next request an algorithm will typically see is REQUEST_INFORMATION. This request asks the Algorithm to provide as much information as it can about what the output data will look like once the algorithm has generated it. Typically an algorithm will look at the information provided about its inputs and try to specify what it can about its outputs. In many image processing filters quite a bit can be specified such as the WHOLE_EXTENT, SCALAR_TYPE, SCALAR_NUMBER_OF_COMPONENTS, ORIGIN, SPACING, etc. The rule here is to provide or compute as much information as you can without actually executing (or reading in the entire data file) and without taking up significant CPU time. For example an image reader should read the header information from the file to get what information it can out of it, but it should not read in the entire image so that it can compute the scalar range of the data. When providing information about an output, an algorithm is not limited to the current information keys (such as WHOLE_EXTENT) that are provided by VTK. Part of the new pipeline design is that it can be easily extended. You can define your own keys and then in the REQUEST_INFORMATION request you can add those keys and their values to the output information objects. You can also specify that you want those keys to be passed down (or up) the pipeline by adding them to the KEYS_TO_COPY. That way you could have a specialized reader that populates the information with special keys and then have a writer or mapper downstream that uses those keys.

By default executives will first copy the input’s information to the output’s information. You only need to handle the cases where it is different.

Request Update Extent

The next request is typically REQUEST_UPDATE_EXTENT. To fulfill this request an algorithm should take the update extents in its output’s information and then set the correct update extents in its input’s information. As with REQUEST_INFORMATION there is a default behavior by the executive and you only need to handle cases where it changes. (see vtkImageGradient for an example)

Request Data

Finally REQUEST_DATA will be called and the algorithm should fill in the output ports data objects.

Executives

The class hierarchy for the mainstream VTK executives is as follows.

This is a graph with borders and nodes. Maybe there is an Imagemap used so the nodes may be linking to some Pages.

Below, we provide some information about these executives that is not necessarily found in VTK User's Guide. If you are not familiar with VTK's pipeline execution model, you should read the Managing Pipeline Execution chapter in the book.

vtkExecutive

This class is the superclass for all executives. It is pretty abstract and provides little functionality. Important ones:

  • An executive has an algorithm
  • An executive manages (i.e. has) the input and output information objects of the algorithm. This means that the pipeline graph is stored by a set of executives not algorithms

The most important function is vtkExecutive is ProcessRequest(). This method does two things:

  1. Forward the request upstream (if the FORWARD_DIRECTION is vtkExecutive::RequestUpstream) or downstream (if the FORWARD_DIRECTION is vtkExecutive::RequestDownstream (note: this is not implemented).
  2. Pass the request to the algorithm by calling CallAlgorithm() which calls ProcessRequest() on the algorithm. This can happen before and/or after the request is forwarding depending on whether ALGORITHM_BEFORE_FORWARD and/or ALGORITHM_AFTER_FORWARD is set.

CallAlgorithm() calls CopyDefaultInformation() before passing the request to the algorithm. The goal of this function is to copy certain information (such as update requests or meta-information) from output to input (when the algorithm is invoked before forwarding - for example, in REQUEST_UPDATE_EXTENT) or from input to output (when the algorithm is invoked after forwarding - for example in REQUEST_INFORMATION).

Note on centralized executives: This implementation does not allow us to use centralized executives (one executive that manages more than one algorithm) because the executive has an algorithm AND the executive stores the pipeline graph. However, it is possible to create a meta-executive (an executive to rule them all) that is centralized. This executive would have to manage the flow of information itself. This can be done by subclassing the executive class to disable forwarding. The centralized executive can the delegate the handling of each pass to the distributed executives but manage forwarding itself.

vtkDemandDrivenPipeline

This executive implements a demand-driven (pull) pipeline. It recognizes 3 passes in ProcessRequest():

  1. REQUEST_DATA_OBJECT: This is where the algorithm is supposed to create its output data objects. It is a ALGORITHM_AFTER_FORWARD pass. After forwarding the request upstream, the executive calls ExecuteDataObject() (a virtual member function). This first calls CallAlgorithm() and then CheckDataObject(). If CallAlgorithm() does not create output data objects, CheckDataObject() tries to create them based on a vtkDataObject::DATA_TYPE_NAME defined in the output port information.
  2. REQUEST_INFORMATION: This is where the algorithm is supposed to provide meta-data. It is a ALGORITHM_AFTER_FORWARD pass. After forwarding the request upstream, the executive calls ExecuteInformation() (a virtual member function). Note: For backwards compatibility purposes, ExecuteInformation() calls CopyInformationToPipeline() on the output data object. In the old pipeline, the meta-information was provided by setting it on the output data object. This copies such meta-information from the data objects to the output information.
  3. REQUEST_DATA: This is where the algorithm is supposed to provide (heavy) data. It is a ALGORITHM_AFTER_FORWARD pass. After forwarding the request upstream, the executive calls ExecuteData() (a virtual member function). ExecuteData() calls ExecuteDataStart(), CallAlgorithm() and ExecuteDataEnd(). ExecuteDataStart() takes case of initializing the outputs whereas ExecuteDataEnd() performs finalization such as marking the outputs as generated by calling DataHasBeenGenerated().

Note that all of these passes make sure to skip execution if the required information was already generated and if nothing upstream changed.

vtkStreamingDemandDrivenPipeline

This executive adds support for streaming to vtkDemandDrivenPipeline. Streaming is usually performed by processing a subset of a dataset in each invocation of the pipeline and accumulating the results. Currently, vtkStreamingDemandDrivenPipeline supports 3 ways of subsetting the data:

  1. Extents: Extents are only applicable to structured datasets. An extent is a logical subset of a structured dataset defined by providing min and max IJK values (VOI).
  2. Pieces: Pieces are only applicable to unstructured datasets. A piece is a subset of an unstructured dataset. What a "piece" means is determined by the producer of such dataset.
  3. Time steps: The pipeline can request a particular time step at each invocation for time series data.

vtkStreamingDemandDrivenPipeline adds 2 new pipeline passes to support streaming:

  1. REQUEST_UPDATE_EXTENT: This pass is where the consumer asks for a particular subset from its input. It is a ALGORITHM_BEFORE_FORWARD pass. This is where algorithms receive extent/piece and time step request from their consumer and copy or modify this request upstream. In this pass, if the algorithms produces unstructured data but consumes structured data, the executive uses an extent translator to automatically convert the piece request to an extent request.
  2. REQUEST_UPDATE_EXTENT_INFORMATION: This optional pass is where meta-information specific to the particular subset is requested. It is a ALGORITHM_AFTER_FORWARD pass. This pass is commonly used for dynamic streaming where the consumer fetches meta-information such as bounds or scalar range for a particular piece before deciding whether to update it.

vtkCompositeDataPipeline

This executive adds support for iterating over multiple blocks and/or time steps. This is best described using an example pipeline. For example,

This is a graph with borders and nodes. Maybe there is an Imagemap used so the nodes may be linking to some Pages.

Here, the Ensight reader always produces a multi-block dataset whereas the contour filter can only handle vtkDataSet and subclasses. As a result, this pipeline would produce a run-time error if the executive is vtkStreamingDemandDrivenPipeline. vtkCompositeDataPipeline deals with this issue by looping over the leaf nodes of the multi-block dataset and performing a full pipeline invocation of the contour filter for each block. Similarly, when the pipeline is something like the following

This is a graph with borders and nodes. Maybe there is an Imagemap used so the nodes may be linking to some Pages.

the executive knows to invoke the reader for 2 time steps (because the particle tracer always need 2 time steps for interpolation) and gather the result in a vtkTemporalDataSet.

vtkCompositeDataPipeline does all of this by overriding a significant portion of the execution mechanism when it needs to iterate over blocks and/or time steps.

Converting an Existing Filter to the New Pipeline

The best approach at this point is to find a VTK filter similar to your filter and then copy that. An alternate approach is to follow the instructions below. The new pipeline implementation does include a backwards compatibility layer for old filters. Specifically vtkProcessObject, vtkSource, and their related subclasses are still present and working. Most filters should work with the new pipeline without any changes. The most common problems with the backwards compatibility layer involve filters that manipulate the pipeline. If your filter overrides UpdateData or UpdateInformation you will probably have to make some changes. If your filter uses an internal pipeline then you might need to make some changes, otherwise you should be OK. Now ideally you would convert your filter to the new pipeline. There is a script that you can run that will help you to convert your filter. The script doesn’t do everything but it will help get you going in the right direction. You can run the script on an existing class as follows:

       cd VTK/MYClasses
       cmake –D CLASS=vtkMyClass –P ../Utilities/Upgrading/NewPipeConvert.cmake

One of the effects this script might have is to change the superclass of your class. There are some convenience superclasses to make writing algorithms a little easier. In the old pipeline there were classes such as vtkImageToImageFilter. Some of the classes designed for the new pipeline include:

       vtkPolyDataAlgorithm: for algorithms that produce vtkPolyData
       vtkImageAlgorithm: for algorithms that take and or produce vtkImageData
       vtkThreadedImageAlgorithm: a subclass of vtkImageAlgorithm that implements

multithreading These classes have some defaults that can be easily changed in your subclass. The first default is that the subclass will take on input and produce one output. This is typically specified in the constructor using SetNumberOfInputPorts(1) and SetNumberOfOutputPorts(1). If your subclass doesn’t take an input then in its constructor just call SetNumberOfInputPorts(0). Another assumption that is made is that all the input port and output ports take vtkPolyData (for vtkPolyDataAlgorithm, vtkImageData for vtkImageAlgorithm etc). Again your subclass can override this by providing its own implementation of FillInputPortInformation or FillOutputPortInformation. These superclasses typically provide an implementation of ProcessRequest that handles REQUEST_INFORMATION and REQUEST_DATA request by invoking virtual functions called RequestData and RequestInformation. They also typically provide default implementations of RequestData that call the older style ExecuteData functions to make converting your old filters easier.