VTK XML Formats

From KitwarePublic
Jump to: navigation, search

This page briefly documents VTK XML File format details to help those attempting to create home-grown writers. It is not indended as a complete or authoritative document. See the VTK Users's Guide or the VTK file formats documentation for more information about the format. We encourage developers to use the C++, C, or Fortan interfaces provided by VTK to use the official writers instead of using the information in this document.

The VTKFile Element

The top-level document element of any VTK XML file is called "VTKFile":

 <VTKFile type="..." version="version" byte_order="byte-order" ...>
   ...rest of file...
 </VTKFile>

The version number is specific to each type of data set represented in the file. For normal (non-composite) data set types possible version numbers are:

  • 0.1: Readable by VTK >= 4
  • 1.0: Readable by VTK >= 6

The byte-order specified may be either "LittleEndian" or "BigEndian" and indicates the byte order used for any binary data in the file.

File format version 1.0 supports an additional VTKFile attribute:

 <VTKFile ... version="1.0" ... header_type="header-type">

The header-type specified may be either "UInt32" or "UInt64" and indicates the integer type used in binary data headers (see below). If no such attribute is specified (as in version 0.1) the header type is UInt32.

Appended Data Section

The appended data section is stored in an XML element just before the end of the file. A file with an AppendedData element has this form:

 <VTKFile ...>
   ...
   <AppendedData encoding="raw">
     _...[DATA]...
   </AppendedData>
 </VTKFile>

Note the literal underscore ('_') at the beginning of the data. This character separates whitespace to its left from the data to its right. Extra whitespace AFTER the data has no effect. The AppendedData section CDATA can be base-64 encoded to produce a fully valid XML file, but may also be left raw. In this document we assume the raw encoding because it is simpler.

DataArray elements elsewhere in the file reference the AppendedData section like this:

 <DataArray ... format="appended" offset="0"/>

The value in the "offset" attribute is the file offset in bytes beyond the leading underscore. An offset of "0" means the first character after the underscore. Each DataArray's data are stored in a contiguous block. The block can be either compressed or uncompressed, but by default it is uncompressed (compression is global to the file and marked by an attribute on the VTKFile element).

Uncompressed Data

A block representing a data array without compression has this format:

 [#bytes][DATA]

where "[#bytes]" is an integer value specifying the number of bytes in the block of data following it. The rest of the data immediately follow it. The byte count integer type is specified by the "header_type" attribute at the top of the file (UInt32 if no type specified). The byte count and all data are in the byte order specified by the "byte_order" attribute at the top of the file. In general a raw, uncompressed AppendedData element has the form

 <AppendedData encoding="raw">
   _<n1><data1><n2><data2>...
 </AppendedData>

where the "<n1>"-style tokens indicate header_type-d integer byte counts and "<data1>"-style tokens indicate the corresponding blocks of binary data.

Compressed Data

Binary data in the file are compressed when the VTKFile element is of the form

 <VTKFile ... compressor="vtkZLibDataCompressor">

The data corresponding to a data array are stored in a set of blocks which are each compressed using the zlib library. The block structure allows semi-random access without decompressing all data. In uncompressed form all the blocks have the same size except possibly the last block which might be smaller. The data for one array begin with a header of the form

 [#blocks][#u-size][#p-size][#c-size-1][#c-size-2]...[#c-size-#blocks][DATA]

Each token is an integer value whose type is specified by "header_type" at the top of the file (UInt32 if no type specified). The token meanings are:

 [#blocks] = Number of blocks
 [#u-size] = Block size before compression
 [#p-size] = Size of last partial block (zero if it not needed)
 [#c-size-i] = Size in bytes of block i after compression

The [DATA] portion stores contiguously every block appended together. The offset from the beginning of the data section to the beginning of a block is computed by summing the compressed block sizes from preceding blocks according to the header.