Talk:Titan Data Structures
vtkTable parent class
vtkDataObject contains the methods SetFieldData and GetFieldData, which could be used to set or get the columns of the table. The columns may currently contain primitive data types and recently vtkStringArray has been added to VTK.
We could also consider inheriting from vtkDataSet, which contains mechanisms for assigning coordinates to each row of the table. These could be initialized to zero and modified when the table is graphically presented. However, vtkDataSet also has the concept of cells, but tables do not have this concept since all rows are independent of each other.
vtkTable should probably extend vtkDataObject, and then be converted to a subclass of vtkDataSet using a filter which determines coordinates for each row.
--Jeff 10:03, 2 Jun 2006 (EDT)
vtkGraph structure
I see many similarities between the vtkPolyData structure and what is required for vtkGraph. The points function like vertices, and the cells function like edges. Though edges normally have exactly two associated vertices, the vtkCellArray in vtkPolyData is flexible enough to handle hyperedges, or arbitrary sets of vertices, using cell types with arbitrary numbers of vertices (such as polygons). This could be useful for describing partitions or clusters of the graph. vtkCellTypes allows random access to edge endpoints, and using vtkCellLinks we can lookup the edges (cells) adjacent to a certain vertex (point). But, it seems we would need one more array which contains the starting index in vtkCellLinks for each point, so that we can randomly access the neighbors of any vertex.
All this to say, we could make vtkGraph directly extend vtkPolyData with added functions like AddVertex (which would essentially add a point), AddEdge (adds a cell), GetAdjacentVertices, etc. and add functionality to make adjacent edge lookup fast. The other option is to start from scratch and inherit from vtkDataObject, and have filters which convert it to a vtkPolyData.
Jeff 11:04, 2 Jun 2006 (EDT)
- After talking with Berk, we found that the vtkCellLinks is in fact random access, so adjacent edge lookup would be fast with no additional changes. However, we also discussed that the first step should be to create an ideal graph API without thinking about how to fit it into VTK. After that, we should look at how to incorporate it into VTK.
- Jeff 13:08, 2 Jun 2006 (EDT)
Heterogeneous data
One issue with the current format is that it assumes homogeneous data is contained in each data structure. That is, a graph consists of a table of vertices and a table of edges, and similarly for the other structures. This is a simple, compact approach, which would likely work well in dealing with database queries. However, it cannot efficiently handle the case where there are multiple types of objects with different properties. For example, if we are reading an XML file into a hierarchy, we should not have to read all items into a single table, since this would require making columns for all possible attributes, wasting space. I was attempting to avoid requiring an "entity" class altogether, and just work with tables, but this may not be possible for the functionality we need.
Jeff 09:34, 8 Jun 2006 (EDT)
- Prior to the meeting today, I modified the dataset such that both heterogeneous and homogeneous data could be stored in a set. However, we at the meeting we discussed that we will make the assumption that the data can be always be stored in tabular format (See Titan Developer Meeting 06/08/2006).
- Jeff 18:45, 8 Jun 2006 (EDT)
Column-data format
As per Titan Developer Meeting 06/08/2006, I have modified the data structures so that they assume all datasets may be represented as columns of data. I also made vtkAbstractGraph, which both vtkGraph and vtkHierarchy inherit from.
Jeff 12:15, 9 Jun 2006 (EDT)
Table filters vs. table containment
Originally I had thought that vtkGraph and vtkHierarchy would contain vtkTable objects, but rethinking things it seems contrary to VTK for one data object to contain other data objects. So instead, I propose filters vtkTablesToGraphFilter and vtkTableToHierarchyFilter that would use fields in the tables to automatically generate links between objects.
Jeff 12:15, 9 Jun 2006 (EDT)
Arrays in vtkVariant: vtk<Type>Array vs. <Type>*
vtkVariant should support not only atomic types, but arrays of atomic types. Since there are already nicely managed classes for lists of values (i.e. vtk<Type>Array, where <Type> is any of a number of atomic types), it would seem logical to store a pointer to the appropriate type of vtk<Type>Arrays inside vtkVariant. This also allows some nice conversion properties and a generic interface to the arrays. For example, vtkVariant could have a GetArray() which would return a vtkAbstractArray pointer with generic functions which work for all arrays. If the arrays were instead implemented as pointers (i.e. float*, double*, etc.), the GetArray() function would need to return a void pointer which would need to be cast to the appropriate type before use. However, vtk<Type>Arrays are not as efficient as just storing pointers along with the size of the array.
Jeff 16:19, 16 Jun 2006 (EDT)
- At the Titan Developer Meeting 06/22/2006, it was generally agreed that using vtk<Type>Array types makes sense. It was additionally agreed that the interface should provide access to a vtkAbtractArray instead of requiring functions for each subtype. This also allows vtkVariant to automatically work when new array types are created, without having to make new member functions for each type.
- Jeff 14:33, 22 Jun 2006 (EDT)
vtkBLOB and vtkDocument
As discussed in Titan Developer Meeting 06/22/2006, we are considering the addition of two new data types, vtkBLOB and vtkDocument. vtkBLOB would exist to hold large pieces of data such as images or video, while vtkDocument holds a text document. Both would inherit from vtkDataObject in order to take advantage of the metadata capabilities of vtkInformation.
Jeff 17:27, 22 Jun 2006 (EDT)
vtkTree
It would be good to have an iterator that can do either depth first or breadth first traversals of the tree.
--Pat 19:49, 11 Jul 2006 (EDT)
Changes to vtkTable
I have made the following changes to vtkTable (As a result of Titan Developer Meeting 07/27/2006):
- The type for the number of columns and column indices was changed to vtkIdType for a more uniform API. Behind the scenes, the index is always converted to an int before accessing the internal vtkFieldData structure, since it uses ints internally for column indices. The potential problem would be if vtkIdType was 64 bits and the user attempts to add more than <math>2^{31} \approx 2\,billion</math> columns, which would fail even though the argument type suggests it should work. If users start adding this many columns, the correct course of action would be to modify vtkFieldData.
- InsertRow(vtkIdType) and InsertRow(vtkIdType,vtkVariantArray*) have been removed. In VTK arrays, "insert" means "set". Hence, these functions were somewhat confusing (did not actually "insert" a row but "set" a row), and it is unclear if they are required.
- The name of InsertNextRow() was changed to InsertNextBlankRow() for clarity.
Jeff 15:21, 27 Jul 2006 (EDT)
- I changed the name of functions in vtkTable which involve accessing a column by name instead of by index to {OldFunction}ByName(). This is due to an error that popped up after changing column indices to vtkIdType. On systems that use 64-bit vtkIdType, if an int is sent to a column-accessing function (e.g. table->RemoveColumn(0)), the compiler did not know whether to use the function RemoveColumn(vtkIdType) or RemoveColumn(const char*). Renaming the second to RemoveColumnByName(const char*) eliminates this problem.
- Jeff 08:36, 28 Jul 2006 (EDT)
vtkVariantArray usage
These are the current uses of vtkVariantArray (As discussed at Titan Developer Meeting 07/27/2006):
- In vtkTable, to retrieve or set a row of data.
- In vtkTable, to handle cases where columns in the table are of type vtkVaraintArray.
- In vtkTreeFieldAggregator, to handle cases where the column on which to aggregate is a vtkVariantArray.
- In vtkVariant, to properly convert a vtkVariantArray to a string / numeric (by returning the first element).
- In vtkTreeToQtModelAdapter, to handle cases where an attribute array of a tree is a vtkVariantArray.
- In TestGraph, TestTable in order to test vtkVariantArray.
- In vtkDelimitedTextReader to insert rows into the table.
vtkVariantArray objects are hardly ever created (they are only created in vtkTable, vtkDelimitedTextReader, and testing code). Most code pertaining to vtkVariantArrays are to deal with cases where a field is of that type. vtkAbstractArray does not have the functionality needed (e.g. get a double value or get a string value at a particular index).
Jeff 16:30, 27 Jul 2006 (EDT)