Fast Path For Temporal Data: Difference between revisions
No edit summary |
m (A few more questions about the proposal.) |
||
Line 29: | Line 29: | ||
What form should the request take on that the filter sends upstream? The request needs to encapsulate the data type, the point/cell id, and the time step range? I'm not sure how to do this using vtkInformationKeys. If there was a vtkInformationStringVectorKey type I could encode each value as a string... | What form should the request take on that the filter sends upstream? The request needs to encapsulate the data type, the point/cell id, and the time step range? I'm not sure how to do this using vtkInformationKeys. If there was a vtkInformationStringVectorKey type I could encode each value as a string... | ||
::<font color="green">One question this discussion doesn't seem to answer is how this mechanism interacts with the pipeline executives. Will it be a separate request (on the same level as REQUEST_DATA or REQUEST_INFORMATION)? If not, which request type will it be included in? In any event, which executive will generate requests of this type?</font> | |||
Should we even allow the filter to request a range of time steps? The exodus API supports it which is why I included it. | Should we even allow the filter to request a range of time steps? The exodus API supports it which is why I included it. | ||
::<font color="green">I also think that it should be included. One use case for this is plots comparing different datasets. Rather than force the user to trim unneeded extents where two datasets' simulation times don't overlap, we should be able to request only those times where they do overlap.</font> | |||
Should the filter be able to specify a specific array to extract data from (instead of having the reader output data for all enabled arrays)? | Should the filter be able to specify a specific array to extract data from (instead of having the reader output data for all enabled arrays)? | ||
::<font color="green">This would certainly increase the efficiency of the reader since it would eliminate disk seeks to page in data which may never be used.</font> | |||
As far as I can tell, the filters that should support this fast-path are vtkExtractDataOverTime and vtkExtractArraysOverTime. Are there any others? | As far as I can tell, the filters that should support this fast-path are vtkExtractDataOverTime and vtkExtractArraysOverTime. Are there any others? | ||
<font color="green">The proposal mentions cell and node data, but what about the edge and face data?</font> | |||
<font color="green">How will this work when running pvserver in parallel? Will only the rank0 process respond? Or will each process look up the node/cell values in its piece of the dataset? This is something that must be decided if the fast path is not an "out-of-band" call. If it is an out-of-band request, then it is not as important. --[[User:Dcthomp|Dcthomp]] 03:12, 17 July 2007 (EDT)</font> |
Revision as of 02:12, 17 July 2007
Problem Defintion
The new time support in VTK is described here. It contains the following excerpt:
...requesting the data for one cell across many timesteps would still be very slow compared to what it could be for a more optimized path. To address this need in the future we plan on creating a fast-path for such requests. This fast path will be implemented for key readers and the array calculator initially. It is still unclear exactly how this will be implemented. But it will effectively be a separate pipeline possibly connecting to a different output port. Another option is to have a special information request that returns the data as meta-information as opposed to first class data.
The purpose of this page is to begin a discussion on how to implement such a "fast-path".
Exodus Example
An example of this optimized path can be seen in the exodus API. It contains ex_get_xxx_time() functions that read the values of a node/element variable for a single node/element through a specified number of time steps. When a filter wants a node's/element's variable value over time (e.g. the vtkExtractDataOverTime filter), instead of re-executing for each timestep, it would send one request upstream to the reader which, in the case of the exodus reader, would then call the appropriate ex_get_xxx_time() method. The problem then becomes how to propagate the data back to the filter. Should it use a separate pipeline or send it back in a vtkInformation key-to-value map?
Proposed Algorithm
1. The exodus reader advertises a special key that tells filters it supports a fast-path for extracting data over time.
2. If a filter supports fast-paths, it will check its input pipeline information to see if it has this key. If it does, it creates an information request to send upstream to the reader, telling it the type of variable (node or element), the id of the node/element, and the range of time steps to return.
3. The reader listens for this request and responds to it by calling ex_get_(elem|nodal)_var_time() for each enabled (nodal/element) variable array. For each one, it will add a new array to its output vtkFieldData, where the array name is formatted as "{ARRAY_NAME}OverTime" (i.e. "TemperatureOverTime") to ensure no conflicts occur with the names of arrays in the vtkPointData or vtkCellData.
4. Back at the filter, it will unpack the "XXX_OverTime" arrays in the field data and copy them on to the output point/cell data arrays of the filter (changing the array names back to the original ones).
Questions
What should the name of the information key be that the reader advertises? TEMPORAL_DATA_FAST_PATH?
What form should the request take on that the filter sends upstream? The request needs to encapsulate the data type, the point/cell id, and the time step range? I'm not sure how to do this using vtkInformationKeys. If there was a vtkInformationStringVectorKey type I could encode each value as a string...
- One question this discussion doesn't seem to answer is how this mechanism interacts with the pipeline executives. Will it be a separate request (on the same level as REQUEST_DATA or REQUEST_INFORMATION)? If not, which request type will it be included in? In any event, which executive will generate requests of this type?
Should we even allow the filter to request a range of time steps? The exodus API supports it which is why I included it.
- I also think that it should be included. One use case for this is plots comparing different datasets. Rather than force the user to trim unneeded extents where two datasets' simulation times don't overlap, we should be able to request only those times where they do overlap.
Should the filter be able to specify a specific array to extract data from (instead of having the reader output data for all enabled arrays)?
- This would certainly increase the efficiency of the reader since it would eliminate disk seeks to page in data which may never be used.
As far as I can tell, the filters that should support this fast-path are vtkExtractDataOverTime and vtkExtractArraysOverTime. Are there any others?
The proposal mentions cell and node data, but what about the edge and face data?
How will this work when running pvserver in parallel? Will only the rank0 process respond? Or will each process look up the node/cell values in its piece of the dataset? This is something that must be decided if the fast path is not an "out-of-band" call. If it is an out-of-band request, then it is not as important. --Dcthomp 03:12, 17 July 2007 (EDT)