Fast Path For Temporal Data: Difference between revisions

From ParaQ Wiki
Jump to navigationJump to search
(New page: The new time support in VTK is described [http://www.vtk.org/Wiki/VTK/Time_Support here]. The important things to note: * Discrete as well as contiguous time values are supported * Update ...)
 
No edit summary
 
(23 intermediate revisions by 5 users not shown)
Line 1: Line 1:
The new time support in VTK is described [http://www.vtk.org/Wiki/VTK/Time_Support here]. The important things to note:
==Problem Defintion==
* Discrete as well as contiguous time values are supported
* Update time has to be requested by the final consumer (usually mapper, i.e. display in paraview)
* Some algorithms may request more than 1 time step.


In ParaView 2, the time is controlled by a widget that belongs to the readers object inspector. This does not work well with the new time support as the time is no longer a parameter of the reader but rather something that needs to be set at the end of the pipeline. Here is how I am thinking of implementing time in ParaView 3.
The new time support in VTK is described [http://www.vtk.org/Wiki/VTK/Time_Support here]. It contains the following excerpt:


== Latest design ==


The source/reader time will be tied to the animation time. This means that by default all readers/sources/filters will be synced to the animation time. We will create animation tracks for each view that will control time. This is a little different than ParaView 2. In ParaView 2, we create a track per reader. Also, by default, these tracks will linearly interpolate time with a slope of 1. This means that the time value sent to the reader will be the same as animation time. Note that unlike ParaView 2 that sends time index to the reader, the current architecture sends time values to the reader. This is based on the assumptions that:
''...requesting the data for one cell across many timesteps would still be very slow compared to what it could be for a more optimized path. To address this need in the future we plan on creating a fast-path for such requests. This fast path will be implemented for key readers and the array calculator initially. It is still unclear exactly how this will be implemented. But it will effectively be a separate pipeline possibly connecting to a different output port. Another option is to have a special information request that returns the data as meta-information as opposed to first class data. ''
# All readers/sources provide time values for all time steps. If a reader provides only index based time, we will assign time values (probably uniform spacing of 1 time unit, starting from 0)
# All readers/sources provide time values of the same unit. We will probably have to provide shift/scale functionality to the sources to allow the user to work with different time units.
We will also provide a way of turning of individual animation tracks so that the user can, for example, animate a cut plane on the same time step.  


The animation time will be assigned to a time property that is on the view manager (vtkSMMultiViewRenderModuleProxy). All views will have time properties that are by default linked to this time property. All displays will have time properties that are by default linked to the view that contains them. In the first pass, the only way to set a different time on a view will be using the python interface. This will be accomplished by breaking the link between the view manager and a view and setting the time property. In the future, we can provide this functionality through the animation interface (by allowing different current time for different views, this may become complicated fast).


===Time Values and Range===
The purpose of this page is to begin a discussion on how to implement such a "fast-path".


The animation interface will have 3 modes:
==Exodus Example==
# snap to time steps - where each animation frame corresponds to a time step of a source/reader (user specified begin time and end time)
# uniform spacing - where each animation frame will be t0 + n * dt (user specifies beginning time (t0), number of frames (n) and time spacing (dt))
# real time  (user specified a time scaling, a begin time and an end time)


In either of the last 2 modes, if the time value does not exactly match If the user selects a time value that is not available on the reader, the reader will read the closest time step unless the user explicitly inserts a vtkTemporalInterpolator (will not work if geometry changes).
An example of this optimized path can be seen in the exodus API. It contains ex_get_xxx_time() functions that read the values of a node/element variable for a single node/element through a specified number of time steps. When a filter wants a node's/element's variable value over time (e.g. the vtkExtractDataOverTime filter), instead of re-executing for each timestep, it would send one request upstream to the reader which, in the case of the exodus reader, would then call the appropriate ex_get_xxx_time() method. The problem then becomes how to propagate the data back to the filter. Should it use a separate pipeline or send it back in a vtkInformation key-to-value map?


If the user loads two readers/sources that provide different time values, the time values will be merged into one list.
==Proposed Algorithm==


===Time-aware filters===
1. The exodus reader advertises a special key that tells filters it supports a fast-path for extracting data over time.


* Accumulate over time: This filter will loop over a given time range and accumulate the results. This can be used to
2. If a filter supports fast-paths, it will check its input pipeline information to see if it has this key. If it does, it creates an information request to send upstream to the reader, telling it the type of variable (node or element), the id of the node/element, and the range of time steps to return.
** Plot the value at a point (picked by position or id) over time
** Generate a carpet plot from a line probe (where one of the dimensions is time)
** Find the maximum value over time
** ...
* Particle paths and streaklines: John Biddiscombe already implemented this. I will ask him to incorporate it when I am done.
* Interpolate in time
* Shift/scale


===List of readers to be updated===
3. The reader listens for this request and responds to it by calling ex_get_(elem|nodal)_var_time() for each enabled (nodal/element) variable array. For each one, it will add a new array to its output vtkFieldData, where the array name is formatted as "{ARRAY_NAME}OverTime" (i.e. "TemperatureOverTime") to ensure no conflicts occur with the names of arrays in the vtkPointData or vtkCellData.
* vtkPVDReader
* Ensight reader
* SPCTH reader
* Phasta reader
* Exodus reader (mostly done)
* OpenFOAM reader
* MFIX reader
* FLUENT reader
* LSDyna reader


== First design ==
4. Back at the filter, it will unpack the "XXX_OverTime" arrays in the field data and copy them on to the output point/cell data arrays of the filter (changing the array names back to the original ones).
===Time as a property of displays/views===


Each display will have a time property. By default, these will be linked to a time property of the view that contains them. Furthermore, the time property of all views will be linked to a global time property. The time of the active display will be displayed on the toolbar.


[[Image:paraviewtoolbar.png]]
==Questions==


Whether a time index or a time value is shown on the toolbar widget will depend on the time selection mode (should this be part of the toolbar or a menu?). It will be possible to break the time property link between the active view and the global time to property (how is this related to the general property link editor property?). This will allow users to display different timesteps in different views. To have 2 different readers with different time steps in the same view, the user will have to apply the vtkTemporalShiftScale filter.
:<font color="green">One question this discussion doesn't seem to answer is how this mechanism interacts with the pipeline executives. Will it be a separate request (on the same level as REQUEST_DATA or REQUEST_INFORMATION)? If not, which request type will it be included in? In any event, which executive will generate requests of this type?</font>
::<font color="blue">The following keys will be added to vtkStreamingDemandDrivenPipeline: FAST_PATH_FOR_TEMPORAL_DATA (set by the reader on its output information), FAST_PATH_OBJECT_TYPE ("CELL","POINT","EDGE", etc), FAST_PATH_ID_TYPE ("GLOBAL" or "INDEX"), and FAST_PATH_OBJECT_ID. In the filter's handling of a REQUEST_UPDATE_EXTENT request, these keys will be added to its input information and thus propagated upstream. In the reader's handling of REQUEST_DATA requests, it will extract these keys. In the filter's handling of REQUEST_DATA requests is where it will unpack the temporal arrays from its input field data and pass to its output point data. The pipeline will actually have to be executed a couple times in order for all this to happen. --[[User:Etstant|Eric]] 12:34, 17 July 2007 (EDT)</font>


:<font color="green">I am very uncomfortable with the idea of having a time value in the toolbar that is different than the "time" associated with animation (which has its own toolbar).  I don't think most users will understand, or care to understand, the difference between animation time and pipeline time (the latter being something that the former controls).  In fact, I was unclear on the distinction of the two until about a month ago when Brian bluntly fixed my perception.  Since the animation controls will control the pipeline time anyway (unless a more experienced user changes it with a clunky animation control panel), I suggest that we hide the control of the pipeline time in some deep, dark recesses of the GUI.</font>
Should we even allow the filter to request a range of time steps? The exodus API supports it which is why I included it.
:--[[User:Kmorel|Ken]] 17:39, 8 Jan 2007 (EST)
:<font color="green">I also think that it should be included. One use case for this is plots comparing different datasets. Rather than force the user to trim unneeded extents where two datasets' simulation times don't overlap, we should be able to request only those times where they do overlap.</font>


:<font color="blue">We need to be careful here. First of all, we have to define what the animation controls on the toolbar do. Do they control all animation tracks or do they simply control time? If they control all animation tracks (let's say we are varying the radius of a sphere and moving the camera as part of the animation as well as controlling time) and if there is no separate time control, the users will not be able to control the time independently of the other tracks without digging into some hidden panel. If the animation controls drive only time, we have to give users another set of animation buttons somewhere else (which will also be confusing). Also, if there is a analytic source that allows time to vary continuously in the pipeline, the only way to control the time for that would be from a time control.
Should the filter be able to specify a specific array to extract data from (instead of having the reader output data for all enabled arrays)?
:I see the animation control something that <em>controls</em> a collection of parameters (time, radius, cut plane location etc.). I think everything that is controlled by the animation should have their own widgets. As the animation is run, those widgets should reflect the values that the animation control sets on them. If we hide some of these parameters, I think we would confuse the users even more. Does make sense?
:<font color="green">This would certainly increase the efficiency of the reader since it would eliminate disk seeks to page in data which may never be used.</font>
:If the user loads a dataset with time steps, we will still automatically create an animation track that controls time (maybe, unless the user already has other tracks, not sure about that). This way, unless they change the animation, the animation controls will control the time as expected (i.e. next will go to the next time step, play will move over time etc.). However, changing time step on the time toolbar would not effect the animation step.</font>
:[[User:Berk|Berk]] 09:49, 9 Jan 2007 (EST)


:<font color="blue">I just thought of an alternative to the approach above. As Ken suggested, we could unify the time that is coming from sources and the time that the animation uses. As in ParaView 2, we could have 2 animation modes: discrete (each frame corresponds to 1 time step) and continuous. The animation tracks would use the time coming from readers/source and the animation time would adjust itself based on the time ranges and steps coming from all sources/readers. The time value label could become part of the animation toolbar. One thing we would need is to be able to enable/disable animation tracks individually so that the animation controls can drive only some readers. We would have to think of a way of allowing separate animation tracks for separate views (linked by default).</font>
:<font color="green">The proposal mentions cell and node data, but what about the edge and face data?</font>
:[[User:Berk|Berk]] 09:54, 9 Jan 2007 (EST)
::<font color="purple">I think a bigger issue is how to map from the VTK point/cell id that the user has selected to the correct id in the correct exodus file. This is probably a technical detail that Eric will have to figure out (and a detail that will change once we move to multiblock). Once it is figured out, the edge/face data should just fall out.  It is my understanding that those are defined only in edge and face sets, which means that the user will be selecting a cell variable on that part of the output. So, again, this is just reverse mapping back to the appropriate exodus identifier. --[[User:Kmorel|Ken]] 09:45, 17 July 2007 (EDT)</font>


===Time Values and Range===
:<font color="green">How will this work when running pvserver in parallel? Will only the rank0 process respond? Or will each process look up the node/cell values in its piece of the dataset? This is something that must be decided if the fast path is not an "out-of-band" call. If it is an out-of-band request, then it is not as important. --[[User:Dcthomp|Dcthomp]] 03:12, 17 July 2007 (EDT)</font>
 
::<font color="purple">A big part of this will be determined in how the reverse mapping happens.  Most likely, each process will have to hold information about the part of the data that it read so that it could do the reverse mapping of the local cells. However, it may be a good idea to transfer the data all to the root process. Ideally it would work either way, but we would be less likely to run into bugs or technical issues if the data was just transfered to node 0. --[[User:Kmorel|Ken]] 09:45, 17 July 2007 (EDT)</font>
The time range and the time step values (if available) will be displayed in the information tab of individual readers. One thing that is not clear is which time step values to use on the time property if there are more that one readers with different time values on the same view. I can think of two solutions:
# Merge the two different lists into one (also uniquify of course)
# Provide multiple lists the user can select from (similar to Ensight)
 
If the user selects a time value that is not available on the reader, the reader will read the closest time step unless the user explicitly inserts a vtkTemporalInterpolator (will not work if geometry changes).
 
===Animation===
 
The users will be able to select between time step animation (discrete) and time value animation (contiguous).
 
===Time-aware filters===
 
* Accumulate over time: This filter will loop over a given time range and accumulate the results. This can be used to
** Plot the value at a point (picked by position or id) over time
** Generate a carpet plot from a line probe (where one of the dimensions is time)
** Find the maximum value over time
** ...
* Particle paths and streaklines: John Biddiscombe already implemented this. I will ask him to incorporate it when I am done.
* Interpolate in time
* Shift/scale
 
===List of readers to be updated===
* vtkPVDReader
* Ensight reader
* SPCTH reader
* Phasta reader
* Exodus reader (mostly done)
* SAF reader (is this even used by anyone?)
* OpenFOAM reader
* MFIX reader
* FLUENT reader
* LSDyna reader

Latest revision as of 22:58, 27 July 2007

Problem Defintion

The new time support in VTK is described here. It contains the following excerpt:


...requesting the data for one cell across many timesteps would still be very slow compared to what it could be for a more optimized path. To address this need in the future we plan on creating a fast-path for such requests. This fast path will be implemented for key readers and the array calculator initially. It is still unclear exactly how this will be implemented. But it will effectively be a separate pipeline possibly connecting to a different output port. Another option is to have a special information request that returns the data as meta-information as opposed to first class data.


The purpose of this page is to begin a discussion on how to implement such a "fast-path".

Exodus Example

An example of this optimized path can be seen in the exodus API. It contains ex_get_xxx_time() functions that read the values of a node/element variable for a single node/element through a specified number of time steps. When a filter wants a node's/element's variable value over time (e.g. the vtkExtractDataOverTime filter), instead of re-executing for each timestep, it would send one request upstream to the reader which, in the case of the exodus reader, would then call the appropriate ex_get_xxx_time() method. The problem then becomes how to propagate the data back to the filter. Should it use a separate pipeline or send it back in a vtkInformation key-to-value map?

Proposed Algorithm

1. The exodus reader advertises a special key that tells filters it supports a fast-path for extracting data over time.

2. If a filter supports fast-paths, it will check its input pipeline information to see if it has this key. If it does, it creates an information request to send upstream to the reader, telling it the type of variable (node or element), the id of the node/element, and the range of time steps to return.

3. The reader listens for this request and responds to it by calling ex_get_(elem|nodal)_var_time() for each enabled (nodal/element) variable array. For each one, it will add a new array to its output vtkFieldData, where the array name is formatted as "{ARRAY_NAME}OverTime" (i.e. "TemperatureOverTime") to ensure no conflicts occur with the names of arrays in the vtkPointData or vtkCellData.

4. Back at the filter, it will unpack the "XXX_OverTime" arrays in the field data and copy them on to the output point/cell data arrays of the filter (changing the array names back to the original ones).


Questions

One question this discussion doesn't seem to answer is how this mechanism interacts with the pipeline executives. Will it be a separate request (on the same level as REQUEST_DATA or REQUEST_INFORMATION)? If not, which request type will it be included in? In any event, which executive will generate requests of this type?
The following keys will be added to vtkStreamingDemandDrivenPipeline: FAST_PATH_FOR_TEMPORAL_DATA (set by the reader on its output information), FAST_PATH_OBJECT_TYPE ("CELL","POINT","EDGE", etc), FAST_PATH_ID_TYPE ("GLOBAL" or "INDEX"), and FAST_PATH_OBJECT_ID. In the filter's handling of a REQUEST_UPDATE_EXTENT request, these keys will be added to its input information and thus propagated upstream. In the reader's handling of REQUEST_DATA requests, it will extract these keys. In the filter's handling of REQUEST_DATA requests is where it will unpack the temporal arrays from its input field data and pass to its output point data. The pipeline will actually have to be executed a couple times in order for all this to happen. --Eric 12:34, 17 July 2007 (EDT)

Should we even allow the filter to request a range of time steps? The exodus API supports it which is why I included it.

I also think that it should be included. One use case for this is plots comparing different datasets. Rather than force the user to trim unneeded extents where two datasets' simulation times don't overlap, we should be able to request only those times where they do overlap.

Should the filter be able to specify a specific array to extract data from (instead of having the reader output data for all enabled arrays)?

This would certainly increase the efficiency of the reader since it would eliminate disk seeks to page in data which may never be used.
The proposal mentions cell and node data, but what about the edge and face data?
I think a bigger issue is how to map from the VTK point/cell id that the user has selected to the correct id in the correct exodus file. This is probably a technical detail that Eric will have to figure out (and a detail that will change once we move to multiblock). Once it is figured out, the edge/face data should just fall out. It is my understanding that those are defined only in edge and face sets, which means that the user will be selecting a cell variable on that part of the output. So, again, this is just reverse mapping back to the appropriate exodus identifier. --Ken 09:45, 17 July 2007 (EDT)
How will this work when running pvserver in parallel? Will only the rank0 process respond? Or will each process look up the node/cell values in its piece of the dataset? This is something that must be decided if the fast path is not an "out-of-band" call. If it is an out-of-band request, then it is not as important. --Dcthomp 03:12, 17 July 2007 (EDT)
A big part of this will be determined in how the reverse mapping happens. Most likely, each process will have to hold information about the part of the data that it read so that it could do the reverse mapping of the local cells. However, it may be a good idea to transfer the data all to the root process. Ideally it would work either way, but we would be less likely to run into bugs or technical issues if the data was just transfered to node 0. --Ken 09:45, 17 July 2007 (EDT)