Time Support: Difference between revisions

From ParaQ Wiki
Jump to navigationJump to search
 
(22 intermediate revisions by 3 users not shown)
Line 1: Line 1:
=Thoughts on Time Support in ParaQ (and ParaView)=
== Overview ==
<center><b>DANGER - THIS PAGE IS UNDER CONSTRUCTION, and SHOULD NOT BE STUDIED YET.</b></center>


==Overview==
For some time now (groaaan, a pun), we've known that we need 'time support' in ParaView/ParaQ. There have been some discussions and we now have a plan to move forward.
For some time now, we've known that we need 'time support' in ParaView, but there's been little written down about what this really means.  Time support is a broad topic, and this wiki entry is an attempt to lay the groundwork for future debate on the topic.


==What is time support?==
== What is time support? ==
This is our first question to address.  Certainly, ParaView supports 'time', in that it reads data files that have a notion of time (in particular, timesteps), and the appliciation supports animation (stepping through time).  However, there are several specific cases which are not supported by the current ParaView architecture.


{| cellpadding="2" cellspacing="4" style="backround:#efefef"
This is our first question to address. Certainly, ParaView supports 'time', in that it reads data files that have a notion of time (in particular, timesteps). And ParaView supports animations that show time sequencesHowever, there are many improvements, large and small, that need to be made.
|-
! style="background:#abcdef" | ID
! style="background:#abcdef" | Requirement
! style="background:#abcdef" | Where we stand now
! style="background:#abcdef" | What the heck do we need it for?
|-
|-
| valign="top" | 1
| valign="top" | Multiple timesteps in a single pipeline.
| valign="top" | Currently, ParaView supports a single data set in a pipeline, and there is no support for iterating over several 'timesteps' of data at a filter.
| valign="top" |
<ul><li>Any filter operation that needs previous or future timesteps to calculate its results.  Example: a filter that calculates a time-based gradient (acceleration).</li>
<li>Memory efficiency (perhaps).  If we want to visualize multiple timesteps of data (think of looking at a multiple-exposure photograph of a ball bouncing), it might be more efficient to have a single pipeline with several timesteps worth of data, rather than having either a) several inputs, which are ganged together or b) several copies of the same pipeline, with inputs set to the appropriate timestep.</li>
</ul>
|-
|-
| bgcolor="#abcdef" height="1" |
| bgcolor="#abcdef" height="1" |
| bgcolor="#abcdef" height="1" |
| bgcolor="#abcdef" height="1" |
|-
|-
| valign="top" | 2
| valign="top" | Query of 'time series' data for particular nodes or elements.
| valign="top" | The Exodus reader supports time-based query, but there is no standard method for transporting this data along the pipelineThere is no standard for discovering or dealing with this data.
| valign="top" | Graphing data for a particular element over time (the extent of time in the source file).
|-
|-
| bgcolor="#abcdef" height="1" |
| bgcolor="#abcdef" height="1" |
| bgcolor="#abcdef" height="1" |
| bgcolor="#abcdef" height="1" |
|-
|-
| valign="top" | 3
| valign="top" | Support for dynamic timestep requests along the pipeline.
| valign="top" | Nothing there now.
| valign="top" | A filter may have to request a specific set of timesteps from downstream filters, in order to do its internal calculations, or to fulfill an upstream request (either for output data or for timesteps)
|-


|}
== Use Cases ==


==Timesteps==
Below is a list of use cases, in the voice of the customer, without suggesting an implementation. "Must have" features are required by users in order for them to use ParaQ.
The timesteps present in a pipeline:
* Need not be contiguous
* Need not be valid timesteps present in the source data.  For example, there may be an interpolated timestep present.


==Responsibilities of a Filter==
=== Must Have ===
* All filters operate on all input timesteps, unless otherwise restricted.
* All filters can request that downstream filters provide certain timesteps.
* All filters must respond to upstream requests for timestep ranges.
* A filters need not 'care' about time.
**A filter that does not care about time in its calculation is responsible for applying its operation to appropriate input timesteps, and create data appropriate to satisfy upstream requests for data.  If we implement the 'dataset per timestep' approach, in the simplest case, this is simply a matter of applying the filter's operation to each input timestep in turn.
* Filter must guarantee that its output data is consistent, across timesteps.


==Expectations for Downstream Data==
# Plot the values of one-to-many elements over a specified range of time (client side graphing).
# Animations of time-series data that handle simulations with nonuniform time steps.  This includes ''optional'' interpolation of node(cell) variables to intermediate times to avoid choppy playback or nonuniform playback (i.e., varying playback speed as integration step size is varied).
# Get a window of data (all elements between times T1 and T2).


<ul>
=== Very Desirable ===
  <li>Data for t[0] and t[1] for the same element can be usefully comparedThis item, if it is a requirement, imposes restrictions on what kinds of data can be passed in the pipelineIn particular, if we have a filter that performs some interpolation on element A, can that operation be performed correctly on all timesteps in the dataAre all the elements available at t[0] and t[1], so that we can compare the results of the output filters?</li>
 
  <li></li>
# Take time series data for one-to-many elements, run that through a filter (such as calculator), and then plot that (client side graphing).
</ul>
# Ghosting (e.g. see the path of a bouncing ball).
# Plot the values of nodal (or cell-centered) variables taken on over a space curve ''and'' over a specified range of time. The space curve may be a line or a circle.
# Having an ''optional''  "snap to nodes" feature for the spacetime plots above (Need 3) would be very useful.
# Having an ''optional''  "geometry inference" feature for defining circular curves for the spacetime plots above (Need 3) would be very useful.
# Min/max, and other statistical operations.  The operation can be for a single element over time or over all elements over time.  Want to know the time at which a variable meets or exceeds a given value (either at specified node(s), or for any node in the entire mesh). An example is thermal situations where we need to know whether or when a particular temperature-sensitive component fails.
 
=== Desirable ===
 
# Provide time series data to a filter that requires multiple time-steps as input (DSP) e.g: for time step N, average the data for N-1, N, and N+1.
# Support alternatives to streamlines - streamline: plot the path through a field at an instant in timeStreak-/path-lines: plot the position of a particle for every instant in time.
# Calculate the integral of a cut plane over time, i.e the integral of a plane (or something) at each time step.
# Would like time interpolation (as required for animations in Need 1) to be nonlinear (i.e., parabolic) so that it matches the interpolation assumed by the integration technique used by the solver. Note that different solvers have different integration schemes. Predictor-corrector methods do not necessarily specify a unique interpolant, but other methods do.
# Per time step field data.  That is data that could be a field of a data set that changes over time.  We should also be able to represent field data that does ''not'' change over time.
# Calculate envelopes (convex hull) over time.  Calculate unions and intersections over time, i.e. "what volume was ever occupied by an object", and "what volume was always occupied by an object".
# Retrieve non-contiguous ranges of time-varying data.
# Readers report temporal bounds.  Time is part of the extents.
 
== Timesteps ==
 
Timesteps are point-samples of continuous time that are defined for a specific data set.  They need not be contiguous or uniformly spaced, and in fact often aren't, since many simulations adjust their sample rate dynamically based on the simulation content.  When more-than-one data set is present in a visualization, the timesteps in the data sets may represent completely different ranges and rates of time.  Thus, it is important to have a single time "reference" that is uniform across all data setsThis document assumes that "time" is a continuous, absolute quantity represented as a real number, and that a "timestep" is an individual sample at a particular point in time.
 
Because time can vary continuously, it is possible to display a time for which there aren't any dataset timesteps that match.  Behavior in this case can vary from displaying nothing or displaying the nearest timestep to interpolation of timestep data.
 
Playback of an animation means varying ''time'', instead of varying ''timesteps''.  By sampling time at regular intervals, animations can play-back at the correct rate, even if the timesteps in a dataset don't represent regular intervals.  When dealing with multiple datasets, the correct data can automatically be displayed, as long as the dataset timesteps share a common reference (if they don't, simple offset and scale transformations are trivial to implement and use).  In the case of a single dataset, matching the animation sample points to the timesteps in the dataset can provide backwards-compatible behavior.
 
== Current Workarounds ==
 
* Plotting time-varying elements - the Exodus reader has functionality that allows single node or cell values to be requested, but this can only be used to plot time-varying data if the data resides in an Exodus file on the client (client-server won't work), and it limits plotting to the source data (the data cannot be filtered).  Effectively this is an "out-of-band" request where the consumer of time-varying data must be hard-coded to seek-out and query the Exodus reader directly.
* Plotting multiple timesteps - in the case of displaying the path of a bouncing ball, multiple copies of a pipeline can be created, with each input displaying the same file, but at a different timestep.  This solution cannot scale well since there will be many (perhaps 100s) of sources referencing the same dataset, with corresponding resource consumption.  This approach also leads to additional complexity in managing multiple pipelines - does the user see / work with all of the pipelines, or is there a management layer that hides them, leaving the user to manage a single "virtual" pipeline instead?
* Using multiple inputs - DSP-like filters that perform operations across multiple timesteps (e.g: averaging) could be coded with multiple inputs, with each input connected to a different source, each displaying the same file, but at a different timestep. Again, this approach does not scale well due to the memory requirements and management issues associated with multiple otherwise-identical pipelines.  It also introduces complexity in the coding of filters.
 
== Proposed Design ==
 
Currently, VTK assumes that the contents of the pipeline represent a single timestep. VTK's time support consists of four main keys. TIME_STEPS is a vector of doubles that lists all the time steps that are available (in increasing order) from a source/filter. This is how a reader reports downstream what timesteps are available. Currently I believe only a couple readers actually set this key. UPDATE_TIME_INDEX is a key that is used to request the data for a specific time index. The index corresponds to a index in the TIME_STEPS vector. DATA_TIME_INDEX and DATA_TIME are keys that indicate for a DataObject what index (int) and time (double) that data corresponds to.
 
This mechanism is limited in that it can only handle discrete time steps. But some analytical sources may be able to produce data for any time requested. To address this need a new information key will be created called TIME_RANGE that will define a T1 to T2 range of time that the source can provide. Likewise the old mechanism only allowed for requesting a single time step by index. This has the problem that some file formats are more efficient at returning multiple times steps at once, requesting by index creates confusion wrt branching filters. A new information key UPDATE_TIME_STEPS (vector of doubles) will be created allows for requesting temporal data by time, as well as supporting requests for multiple time-steps at once.
 
The result is that a temporal request will return a multiblock dataset with each time step being an independent dataobject. This will also support returning multiple time steps of multiblock data (multi-multi-block :) To this end time support will require using the composite data pipeline executive. This executive will be extended to include time support.
 
Interpolation of time-varying data will be provided by filters written for that purpose; this would minimize the coding impact on sources (which can stay simple), and allows for different types of interpolation to suit user requirements / resources.  Interpolation filters would be an example of a filter that expands the temporaral extents of a data request, just as a convolution image filter expands a request's geometric extents. Likewise a shift scale filter will be created that passes the data through but shifts and scales the time. This is useful for visualizing multiple datasets that while intended to be time aligned, are not so due to simple temporal origin and or scale issues.
 
One side effect of the above is that requesting the data for one cell across many timesteps would still be very slow compared to what it could be for a more optimized path. To address this need weplan on creating a fast-path for such requests. This fast path will be implemented for key readers and the array calculator initially. It is still unclear exactly how this will be implemented. But it will effectively be a separate pipeline possibly connecting to a different output port. Another option is to have a special iformation request that rturns the data as meta-information as opposed to first class data.
 
== Responsibilities of a Filter ==
 
With the above design in-place most operations on time agnostic filters will be handled by the executive which will loop the time steps through each filter. Time aware filters will be passed the multiblock dataset if desired and can then operate on mutliple time steps at once. A time aware consumer is responsible for making reasonable requests upstream. The executive will not crop a temporal request to fit in memory. If a temporal consumer asks for a million timesteps, each a gigabyte in size, then the executive will pass the request upstream, so as with geometric extents, try to request what is needed balanced with efficiency. In a time dependent dataset avoid assuming much consistency between timesteps. Many things can change between timesteps, including node positions, connectivity, the set of fields, etc.

Latest revision as of 10:19, 7 August 2006

Overview

For some time now (groaaan, a pun), we've known that we need 'time support' in ParaView/ParaQ. There have been some discussions and we now have a plan to move forward.

What is time support?

This is our first question to address. Certainly, ParaView supports 'time', in that it reads data files that have a notion of time (in particular, timesteps). And ParaView supports animations that show time sequences. However, there are many improvements, large and small, that need to be made.

Use Cases

Below is a list of use cases, in the voice of the customer, without suggesting an implementation. "Must have" features are required by users in order for them to use ParaQ.

Must Have

  1. Plot the values of one-to-many elements over a specified range of time (client side graphing).
  2. Animations of time-series data that handle simulations with nonuniform time steps. This includes optional interpolation of node(cell) variables to intermediate times to avoid choppy playback or nonuniform playback (i.e., varying playback speed as integration step size is varied).
  3. Get a window of data (all elements between times T1 and T2).

Very Desirable

  1. Take time series data for one-to-many elements, run that through a filter (such as calculator), and then plot that (client side graphing).
  2. Ghosting (e.g. see the path of a bouncing ball).
  3. Plot the values of nodal (or cell-centered) variables taken on over a space curve and over a specified range of time. The space curve may be a line or a circle.
  4. Having an optional "snap to nodes" feature for the spacetime plots above (Need 3) would be very useful.
  5. Having an optional "geometry inference" feature for defining circular curves for the spacetime plots above (Need 3) would be very useful.
  6. Min/max, and other statistical operations. The operation can be for a single element over time or over all elements over time. Want to know the time at which a variable meets or exceeds a given value (either at specified node(s), or for any node in the entire mesh). An example is thermal situations where we need to know whether or when a particular temperature-sensitive component fails.

Desirable

  1. Provide time series data to a filter that requires multiple time-steps as input (DSP) e.g: for time step N, average the data for N-1, N, and N+1.
  2. Support alternatives to streamlines - streamline: plot the path through a field at an instant in time. Streak-/path-lines: plot the position of a particle for every instant in time.
  3. Calculate the integral of a cut plane over time, i.e the integral of a plane (or something) at each time step.
  4. Would like time interpolation (as required for animations in Need 1) to be nonlinear (i.e., parabolic) so that it matches the interpolation assumed by the integration technique used by the solver. Note that different solvers have different integration schemes. Predictor-corrector methods do not necessarily specify a unique interpolant, but other methods do.
  5. Per time step field data. That is data that could be a field of a data set that changes over time. We should also be able to represent field data that does not change over time.
  6. Calculate envelopes (convex hull) over time. Calculate unions and intersections over time, i.e. "what volume was ever occupied by an object", and "what volume was always occupied by an object".
  7. Retrieve non-contiguous ranges of time-varying data.
  8. Readers report temporal bounds. Time is part of the extents.

Timesteps

Timesteps are point-samples of continuous time that are defined for a specific data set. They need not be contiguous or uniformly spaced, and in fact often aren't, since many simulations adjust their sample rate dynamically based on the simulation content. When more-than-one data set is present in a visualization, the timesteps in the data sets may represent completely different ranges and rates of time. Thus, it is important to have a single time "reference" that is uniform across all data sets. This document assumes that "time" is a continuous, absolute quantity represented as a real number, and that a "timestep" is an individual sample at a particular point in time.

Because time can vary continuously, it is possible to display a time for which there aren't any dataset timesteps that match. Behavior in this case can vary from displaying nothing or displaying the nearest timestep to interpolation of timestep data.

Playback of an animation means varying time, instead of varying timesteps. By sampling time at regular intervals, animations can play-back at the correct rate, even if the timesteps in a dataset don't represent regular intervals. When dealing with multiple datasets, the correct data can automatically be displayed, as long as the dataset timesteps share a common reference (if they don't, simple offset and scale transformations are trivial to implement and use). In the case of a single dataset, matching the animation sample points to the timesteps in the dataset can provide backwards-compatible behavior.

Current Workarounds

  • Plotting time-varying elements - the Exodus reader has functionality that allows single node or cell values to be requested, but this can only be used to plot time-varying data if the data resides in an Exodus file on the client (client-server won't work), and it limits plotting to the source data (the data cannot be filtered). Effectively this is an "out-of-band" request where the consumer of time-varying data must be hard-coded to seek-out and query the Exodus reader directly.
  • Plotting multiple timesteps - in the case of displaying the path of a bouncing ball, multiple copies of a pipeline can be created, with each input displaying the same file, but at a different timestep. This solution cannot scale well since there will be many (perhaps 100s) of sources referencing the same dataset, with corresponding resource consumption. This approach also leads to additional complexity in managing multiple pipelines - does the user see / work with all of the pipelines, or is there a management layer that hides them, leaving the user to manage a single "virtual" pipeline instead?
  • Using multiple inputs - DSP-like filters that perform operations across multiple timesteps (e.g: averaging) could be coded with multiple inputs, with each input connected to a different source, each displaying the same file, but at a different timestep. Again, this approach does not scale well due to the memory requirements and management issues associated with multiple otherwise-identical pipelines. It also introduces complexity in the coding of filters.

Proposed Design

Currently, VTK assumes that the contents of the pipeline represent a single timestep. VTK's time support consists of four main keys. TIME_STEPS is a vector of doubles that lists all the time steps that are available (in increasing order) from a source/filter. This is how a reader reports downstream what timesteps are available. Currently I believe only a couple readers actually set this key. UPDATE_TIME_INDEX is a key that is used to request the data for a specific time index. The index corresponds to a index in the TIME_STEPS vector. DATA_TIME_INDEX and DATA_TIME are keys that indicate for a DataObject what index (int) and time (double) that data corresponds to.

This mechanism is limited in that it can only handle discrete time steps. But some analytical sources may be able to produce data for any time requested. To address this need a new information key will be created called TIME_RANGE that will define a T1 to T2 range of time that the source can provide. Likewise the old mechanism only allowed for requesting a single time step by index. This has the problem that some file formats are more efficient at returning multiple times steps at once, requesting by index creates confusion wrt branching filters. A new information key UPDATE_TIME_STEPS (vector of doubles) will be created allows for requesting temporal data by time, as well as supporting requests for multiple time-steps at once.

The result is that a temporal request will return a multiblock dataset with each time step being an independent dataobject. This will also support returning multiple time steps of multiblock data (multi-multi-block :) To this end time support will require using the composite data pipeline executive. This executive will be extended to include time support.

Interpolation of time-varying data will be provided by filters written for that purpose; this would minimize the coding impact on sources (which can stay simple), and allows for different types of interpolation to suit user requirements / resources. Interpolation filters would be an example of a filter that expands the temporaral extents of a data request, just as a convolution image filter expands a request's geometric extents. Likewise a shift scale filter will be created that passes the data through but shifts and scales the time. This is useful for visualizing multiple datasets that while intended to be time aligned, are not so due to simple temporal origin and or scale issues.

One side effect of the above is that requesting the data for one cell across many timesteps would still be very slow compared to what it could be for a more optimized path. To address this need weplan on creating a fast-path for such requests. This fast path will be implemented for key readers and the array calculator initially. It is still unclear exactly how this will be implemented. But it will effectively be a separate pipeline possibly connecting to a different output port. Another option is to have a special iformation request that rturns the data as meta-information as opposed to first class data.

Responsibilities of a Filter

With the above design in-place most operations on time agnostic filters will be handled by the executive which will loop the time steps through each filter. Time aware filters will be passed the multiblock dataset if desired and can then operate on mutliple time steps at once. A time aware consumer is responsible for making reasonable requests upstream. The executive will not crop a temporal request to fit in memory. If a temporal consumer asks for a million timesteps, each a gigabyte in size, then the executive will pass the request upstream, so as with geometric extents, try to request what is needed balanced with efficiency. In a time dependent dataset avoid assuming much consistency between timesteps. Many things can change between timesteps, including node positions, connectivity, the set of fields, etc.