Time Support: Difference between revisions

From ParaQ Wiki
Jump to navigationJump to search
No edit summary
 
(20 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== Overview ==
== Overview ==
For some time now (groaaan, a pun), we've known that we need 'time support' in ParaView/ParaQ, but there's been little written down about what this really means.  Time support is a broad topic, and this wiki entry is an attempt to lay the groundwork for future debate on the topic.
 
For some time now (groaaan, a pun), we've known that we need 'time support' in ParaView/ParaQ. There have been some discussions and we now have a plan to move forward.


== What is time support? ==
== What is time support? ==
This is our first question to address.  Certainly, ParaView supports 'time', in that it reads data files that have a notion of time (in particular, timesteps). And ParaView supports animations that show time sequences.  However, there are many improvements, large and small, that need to be made.
This is our first question to address.  Certainly, ParaView supports 'time', in that it reads data files that have a notion of time (in particular, timesteps). And ParaView supports animations that show time sequences.  However, there are many improvements, large and small, that need to be made.


== Timesteps ==
== Use Cases ==
 
Below is a list of use cases, in the voice of the customer, without suggesting an implementation. "Must have" features are required by users in order for them to use ParaQ.
 
=== Must Have ===
 
# Plot the values of one-to-many elements over a specified range of time (client side graphing).
# Animations of time-series data that handle simulations with nonuniform time steps.  This includes ''optional'' interpolation of node(cell) variables to intermediate times to avoid choppy playback or nonuniform playback (i.e., varying playback speed as integration step size is varied).
# Get a window of data (all elements between times T1 and T2).


Timesteps are point-samples of continuous time that are defined for a specific data set.  They need not be contiguous or uniformly spaced, and in fact often aren't, since many simulations adjust their sample rate dynamically based on the simulation content.  When more-than-one data set is present in a visualization, the timesteps in the data sets may represent completely different ranges and rates of time.  Thus, it is important to have a single time "reference" that is uniform across all data sets.  This document assumes that "time" is an absolute, continuous real number, and that a "timestep" is an individual sample at a particular point in time.
=== Very Desirable ===


== Use Cases ==
# Take time series data for one-to-many elements, run that through a filter (such as calculator), and then plot that (client side graphing).
# Ghosting (e.g. see the path of a bouncing ball).
# Plot the values of nodal (or cell-centered) variables taken on over a space curve ''and'' over a specified range of time. The space curve may be a line or a circle.
# Having an ''optional''  "snap to nodes" feature for the spacetime plots above (Need 3) would be very useful.
# Having an ''optional''  "geometry inference" feature for defining circular curves for the spacetime plots above (Need 3) would be very useful.
# Min/max, and other statistical operations.  The operation can be for a single element over time or over all elements over time.  Want to know the time at which a variable meets or exceeds a given value (either at specified node(s), or for any node in the entire mesh). An example is thermal situations where we need to know whether or when a particular temperature-sensitive component fails.


Below is a list of use cases, in the voice of the customer, without suggesting an implementation. Each need is assigned an importance from the list: (Must have, Very desirable, Desirable, Not needed, Undesirable). A "Must have" feature is required by users in order for them to use ParaView.
=== Desirable ===


# Plot the values of one-to-many elements over a specified range of time (client side graphing).
# Provide time series data to a filter that requires multiple time-steps as input (DSP) e.g: for time step N, average the data for N-1, N, and N+1.
# Take time series data for a known element (point or cell), run that through a filter (such as calculator), and then plot that (client side graphing).
# Provide time series data to a filter that requires multiple time-steps as input (e.g: for time step N, average the data for N-1, N, and N+1).  (DSP).
# Support alternatives to streamlines - streamline: plot the path through a field at an instant in time.  Streak-/path-lines: plot the position of a particle for every instant in time.
# Support alternatives to streamlines - streamline: plot the path through a field at an instant in time.  Streak-/path-lines: plot the position of a particle for every instant in time.
# Calculate the integral of a cut plane over time, i.e the integral of a plane (or something) at each time step.
# Calculate the integral of a cut plane over time, i.e the integral of a plane (or something) at each time step.
# Ghosting (e.g. see the path of a bouncing ball).
# Would like time interpolation (as required for animations in Need 1) to be nonlinear (i.e., parabolic) so that it matches the interpolation assumed by the integration technique used by the solver. Note that different solvers have different integration schemes. Predictor-corrector methods do not necessarily specify a unique interpolant, but other methods do.
# Min/max, and other statistical operations.  The operation can be for a single element over time or over all elements over time.
# Per time step field data.  That is data that could be a field of a data set that changes over time.  We should also be able to represent field data that does ''not'' change over time.
# Per time step field data.  That is data that could be a field of a data set that changes over time.  We should also be able to represent field data that does ''not'' change over time.
# AMR?
# Calculate envelopes (convex hull) over time.  Calculate unions and intersections over time, i.e. "what volume was ever occupied by an object", and "what volume was always occupied by an object".
# Calculate envelopes (convex hull) over time.  Calculate unions and intersections over time, i.e. "what volume was ever occupied by an object", and "what volume was always occupied by an object".
# Get a window of data (all elements between times T1 and T2).
# Retrieve non-contiguous ranges of time-varying data.
# Select non-contiguous time steps.
# Time sampling.
# Hierarchical (weeks, months, years).
# Handle when <math>\Delta t</math> is not constant.
# Multiple timelines (> 1 data set, etc).
#* They can be overlapping or non-overlapping.
# Readers report temporal bounds.  Time is part of the extents.
# Readers report temporal bounds.  Time is part of the extents.


{| cellpadding="2" cellspacing="4"
== Timesteps ==
|-
 
! style="background:#abcdef" | ID
Timesteps are point-samples of continuous time that are defined for a specific data set. They need not be contiguous or uniformly spaced, and in fact often aren't, since many simulations adjust their sample rate dynamically based on the simulation contentWhen more-than-one data set is present in a visualization, the timesteps in the data sets may represent completely different ranges and rates of time. Thus, it is important to have a single time "reference" that is uniform across all data sets.  This document assumes that "time" is a continuous, absolute quantity represented as a real number, and that a "timestep" is an individual sample at a particular point in time.
! style="background:#abcdef" | Need
! style="background:#abcdef" | Importance
|-
|-
| valign="top" | 1
| valign="top" | Plot the values of one-to-many nodes(cells) over a specified range of time (client side graphing).
| valign="top" | Must have
|-
|-
| valign="top" | 2
| valign="top" | Animations of time-series data that handle simulations with nonuniform time steps. This includes ''optional'' interpolation of node(cell) variables to intermediate times to avoid choppy playback or nonuniform playback (i.e., varying playback speed as integration step size is varied).
| valign="top" | Must have
|-
|-
| bgcolor="#abcdef" height="1" |
| bgcolor="#abcdef" height="1" |
| bgcolor="#abcdef" height="1" |
|-
|-
| valign="top" | 3
| valign="top" | Plot the values of nodal (or cell-centered) variables taken on over a space curve ''and'' over a specified range of time. The space curve may be a line or a circle.
| valign="top" | Very desirable
|-
|-
| valign="top" | 4
| valign="top" | Having an ''optional''  "snap to nodes" feature for the spacetime plots above (Need 3) would be very useful.
| valign="top" | Very desirable
|-
|-
| valign="top" | 5
| valign="top" | Having an ''optional'' "geometry inference" feature for defining circular curves for the spacetime plots above (Need 3) would be very useful.
| valign="top" | Very desirable
|-
|-
| valign="top" | 6
| valign="top" | Want to know the time at which a variable meets or exceeds a given value (either at specified node(s), or for any node in the entire mesh). An example is thermal situations where we need to know whether or when a particular temperature-sensitive component fails.
| valign="top" | Very desirable
|-
|-
| bgcolor="#abcdef" height="1" |
| bgcolor="#abcdef" height="1" |
| bgcolor="#abcdef" height="1" |
|-
|-
| valign="top" | 7
| valign="top" | Would like time interpolation (as required for animations in Need 1) to be nonlinear (i.e., parabolic) so that it matches the interpolation assumed by the integration technique used by the solver. Note that different solvers have different integration schemes. Predictor-corrector methods do not necessarily specify a unique interpolant, but other methods do.
| valign="top" | Desirable
|-
|-
| bgcolor="#abcdef" height="1" |
| bgcolor="#abcdef" height="1" |
| bgcolor="#abcdef" height="1" |
|-
|-
| valign="top" | 8
| valign="top" | Take time-derivatives. (Since the simulation package can be made to output these values directly when required, this is not needed).
| valign="top" | Not needed
|-
|}


== Mapping the needs to ParaView's infrastructure ==
Because time can vary continuously, it is possible to display a time for which there aren't any dataset timesteps that match.  Behavior in this case can vary from displaying nothing or displaying the nearest timestep to interpolation of timestep data.


Because the needs don't specify an implementation, we need to identify and evaluate ways in which ParaView can be extended to meet them (where they are not already met).
Playback of an animation means varying ''time'', instead of varying ''timesteps''.  By sampling time at regular intervals, animations can play-back at the correct rate, even if the timesteps in a dataset don't represent regular intervals.  When dealing with multiple datasets, the correct data can automatically be displayed, as long as the dataset timesteps share a common reference (if they don't, simple offset and scale transformations are trivial to implement and use).  In the case of a single dataset, matching the animation sample points to the timesteps in the dataset can provide backwards-compatible behavior.


Currently, ParaView supports a single data set in a pipeline. There is a way for filters to request multiple timesteps but this may cause cache problems upstream. Also, there is no way for a filter to request "pieces" of unstructured grids that contain a specified node or cell. This is required for efficient plotting at points or along curves. The Exodus reader has functionality that allows single node or cell values to be requested, but the XY plot filter can only take advantage of this if the Exodus reader is its direct upstream filter (without any intervening filters). If extents are generalized in a way that lets filters specify pieces containing cells or nodes, this limitation would go away.
== Current Workarounds ==


Memory efficiencyIf we want to visualize multiple timesteps of data (think of looking at a multiple-exposure photograph of a ball bouncing), it might be more efficient to have a single pipeline with several timesteps worth of data, rather than having either a) several inputs, which are ganged together or b) several copies of the same pipeline, with inputs set to the appropriate timestep. Another thing that would help with memory efficiency is for multiple timesteps to refer to the same connectivity (and point?) arrays. This is more of an implementation issue for readers than for the pipeline, though.
* Plotting time-varying elements - the Exodus reader has functionality that allows single node or cell values to be requested, but this can only be used to plot time-varying data if the data resides in an Exodus file on the client (client-server won't work), and it limits plotting to the source data (the data cannot be filtered)Effectively this is an "out-of-band" request where the consumer of time-varying data must be hard-coded to seek-out and query the Exodus reader directly.
* Plotting multiple timesteps - in the case of displaying the path of a bouncing ball, multiple copies of a pipeline can be created, with each input displaying the same file, but at a different timestep.  This solution cannot scale well since there will be many (perhaps 100s) of sources referencing the same dataset, with corresponding resource consumption.  This approach also leads to additional complexity in managing multiple pipelines - does the user see / work with all of the pipelines, or is there a management layer that hides them, leaving the user to manage a single "virtual" pipeline instead?
* Using multiple inputs - DSP-like filters that perform operations across multiple timesteps (e.g: averaging) could be coded with multiple inputs, with each input connected to a different source, each displaying the same file, but at a different timestep.  Again, this approach does not scale well due to the memory requirements and management issues associated with multiple otherwise-identical pipelines. It also introduces complexity in the coding of filters.


ParaView currently assumes that a given mesh will have attributes for a single time step and that that time step will be one selected from a list of available time steps provided by a source, such as a reader. There is no way for a mesh to indicate that its attributes are interpolated to some time between time steps provided by a source.
== Proposed Design ==
 
Currently, VTK assumes that the contents of the pipeline represent a single timestep. VTK's time support consists of four main keys. TIME_STEPS is a vector of doubles that lists all the time steps that are available (in increasing order) from a source/filter. This is how a reader reports downstream what timesteps are available. Currently I believe only a couple readers actually set this key. UPDATE_TIME_INDEX is a key that is used to request the data for a specific time index. The index corresponds to a index in the TIME_STEPS vector. DATA_TIME_INDEX and DATA_TIME are keys that indicate for a DataObject what index (int) and time (double) that data corresponds to.
 
This mechanism is limited in that it can only handle discrete time steps. But some analytical sources may be able to produce data for any time requested. To address this need a new information key will be created called TIME_RANGE that will define a T1 to T2 range of time that the source can provide. Likewise the old mechanism only allowed for requesting a single time step by index. This has the problem that some file formats are more efficient at returning multiple times steps at once, requesting by index creates confusion wrt branching filters. A new information key UPDATE_TIME_STEPS (vector of doubles) will be created allows for requesting temporal data by time, as well as supporting requests for multiple time-steps at once.
 
The result is that a temporal request will return a multiblock dataset with each time step being an independent dataobject. This will also support returning multiple time steps of multiblock data (multi-multi-block :) To this end time support will require using the composite data pipeline executive. This executive will be extended to include time support.
 
Interpolation of time-varying data will be provided by filters written for that purpose; this would minimize the coding impact on sources (which can stay simple), and allows for different types of interpolation to suit user requirements / resources.  Interpolation filters would be an example of a filter that expands the temporaral extents of a data request, just as a convolution image filter expands a request's geometric extents. Likewise a shift scale filter will be created that passes the data through but shifts and scales the time. This is useful for visualizing multiple datasets that while intended to be time aligned, are not so due to simple temporal origin and or scale issues.
 
One side effect of the above is that requesting the data for one cell across many timesteps would still be very slow compared to what it could be for a more optimized path. To address this need weplan on creating a fast-path for such requests. This fast path will be implemented for key readers and the array calculator initially. It is still unclear exactly how this will be implemented. But it will effectively be a separate pipeline possibly connecting to a different output port. Another option is to have a special iformation request that rturns the data as meta-information as opposed to first class data.


== Responsibilities of a Filter ==
== Responsibilities of a Filter ==
* All filters operate on all input timesteps, unless otherwise restricted.
* All filters can request that downstream filters provide certain timesteps.
* All filters must respond to upstream requests for timestep ranges.
* A filters need not 'care' about time.
**A filter that does not care about time in its calculation is responsible for applying its operation to appropriate input timesteps, and create data appropriate to satisfy upstream requests for data.  If we implement the 'dataset per timestep' approach, in the simplest case, this is simply a matter of applying the filter's operation to each input timestep in turn.
* Filter must guarantee that its output data is consistent, across timesteps.
== Expectations for Downstream Data ==


<ul>
With the above design in-place most operations on time agnostic filters will be handled by the executive which will loop the time steps through each filter. Time aware filters will be passed the multiblock dataset if desired and can then operate on mutliple time steps at once. A time aware consumer is responsible for making reasonable requests upstream. The executive will not crop a temporal request to fit in memory. If a temporal consumer asks for a million timesteps, each a gigabyte in size, then the executive will pass the request upstream, so as with geometric extents, try to request what is needed balanced with efficiency. In a time dependent dataset avoid assuming much consistency between timesteps. Many things can change between timesteps, including node positions, connectivity, the set of fields, etc.
  <li>Data for t[0] and t[1] for the same element can be usefully compared. This item, if it is a requirement, imposes restrictions on what kinds of data can be passed in the pipeline. In particular, if we have a filter that performs some interpolation on element A, can that operation be performed correctly on all timesteps in the data?  Are all the elements available at t[0] and t[1], so that we can compare the results of the output filters?</li>
  <li></li>
</ul>

Latest revision as of 10:19, 7 August 2006

Overview

For some time now (groaaan, a pun), we've known that we need 'time support' in ParaView/ParaQ. There have been some discussions and we now have a plan to move forward.

What is time support?

This is our first question to address. Certainly, ParaView supports 'time', in that it reads data files that have a notion of time (in particular, timesteps). And ParaView supports animations that show time sequences. However, there are many improvements, large and small, that need to be made.

Use Cases

Below is a list of use cases, in the voice of the customer, without suggesting an implementation. "Must have" features are required by users in order for them to use ParaQ.

Must Have

  1. Plot the values of one-to-many elements over a specified range of time (client side graphing).
  2. Animations of time-series data that handle simulations with nonuniform time steps. This includes optional interpolation of node(cell) variables to intermediate times to avoid choppy playback or nonuniform playback (i.e., varying playback speed as integration step size is varied).
  3. Get a window of data (all elements between times T1 and T2).

Very Desirable

  1. Take time series data for one-to-many elements, run that through a filter (such as calculator), and then plot that (client side graphing).
  2. Ghosting (e.g. see the path of a bouncing ball).
  3. Plot the values of nodal (or cell-centered) variables taken on over a space curve and over a specified range of time. The space curve may be a line or a circle.
  4. Having an optional "snap to nodes" feature for the spacetime plots above (Need 3) would be very useful.
  5. Having an optional "geometry inference" feature for defining circular curves for the spacetime plots above (Need 3) would be very useful.
  6. Min/max, and other statistical operations. The operation can be for a single element over time or over all elements over time. Want to know the time at which a variable meets or exceeds a given value (either at specified node(s), or for any node in the entire mesh). An example is thermal situations where we need to know whether or when a particular temperature-sensitive component fails.

Desirable

  1. Provide time series data to a filter that requires multiple time-steps as input (DSP) e.g: for time step N, average the data for N-1, N, and N+1.
  2. Support alternatives to streamlines - streamline: plot the path through a field at an instant in time. Streak-/path-lines: plot the position of a particle for every instant in time.
  3. Calculate the integral of a cut plane over time, i.e the integral of a plane (or something) at each time step.
  4. Would like time interpolation (as required for animations in Need 1) to be nonlinear (i.e., parabolic) so that it matches the interpolation assumed by the integration technique used by the solver. Note that different solvers have different integration schemes. Predictor-corrector methods do not necessarily specify a unique interpolant, but other methods do.
  5. Per time step field data. That is data that could be a field of a data set that changes over time. We should also be able to represent field data that does not change over time.
  6. Calculate envelopes (convex hull) over time. Calculate unions and intersections over time, i.e. "what volume was ever occupied by an object", and "what volume was always occupied by an object".
  7. Retrieve non-contiguous ranges of time-varying data.
  8. Readers report temporal bounds. Time is part of the extents.

Timesteps

Timesteps are point-samples of continuous time that are defined for a specific data set. They need not be contiguous or uniformly spaced, and in fact often aren't, since many simulations adjust their sample rate dynamically based on the simulation content. When more-than-one data set is present in a visualization, the timesteps in the data sets may represent completely different ranges and rates of time. Thus, it is important to have a single time "reference" that is uniform across all data sets. This document assumes that "time" is a continuous, absolute quantity represented as a real number, and that a "timestep" is an individual sample at a particular point in time.

Because time can vary continuously, it is possible to display a time for which there aren't any dataset timesteps that match. Behavior in this case can vary from displaying nothing or displaying the nearest timestep to interpolation of timestep data.

Playback of an animation means varying time, instead of varying timesteps. By sampling time at regular intervals, animations can play-back at the correct rate, even if the timesteps in a dataset don't represent regular intervals. When dealing with multiple datasets, the correct data can automatically be displayed, as long as the dataset timesteps share a common reference (if they don't, simple offset and scale transformations are trivial to implement and use). In the case of a single dataset, matching the animation sample points to the timesteps in the dataset can provide backwards-compatible behavior.

Current Workarounds

  • Plotting time-varying elements - the Exodus reader has functionality that allows single node or cell values to be requested, but this can only be used to plot time-varying data if the data resides in an Exodus file on the client (client-server won't work), and it limits plotting to the source data (the data cannot be filtered). Effectively this is an "out-of-band" request where the consumer of time-varying data must be hard-coded to seek-out and query the Exodus reader directly.
  • Plotting multiple timesteps - in the case of displaying the path of a bouncing ball, multiple copies of a pipeline can be created, with each input displaying the same file, but at a different timestep. This solution cannot scale well since there will be many (perhaps 100s) of sources referencing the same dataset, with corresponding resource consumption. This approach also leads to additional complexity in managing multiple pipelines - does the user see / work with all of the pipelines, or is there a management layer that hides them, leaving the user to manage a single "virtual" pipeline instead?
  • Using multiple inputs - DSP-like filters that perform operations across multiple timesteps (e.g: averaging) could be coded with multiple inputs, with each input connected to a different source, each displaying the same file, but at a different timestep. Again, this approach does not scale well due to the memory requirements and management issues associated with multiple otherwise-identical pipelines. It also introduces complexity in the coding of filters.

Proposed Design

Currently, VTK assumes that the contents of the pipeline represent a single timestep. VTK's time support consists of four main keys. TIME_STEPS is a vector of doubles that lists all the time steps that are available (in increasing order) from a source/filter. This is how a reader reports downstream what timesteps are available. Currently I believe only a couple readers actually set this key. UPDATE_TIME_INDEX is a key that is used to request the data for a specific time index. The index corresponds to a index in the TIME_STEPS vector. DATA_TIME_INDEX and DATA_TIME are keys that indicate for a DataObject what index (int) and time (double) that data corresponds to.

This mechanism is limited in that it can only handle discrete time steps. But some analytical sources may be able to produce data for any time requested. To address this need a new information key will be created called TIME_RANGE that will define a T1 to T2 range of time that the source can provide. Likewise the old mechanism only allowed for requesting a single time step by index. This has the problem that some file formats are more efficient at returning multiple times steps at once, requesting by index creates confusion wrt branching filters. A new information key UPDATE_TIME_STEPS (vector of doubles) will be created allows for requesting temporal data by time, as well as supporting requests for multiple time-steps at once.

The result is that a temporal request will return a multiblock dataset with each time step being an independent dataobject. This will also support returning multiple time steps of multiblock data (multi-multi-block :) To this end time support will require using the composite data pipeline executive. This executive will be extended to include time support.

Interpolation of time-varying data will be provided by filters written for that purpose; this would minimize the coding impact on sources (which can stay simple), and allows for different types of interpolation to suit user requirements / resources. Interpolation filters would be an example of a filter that expands the temporaral extents of a data request, just as a convolution image filter expands a request's geometric extents. Likewise a shift scale filter will be created that passes the data through but shifts and scales the time. This is useful for visualizing multiple datasets that while intended to be time aligned, are not so due to simple temporal origin and or scale issues.

One side effect of the above is that requesting the data for one cell across many timesteps would still be very slow compared to what it could be for a more optimized path. To address this need weplan on creating a fast-path for such requests. This fast path will be implemented for key readers and the array calculator initially. It is still unclear exactly how this will be implemented. But it will effectively be a separate pipeline possibly connecting to a different output port. Another option is to have a special iformation request that rturns the data as meta-information as opposed to first class data.

Responsibilities of a Filter

With the above design in-place most operations on time agnostic filters will be handled by the executive which will loop the time steps through each filter. Time aware filters will be passed the multiblock dataset if desired and can then operate on mutliple time steps at once. A time aware consumer is responsible for making reasonable requests upstream. The executive will not crop a temporal request to fit in memory. If a temporal consumer asks for a million timesteps, each a gigabyte in size, then the executive will pass the request upstream, so as with geometric extents, try to request what is needed balanced with efficiency. In a time dependent dataset avoid assuming much consistency between timesteps. Many things can change between timesteps, including node positions, connectivity, the set of fields, etc.