Time Support

From ParaQ Wiki
Jump to navigationJump to search

Overview

For some time now (groaaan, a pun), we've known that we need 'time support' in ParaView/ParaQ. There have been some discussions and we now have a plac to move forward.

What is time support?

This is our first question to address. Certainly, ParaView supports 'time', in that it reads data files that have a notion of time (in particular, timesteps). And ParaView supports animations that show time sequences. However, there are many improvements, large and small, that need to be made.

Use Cases

Below is a list of use cases, in the voice of the customer, without suggesting an implementation. "Must have" features are required by users in order for them to use ParaQ.

Must Have

  1. Plot the values of one-to-many elements over a specified range of time (client side graphing).
  2. Animations of time-series data that handle simulations with nonuniform time steps. This includes optional interpolation of node(cell) variables to intermediate times to avoid choppy playback or nonuniform playback (i.e., varying playback speed as integration step size is varied).
  3. Get a window of data (all elements between times T1 and T2).

Very Desirable

  1. Take time series data for one-to-many elements, run that through a filter (such as calculator), and then plot that (client side graphing).
  2. Ghosting (e.g. see the path of a bouncing ball).
  3. Plot the values of nodal (or cell-centered) variables taken on over a space curve and over a specified range of time. The space curve may be a line or a circle.
  4. Having an optional "snap to nodes" feature for the spacetime plots above (Need 3) would be very useful.
  5. Having an optional "geometry inference" feature for defining circular curves for the spacetime plots above (Need 3) would be very useful.
  6. Min/max, and other statistical operations. The operation can be for a single element over time or over all elements over time. Want to know the time at which a variable meets or exceeds a given value (either at specified node(s), or for any node in the entire mesh). An example is thermal situations where we need to know whether or when a particular temperature-sensitive component fails.

Desirable

  1. Provide time series data to a filter that requires multiple time-steps as input (DSP) e.g: for time step N, average the data for N-1, N, and N+1.
  2. Support alternatives to streamlines - streamline: plot the path through a field at an instant in time. Streak-/path-lines: plot the position of a particle for every instant in time.
  3. Calculate the integral of a cut plane over time, i.e the integral of a plane (or something) at each time step.
  4. Would like time interpolation (as required for animations in Need 1) to be nonlinear (i.e., parabolic) so that it matches the interpolation assumed by the integration technique used by the solver. Note that different solvers have different integration schemes. Predictor-corrector methods do not necessarily specify a unique interpolant, but other methods do.
  5. Per time step field data. That is data that could be a field of a data set that changes over time. We should also be able to represent field data that does not change over time.
  6. Calculate envelopes (convex hull) over time. Calculate unions and intersections over time, i.e. "what volume was ever occupied by an object", and "what volume was always occupied by an object".
  7. Retrieve non-contiguous ranges of time-varying data.
  8. Readers report temporal bounds. Time is part of the extents.

Timesteps

Timesteps are point-samples of continuous time that are defined for a specific data set. They need not be contiguous or uniformly spaced, and in fact often aren't, since many simulations adjust their sample rate dynamically based on the simulation content. When more-than-one data set is present in a visualization, the timesteps in the data sets may represent completely different ranges and rates of time. Thus, it is important to have a single time "reference" that is uniform across all data sets. This document assumes that "time" is a continuous, absolute quantity represented as a real number, and that a "timestep" is an individual sample at a particular point in time.

Because time can vary continuously, it is possible to display a time for which there aren't any dataset timesteps that match. Behavior in this case can vary from displaying nothing or displaying the nearest timestep to interpolation of timestep data.

Playback of an animation means varying time, instead of varying timesteps. By sampling time at regular intervals, animations can play-back at the correct rate, even if the timesteps in a dataset don't represent regular intervals. When dealing with multiple datasets, the correct data can automatically be displayed, as long as the dataset timesteps share a common reference (if they don't, simple offset and scale transformations are trivial to implement and use). In the case of a single dataset, matching the animation sample points to the timesteps in the dataset can provide backwards-compatible behavior.

Current Workarounds

  • Plotting time-varying elements - the Exodus reader has functionality that allows single node or cell values to be requested, but this can only be used to plot time-varying data if the data resides in an Exodus file on the client (client-server won't work), and it limits plotting to the source data (the data cannot be filtered). Effectively this is an "out-of-band" request where the consumer of time-varying data must be hard-coded to seek-out and query the Exodus reader directly.
  • Plotting multiple timesteps - in the case of displaying the path of a bouncing ball, multiple copies of a pipeline can be created, with each input displaying the same file, but at a different timestep. This solution cannot scale well since there will be many (perhaps 100s) of sources referencing the same dataset, with corresponding resource consumption. This approach also leads to additional complexity in managing multiple pipelines - does the user see / work with all of the pipelines, or is there a management layer that hides them, leaving the user to manage a single "virtual" pipeline instead?
  • Using multiple inputs - DSP-like filters that perform operations across multiple timesteps (e.g: averaging) could be coded with multiple inputs, with each input connected to a different source, each displaying the same file, but at a different timestep. Again, this approach does not scale well due to the memory requirements and management issues associated with multiple otherwise-identical pipelines. It also introduces complexity in the coding of filters.

Proposed Design

Currently, VTK assumes that the contents of the pipeline represent a single timestep. Downstream filters cannot request a specific time-or-times because the the choice of which timestep to provide is determined by the dataset source.

This is not quite correct. As of about a year ago (and I believe this is still currently where it stands) VTK's time support consists of four main keys. TIME_STEPS is a vector of doubles that lists all the time steps that are available (in increasing order) from a source/filter. This is how a reader reports downstream what timesteps are available. Currently I believe only a couple readers actually set this key. UPDATE_TIME_INDEX is a key that is used to request the data for a specific time index. The index corresponds to a index in the TIME_STEPS vector. DATA_TIME_INDEX and DATA_TIME are keys that indicate for a DataObject what index (int) and time (double) that data corresponds to.Martink 10:30, 18 Jul 2006 (EDT)

Downstream filters can limit the spatial extents of a request, so that only a subset of the available data is returned. We propose that the extents mechanism be expanded to include the concept of temporaral extents, allowing a downstream filter to request data within a given range (or ranges) of time. A data source would respond to such a request by returning a (multi-block?) dataset that would contain zero-to-many timesteps worth of data falling within the requested range of time. The timesteps returned would be the "raw" timesteps within the file that meet the time extents requirements.

So as you can see from above, right now time is handled one step at a time. We could instead specify time as a range and return a multiblock (or in the case of spatial multiblock it would be a temproal multiblock of spatial multiblocks). My gut feeling is that I'd rather keep to requesting one time step at a time as it requires no new executives or special handling except by the filters that operate on multiple timesteps directly.Martink 11:09, 18 Jul 2006 (EDT)

Data sources could be queried to determine the timestep counts and values, for use by the UI layer.

This capability already exists in the pipeline although many readers do not handle it. (so let us say it sort of exists) It is just a matter of adding it to any readers or temporal sources we care about.Martink 10:45, 18 Jul 2006 (EDT)

Datasets containing multiple timesteps would have to provide additional data such as the time value for each timestep, plus flags to show whether a given timestep was "raw" or "interpolated".

An additional key could be added to indicate if the data is interpolated or not. Is this important? The reason I ask is because right now VTK can do all sorts of spatial interpolation and we never mark the data as spatially interpolated.Martink 10:45, 18 Jul 2006 (EDT)

Interpolation of time-varying data could be provided by filters written for that purpose; this would minimize the coding impact on sources (which can stay simple), and allows for different types of interpolation to suit user requirements / resources. Interpolation filters would be an example of a filter that expands the temporaral extents of a data request, just as a convolution image filter expands a request's geometric extents.

I like the idea of a filter handling interpolation of data. There is a limitation right now in how the pipeline supports time in that a sink cannot request an arbitrary time. It must request an index. Now an interpolation filter could still exist where you specify by setting ivars what time steps or intervals it should interpolate to. The alternative would be to add the notion of an UPDATE_TIME in addition to UPDATE_INDEX. This raises some questions if we were to add it. Should a reader provide any data if an UPDATE_TIME is requested that doesn't match a time it has (exactly? to some epsilon?) And what level of accuracy is considered a match for time? I'm just worried about heterogeneous environments where we are sending time as a double from one platform to another.Martink 10:45, 18 Jul 2006 (EDT)

Filters that act as "passthroughs" for data, but offset and/or scale temporaral extents could be used when displaying datasets whose time references do not match.

One concern I have here is the performance of querying 100s of time steps for a single point or cell. Treating time like a first class extent (a fourth dimension) would be fast but require a lot of rewriting VTK to handle four dimensional data properly and it would be a backwards compatibility challenge. Treating time as single steps (or as multiblock) will be super slow for single cell queries across 100s of timesteps. For that case I feel like I'ld like to return the data as a single DataArray in a single DataObject with some tag that indicates that the scalar data is temporal. Any thoughts on this?Martink 11:09, 18 Jul 2006 (EDT)
In thinking about this some more I think introducing a new object and key might work. Speficially to address performance issues when requesting many timesteps for a single cell create a new class that will hold the cell and the data arrays. Add a new request that specifies a vector of time indexes and a cell ID. The return of the Information pass would load that information, fill in the data, and return it. All of this would be done as a key in the information objects, not as a first class vtkDataObject. So the main data path would remain unchanged. This would be a meta-information request. Only a few readers and filters would respond to this and handle it. While this could be fairly fat meta information it should be high performance. For cases where a filter is required that does not support a time series of data the fall back of requesting one timestep ata time could be used. (very slow) An alternative approach would be do create a new extent for requesting just one cell. So getting 1000 time steps would still require 1000 passes of the pipeline, but at least it would not require 1000 passes requesting all the data, but rather 1000 passes requesting just one cell and its data. This has the advantage that most filters could work with this without a lot of redesign.Martink 12:46, 19 Jul 2006 (EDT)

Responsibilities of a Filter

With the above design in-place, it would become the responsibility of every filter to:

  • Pass incoming temporal extent requests upstream without modification, unless implementing specific time-related functionality.
  • Operate on (iterate over) every timestep within the resulting dataset, unless implementing specific time-related functionality.
  • Provide consistent, stable results for every timestep within a dataset, unless implementing specific time-related functionality.
  • Provide well-defined results regardless of the number of timesteps within a dataset.
  • Avoid assuming any consistency between timesteps within a dataset. Anything can change between timesteps, including node positions, connectivity, the set of fields, etc.