Client Side Delivery

From ParaQ Wiki
Revision as of 16:17, 1 February 2007 by DaveDemarle (talk | contribs) (update with newest revision of the architecture)
Jump to navigationJump to search

The goal of client side delivery is to provide functions that can be called from a python script to perform a parallel aggregation of data with the result obtained on the client. This bulk of the implementation is already provided in the vtkSMGenericViewDisplayProxy (GVD) class. This class is used throughout paraview to obtain data for display within a view. We have made modifications to the internals of the GVD and created a new class and a new python function in the paraview module that automates the use of the combined whole. The GVD is a proxy for the pipeline shown in the figure below.

P3 csd.png

The vtkReductionFilter class within the GVD does the communication needed to perform data aggregation on the server. This filter defers all data manipulation to two user provided algorithms. The PreGatherHelper is a user specified vtkAlgorithm that each node in the server executes in parallel _before_ sending their data to the root node. The PreGather algorithm is essential for scalability. The reduction filter then issues MPI calls to gather the output of each node to the root node in the server. Next it runs the PostGatherHelper algorithm on the intermediate results and leaves the result on the root node. In much of paraview3 the PreGatherHelper is null and the PostGatherHelper is a vtkPolyDataAppend filter which creates a single vtkPolyData including the data from every node.

The vtkClientServerMoveData portion of the GVD pipeline exists primarily on the root node of the server and the client. Its function is to serialize the data on the server and transmit it to the client.

The vtkUpdateSupressor is used to prevent redundant pipeline updates and to communicate piece and extent information to each node in the server.



New work

vtkReductionFilter
This filter has been modified to allow a passthrough of any individual nodes data. This is useful when the data of interest has already been collected to the root node for instance. This filter has also been modified to use the PreGatherHelper for scalability during parallel reduction operations.
vtkMinMax
This is a new filter that performs an operation on the attribute data of its input. As of this writing the operation is one of MIN, MAX or SUM. The filter iterates through whatever datasets it is given and calls a templated operate function on each value. The templated function is necessary to avoid double conversion and loss of precision on non double valued arrays. This filter can take one or more input datasets on its first input port and always produces a single unstructured grid output that contains exactly one point and one cell. This filter is designed to be used as both in PreGather and PostGather algorithms above. In this case each node does a local reduction and then the root node does a global reduction on the intermediate results.
paraview.fetch
This is a new function that has been added to the paraview.py module. It automates the use of the above components and encapsulates many operations within a single function call. The code for the fetch function is included below:
def fetch(input, arg=None):
""" 
A convenience method that moves data from the server to the client, 
optionally performing some operation on the data as it moves.
The input argument is the name of the (proxy for a) source or filter
whose output is needed on the client.

You can use fetch to do three things.
If arg is None (the default) then the output is brought to the client 
unmodified. In parallel runs the Append or AppenPolyData filters merges the
data on each processor into one dataset.
 
If arg is an integer then one particular processor's output is brought to
the client. In serial runs the arg is ignored. If you have a filter that
computes results in parallel and brings them to the root node, then set 
arg to be 0.

If arg is an algorithm, for example vtkMinMax, the algorithm will be applied
to the data to obtain some result. In parallel runs the algorithm will be
run on each processor to make intermediate results and then again on the 
root processor over all of the intermediate results to create a 
global result.
"""
 
import types
 
#create the pipeline that reduces and transmits the data
gvd = CreateProxy("displays", "GenericViewDisplay")
gvd.SetInput(input) 

if arg == None:
  print "getting appended"
  if input.GetDataInformation().GetDataClassName() == "vtkPolyData":
    print "use append poly data filter"
    gvd.SetReductionType(1)
  else:
    print "use append filter"
    gvd.SetReductionType(2)
 
elif type(arg) is types.IntType:          
 print "getting node %d" % arg
 gvd.SetReductionType(3)   
 gvd.SetPreGatherHelper(None)
 gvd.SetPostGatherHelper(None)
 gvd.SetPassThrough(arg)

else:
 print "applying operation"
 gvd.SetReductionType(3)   
 gvd.SetPreGatherHelper(arg)
 gvd.SetPostGatherHelper(arg)
 gvd.SetPassThrough(-1)

#go!
gvd.UpdateVTKObjects()
gvd.Update()   
return gvd.GetOutput()

Example use

With the above infrastucture data can be obtained from a client side pvpython session with the following code:

import paraview
#connect to a running pvserver with one or more nodes
paraview.ActiveConnection = paraview.Connect("localhost", 11111)
 
#create a sample data set
sphere = paraview.CreateProxy("sources", "SphereSource")
elev = paraview.CreateProxy("filters", "ElevationFilter")
elev.SetInput(sphere)
elev.SetLowPoint(0,-.5,0)
elev.SetHighPoint(0,.5,0)

#choose the operation to perform in parallel
max = paraview.CreateProxy("filters", "MinMax")
max.SetOperation("MAX")
max.UpdateVTKObjects()
 
#fetch the entire data set
out = paraview.fetch(elev)
print out.GetBounds() 

#fetch one node's portion of the data set
out = paraview.fetch(elev, 0)
print out.GetBounds() 

#calculate the maximum attribute values and show the results 
out = paraview.fetch(elev, max)
arr = out.GetPointData().GetArray("Elevation")
arr.GetValue(0)

Status

Currently there is a know bug that happens when the GVD attempts to change the vtkDataSet type within its Pre and PostGather helpers. Using a vtkAppendFilter on image data will produce an unstructured grid dataset which results in warning messages and incomplete results. The same effect can be observed when computing the Min or any non polygonal dataset. vtkMinMax produces a vtkPolyData with a single point and cell regardless of its input type. This bug will be fixed shortly.

To progress further we need to better define the set of aggregation operations that we need. The min/max/sum operation is admittedly very simple and what makes it so is that it is geometry independent and simply throws away all of the input geometry it is given. Specific use cases that involve geometry aggregation are needed.