Client Side Delivery: Difference between revisions
DaveDemarle (talk | contribs) m (add info about collection automation) |
DaveDemarle (talk | contribs) mNo edit summary |
||
(10 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
The goal of client side delivery is to provide functions that can be called from a python script to perform a parallel aggregation of data | The goal of client side delivery is to provide functions that can be called from a python script to perform a parallel aggregation of data from the server to the client. The bulk of the implementation occurs inside the '''vtkSMGenericViewDisplayProxy''' (GVD) class. This class is used throughout paraview3 to obtain data to display within a view. We have made modifications to the internals of the GVD and created a new class and a new python function that automates the use of the combined whole. | ||
The GVD is a proxy for the pipeline shown in the figure below. | |||
{| border="0" | |||
|- bgcolor="#abcdef" | |||
|- | |||
| | |||
[[Image:client_side_delivery.png]] | |||
|| | |||
The '''vtkReductionFilter''' class within the GVD does the communication needed to perform data aggregation on the server. This filter defers all data manipulation to two user provided algorithms. The '''PreGatherHelper''' is a user specified vtkAlgorithm that each node in the server executes in parallel _before_ sending their data to the root node. The PreGather algorithm is essential for scalability. The reduction filter then issues MPI calls to gather the output of each node to the root node in the server. Next it runs the '''PostGatherHelper''' algorithm on the intermediate results and leaves the result on the root node. In much of paraview3 the PreGatherHelper is null and the PostGatherHelper is a vtkPolyDataAppend filter which creates a single vtkPolyData with primitives and attributes from every node. | |||
The '''vtkClientServerMoveData''' portion of the GVD pipeline exists primarily on the root node of the server and the client. Its function is to serialize the data on the server and transmit it to the client. | |||
; | The '''vtkUpdateSupressor''' is used to prevent redundant pipeline updates and to communicate piece and extent information to each node in the server. | ||
|} | |||
== New work == | |||
;vtkReductionFilter: This filter has been modified to allow a passthrough of any individual nodes data. This is useful when the data of interest has already been collected to the root node for instance. The PreGatherHelper has been added for scalability during parallel reduction operations. | |||
;vtkMinMax: This is a new filter that performs an operation on the attribute data of its input. As of this writing the operation is one of MIN, MAX or SUM. The filter iterates through whatever datasets it is given and calls a templated operate function on each value. The templated function is necessary to avoid double conversion and loss of precision on non double valued arrays. This filter can take one or more input datasets on its first input port and always produces a single vtkPolyData output that contains exactly one point and one cell. This filter is designed to be used as both the PreGatherHelper and PostGatherHelper algorithms above. In this case each node does a local reduction and then the root node does a global reduction on the intermediate results. | |||
;paraview.Fetch: This is a new function that has been added to the paraview.py module. It automates the use of the above components and encapsulates many operations within a single function call. The code for the Fetch function is included below: | |||
def Fetch(input, arg=None): | |||
""" | |||
A convenience method that moves data from the server to the client, | |||
optionally performing some operation on the data as it moves. | |||
The input argument is the name of the (proxy for a) source or filter | |||
whose output is needed on the client. | |||
You can use fetch to do three things. | |||
If arg is None (the default) then the output is brought to the client | |||
unmodified. In parallel runs the Append or AppenPolyData filters merges the | |||
data on each processor into one dataset. | |||
If arg is an integer then one particular processor's output is brought to | |||
the client. In serial runs the arg is ignored. If you have a filter that | |||
computes results in parallel and brings them to the root node, then set | |||
arg to be 0. | |||
If arg is an algorithm, for example vtkMinMax, the algorithm will be applied | |||
to the data to obtain some result. In parallel runs the algorithm will be | |||
run on each processor to make intermediate results and then again on the | |||
root processor over all of the intermediate results to create a | |||
global result. | |||
""" | |||
import types | |||
#create the pipeline that reduces and transmits the data | |||
gvd = CreateProxy("displays", "GenericViewDisplay") | |||
gvd.SetInput(input) | |||
if arg == None: | |||
print "getting appended" | |||
if input.GetDataInformation().GetDataClassName() == "vtkPolyData": | |||
print "use append poly data filter" | |||
gvd.SetReductionType(1) | |||
else: | |||
print "use append filter" | |||
gvd.SetReductionType(2) | |||
elif type(arg) is types.IntType: | |||
print "getting node %d" % arg | |||
gvd.SetReductionType(3) | |||
gvd.SetPreGatherHelper(None) | |||
gvd.SetPostGatherHelper(None) | |||
gvd.SetPassThrough(arg) | |||
else: | |||
print "applying operation" | |||
gvd.SetReductionType(3) | |||
gvd.SetPreGatherHelper(arg) | |||
gvd.SetPostGatherHelper(arg) | |||
gvd.SetPassThrough(-1) | |||
#go! | |||
gvd.UpdateVTKObjects() | |||
gvd.Update() | |||
return gvd.GetOutput() | |||
== Example use == | == Example use == | ||
With the above infrastucture data can be obtained easily from a client side pvpython session as in the following example. | |||
import paraview | import paraview | ||
#connect to a running pvserver | #connect to a running pvserver with one or more nodes | ||
paraview.ActiveConnection = paraview.Connect("localhost", 11111) | paraview.ActiveConnection = paraview.Connect("localhost", 11111) | ||
#create a sample data set | #create a sample data set | ||
sphere = paraview.CreateProxy("sources", "SphereSource") | sphere = paraview.CreateProxy("sources", "SphereSource") | ||
elev = paraview.CreateProxy("filters", "ElevationFilter") | elev = paraview.CreateProxy("filters", "ElevationFilter") | ||
elev.SetInput(sphere) | elev.SetInput(sphere) | ||
elev.SetLowPoint(0,-.5,0) | |||
elev.SetHighPoint(0,.5,0) | |||
# | #choose the operation to perform in parallel | ||
max = paraview.CreateProxy("filters", "MinMax") | |||
max.SetOperation("MAX") | |||
max.UpdateVTKObjects() | |||
#the | #fetch the entire data set | ||
out = | out = paraview.Fetch(elev) | ||
print out.GetBounds() | |||
#fetch one node's portion of the data set | |||
out = paraview.Fetch(elev, 0) | |||
print out.GetBounds() | |||
#calculate the maximum attribute values and show the results | |||
out = paraview.Fetch(elev, max) | |||
arr = out.GetPointData().GetArray("Elevation") | arr = out.GetPointData().GetArray("Elevation") | ||
arr.GetValue(0) | arr.GetValue(0) | ||
== Status == | |||
Currently there is a bug that happens when the GVD attempts to change the vtkDataSet type within its Pre and PostGather helpers. Using a vtkAppendFilter on image data will produce an unstructured grid dataset which results in warning messages and incomplete results. The same effect can be observed when computing the Max of any non polygonal dataset. We hope to address this bug shortly. | |||
Beyond this issue, to progress further we need to better define the set of aggregation operations that we need. The min/max/sum operation is admittedly very simple and what makes it so is that it is geometry independent and simply throws away all of the input geometry it is given. Specific use cases that involve geometry aggregation are needed. |
Latest revision as of 15:42, 7 February 2007
The goal of client side delivery is to provide functions that can be called from a python script to perform a parallel aggregation of data from the server to the client. The bulk of the implementation occurs inside the vtkSMGenericViewDisplayProxy (GVD) class. This class is used throughout paraview3 to obtain data to display within a view. We have made modifications to the internals of the GVD and created a new class and a new python function that automates the use of the combined whole.
The GVD is a proxy for the pipeline shown in the figure below.
The vtkReductionFilter class within the GVD does the communication needed to perform data aggregation on the server. This filter defers all data manipulation to two user provided algorithms. The PreGatherHelper is a user specified vtkAlgorithm that each node in the server executes in parallel _before_ sending their data to the root node. The PreGather algorithm is essential for scalability. The reduction filter then issues MPI calls to gather the output of each node to the root node in the server. Next it runs the PostGatherHelper algorithm on the intermediate results and leaves the result on the root node. In much of paraview3 the PreGatherHelper is null and the PostGatherHelper is a vtkPolyDataAppend filter which creates a single vtkPolyData with primitives and attributes from every node. The vtkClientServerMoveData portion of the GVD pipeline exists primarily on the root node of the server and the client. Its function is to serialize the data on the server and transmit it to the client. The vtkUpdateSupressor is used to prevent redundant pipeline updates and to communicate piece and extent information to each node in the server. |
New work
- vtkReductionFilter
- This filter has been modified to allow a passthrough of any individual nodes data. This is useful when the data of interest has already been collected to the root node for instance. The PreGatherHelper has been added for scalability during parallel reduction operations.
- vtkMinMax
- This is a new filter that performs an operation on the attribute data of its input. As of this writing the operation is one of MIN, MAX or SUM. The filter iterates through whatever datasets it is given and calls a templated operate function on each value. The templated function is necessary to avoid double conversion and loss of precision on non double valued arrays. This filter can take one or more input datasets on its first input port and always produces a single vtkPolyData output that contains exactly one point and one cell. This filter is designed to be used as both the PreGatherHelper and PostGatherHelper algorithms above. In this case each node does a local reduction and then the root node does a global reduction on the intermediate results.
- paraview.Fetch
- This is a new function that has been added to the paraview.py module. It automates the use of the above components and encapsulates many operations within a single function call. The code for the Fetch function is included below:
def Fetch(input, arg=None): """ A convenience method that moves data from the server to the client, optionally performing some operation on the data as it moves. The input argument is the name of the (proxy for a) source or filter whose output is needed on the client. You can use fetch to do three things. If arg is None (the default) then the output is brought to the client unmodified. In parallel runs the Append or AppenPolyData filters merges the data on each processor into one dataset. If arg is an integer then one particular processor's output is brought to the client. In serial runs the arg is ignored. If you have a filter that computes results in parallel and brings them to the root node, then set arg to be 0. If arg is an algorithm, for example vtkMinMax, the algorithm will be applied to the data to obtain some result. In parallel runs the algorithm will be run on each processor to make intermediate results and then again on the root processor over all of the intermediate results to create a global result. """ import types #create the pipeline that reduces and transmits the data gvd = CreateProxy("displays", "GenericViewDisplay") gvd.SetInput(input) if arg == None: print "getting appended" if input.GetDataInformation().GetDataClassName() == "vtkPolyData": print "use append poly data filter" gvd.SetReductionType(1) else: print "use append filter" gvd.SetReductionType(2) elif type(arg) is types.IntType: print "getting node %d" % arg gvd.SetReductionType(3) gvd.SetPreGatherHelper(None) gvd.SetPostGatherHelper(None) gvd.SetPassThrough(arg) else: print "applying operation" gvd.SetReductionType(3) gvd.SetPreGatherHelper(arg) gvd.SetPostGatherHelper(arg) gvd.SetPassThrough(-1) #go! gvd.UpdateVTKObjects() gvd.Update() return gvd.GetOutput()
Example use
With the above infrastucture data can be obtained easily from a client side pvpython session as in the following example.
import paraview #connect to a running pvserver with one or more nodes paraview.ActiveConnection = paraview.Connect("localhost", 11111) #create a sample data set sphere = paraview.CreateProxy("sources", "SphereSource") elev = paraview.CreateProxy("filters", "ElevationFilter") elev.SetInput(sphere) elev.SetLowPoint(0,-.5,0) elev.SetHighPoint(0,.5,0) #choose the operation to perform in parallel max = paraview.CreateProxy("filters", "MinMax") max.SetOperation("MAX") max.UpdateVTKObjects() #fetch the entire data set out = paraview.Fetch(elev) print out.GetBounds() #fetch one node's portion of the data set out = paraview.Fetch(elev, 0) print out.GetBounds() #calculate the maximum attribute values and show the results out = paraview.Fetch(elev, max) arr = out.GetPointData().GetArray("Elevation") arr.GetValue(0)
Status
Currently there is a bug that happens when the GVD attempts to change the vtkDataSet type within its Pre and PostGather helpers. Using a vtkAppendFilter on image data will produce an unstructured grid dataset which results in warning messages and incomplete results. The same effect can be observed when computing the Max of any non polygonal dataset. We hope to address this bug shortly.
Beyond this issue, to progress further we need to better define the set of aggregation operations that we need. The min/max/sum operation is admittedly very simple and what makes it so is that it is geometry independent and simply throws away all of the input geometry it is given. Specific use cases that involve geometry aggregation are needed.