[Paraview] distributed stream tracer scalability issue

Tue Aug 25 09:07:08 EDT 2009

Hi Burlen,

The only reason I implemented it as it is was time constraint. It can
be made much scalable (hint hint ;-) ).

As for the issue with building links, we are working on incorporating
the BSP tree locator as an alternative to vtkPointSet::FindCell().
Thanks John for contributing it.

-berk

On Tue, Aug 25, 2009 at 4:42 AM, John Biddiscombe<biddisco at cscs.ch> wrote:
> Burlen
>
> I have had performance issues with the Distributed Stream tracer, but in
> fact I found that in general, the problem of it not being very well
> optimized for parallel operation was not the main trouble. If you are using
> Unstructured Grids, and they are large (in my case 20million cells in a
> block), then the main time was taken by the building of cell links which are
> used to FindCEll inwhich an integration point lies. I modified the stream
> tracer interpolation to use a BSP tree (or CellLocator) and found a huge
> improvement in execution time. (minutes instead of hours).
>
> Secondly. the parallelization of the stream tracer is an inherent problem.
> One cannot integrate the streamline in block 2, until it has reached a
> boundary in block 1 - one must wait until the streamling traverses one block
> before passing it to the next. In actuality, the implementation could be
> improved with more intelligent seeding and rending/receiving of streamline
> seeds etc between iterations.
>
> The Particle tracer code could be modifed to produce streamlines in a serial
> or distributed manner and ought to give a 'reasonably' optimal solution to
> the problem - but in fact the chaps at kitware are at the moment (they tell
> me) in the process of revamping the streamline code to make use of
> CellLocators - and for this reason I recently committed my BSP tree code.
>
> Here's how to check your bottleneck.
> Find a large StructuredGrid dataset which is loaded in parallel. Generate
> streamlines. Time it. Convert the grdi to UnstructuredGrid and do the same.
> If test 1 takes 1 minute and test 2 1 hour, then it isn't the parallization
> that's the real issue, but the grid being used.
>
> JB
>
>
>
>
>> We've been using the distributed stream tracer to generate 100s-1000s of
>> stream lines per time step. It's very slow, and it doesn't scale at all.
>>  The class comments say as much. I'm sure there is a reason why this
>> implementation was chosen. Is there something that generally prevents real
>> parallel implementation? Is there a better implementation available out
>> there?
>>
>> There is this post a while back
>> http://www.paraview.org/pipermail/paraview/2009-July/012959.html
>>
>> What's the status?
>>
>> Thanks
>> Burlen
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Powered by www.kitware.com
>>
>> Visit other Kitware open-source projects at
>> http://www.kitware.com/opensource/opensource.html
>>
>> Please keep messages on-topic and check the ParaView Wiki at:
>> http://paraview.org/Wiki/ParaView
>>
>> Follow this link to subscribe/unsubscribe:
>> http://www.paraview.org/mailman/listinfo/paraview
>
>
> --
> John Biddiscombe,                            email:biddisco @ cscs.ch
> http://www.cscs.ch/
> CSCS, Swiss National Supercomputing Centre  | Tel:  +41 (91) 610.82.07
> Via Cantonale, 6928 Manno, Switzerland      | Fax:  +41 (91) 610.82.82
>
> _______________________________________________
> Powered by www.kitware.com
>
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
>
> Please keep messages on-topic and check the ParaView Wiki at:
> http://paraview.org/Wiki/ParaView
>
> Follow this link to subscribe/unsubscribe:
> http://www.paraview.org/mailman/listinfo/paraview
>