Hi, <br> I have recently gotten Burlen's code and updated it to work with the latest ParaView. Aside from vtkstd, there are also a few backward incompatible VTK changes ( see the VTK6.0 section on the VTK wiki). But it is not too much work. I will be happy send either of you my code changes if you need a reference.<br>
<br>Leo<br><br><br><div class="gmail_quote">On Fri, Jun 8, 2012 at 10:25 AM, Stephan Rogge <span dir="ltr"><<a href="mailto:Stephan.Rogge@tu-cottbus.de" target="_blank">Stephan.Rogge@tu-cottbus.de</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Someone told me that you have to clear your build directory completely and<br>
start a fresh PV build.<br>
<br>
Stephan<br>
<br>
-----Ursprüngliche Nachricht-----<br>
Von: burlen [mailto:<a href="mailto:burlen.loring@gmail.com" target="_blank">burlen.loring@gmail.com</a>]<br>
Gesendet: Freitag, 8. Juni 2012 16:21<br>
<div><div>An: Stephan Rogge<br>
Cc: 'Yuanxin Liu'; <a href="mailto:paraview@paraview.org" target="_blank">paraview@paraview.org</a><br>
Betreff: Re: [Paraview] Parallel Streamtracer<br>
<br>
Hi Stephan,<br>
<br>
Oh, thanks for the update, I wasn't aware of these changes. I have been<br>
working with 3.14.1.<br>
<br>
Burlen<br>
<br>
On 06/08/2012 01:47 AM, Stephan Rogge wrote:<br>
> Hello Burlen,<br>
><br>
> thank you very much for your post. I really would like to test your<br>
> plugin and so I've start to build it. Unfortunately I've got a lot of<br>
> compiler errors (e.g. vtkstd isn't used in PV master anymore). Which<br>
> PV version is the base for your plugin?<br>
><br>
> Regards,<br>
> Stephan<br>
><br>
> -----Ursprüngliche Nachricht-----<br>
> Von: Burlen Loring [mailto:<a href="mailto:bloring@lbl.gov" target="_blank">bloring@lbl.gov</a>]<br>
> Gesendet: Donnerstag, 7. Juni 2012 17:54<br>
> An: Stephan Rogge<br>
> Cc: 'Yuanxin Liu'; <a href="mailto:paraview@paraview.org" target="_blank">paraview@paraview.org</a><br>
> Betreff: Re: [Paraview] Parallel Streamtracer<br>
><br>
> Hi Stephan,<br>
><br>
> I've experienced the scaling behavior that you report when I was<br>
> working on a project that required generating millions of streamlines<br>
> for a topological mapping algorithm interactively in ParaView. To get<br>
> the required scaling I wrote a stream tracer that uses a load on<br>
> demand approach with tunable block cache so that all ranks could<br>
> integrate any streamline and stay busy throughout the entire<br>
> computation. It was very effective on our data and I've used it to<br>
> integrate 30 Million streamlines in about 10min on 256 cores. If you<br>
> really need better scalability than the distributed data tracing<br>
> approach implemented in PV, you might take a look at our work. The<br>
> down side of our approach is that in order to provide the demand<br>
> loading the reader has to implement a vtk object that provides an api<br>
> giving the integrator direct access to I/O functionality. In case you're<br>
interested the stream tracer is class is vtkSQFieldTracer and our reader is<br>
vtkSQBOVReader.<br>
> The latest release could be found here<br>
> <a href="https://github.com/burlen/SciberQuestToolKit/tarball/SQTK-20120531" target="_blank">https://github.com/burlen/SciberQuestToolKit/tarball/SQTK-20120531</a><br>
><br>
> Burlen<br>
><br>
> On 06/04/2012 02:21 AM, Stephan Rogge wrote:<br>
>> Hello Leo,<br>
>><br>
>> ok, I took the "disk_out_ref.ex2" example data set and did some time<br>
>> measurements. Remember, my machine has 4 Cores + HyperThreading.<br>
>><br>
>> My first observation is that PV seems to have a problem with<br>
>> distributing the data when the Multi-Core option (GUI) is enabled.<br>
>> When PV is started with builtin Multi-Core I was not able to apply a<br>
>> stream tracer with more than 1000 seed points (PV is freezing and<br>
>> never comes back). Otherwise, when pvserver processes has been<br>
>> started manually I was able to set up to 100.000 seed points. Is it a<br>
bug?<br>
>><br>
>> Now let's have a look on the scaling performance. As you suggested,<br>
>> I've used the D3 filter for distributing the data along the processes.<br>
>> The stream tracer execution time for 10.000 seed points:<br>
>><br>
>> ## Bulitin: 10.063 seconds<br>
>> ## 1 MPI-Process (no D3): 10.162 seconds<br>
>> ## 4 MPI-Processes: 15.615 seconds<br>
>> ## 8 MPI-Processes: 14.103 seconds<br>
>><br>
>> and 100.000 seed points:<br>
>><br>
>> ## Bulitin: 100.603 seconds<br>
>> ## 1 MPI-Process (no D3): 100.967 seconds<br>
>> ## 4 MPI-Processes: 168.1 seconds<br>
>> ## 8 MPI-Processes: 171.325 seconds<br>
>><br>
>> I cannot see any positive scaling behavior here. Maybe is this<br>
>> example not appropriate for scaling measurements?<br>
>><br>
>> One more thing: I've visualized the vtkProcessId and saw that the<br>
>> whole vector field is partitioned. I thought, that each streamline is<br>
>> integrated in its own process. But it seems that this is not the case.<br>
>> This could explain my scaling issues: In cases of small vector fields<br>
>> the overhead of synchronization becomes too large and decreases the<br>
> overall performance.<br>
>> My suggestion is to have a parallel StreamTracer which is built for a<br>
>> single machine with several threads. Could be worth to randomly<br>
>> distribute the seeds over all available (local) processes? Of course,<br>
>> each process have access on the whole vector field.<br>
>><br>
>> Cheers,<br>
>> Stephan<br>
>><br>
>><br>
>><br>
>> Von: Yuanxin Liu [mailto:<a href="mailto:leo.liu@kitware.com" target="_blank">leo.liu@kitware.com</a>]<br>
>> Gesendet: Freitag, 1. Juni 2012 16:13<br>
>> An: Stephan Rogge<br>
>> Cc: Andy Bauer; <a href="mailto:paraview@paraview.org" target="_blank">paraview@paraview.org</a><br>
>> Betreff: Re: [Paraview] Parallel Streamtracer<br>
>><br>
>> Hi, Stephan,<br>
>> I did measure the performance at some point and was able to get<br>
>> fairly decent speed up with more processors. So I am surprised you<br>
>> are seeing huge latency.<br>
>><br>
>> Of course, the performance is sensitive to the input. It is also<br>
>> sensitive to how readers distribute data. So, one thing you might<br>
>> want to try is to attach the "D3" filter to the reader.<br>
>><br>
>> If that doesn't help, I will be happy to get your data and take<br>
>> a<br>
> look.<br>
>> Leo<br>
>><br>
>> On Fri, Jun 1, 2012 at 1:54 AM, Stephan<br>
>> Rogge<<a href="mailto:Stephan.Rogge@tu-cottbus.de" target="_blank">Stephan.Rogge@tu-cottbus.de</a>><br>
>> wrote:<br>
>> Leo,<br>
>><br>
>> As I mentioned in my initial post of this thread: I used the<br>
>> up-to-date master branch of ParaView. Which means I have already used<br>
>> your implementation.<br>
>><br>
>> I can imagine, to parallelize this algorithm can be very tough. And I<br>
>> can see that distribute the calculation over 8 processes does not<br>
>> lead to a nice scaling.<br>
>><br>
>> But I don't understand this huge amount of latency when using the<br>
>> StreamTracer in a Cave-Mode with two view ports and two pvserver<br>
>> processes on the same machine (extra machine for the client). I guess<br>
>> the tracer filter is applied for each viewport separately? This would<br>
>> be ok as long as both filter executions run parallel. And I doubt<br>
>> that<br>
> this is the case.<br>
>> Can you help to clarify my problem?<br>
>><br>
>> Regards,<br>
>> Stephan<br>
>><br>
>><br>
>> Von: Yuanxin Liu [mailto:<a href="mailto:leo.liu@kitware.com" target="_blank">leo.liu@kitware.com</a>]<br>
>> Gesendet: Donnerstag, 31. Mai 2012 21:33<br>
>> An: Stephan Rogge<br>
>> Cc: Andy Bauer; <a href="mailto:paraview@paraview.org" target="_blank">paraview@paraview.org</a><br>
>> Betreff: Re: [Paraview] Parallel Streamtracer<br>
>><br>
>> It is in the current VTK and ParaView master. The class is<br>
>> vtkPStreamTracer.<br>
>><br>
>> Leo<br>
>> On Thu, May 31, 2012 at 3:31 PM, Stephan<br>
>> Rogge<<a href="mailto:stephan.rogge@tu-cottbus.de" target="_blank">stephan.rogge@tu-cottbus.de</a>><br>
>> wrote:<br>
>> Hi, Andy and Leo,<br>
>><br>
>> thanks for your replies.<br>
>><br>
>> Is it possible to get this new implementation? I would to give it a try.<br>
>><br>
>> Regards,<br>
>> Stephan<br>
>><br>
>> Am 31.05.2012 um 17:48 schrieb Yuanxin Liu<<a href="mailto:leo.liu@kitware.com" target="_blank">leo.liu@kitware.com</a>>:<br>
>> Hi, Stephan,<br>
>> The previous implementation only has serial performance: It<br>
>> traces the streamlines one at a time and never starts a new<br>
>> streamline until the previous one finishes. With communication<br>
>> overhead, it is not surprising it got slower.<br>
>><br>
>> My new implementation is able to let the processes working on<br>
>> different streamlines simultaneously and should scale much better.<br>
>><br>
>> Leo<br>
>><br>
>> On Thu, May 31, 2012 at 11:27 AM, Andy Bauer<<a href="mailto:andy.bauer@kitware.com" target="_blank">andy.bauer@kitware.com</a>><br>
> wrote:<br>
>> Hi Stephan,<br>
>><br>
>> The parallel stream tracer uses the partitioning of the grid to<br>
>> determine which process does the integration. When the streamline<br>
>> exits the subdomain of a process there is a search to see if it<br>
>> enters a subdomain assigned to any other processes before figuring it<br>
>> whether it has left the entire domain.<br>
>><br>
>> Leo, copied here, has been improving the streamline implementation<br>
>> inside of VTK so you may want to get his newer version. It is a<br>
>> pretty tough algorithm to parallelize efficiently without making any<br>
>> assumptions on the flow or partitioning.<br>
>><br>
>> Andy<br>
>><br>
>> On Thu, May 31, 2012 at 4:16 AM, Stephan<br>
>> Rogge<<a href="mailto:Stephan.Rogge@tu-cottbus.de" target="_blank">Stephan.Rogge@tu-cottbus.de</a>><br>
>> wrote:<br>
>> Hello,<br>
>><br>
>> I have a question related to the parallelism of the stream tracer: As<br>
>> I understand the code right, each line integration (trace) is<br>
>> processed in an own MPI process. Right?<br>
>><br>
>> To test the scalability of the Stream tracer I've load a structured<br>
>> (curvilinear) grid and applied the filter with a Seed resolution of<br>
>> 1500 and check the timings in a single and multi-thread (Multi Core<br>
>> enabled in PV<br>
>> GUI) situation.<br>
>><br>
>> I was really surprised that multi core slows done the execution time<br>
>> to 4 seconds. The single core takes only 1.2 seconds. Data migration<br>
>> cannot be the explanation for that behavior (0.5 seconds). What is<br>
>> the<br>
> problem here?<br>
>> Please see attached some statistics...<br>
>><br>
>> Data:<br>
>> * Structured (Curvilinear) Grid<br>
>> * 244030 Cells<br>
>> * 37 MB Memory<br>
>><br>
>> System:<br>
>> * Intel i7-2600K (4 Cores + HT = 8 Threads)<br>
>> * 16 GB Ram<br>
>> * Windows 7 64 Bit<br>
>> * ParaView (master-branch, 64 bit compilation)<br>
>><br>
>> #################################<br>
>> Single Thread (Seed resolution 1500):<br>
>> #################################<br>
>><br>
>> Local Process<br>
>> Still Render, 0.014 seconds<br>
>> RenderView::Update, 1.222 seconds<br>
>> vtkPVView::Update, 1.222 seconds<br>
>> Execute vtkStreamTracer id: 2184, 1.214 seconds Still<br>
>> Render,<br>
>> 0.015 seconds<br>
>><br>
>> #################################<br>
>> Eight Threads (Seed resolution 1500):<br>
>> #################################<br>
>><br>
>> Local Process<br>
>> Still Render, 0.029 seconds<br>
>> RenderView::Update, 4.134 seconds<br>
>> vtkSMDataDeliveryManager: Deliver Geome, 0.619 seconds<br>
>> FullRes Data Migration, 0.619 seconds Still Render, 0.042<br>
>> seconds<br>
>> OpenGL Dev Render, 0.01 seconds<br>
>><br>
>><br>
>> Render Server, Process 0<br>
>> RenderView::Update, 4.134 seconds<br>
>> vtkPVView::Update, 4.132 seconds<br>
>> Execute vtkStreamTracer id: 2193, 3.941 seconds FullRes<br>
>> Data Migration, 0.567 seconds<br>
>> Dataserver gathering to 0, 0.318 seconds<br>
>> Dataserver sending to client, 0.243 seconds<br>
>><br>
>> Render Server, Process 1<br>
>> Execute vtkStreamTracer id: 2193, 3.939 seconds<br>
>><br>
>> Render Server, Process 2<br>
>> Execute vtkStreamTracer id: 2193, 3.938 seconds<br>
>><br>
>> Render Server, Process 3<br>
>> Execute vtkStreamTracer id: 2193, 4.12 seconds<br>
>><br>
>> Render Server, Process 4<br>
>> Execute vtkStreamTracer id: 2193, 3.938 seconds<br>
>><br>
>> Render Server, Process 5<br>
>> Execute vtkStreamTracer id: 2193, 3.939 seconds<br>
>><br>
>> Render Server, Process 6<br>
>> Execute vtkStreamTracer id: 2193, 3.938 seconds<br>
>><br>
>> Render Server, Process 7<br>
>> Execute vtkStreamTracer id: 2193, 3.939 seconds<br>
>><br>
>> Cheers,<br>
>> Stephan<br>
>><br>
>><br>
>> _______________________________________________<br>
>> Powered by <a href="http://www.kitware.com" target="_blank">www.kitware.com</a><br>
>><br>
>> Visit other Kitware open-source projects at<br>
>> <a href="http://www.kitware.com/opensource/opensource.html" target="_blank">http://www.kitware.com/opensource/opensource.html</a><br>
>><br>
>> Please keep messages on-topic and check the ParaView Wiki at:<br>
>> <a href="http://paraview.org/Wiki/ParaView" target="_blank">http://paraview.org/Wiki/ParaView</a><br>
>><br>
>> Follow this link to subscribe/unsubscribe:<br>
>> <a href="http://www.paraview.org/mailman/listinfo/paraview" target="_blank">http://www.paraview.org/mailman/listinfo/paraview</a><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>> _______________________________________________<br>
>> Powered by <a href="http://www.kitware.com" target="_blank">www.kitware.com</a><br>
>><br>
>> Visit other Kitware open-source projects at<br>
>> <a href="http://www.kitware.com/opensource/opensource.html" target="_blank">http://www.kitware.com/opensource/opensource.html</a><br>
>><br>
>> Please keep messages on-topic and check the ParaView Wiki at:<br>
>> <a href="http://paraview.org/Wiki/ParaView" target="_blank">http://paraview.org/Wiki/ParaView</a><br>
>><br>
>> Follow this link to subscribe/unsubscribe:<br>
>> <a href="http://www.paraview.org/mailman/listinfo/paraview" target="_blank">http://www.paraview.org/mailman/listinfo/paraview</a><br>
><br>
<br>
<br>
</div></div></blockquote></div><br>