Sorry for the confusion. i'll try to explain more on my situation.<br>the first example show that the pvserver is "launchable" when it only involves one machine as the pvserver. <br>i could connect to it using any binary version of paraview client from any machine.<br>
<br>at the second example, i'm trying to run a parallel server using 2 machines.<br><br>added the mpi bin directory PATH in my bashrc file and came out with following konsole lines:<br><div> </div><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">
<i>yewyong@vrc1:~/installer/ParaView-3.4.0_build/bin> mpirun -hostfile ../../openmpi-1.2.8_build/etc/quadcore_frodo -np 6 ./pvserver --use-offscreen-rendering<br></i><i>Password:<br></i><i>[<a href="http://192.168.0.10">192.168.0.10</a>][0,1,1][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=110<br>
</i><i>[<a href="http://192.168.0.10">192.168.0.10</a>][0,1,3][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=110</i><br></blockquote><br>then the machine just fails to give any response. <br>
when pressed with "control c" the following shows:<br><div> </div><div><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">mpirun: killing job...<br>
</blockquote></div><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote"><i>mpirun noticed that job rank 0 with PID 20731 on node <a href="http://192.168.0.10">192.168.0.10</a> exited on signal 15 (Terminated).<br>
5 additional processes aborted (not shown)</i></blockquote><div><br> Any ideas what going on?<br><br>thanks,<br></div><br><br><br><div class="gmail_quote">On Fri, Nov 7, 2008 at 12:08 AM, John M. Patchett <span dir="ltr"><<a href="mailto:patchett@lanl.gov" target="_blank">patchett@lanl.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div>
<br><div><div><div>On Nov 6, 2008, at 8:34 AM, Moreland, Kenneth wrote:</div><br><blockquote type="cite"> <font face="Calibri, Verdana, Helvetica, Arial"><span style="font-size: 11pt;">I am a bit confused by your question. Are you saying the first example works and the second does not? I am also confused by what you are trying to set up. Are you trying to run the server on one of your machines and the client on the other, or are you trying to run a parallel server using both machines?<br>
<br> In the second example, it looks like your OpenMPI install is messed up. In the 4th line (1st error), it complains about not finding orted, which is a daemon that the OpenMPI version of mpirun uses to launch processes. It does not look like pvserver is being run at all.</span></font></blockquote>
<div><br></div></div><div>I think Ken is correct. It would appear that you don't have your bash environment set up correctly. You need to have in your .profile or .bashrc or .something that is read when OpenMPI launches your job, the path to the mpi bin directory.</div>
<div>the LD_LIBRARY_PATH needs to be set to your OpenMPI lib directory. You can test that the PATH component is set correctly for your type of login by running a command like:</div><div>ssh <a href="http://192.168.0.16" target="_blank">192.168.0.16</a> which mpirun</div>
<div>The LD_LIBRARY_PATH will become obvious once your pvserver executable is actually launched.</div><div>(The man page for bash on your system should explain what order the dot files are read in for interactive and non interactive shells...)</div>
<div><br></div><div>At the very very end of this mail I'm pasting the show output of a very (very) old openmpi environment modules ... </div><div><div>Hope this helps,</div><div>-John.</div></div><br><blockquote type="cite">
<div><div></div><div><font face="Calibri, Verdana, Helvetica, Arial"><span style="font-size: 11pt;"><br> <br> I cannot claim to have a lot of experience in setting up MPI installs, especially ones on heterogeneous nodes like you are setting up. My advice would be to first make sure that you can launch a job "locally" on each machine, then try to launch a job only one the opposite computer (i.e. launch only on the 32 bit machine from the 64 bit machine and vice versa), and then finally launch a single job on both.<br>
<br> -Ken<br> <br> <br> On 11/6/08 1:32 AM, "yewyong" <<a href="http://uyong81@gmail.com" target="_blank">uyong81@gmail.com</a>> wrote:<br> <br> </span></font><blockquote><font face="Calibri, Verdana, Helvetica, Arial"><span style="font-size: 11pt;">Hi all,<br>
<br> i managed to compile the paraview 3.4.0 in both a 64bit dual core machine and a 32bit quad core machine.<br> having trouble starting the pvserver when linking 2 machines. Below are the konsole lines..<br> <br> </span></font><blockquote>
<font face="Calibri, Verdana, Helvetica, Arial"><span style="font-size: 11pt;"><i><a>yewyong@frodo:~/installer/ParaView-3.4.0_build/bin</a>> ../../openmpi-1.2.8_build/bin/mpirun -hostfile ../../openmpi-1.2.8_build/etc/frodo -np 2 ./pvserver --use-offscreen-rendering<br>
Listen on port: 11111<br> Waiting for client...<br> ^Cmpirun: killing job...<br> </i><br> <i>mpirun noticed that job rank 0 with PID 27000 on node <a href="http://192.168.0.16" target="_blank">192.168.0.16</a> <<a href="http://192.168.0.16" target="_blank">http://192.168.0.16</a>> exited on signal 15 (Terminated).<br>
1 additional process aborted (not shown)<br> </i></span></font></blockquote><font face="Calibri, Verdana, Helvetica, Arial"><span style="font-size: 11pt;"><br> everything looks good when there is no linking of 2 machines. (this is the 64bit dual core machine)<br>
but,<br> <br> </span></font><blockquote><font face="Calibri, Verdana, Helvetica, Arial"><span style="font-size: 11pt;"><i><a>yewyong@frodo:~/installer/ParaView-3.4.0_build/bin</a>> ../../openmpi-1.2.8_build/bin/mpirun -hostfile ../../openmpi-1.2.8_build/etc/frodo_quadcore -np 6 ./pvserver --use-offscreen-rendering<br>
Password:<br> bash: orted: command not found<br> [frodo:27009] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/pls_base_orted_cmds.c at line 275<br> [frodo:27009] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at line 1166<br>
[frodo:27009] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c at line 90<br> [frodo:27009] ERROR: A daemon on node <a href="http://192.168.0.10" target="_blank">192.168.0.10</a> <<a href="http://192.168.0.10" target="_blank">http://192.168.0.10</a>> failed to start as expected.<br>
[frodo:27009] ERROR: There may be more information available from<br> [frodo:27009] ERROR: the remote shell (see above).<br> [frodo:27009] ERROR: The daemon exited unexpectedly with status 127.<br> [frodo:27009] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/pls_base_orted_cmds.c at line 188<br>
[frodo:27009] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at line 1198<br> --------------------------------------------------------------------------<br> mpirun was unable to cleanly terminate the daemons for this job. Returned value Timeout instead of ORTE_SUCCESS.<br>
--------------------------------------------------------------------------<br> </i></span></font></blockquote><font face="Calibri, Verdana, Helvetica, Arial"><span style="font-size: 11pt;"><br> i can't start the pvserver when involving 2 machines.<br>
<a href="http://192.168.0.16" target="_blank">192.168.0.16</a> <<a href="http://192.168.0.16" target="_blank">http://192.168.0.16</a>> = 64bit dual core<br> <a href="http://192.168.0.10" target="_blank">192.168.0.10</a> <<a href="http://192.168.0.10" target="_blank">http://192.168.0.10</a>> = 32bit quad core<br>
<br> Anyone faced this before? <br> Thanks in advance for helping.<br> <br> Cheers,<br> yewyong<br> VRC UM <br> <br> <br> </span></font></blockquote><font face="Calibri, Verdana, Helvetica, Arial"><span style="font-size: 11pt;"><br>
<br> **** Kenneth Moreland<br> *** Sandia National Laboratories<br> *********** <br> *** *** *** email: <a href="http://kmorel@sandia.gov" target="_blank">kmorel@sandia.gov</a><br> ** *** ** phone: (505) 844-8919<br>
*** fax: (505) 845-0833<br> <br> </span></font></div></div><div style="margin: 0px;">_______________________________________________</div><div style="margin: 0px;">ParaView mailing list</div><div style="margin: 0px;">
<a href="mailto:ParaView@paraview.org" target="_blank">ParaView@paraview.org</a></div><div style="margin: 0px;"><a href="http://www.paraview.org/mailman/listinfo/paraview" target="_blank">http://www.paraview.org/mailman/listinfo/paraview</a></div>
</blockquote></div><br><div><br></div><div><div>[dq001 ~]$module show openmpi</div><div>-------------------------------------------------------------------</div><div>/usr/local/Modules/modulefiles/openmpi/1.0.1curr:</div>
<div><br></div><div>prepend-path PATH /usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/bin </div><div>prepend-path MANPATH /usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/man </div>
<div>prepend-path LD_LIBRARY_PATH /usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/lib </div><div>prepend-path LIBRARY_PATH /usr/mellanox/lib </div><div>prepend-path PATH /usr/mellanox/bin </div>
<div>prepend-path CPLUS_INCLUDE_PATH /usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/include </div><div>prepend-path C_INCLUDE_PATH /usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/include </div>
<div>prepend-path LIBRARY_PATH /usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/lib </div><div>prepend-path LIBRARY_PATH /usr/mellanox/lib </div><div>setenv MPI_ROOT /usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install </div>
<div>setenv LDLIBS -lmpi </div><div>setenv PV_MPI_INCLUDE_PATH /usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/include/openmpi/ompi </div><div>setenv PV_MPI_LIBRARY /usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/lib/libmpi.a </div>
<div>setenv PV_MPI_EXTRA_LIBRARY /usr/lib64/libmtl_common.so;/usr/lib64/libvapi.so;/usr/lib64/libmpga.so;/usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/lib/libmpi_cxx.a;/usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/lib/libopal.a;/usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/lib/liborte.a;/usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/lib/libmpi.a;/usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/lib/libmpi_cxx.a;/usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/lib/libopal.a;/usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/lib/liborte.a;/usr/lib64/libutil.a /usr/lib64/libc.a </div>
<div>setenv PV_MPI_NUMPROC_FLAG -np </div><div>setenv PV_MPIRUN /usr/local/packages/openmpi/current/installs/mpi_get_ompi_1.0/mpi_install_infiniband/install/bin/mpirun </div><div>-------------------------------------------------------------------</div>
</div></div></blockquote></div><br>