<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Hi,<br>
I took the cmakecache from the static build and toggled the
build_share_lib to ON and compiled and it is working!<br>
previously i had added some python bindings and other stuff too...<br>
<br>
pratik<br>
On Thursday 28 April 2011 02:38 PM, pratik wrote:
<blockquote cite="mid:4DB92E72.1050405@gmail.com" type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
Also, I wrote to a person who seemed to have the same problem with sgi
mpt and this is what he wrote back (he did not use paraview, but ran a
cluster of sgi altix systems). I don't know much of mpi so i'm still
looking into this, but i thought this may be of some assistance to the
paraview developers if they try to see why this problem is occurring:<br>
<div class="moz-text-plain" wrap="true"
style="font-family: -moz-fixed; font-size: 12px;" lang="x-unicode">
<pre wrap="">On Wed, Apr 27, 2011 at 02:53:49PM +1000, pratik for help wrote:
</pre>
<blockquote type="cite" style="color: rgb(0, 0, 0);">
<pre wrap=""><span class="moz-txt-citetags">> </span>The startup mechanism for SGI MPI jobs is quite complex and depends on
<span class="moz-txt-citetags">> </span>the type of executable you are running. If you encounter errors such as
<span class="moz-txt-citetags">> </span>ctrl_connect/connect: Connection refused
<span class="moz-txt-citetags">> </span>or
<span class="moz-txt-citetags">> </span>mpirun: MPT error (MPI_RM_sethosts): err=-1: could not run executable
<span class="moz-txt-citetags">> </span>(case #3)
<span class="moz-txt-citetags">> </span>contact us for an explanation.
<span class="moz-txt-citetags">> </span>
<span class="moz-txt-citetags">> </span>Can you please explain why such errors occur? I am running paraview on a
<span class="moz-txt-citetags">> </span>sgi altix cluster and am getting the exact same error!
</pre>
</blockquote>
<pre wrap="">I worked in some depth on MPT while we had our Altix. Here are the
details that I remember.
During startup, mpirun will listen on a certain IP/port. It puts the
IP/port into an environment variable (MPI_ENVIRONMENT, perhaps? I
forget, it starts with MPI_* though), and then starts the worker
processes. The worker processes (actually, 1 "shepherd" process per
node) will examine $MPI_ENVIRONMENT, and then using those details,
connect back to the mpirun process. This connection is then used to
communicate job details, as well as stdin/out/err.
The error indicates that this connection could not be made. The main
reasons are, either the $MPI_ENVIRONMENT variable hasn't been
propagated properly, or some other process has already connected to
the mpirun (the mpirun will stop listening once it receives the right
number of connections), usually because some other MPI program has
already connected (eg. if the MPI worker program is somehow run
twice), or perhaps if there is a firewall or (TCP/IP) networking issue
between the remote worker nodes and the node running mpirun.
I hope that helps.
Kev
<div class="moz-txt-sig">--
Dr Kevin Pulo <a
moz-do-not-send="true" class="moz-txt-link-abbreviated"
href="mailto:kevin.pulo@anu.edu.au">kevin.pulo@anu.edu.au</a>
Academic Consultant / Systems Programmer <a
moz-do-not-send="true" class="moz-txt-link-abbreviated"
href="http://www.kev.pulo.com.au">www.kev.pulo.com.au</a>
NCI NF / ANU SF +61 2 6125 7568
</div></pre>
</div>
<br>
On Thursday 28 April 2011 02:33 PM, pratik wrote:
<blockquote cite="mid:4DB92D67.30109@gmail.com" type="cite">Hi, <br>
Also, can you please tell me how can i rebuild paraview with the
*static* library of the plugin (i.e the .a file)? Although this is a
very inelegant way to solve the problem, I just want the functionality
of the TensorGlyph plugin. <br>
<br>
pratik <br>
On Thursday 28 April 2011 02:09 PM, pratik wrote: <br>
<blockquote type="cite">Hi Utkarsh, <br>
So...do you have a hunch what may be going on? I'm sorry if i have been
troubling you a lot, but this is really the last stage to get PV
working on the cluster:as i said before, the build with
BUILD_SHARED_LIB off worked perfectly, but the one with that option on
did not.... <br>
The thing that bothers me is that it is definetly not something wrong
with sgi mpt, since one build of pvserver is working fine. Having
reached so far, it is driving me crazy that it is still not able to
work :( <br>
<br>
If you need any more information please do let me know. Once again
thanks for all the help. <br>
<br>
pratik <br>
On Wednesday 27 April 2011 08:35 PM, pratik wrote: <br>
<blockquote type="cite">I think it is : <br>
<a moz-do-not-send="true" class="moz-txt-link-abbreviated"
href="mailto:pratikm@annapurna:%7E/source/ParaView/ParaView-3.10.1/NEWBUILD/bin">pratikm@annapurna:~/source/ParaView/ParaView-3.10.1/NEWBUILD/bin</a>>
ldd
/home/pratikm/source/ParaView/ParaView-3.10.1/BUILD/bin/pvserver
|grep mp <br>
libmpi++abi1002.so =>
/opt/sgi/mpt/mpt-1.23/lib/libmpi++abi1002.so (0x00002b61473a3000) <br>
libmpi.so => /opt/sgi/mpt/mpt-1.23/lib/libmpi.so
(0x00002b61474d0000) <br>
libsma.so => /opt/sgi/mpt/mpt-1.23/lib/libsma.so
(0x00002b6147854000) <br>
libxmpi.so => /opt/sgi/mpt/mpt-1.23/lib/libxmpi.so
(0x00002b614ed46000) <br>
libimf.so => /opt/intel/Compiler/11.1/038/lib/intel64/libimf.so
(0x00002b6153a16000) <br>
libsvml.so =>
/opt/intel/Compiler/11.1/038/lib/intel64/libsvml.so
(0x00002b6153d69000) <br>
libintlc.so.5 =>
/opt/intel/Compiler/11.1/038/lib/intel64/libintlc.so.5
(0x00002b6153f80000) <br>
<a moz-do-not-send="true" class="moz-txt-link-abbreviated"
href="mailto:pratikm@annapurna:%7E/source/ParaView/ParaView-3.10.1/NEWBUILD/bin">pratikm@annapurna:~/source/ParaView/ParaView-3.10.1/NEWBUILD/bin</a>>
ldd
/home/pratikm/source/ParaView/ParaView-3.10.1/NEWBUILD/bin/pvserver
|grep mp <br>
libmpi++abi1002.so =>
/opt/sgi/mpt/mpt-1.23/lib/libmpi++abi1002.so (0x00002ac9ae446000) <br>
libmpi.so => /opt/sgi/mpt/mpt-1.23/lib/libmpi.so
(0x00002ac9ae573000) <br>
libsma.so => /opt/sgi/mpt/mpt-1.23/lib/libsma.so
(0x00002ac9ae8f7000) <br>
libxmpi.so => /opt/sgi/mpt/mpt-1.23/lib/libxmpi.so
(0x00002ac9aee0f000) <br>
<a moz-do-not-send="true" class="moz-txt-link-abbreviated"
href="mailto:pratikm@annapurna:%7E/source/ParaView/ParaView-3.10.1/NEWBUILD/bin">pratikm@annapurna:~/source/ParaView/ParaView-3.10.1/NEWBUILD/bin</a>>
ldd
/home/pratikm/install/bin/pvserver |grep mp <br>
libmpi.so => /usr/lib64/libmpi.so (0x00002b0a9c9e3000) <br>
<br>
These are precisely the libraries i specified; the first one is the
pvserver with shared libs enabled, second one with shared lib disabled,
and the last one is the "installed" pvserver(installed version of
pvserver with shared libs enabled) <br>
Again, the last one is the "installed" pvserver; i am not quite sure
why the path has changed, but i am 90% sure that /usr/lib64/libmpi.so
refers to the same sgi mpi lib. <br>
<br>
pratik <br>
On Wednesday 27 April 2011 07:26 PM, Utkarsh Ayachit wrote: <br>
<blockquote type="cite">Do a "pvserver --ldd", is it using the
correct mpi libraries? <br>
<br>
Utkarsh <br>
<br>
On Wed, Apr 27, 2011 at 8:43 AM, pratik<a moz-do-not-send="true"
class="moz-txt-link-rfc2396E" href="mailto:pratik.mallya@gmail.com"><pratik.mallya@gmail.com></a>
wrote:
<br>
<blockquote type="cite">Also, i tried to start the pvserver
(with shared libraries enabled) on just <br>
the head node: <br>
pratikm@annapurna:~/install/bin> /usr/bin/mpirun -v -np 2 <br>
/home/pratikm/install/bin/pvserver <br>
MPI: libxmpi.so 'SGI MPT 1.23 03/28/09 11:45:59' <br>
MPI: libmpi.so 'SGI MPT 1.23 03/28/09 11:43:39' <br>
<br>
and it just hangs there! <br>
<br>
pratik <br>
On Wednesday 27 April 2011 06:05 PM, pratik wrote: <br>
<blockquote type="cite">oh! I'm sorry about that.... <br>
the client stalls indefinitely, but the server will stop executing.
Since <br>
I am running pv using PBS, the output file of the mpirun gives this: <br>
MPI: libxmpi.so 'SGI MPT 1.23 03/28/09 11:45:59' <br>
MPI: libmpi.so 'SGI MPT 1.23 03/28/09 11:43:39' <br>
MPI Environmental Settings <br>
MPI: MPI_DSM_DISTRIBUTE (default: not set) : 1 <br>
ctrl_connect/connect: Connection refused <br>
ctrl_connect/connect: Connection refused <br>
ctrl_connect/connect: Connection refused <br>
ctrl_connect/connect: Connection refused <br>
ctrl_connect/connect: Connection refused <br>
ctrl_connect/connect: Connection refused <br>
ctrl_connect/connect: Connection refused <br>
ctrl_connect/connect: Connection refused <br>
MPI: MPI_COMM_WORLD rank 2 has terminated without calling
MPI_Finalize() <br>
MPI: aborting job <br>
<br>
Attached is the cmakecahe of my server if you want to look at it. <br>
<br>
pratik <br>
On Wednesday 27 April 2011 05:55 PM, Utkarsh Ayachit wrote: <br>
<blockquote type="cite">You need to be more specific
about
the "something" that's going wrong <br>
before anyone can provide any additional information. <br>
<br>
Utkarsh <br>
<br>
On Wed, Apr 27, 2011 at 3:26 AM,
pratik<a moz-do-not-send="true" class="moz-txt-link-rfc2396E"
href="mailto:pratik.mallya@gmail.com"><pratik.mallya@gmail.com></a>
wrote: <br>
<blockquote type="cite">Hi, <br>
I built 2 versions of pv on the sgi altix cluster here(sgi mpt <br>
mpi)...one <br>
with BUILD_SHARED_LIBS enabled and one without. Now, the static
pvserver <br>
functions properly (i am accessing thru laptop via the reverse <br>
connection <br>
method) BUT the one with shared_libs enabled does not! Can this <br>
behaviour be <br>
explained? (the second one fails to establish a connection...something <br>
wrong <br>
with pvserver) <br>
I have EXACTLY the same cmakecache on both build EXCEPT the <br>
BUILD_SHARED_LIBS option. <br>
I know that there are many many things that could go wrong in a cluster
<br>
installation. So any hints/experience/hunch as to what is going on is <br>
welcome. <br>
<br>
pratik <br>
_______________________________________________ <br>
Powered by <a moz-do-not-send="true" class="moz-txt-link-abbreviated"
href="http://www.kitware.com">www.kitware.com</a> <br>
<br>
Visit other Kitware open-source projects at <br>
<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="http://www.kitware.com/opensource/opensource.html">http://www.kitware.com/opensource/opensource.html</a>
<br>
<br>
Please keep messages on-topic and check the ParaView Wiki at: <br>
<a moz-do-not-send="true"
class="moz-txt-link-freetext" href="http://paraview.org/Wiki/ParaView">http://paraview.org/Wiki/ParaView</a>
<br>
<br>
Follow this link to subscribe/unsubscribe: <br>
<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="http://www.paraview.org/mailman/listinfo/paraview">http://www.paraview.org/mailman/listinfo/paraview</a>
<br>
<br>
</blockquote>
</blockquote>
</blockquote>
<br>
</blockquote>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>