[Paraview] vtkSocket bugs
Burlen Loring
bloring at lbl.gov
Sun Feb 27 17:18:58 EST 2011
Hi,
While installing ParaView on Nautilus,
http://www.nics.tennessee.edu/computing-resources/nautilus, I hit a bug
in vtkSocket that prevents ParaView from running on this machine. While
tracking this down I uncovered a couple related issues.
The main issue is that vtkSocket does not handle EINTR. EINTR occurs
when a signal is caught by the application during a blocking socket
call. While ParaView does not make use of signals they are used for
asynchronous communication by some SGI specific libraries on Nautilus
that are linked in with SGI MPI. Because Rank 0 pvserver spends quite a
bit of its time blocked in socket calls it only takes a few 10s of
seconds for EINTR to occur. When faced with EINTR ParaView silently
exits leaving the user wondering what the heck happened. Which brings me
to the second issue, a lack of error reporting in vtkSocket.
To solve the first issue vtkSocket has to handle EINTR. How EINTR should
be handled depends on the specific socket call. For all calls except
connect the call can simply be restarted. For EINTR during connect one
can't restart the call on all unix, so instead one must block in a
select call when connect fails with EINTR. To be portable across Unix
one should handle EINTR in all socket calls, even simple ones like
set/getsockopt.
The second issue of error reporting applies to all socket related errors
in general, my feeling is that when a socket call fails vtkSocket should
print a message using vtkErrorMacro, errno, and strerror(or windows
equivalent) at the point of failure. I think this should be done inside
vtkSocket because this is the only place one can safely assume errno has
relevant information and vtkSocket has been implemented returning a
single error code, -1, so that returning the real error code would
change the API and break existing code, including ParaView. Not to
mention that the values for error codes are apparently different on
windows and unix.
I took a stab at fixing these issues, patches attached. I tested them on
my workstation, nautilus, and laptop running xp. I ran a dashboard on my
linux workstation and didn't see any related issues. Would someone at KW
mind taking a look at the changes and see if it could be made permanent?
By the way after testing all socket calls for error returns I uncovered
a third bug, vtkSocket::Close didn't set the descriptor ivar to -1 which
resulted in vtkSocket::~vtkSocket calling close on a closed socket. Not
a disasterous error, but this reinforces my opinion that the returns
should be tested and error messages printed.
Thanks
Burlen
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vtkSocket.cxx.patch
Type: text/x-patch
Size: 17745 bytes
Desc: not available
URL: <http://www.paraview.org/pipermail/paraview/attachments/20110227/bcb38894/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vtkSocket.h.patch
Type: text/x-patch
Size: 331 bytes
Desc: not available
URL: <http://www.paraview.org/pipermail/paraview/attachments/20110227/bcb38894/attachment-0003.bin>
More information about the ParaView
mailing list