ParaQ:Client-Server Connections: Difference between revisions

From ParaQ Wiki
Jump to navigationJump to search
No edit summary
No edit summary
Line 18: Line 18:
<br>UC2) The user’s environment allows direct connections to a running server but doing a reverse connection is also totally fine.
<br>UC2) The user’s environment allows direct connections to a running server but doing a reverse connection is also totally fine.
<br>UC3) The user’s environment allows direct connections to a running server but for some architectural reason will not allow a reverse connection.
<br>UC3) The user’s environment allows direct connections to a running server but for some architectural reason will not allow a reverse connection.
<br>UC4) Same as UC1/UC2 but user would like to connect multiple servers to a client.  
<br>UC4) Same as UC1/UC2<font color=red>/UC3</font> but user would like to connect multiple servers to a client.  
<br>UC5) Same as UC1/UC2 but user would like to connect multiple clients to a server.
<br>UC5) Same as UC1/UC2<font color=red>/UC3</font> but user would like to connect multiple clients to a server.
<br><font color=red>UC6) Same as UC1/UC2/UC3 but the user would like to disconnect from the server and reconnect at a later time.</font>


'''''Use Case Assumption:'''''
'''''Use Case Assumption:'''''
<br>As you may have noticed the use cases above are specifically worded to ‘deprecate’ the use of the normal forward connections. If UC3 is determined to be a significant use case for the larger ParaQ community, then the proposed approach may have to be altered to address that specific use case.
<br>As you may have noticed the use cases above are specifically worded to ‘deprecate’ the use of the normal forward connections. If UC3 is determined to be a significant use case for the larger ParaQ community, then the proposed approach may have to be altered to address that specific use case.
<br><font color=red>Ken's peanut gallery:</font> I think deprecating forward connections is a bad idea.  Outside of Sandia, this is probably the rule rather than the exception.  For example, consider a student/employee connecting to a ParaView server running on a cluster at school/work from home.  My home, like many others, connects to the internet with a dynamic IP address with no hostname to resolve it.  Furthermore, my home computer sits behind a firewall that blocks all incomming connections (since I don't host any services).  If you force users to negotiate all of that to make a reverse connections, I expect lots of frustrated users.


   
   
Line 38: Line 41:


The menu items and the corresponding connect dialog box will be constructed from this XML. So for instance if I say connect to ‘redrage’ then a dialog box may pop up asking for a kerboros password and the number of nodes, time, and a case number. If I connect to ‘testcluster’ the dialog may just have number of nodes and time. We will want to UI construction to be ‘extendable’ so that if some cluster needs another variable like which queue to submit to then the XML could define a new variable that the UI would ask the user to type in.
The menu items and the corresponding connect dialog box will be constructed from this XML. So for instance if I say connect to ‘redrage’ then a dialog box may pop up asking for a kerboros password and the number of nodes, time, and a case number. If I connect to ‘testcluster’ the dialog may just have number of nodes and time. We will want to UI construction to be ‘extendable’ so that if some cluster needs another variable like which queue to submit to then the XML could define a new variable that the UI would ask the user to type in.
<font color=red>Ken's peanut gallery:</font> I agree that this configurable XML approach is the way to go.  In addition to the connections specified in the XML, there should be at least two more: connect to an existing server and wait for a server's connnection.  The two are roughly equivalent to the client/server connection modes that exist for ParaView now.


The approach described above should address UC1 and UC2. The problem with UC3 is that launching a server batch job does not lend itself to knowing when the server has actually been started (i.e. you don’t know how long to wait before the client can connect in). In the UC1 and UC2 cases this is not an issue because the client will simply ‘wait’ until the server starts up and connects back.
The approach described above should address UC1 and UC2. The problem with UC3 is that launching a server batch job does not lend itself to knowing when the server has actually been started (i.e. you don’t know how long to wait before the client can connect in). In the UC1 and UC2 cases this is not an issue because the client will simply ‘wait’ until the server starts up and connects back.
<font color=red>Ken's peanut gallery:</font> From the user's standpoint, what is the difference between waiting for a server to connect and polling a connection to see if a server has started yet?  I don't think the forward connect is an issue.  Just keep trying to connect until it works.


'''''Multiple Servers:'''''
'''''Multiple Servers:'''''
Line 53: Line 60:


So, in order to connect up another client, the user of the ‘initiating’ client has to explicitly say connect to client ‘blah’, again in my opinion this is a good thing. The user can at that point also specify whether that client is ‘read/view only’ or can control/write to the server state, also issues like writing to the servers disk (again you have a client that may not be the same user as the person who started the server). So the server then connects to the specified client (who, because UC3 is deprecated, is waiting in reverse connect mode).  
So, in order to connect up another client, the user of the ‘initiating’ client has to explicitly say connect to client ‘blah’, again in my opinion this is a good thing. The user can at that point also specify whether that client is ‘read/view only’ or can control/write to the server state, also issues like writing to the servers disk (again you have a client that may not be the same user as the person who started the server). So the server then connects to the specified client (who, because UC3 is deprecated, is waiting in reverse connect mode).  
<font color=red>
UC6: When connecting to the server after all clients have disconnected, the above approach will not work.  A solution to this problem is to write out a small "cookie" file that contains enough information to reestablish the connection.  This has the potential to solve many of the problems with connecting to servers.  It solves the problem of specifying the server on a cluster running multiple servers.  It also can solve the security problem of forward connections given above.  The "secret key" could be randomly generated when the server starts, and that key can be placed in the cookie.  Thus, access to the server is basically limited to people with access to the cookie file.
</font>


'''''Challenges:'''''
'''''Challenges:'''''
Line 58: Line 69:
<br>1) Which ssh do I use? Are there issues about using a particular (site specific) ssh? Can I use some general OpenSSH library?
<br>1) Which ssh do I use? Are there issues about using a particular (site specific) ssh? Can I use some general OpenSSH library?
<br>2) What are the issues with taking in a user’s password?  
<br>2) What are the issues with taking in a user’s password?  
<br>3) What are the issues with grabbing the client host name? Is there a nice cross platform way of doing this? Will it be fully qualified? Can the server use it to connect back?
<br>3) What are the issues with grabbing the client host name? Is there a nice cross platform way of doing this? Will it be fully qualified? Can the server use it to connect back? <font color=red>Ken:</font> Lots of issues.  For starters, the client may not have a host name at all (or it's not registered in any DNS).  If the client and server are not on the same LAN, the client may not even have a valid IP address.
<br>4) What if you have two clients on the same host?
<br>4) What if you have two clients on the same host <font color=red>(Solved by deprecating UC1.  Just kidding.)</font>?
<br>5) What if you have two servers on the same cluster (solved by deprecating UC3)?
<br>5) What if you have two servers on the same cluster (solved by deprecating UC3)?




*This would be very helpful at a place like Sandia where we have several clusters which have interactive queues for vis, but many others that don’t. It’s a nice indication to the user that they may have to wait a long time for non-interactive compute platforms. Also it will mean that users start to demand an interactive queue for those platforms that don’t currently have an interactive vis queue.
*This would be very helpful at a place like Sandia where we have several clusters which have interactive queues for vis, but many others that don’t. It’s a nice indication to the user that they may have to wait a long time for non-interactive compute platforms. Also it will mean that users start to demand an interactive queue for those platforms that don’t currently have an interactive vis queue.

Revision as of 09:43, 5 October 2005

Client-Server Connections in ParaQ


Background:
The precise mechanism of connecting a client to a server in ParaQ should be designed into the ParaQ client. The current mechanism of launching the server and the client with a script (when on a Unix client) and having the users do it manually on the windows side is not an acceptable design for ParaQ.

The computational and visualization clusters at Sandia do not currently support the client connecting directly to a running server. As the job submission, network and name resolution setups at Sandia are very similar to other laboratories and universities. Having a general ‘reverse connect’ functionality would probably be useful for the larger ParaQ community.

This document will try to address the desired functionality for ‘reverse’ connection scenarios. This document will also try to address the issue of multiple clients connected to a server, and multiple servers connected to a single client.

General:
Users would probably expect ParaQ to come up by default in ‘stand alone’ or ‘localhost’ mode. They can then use ParaQ on data from their local workstation.

Users would also probably expect ParaQ to have the ability to connect to a specific server and read data off of that server.

Use Cases:
UC1) The user’s environment is setup similar to Sandia’s and a reverse connection is necessary.
UC2) The user’s environment allows direct connections to a running server but doing a reverse connection is also totally fine.
UC3) The user’s environment allows direct connections to a running server but for some architectural reason will not allow a reverse connection.
UC4) Same as UC1/UC2/UC3 but user would like to connect multiple servers to a client.
UC5) Same as UC1/UC2/UC3 but user would like to connect multiple clients to a server.
UC6) Same as UC1/UC2/UC3 but the user would like to disconnect from the server and reconnect at a later time.

Use Case Assumption:
As you may have noticed the use cases above are specifically worded to ‘deprecate’ the use of the normal forward connections. If UC3 is determined to be a significant use case for the larger ParaQ community, then the proposed approach may have to be altered to address that specific use case.


Ken's peanut gallery: I think deprecating forward connections is a bad idea. Outside of Sandia, this is probably the rule rather than the exception. For example, consider a student/employee connecting to a ParaView server running on a cluster at school/work from home. My home, like many others, connects to the internet with a dynamic IP address with no hostname to resolve it. Furthermore, my home computer sits behind a firewall that blocks all incomming connections (since I don't host any services). If you force users to negotiate all of that to make a reverse connections, I expect lots of frustrated users.


Proposed Approach:
The proposed approach tries to address the fact that every installation will have different server names and different ways to connect to that server and then different ways of launching the paraview server.

Specifically the use of a server.xml file in a specific location of the installation is proposed. The XML file will contain all the information relevant to the available servers.


1) The server name (perhaps fully qualified “foo.sandia.gov”)
2) The displayed name (“foo”)
3) Whether the queue is interactive or not (to be displayed in menu)*
4) The connection protocol (rsh, ssh, other?)
5) The script to be run once connected (“paraview_server_go”)
6) The arguments to that script (nodes, times, case_number)

The menu items and the corresponding connect dialog box will be constructed from this XML. So for instance if I say connect to ‘redrage’ then a dialog box may pop up asking for a kerboros password and the number of nodes, time, and a case number. If I connect to ‘testcluster’ the dialog may just have number of nodes and time. We will want to UI construction to be ‘extendable’ so that if some cluster needs another variable like which queue to submit to then the XML could define a new variable that the UI would ask the user to type in.

Ken's peanut gallery: I agree that this configurable XML approach is the way to go. In addition to the connections specified in the XML, there should be at least two more: connect to an existing server and wait for a server's connnection. The two are roughly equivalent to the client/server connection modes that exist for ParaView now.

The approach described above should address UC1 and UC2. The problem with UC3 is that launching a server batch job does not lend itself to knowing when the server has actually been started (i.e. you don’t know how long to wait before the client can connect in). In the UC1 and UC2 cases this is not an issue because the client will simply ‘wait’ until the server starts up and connects back.

Ken's peanut gallery: From the user's standpoint, what is the difference between waiting for a server to connect and polling a connection to see if a server has started yet? I don't think the forward connect is an issue. Just keep trying to connect until it works.

Multiple Servers:
UC4: Not sure exactly what the usage of multiple servers might look like in the end product, but it appears that the proposed approach actually lends itself quite nicely to multiple servers. The user might do something like specify a window before connecting to the server and then each window would have a different server (really not sure).

Multiple Clients:
UC5: This is the tricky one. If the server always does a connection back to the client, there seems to be no way to actually have multiple clients connect into one server. Well what on the surfaces seems like a bad thing is actually a good thing.

    Security Sidebar:
    Let’s talk about the other case where the server is ‘open’ and clients can simply connect in. The server is running with the permissions of the user that started that server, and you surely don’t want anyone with a client to be able to arbitrarily connect. Okay so you put something in place like a secret key or something, well in practice everyone will choose ‘123’ and you haven’t really protected ‘need to know’ issues in any significant way (by the way this is one of the reasons why I propose the deprecation of UC3). In general I like reverse connect because then the person starting the server is specifically specifying which client to connect back to and because that client is ‘waiting’ the connection is immediately established. In the case where the server is ‘open’ and excepting connections you don’t have any real way of saying except connections from this client but not this client, also inevitably, the server is simply ‘hanging’ out for a while before you get around to starting your client, this is also a bit of a security hole.

So, in order to connect up another client, the user of the ‘initiating’ client has to explicitly say connect to client ‘blah’, again in my opinion this is a good thing. The user can at that point also specify whether that client is ‘read/view only’ or can control/write to the server state, also issues like writing to the servers disk (again you have a client that may not be the same user as the person who started the server). So the server then connects to the specified client (who, because UC3 is deprecated, is waiting in reverse connect mode).

UC6: When connecting to the server after all clients have disconnected, the above approach will not work. A solution to this problem is to write out a small "cookie" file that contains enough information to reestablish the connection. This has the potential to solve many of the problems with connecting to servers. It solves the problem of specifying the server on a cluster running multiple servers. It also can solve the security problem of forward connections given above. The "secret key" could be randomly generated when the server starts, and that key can be placed in the cookie. Thus, access to the server is basically limited to people with access to the cookie file.

Challenges:
Whenever you ‘submit’ a job to a computation cluster you are in tricky waters. The challenges to this type of design include at least the following items.
1) Which ssh do I use? Are there issues about using a particular (site specific) ssh? Can I use some general OpenSSH library?
2) What are the issues with taking in a user’s password?
3) What are the issues with grabbing the client host name? Is there a nice cross platform way of doing this? Will it be fully qualified? Can the server use it to connect back? Ken: Lots of issues. For starters, the client may not have a host name at all (or it's not registered in any DNS). If the client and server are not on the same LAN, the client may not even have a valid IP address.
4) What if you have two clients on the same host (Solved by deprecating UC1. Just kidding.)?
5) What if you have two servers on the same cluster (solved by deprecating UC3)?


  • This would be very helpful at a place like Sandia where we have several clusters which have interactive queues for vis, but many others that don’t. It’s a nice indication to the user that they may have to wait a long time for non-interactive compute platforms. Also it will mean that users start to demand an interactive queue for those platforms that don’t currently have an interactive vis queue.