Simplifying Remote Visualization for HPC sites: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
Currently ParaView uses pvsc files for configuring connecting to servers. Although powerful, it's really complicated for site-maintainers to use them. The goal of this page is to consolidate ideas so | Currently ParaView uses pvsc files for configuring connecting to servers. Although powerful, it's really complicated for site-maintainers to use them. The goal of this page is to consolidate ideas so that it's possible for users to simply download ParaView and then submit jobs to any of the major supercomputing sites and visualize their datasets. | ||
===server profile=== | |||
A paraview server profile is a combination of a [[#host profile]] and a [[#launch profile]]. The host profile says <i>where</i> the server is launched, the launch profile says <i>how</i> the server is launched. | |||
===component launcher=== | |||
In the paraview client, you would no longer specify a custom command for launching pvserver. Instead, paraview client starts the component launcher. Depending on what is specified in the [[#host profile]], the component launcher is started directly on localhost, or it is started on a remote host using an [[#ssh client|ssh command]]. The component launcher [[#connecting to the client|connects back to the client]] with a tcp socket. The connection is made directly using host:port or through a reverse [[#ssh tunnel]]. | |||
The component launcher receives a [[#launch profile]] from the paraview client and uses it to launch pvserver. Visit's component launcher can also launch a meta-data server that can deliver filesystem listings to the client- it may be convenient to browse files without having to launch pvserver. | |||
Using the component launcher you only have to run ssh one time (authenticate once), and from there the component launcher can launch your servers. If you are using [[#ssh tunnel|ssh tunneling]], the ssh tunnel will stay open as long as the component launcher is running. Also, the component launcher solves the [[#gateway ports]] issue. | |||
===host profile=== | |||
A host profile is specified in xml and consists of key:value pairs. The simplest case for a host profile is: | |||
<pre>"host: localhost"</pre> | |||
A more common case is: | |||
<pre> | |||
user: pmarion | |||
host: artemis.princeton.edu | |||
use-tunnel: true | |||
paraview-install: /path/to/paraview | |||
</pre> | |||
Using the host profile we know how to start the component launcher on the remote host (of course, it might just be localhost). The path to the component launcher is specified in the host profile. The component launcher will know how to find the other binaries such as pvserver. | |||
Of course, the average user will never see the xml. Host profiles are distributed with paraview. The users just need dialogs in paraview to customize the information. To make it even simpler, we might split host profile into <i>host profile</i> and <i>user profile</i>. Users may only need to specify their username and maybe a ssh key file. | |||
===launch profile=== | |||
A launch profile is specified in xml and consists of key:value pairs. A simple serial launch profile might be: | |||
<pre> | |||
type: serial | |||
extra-args: --foo | |||
</pre> | |||
A launch profile for a parallel run might be: | |||
<pre> | |||
type: parallel | |||
mpi-command: mpirun | |||
nprocs: 8 | |||
machine-file: /path/to/machinefile | |||
</pre> | |||
If pvserver should be run as a scheduled job on a queuing system, the launch profile would contain more fields: | |||
<pre> | |||
launcher: qsub | |||
parition: debug | |||
time-limit: 01:00 | |||
procs-per-node: 2 | |||
account: 001234 | |||
</pre> | |||
Launch profiles would be distributed with paraview. The launch profile could contain site specific key:value pairs if needed. As with the host profile, users don't have to edit any xml, paraview provides dialogs for modifying the fields. | |||
Here is how visit handles the launch profile- the component launcher parses the launch profile and converts the key:value pairs into a list of standard command line arguments. It then calls a perl script named internallauncher and passes the arguments. This perl script (3.3K lines!) takes the arguments and does all the required setup. It sets environment variables, resolves paths, it contains all the logic for constructing the mpirun command line, or writing a job scripts for the specified scheduler. The script is full of special cases for the sites where visit runs, things like: | |||
<pre> | |||
if ( $IsRunningOnFranklinNERSC || $IsRunningOnHopperNERSC || $IsRunningOnJaguar_ORNL ) | |||
{ | |||
push @mpicmd, "$visitbindir/env LD_LIBRARY_PATH=$visitlibdir:$visitarchdir/system_libs"; | |||
} | |||
</pre> | |||
I propose a slightly different solution. The component launcher would call a python script and pass the launch profile xml. The python script constructs a LaunchProfile object from the xml that has convenient getters/setters, and passes the object to a standard launch routine. The launch routine could call custom routines for the sites that paraview officially supports, or a site maintainer could easily insert their own launch routine. | |||
===ssh client=== | |||
For mac & linux we would default to 'ssh' but allow the user to specify a path to their preferred ssh client. For windows- visit src contains a cmakeified Putty client that we could use by default, as well as allowing the user to enter their preferred ssh client. | |||
===interactive passwords=== | |||
If you launch paraview from a terminal and paraview forks off an ssh process that connects to a host requiring interactive password authentication, then the ssh password prompt will appear in the terminal. It is possible to use some trickery to allocate a pty and run ssh inside that, then you can capture the password prompt and display a qt dialog to the user, then send the user's password to ssh through the pty. | |||
There is a python module called 'pexpect' (written in pure python, depends on pty module) that makes it easy to do this on mac/linux. A common way to do it from c++ is to use libexpect, but that has a tcl dependency. Visit contains code that attempts to do this without using libexpect, but it did not work on my machine. Visit's documentation says their code may not work on 'modern' linuxes. I don't see why we couldn't make it work though because the python pexpect module works just fine. We can handle interactive passwords on Windows through the cmakified Putty client, using Visit's code. | |||
===token passcode=== | |||
Many sites require you enter a passcode from a token. You have to do this interactively. So either the prompt appears in the terminal, or we try to be fancy and allocate a pty for ssh. | |||
===connecting to the client=== | |||
The component launcher and pvserver have to be able to connect back to the client. If you are using an [[#ssh tunnel]] then this section is not relevant. After you ssh to a remote host, usually the SSH_CLIENT environment variable contains the originating IP address. Alternatively you can use the result of `hostname` on the client computer. Alternatively, the user could manually specify their machine's host:port in the host profile. | |||
===ssh tunnel=== | |||
The user will often want to use a ssh tunnel when launching on a remote host. To create the ssh tunnel we will have to pick a random port number on the remote host that will be forwarded to the client machine where paraview client is listening for a reverse connection. | |||
===gateway ports=== | |||
When you use an ssh tunnel, the ssh tunnel only accepts connections to the tunnel port from localhost. This becomes a problem when there is a reverse ssh tunnel from the login node to client computer computer, and pvserver executes on compute nodes. The ssh tunnel on the login node will not accept connections from a compute node (it only accepts connections from localhost) unless GatewayPorts is enabled in the sshd_config, and it is not normally enabled. The problem can be solved by using the [[#component launcher]] to act as a port forwarder. The component launcher forks off a port-forwarding process, a bridge, it accepts connections from pvserver and forwards the traffic to the ssh tunnel. Code exists in visit for doing this. A simple port-forwarder can also be written in python which may or may not perform as well as pure c code. | |||
===ssh security=== | |||
Martin et. al have an article talking about avoiding ssh-tunneling, instead directly using ssh to talk to pvserver. We should see how we can support that. |
Revision as of 16:02, 19 May 2010
Currently ParaView uses pvsc files for configuring connecting to servers. Although powerful, it's really complicated for site-maintainers to use them. The goal of this page is to consolidate ideas so that it's possible for users to simply download ParaView and then submit jobs to any of the major supercomputing sites and visualize their datasets.
server profile
A paraview server profile is a combination of a #host profile and a #launch profile. The host profile says where the server is launched, the launch profile says how the server is launched.
component launcher
In the paraview client, you would no longer specify a custom command for launching pvserver. Instead, paraview client starts the component launcher. Depending on what is specified in the #host profile, the component launcher is started directly on localhost, or it is started on a remote host using an ssh command. The component launcher connects back to the client with a tcp socket. The connection is made directly using host:port or through a reverse #ssh tunnel.
The component launcher receives a #launch profile from the paraview client and uses it to launch pvserver. Visit's component launcher can also launch a meta-data server that can deliver filesystem listings to the client- it may be convenient to browse files without having to launch pvserver.
Using the component launcher you only have to run ssh one time (authenticate once), and from there the component launcher can launch your servers. If you are using ssh tunneling, the ssh tunnel will stay open as long as the component launcher is running. Also, the component launcher solves the #gateway ports issue.
host profile
A host profile is specified in xml and consists of key:value pairs. The simplest case for a host profile is:
"host: localhost"
A more common case is:
user: pmarion host: artemis.princeton.edu use-tunnel: true paraview-install: /path/to/paraview
Using the host profile we know how to start the component launcher on the remote host (of course, it might just be localhost). The path to the component launcher is specified in the host profile. The component launcher will know how to find the other binaries such as pvserver.
Of course, the average user will never see the xml. Host profiles are distributed with paraview. The users just need dialogs in paraview to customize the information. To make it even simpler, we might split host profile into host profile and user profile. Users may only need to specify their username and maybe a ssh key file.
launch profile
A launch profile is specified in xml and consists of key:value pairs. A simple serial launch profile might be:
type: serial extra-args: --foo
A launch profile for a parallel run might be:
type: parallel mpi-command: mpirun nprocs: 8 machine-file: /path/to/machinefile
If pvserver should be run as a scheduled job on a queuing system, the launch profile would contain more fields:
launcher: qsub parition: debug time-limit: 01:00 procs-per-node: 2 account: 001234
Launch profiles would be distributed with paraview. The launch profile could contain site specific key:value pairs if needed. As with the host profile, users don't have to edit any xml, paraview provides dialogs for modifying the fields.
Here is how visit handles the launch profile- the component launcher parses the launch profile and converts the key:value pairs into a list of standard command line arguments. It then calls a perl script named internallauncher and passes the arguments. This perl script (3.3K lines!) takes the arguments and does all the required setup. It sets environment variables, resolves paths, it contains all the logic for constructing the mpirun command line, or writing a job scripts for the specified scheduler. The script is full of special cases for the sites where visit runs, things like:
if ( $IsRunningOnFranklinNERSC || $IsRunningOnHopperNERSC || $IsRunningOnJaguar_ORNL ) { push @mpicmd, "$visitbindir/env LD_LIBRARY_PATH=$visitlibdir:$visitarchdir/system_libs"; }
I propose a slightly different solution. The component launcher would call a python script and pass the launch profile xml. The python script constructs a LaunchProfile object from the xml that has convenient getters/setters, and passes the object to a standard launch routine. The launch routine could call custom routines for the sites that paraview officially supports, or a site maintainer could easily insert their own launch routine.
ssh client
For mac & linux we would default to 'ssh' but allow the user to specify a path to their preferred ssh client. For windows- visit src contains a cmakeified Putty client that we could use by default, as well as allowing the user to enter their preferred ssh client.
interactive passwords
If you launch paraview from a terminal and paraview forks off an ssh process that connects to a host requiring interactive password authentication, then the ssh password prompt will appear in the terminal. It is possible to use some trickery to allocate a pty and run ssh inside that, then you can capture the password prompt and display a qt dialog to the user, then send the user's password to ssh through the pty.
There is a python module called 'pexpect' (written in pure python, depends on pty module) that makes it easy to do this on mac/linux. A common way to do it from c++ is to use libexpect, but that has a tcl dependency. Visit contains code that attempts to do this without using libexpect, but it did not work on my machine. Visit's documentation says their code may not work on 'modern' linuxes. I don't see why we couldn't make it work though because the python pexpect module works just fine. We can handle interactive passwords on Windows through the cmakified Putty client, using Visit's code.
token passcode
Many sites require you enter a passcode from a token. You have to do this interactively. So either the prompt appears in the terminal, or we try to be fancy and allocate a pty for ssh.
connecting to the client
The component launcher and pvserver have to be able to connect back to the client. If you are using an #ssh tunnel then this section is not relevant. After you ssh to a remote host, usually the SSH_CLIENT environment variable contains the originating IP address. Alternatively you can use the result of `hostname` on the client computer. Alternatively, the user could manually specify their machine's host:port in the host profile.
ssh tunnel
The user will often want to use a ssh tunnel when launching on a remote host. To create the ssh tunnel we will have to pick a random port number on the remote host that will be forwarded to the client machine where paraview client is listening for a reverse connection.
gateway ports
When you use an ssh tunnel, the ssh tunnel only accepts connections to the tunnel port from localhost. This becomes a problem when there is a reverse ssh tunnel from the login node to client computer computer, and pvserver executes on compute nodes. The ssh tunnel on the login node will not accept connections from a compute node (it only accepts connections from localhost) unless GatewayPorts is enabled in the sshd_config, and it is not normally enabled. The problem can be solved by using the #component launcher to act as a port forwarder. The component launcher forks off a port-forwarding process, a bridge, it accepts connections from pvserver and forwards the traffic to the ssh tunnel. Code exists in visit for doing this. A simple port-forwarder can also be written in python which may or may not perform as well as pure c code.
ssh security
Martin et. al have an article talking about avoiding ssh-tunneling, instead directly using ssh to talk to pvserver. We should see how we can support that.