Simplifying Remote Visualization for HPC sites

From ParaQ Wiki
Revision as of 09:05, 28 May 2010 by Patmarion (talk | contribs) (→‎optional fields)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Currently ParaView uses pvsc files for configuring connecting to servers. Although powerful, it's really complicated for site-maintainers to use them. The goal of this page is to consolidate ideas so that it's possible for users to simply download ParaView and then submit jobs to any of the major supercomputing sites and visualize their datasets.


server profile

A paraview server profile is a combination of a #host profile and a #launch profile. The host profile says where the server is launched, the launch profile says how the server is launched.

component launcher

In the paraview client, you would no longer specify a custom command for launching pvserver. Instead, paraview client starts the component launcher. Depending on what is specified in the #host profile, the component launcher is started directly on localhost, or it is started on a remote host using a ssh command. The component launcher connects back to the client with a tcp socket. The connection is made directly using host:port, or through a reverse #ssh tunnel.

The component launcher receives a #launch profile from the paraview client and uses it to launch pvserver. Visit's component launcher can also launch a meta-data server that can deliver filesystem listings to the client- it may be convenient to browse files without having to launch pvserver.

Using the component launcher you only have to run ssh one time (authenticate once), and from there the component launcher can launch your servers. If you are using ssh tunneling, the ssh tunnel will stay open as long as the component launcher is running. Also, the component launcher solves the #gateway ports issue.

backward compatibility

Paraview should continue to support pvsc files. When creating a #server profile, you could specify that this server configuration follows the old style. In this case paraview will use the old code path instead of using the component launcher.

host profile

A host profile is specified in xml and consists of key:value pairs. The simplest case for a host profile is:

host: localhost

A more common case is:

user: pmarion
host: artemis.princeton.edu
use-tunnel: true
paraview-install: /path/to/paraview

Using the host profile we know how to start the component launcher on the remote host (of course, it might just be localhost). The path to the component launcher is specified in the host profile. The component launcher will know how to find the other binaries such as pvserver.

Of course, the average user will never see the xml. Host profiles are distributed with paraview. The users just need dialogs in paraview to customize the information. To make it even simpler, we might split host profile into host profile and user profile. Users may only need to specify their username and maybe a ssh key file.

launch profile

A launch profile is specified in xml and consists of key:value pairs. A simple serial launch profile might be:

type: serial
extra-args: --foo

A launch profile for a parallel run might be:

type: parallel
mpi-command: mpirun
nprocs: 8
machine-file: /path/to/machinefile

If pvserver should be run as a scheduled job on a queuing system, the launch profile would contain more fields:

launcher: qsub
parition: debug
time-limit: 01:00
procs-per-node: 2
account: 001234

Launch profiles would be distributed with paraview. The launch profile could contain site specific key:value pairs if needed. As with the host profile, users don't have to edit any xml, paraview provides dialogs for modifying the fields.

Here is how visit handles the launch profile- the component launcher parses the launch profile and converts the key:value pairs into a list of standard command line arguments. It then calls a perl script named internallauncher and passes the arguments. This perl script (3.3K lines!) takes the arguments and does all the required setup. It sets environment variables, resolves paths, it contains all the logic for constructing the mpirun command line, or writing a job script for the specified scheduler. The script is full of special cases for the sites where visit runs, things like:

  if ( $IsRunningOnFranklinNERSC || $IsRunningOnHopperNERSC || $IsRunningOnJaguar_ORNL )
  {
    push @mpicmd, "$visitbindir/env LD_LIBRARY_PATH=$visitlibdir:$visitarchdir/system_libs";
  }


I propose a slightly different solution. The component launcher would call a python script and pass the launch profile xml. The python script constructs a LaunchProfile object from the xml that has convenient getters/setters, and passes the object to a standard launch routine. The launch routine could call custom routines for the sites that paraview officially supports, or a site maintainer could write their own launch routine.

ssh client

For mac & linux we would default to 'ssh' but allow the user to specify a path to their preferred ssh client. For windows- visit src contains a cmakeified Putty client that we could use by default, as well as allowing the user to enter their preferred ssh client.

interactive passwords

If you launch paraview from a terminal and paraview forks off an ssh process that connects to a host requiring interactive password authentication, then the ssh password prompt will appear in the terminal. On Mac however, users commonly run paraview from an app bundle, not a terminal. It is possible to use some tricks to allocate a pty and run ssh inside that, then you can capture the password prompt and display a qt dialog to the user, then send the user's password to ssh through the pty. We can handle interactive passwords on Windows through the cmakified Putty client, using Visit's code.

There is a python module called 'pexpect' (written in pure python, depends on pty module) that makes it easy to run ssh inside a pty on mac/linux. A common way to do it from c++ is to use libexpect, but that has a tcl dependency. Visit contains code that attempts to do this without using libexpect, but it did not work on my machine. Visit's documentation says their code may not work on 'modern' linuxes. I don't see why we couldn't make it work though because the python pexpect module works just fine.

Another solution for mac and linux would be to fork a new terminal process that executes the ssh command. On linux (with gnome):

gnome-terminal --command "ssh user@host ./launch_paraview"

We could probably find a solution on mac using Terminal or AppleScript + Terminal. For linux users without gnome-terminal (or other popular terminals) we could fallback to requiring that they run paraview from the command line.

token passcode

Many sites require you enter a passcode from a token. You have to do this interactively. See above for discussion of handling interactive prompts.

connecting to the client

The component launcher and pvserver have to be able to connect back to the client. If you are using an #ssh tunnel then this section is not relevant. After you ssh to a remote host, usually the SSH_CLIENT environment variable contains the originating IP address. Alternatively you can use the result of `hostname` on the client computer. Alternatively, the user could manually specify their machine's host:port in the host profile.

ssh tunnel

The user will often want to use a ssh tunnel when launching on a remote host. To create the ssh tunnel we will have to pick a random port number on the remote host that will be forwarded to the client machine where paraview client is listening for a reverse connection.

gateway ports

When you use a ssh tunnel, the ssh tunnel only accepts connections to the tunnel port from localhost. This becomes a problem when there is a reverse ssh tunnel from the login node to client computer, and pvserver executes on compute nodes. The ssh tunnel on the login node will not accept connections from a compute node (it only accepts connections from localhost) unless GatewayPorts is enabled in the sshd_config, and it is not normally enabled. The problem can be solved by using the #component launcher to act as a port forwarder. The component launcher forks off a port-forwarding process, a bridge, it accepts connections from pvserver and forwards the traffic to the ssh tunnel. Code exists in visit for doing this. A simple port-forwarder can also be written in python which may or may not perform as well as pure c code.

compiling the component launcher

The #component launcher could be a c++ application, or written in pure python. If it was pure python, it would be nice to use python to write the client side code that communicates with the component launcher. This introduces a python requirement for paraview client builds that plan to use the component launcher.

If the component launcher is written c++, should it use vtk? For many sites we will already have a full paraview build, so there is no issue. For sites that require cross compiling, we currently build a small pvHostTools target on the login node, then use these tools to cross compile paraview for the compute nodes. The pvHostTools target goes as far as building vtkCommon. Therefore the component launcher probably shouldn't require any more than vtkCommon. VTK's socket communicator classes are in vtkParallel, but the component launcher is simple enough that it should get by without those extras.

optional fields

It will be helpful to allow fields to be optional. Certain fields only apply to some users or styles of launch.

ssh security

Martin et. al have an article talking about avoiding ssh-tunneling, instead directly using ssh to talk to pvserver. We should see how we can support that.