Parallel I/O: Difference between revisions

Revision as of 13:59, 29 January 2007

As the data sets we process with ParaView get bigger, we are finding that a significant portion of the wait time in ParaView is I/O for many of our users. We had originally assumed that moving our clusters to parallel disk drives with greater overall read performance would largely fix the problem. Unfortunately, in practice we have found that often times our read rates from the storage system are far below its potential.

This document analyzes the parallel I/O operations performed by VTK and ParaView (specifically reads since they are by far the most common), hypothesizes on how these might impede I/O performance, and proposes a mechanism that will improve the I/O performance.

File Access Patterns

VTK and ParaView, by design, support many different readers with many different formats. Rather than iterate over every different reader that is or could be, we categorize the file access patterns that they have here. We also try to identify which readers perform which access patterns. Readers that do not read in parallel data will not be considered for obvious reasons. Also out of consideration are readers that are thin wrappers over other I/O libraries (e.g. hdf5). We have little control over how these readers other than report problems and hope they get fixed.

Common File

There are some cases where all the processes in a parallel job will each read in the entire contents of a single file. Although the parallel data readers are usually smart enough to only read in the portion of the data that they need. However, there is usually a collection of "metadata" that all processes require to read in the actual data. This metadata includes things like domain extents, number formats, and what data is attached. Oftentimes this metadata is all packaged up into its own file (along with pointers to #Individual Files that hold the actual data).

In this case, every process independently reads the entire metadata file because all the readers are designed to perform reads with no communication amongst them. This is so that the readers will also work in a sequential parallel read mode where data is broken up over time rather than across processors and there are no other processors with which to collaborate.

Why it is inefficient:

The storage system is being bombarded with a bunch of similar read requests at about the same time. All the requests are likely to be near each other (with respect to offsets in the file) and hence on the same disk. Unless the read requests are synchronized well and the parallel storage device has really good caching that works across multiple clients (neither of which is very likely), then storage system will be forced to constantly move the head on the disk to satisfy all the incoming requests without starving any of them. Moving the head causes a delay measured in milliseconds, which slows the reading to a crawl.

What readers do this:

All the readers that have a metadata file use this access pattern. This includes the PVDReader as well as all of the XMLP*Readers and the pvtkfile (partitioned legacy VTK files).

The spy plot reader used to have a metadata file that listed all of the actual data files. Does it still have this? Also, is there an EnSight equivalent to this (maybe the SOS file)?

@@ Line 8: / Line 8: @@
 === Common File ===
+There are some cases where all the processes in a parallel job will each read in the entire contents of a single file.  Although the parallel data readers are usually smart enough to only read in the portion of the data that they need.  However, there is usually a collection of "metadata" that all processes require to read in the actual data.  This metadata includes things like domain extents, number formats, and what data is attached.  Oftentimes this metadata is all packaged up into its own file (along with pointers to [[#Individual Files]] that hold the actual data).
+In this case, every process independently reads the entire metadata file because all the readers are designed to perform reads with no communication amongst them.  This is so that the readers will also work in a sequential parallel read mode where data is broken up over time rather than across processors and there are no other processors with which to collaborate.
+'''Why it is inefficient:'''
+The storage system is being bombarded with a bunch of similar read requests at about the same time.  All the requests are likely to be near each other (with respect to offsets in the file) and hence on the same disk.  Unless the read requests are synchronized well ''and'' the parallel storage device has really good caching that works across multiple clients (neither of which is very likely), then storage system will be forced to constantly move the head on the disk to satisfy all the incoming requests without starving any of them.  Moving the head causes a delay measured in milliseconds, which slows the reading to a crawl.
+'''What readers do this:'''
+All the readers that have a metadata file use this access pattern.  This includes the PVDReader as well as all of the XMLP*Readers and the pvtkfile (partitioned legacy VTK files).
+:<font color="green">The spy plot reader used to have a metadata file that listed all of the actual data files.  Does it still have this?  Also, is there an EnSight equivalent to this (maybe the SOS file)?</font>
+:--[[User:Kmorel|Ken]] 13:59, 29 Jan 2007 (EST)
 === Monolithic File ===

Parallel I/O: Difference between revisions

Revision as of 13:59, 29 January 2007

Contents

File Access Patterns

Common File

Monolithic File

Monolithic File with a Common Header

Individual Files

Parallel I/O Layer

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools