Data Management for Batch Systems

Paper: 205
Session: C (poster)
Presenter: Davis, Mark E., SURA/Jefferson Lab, Newport News
Keywords: data management, hierarchical storage management, mass storage


Data Management for Batch Systems

Ian Bird (igb@jlab.org, 757-269-7105)
Rita Chambers (chambers@jlab.org, 757-269-7514)
Mark E. Davis (davis@jlab.org, 757-269-7027)
Andy Kowalski (kowalski@jlab.org, 757-269-6224)
Sandy Philpott (philpott@jlab.org, 757-269-7152)
Dave Rackley (rackley@jlab.org, 757-269-7041)
Roy Whitney (whitney@jlab.org, 757-269-7536)

SURA/Jefferson Lab
12000 Jefferson Ave.
Newport News, VA 23606
USA
Fax: 757-269-7053


Abstract:

By late-1997, the Thomas Jefferson National Accelerator Facility ("Jefferson
Lab") will collect over one Terabyte of raw information per day of Accelerator
operation from three concurrently operating Experimental Halls. Raw data is
stored in a central mass storage library using StorageTek Redwood tape
transports. Data is then made available for data reconstruction on a batch CPU
farm consisting of a variety of dedicated UNIX workstations. Given an average
of 2.5 passes through each data set, 120-150 days of operation per year, and
aggregate reconstructed data potentially as large as input data to be
stored in the central library, the data management operation must move over
a petabyte of information through the system each year.

This paper will focus on the designs considered specifically to move data
to and from the central data silo to the CPU. Solutions considered include
the use of local RAID or single disks per batch node, central staging RAID
and longer term RAID work areas, the use of NFS served RAID for reads
versus writes, and the use of multiple host connects to a central RAID. The
final solution must meet the aggregate data movement requirement with a
cost effective implementation. It must also take into consideration the
I/O and networking throughput of individual batch nodes and central data
servers, optimizations of the data movement algorithms to avoid redundant
data transfers yet provide flexibility for researchers' data requirements,
and cross platform access to data sets.

By mid-1997, the hardware implementation will include a batch CPU farm
consisting of HP, IBM, and Sun workstations providing 150 SPECint95 of
processing power. The initial data distribution method will use direct
attached UltraSCSI RAID subsystems on data servers available to batch nodes
via switched 100 Mbps Ethernet. The storage and retrieval of both input
and output data sets will be controlled by a locally developed database-
driven scheduling system interfaced to the commercial products, Open
Storage Manager (Computer Associates) and Load Sharing Facility (Platform
Computing). We will provide results of our data throughput testing for
a variety of configurations as well as provide an overview of the design
in progress to support a 300 SPECint95 farm required by late 1997.
At full operational capacity, we anticipate using multi-host connected
high performance RAID systems as well as Fibre Channel interfaces to
support the data requirements for the laboratory.