High-Speed Distributed Data Handling for HENP

Paper: 328
Session: C (talk)
Speaker: Johnston, William, Ernest Orlando Lawrence Berkeley National Laboratory (LBL), Berkeley
Keywords: ATM, data management, hierarchical storage management



High-Speed Distributed Data Handling for HENP

W. Greiman, W. E. Johnston, C. McParland, D. Olson, B. Tierny

Lawrence Berkeley National Laboratory
Berkeley, CA, 94720, USA

Abstract

This paper describes a project whose goal is to demonstrate a
scalable solution to the problem of high-bandwidth data
handling for high-performance analysis of high-energy and
nuclear physics data. The STAR experiment at RHIC is used as
the basis for a realistic example in this project. The approach
is based on a distributed parallel storage system which collects
data from the detector and serves data to a distributed cluster
of symmetric multi-processor computers.

Simulated STAR data are injected into the system at realistic
rates (10-20 MBy/s), and the STAR analysis framework running on
the SMPs is used to demonstrate the capability of processing
with realistic analysis algorithms. Two basic modes of
operation are being investigated: an event reconstruction style
operating mode, which has sequential access to data; and a
physics analysis style operating mode, which has a random or
filtered access to the data.

The computing experiment architecture is based on an OC-12 ATM
metropolitan-area network, which may be configured with up to
about a 1000 km diameter, separating the simulated detector from
the storage systems (the components of which are also scattered
throughout this network). The processing elements include Sun
E-4000 and Ultra SMPs, DEC Alpha SMPs, and SGI SMPs, all of
which are also scattered around the network. Most of this
environment is currently in place and the distributed storage
system is integrated with the analysis code. High data rate
experiments are exptected to start by early 1997.

The goal of the work is to demonstrate that the networks that we
expect to be in place at the turn of the century will support
distributed systems that can collect and process all of the
event data in real-time so that decisions on how to organize the
data on tertiary storage, and what sub-sets to keep in on-line
storage, can be made prior to committing to an off-line storage
strategy. Conceivably, such real-time processing could also be
used to feed information back to the experiment operators so
that near real-time adjustments might be made to the experiment
parameters based on the preliminary analysis.