Application of Data Compression to HEP Data

Paper: 431
Session: C (poster)
Presenter: Schindler, Michael, Vienna University of Technology, Vienna
Keywords: algorithms, network performance, data acquisition systems, large systems, mass storage

Application of Data Compression to HEP Data

Michael Schindler
TU Wien, Karlsplatz 13/1861, A1040 Wien-Austria;


Data compression is widely used in data storage and transmission today. Typical
compression ratios range from 50% for text to >99% for mpeg compressed
movies. High energy physics data proved to be different; compression is
typically between 10 and 50%. Sparse readout data (Zero compressed data) seems
to be a mayor problem for standard algorithms in software (gzip and others)
and hardware (tape drives).

On the other hand HEP experiments produce enourmous amounts of data; the KLOE
experiment at the DAFNE accelerator (Frascati) will produce about 50MByte/sec,
which is still harmless compared to the 2.5GByte/sec ALICE at the LHC (Cern)
might produce. Permanent storage space does not come for free, so there is some
demand for data compression.

This talk will address some of the problems with HEP data and implications
for the design of planned and future DAQ systems. It will also give a brief
overview on data compression basics to give the necessary background for the
discussed problems and possible solutions. Those solutions range from
electronic readout sequence to what data analysis should be done online and
which part of the compression could be done in hardware. The benefits and
drawbacks of possible data compression locations in a DAQ architecture
will also be addressed.

I would prefer to give a rather general talk and leave technical details to
the poster session, where I have submitted this abstract too.

Some references:
CERN ALICE internal note 1996-03
coauthor of H. Bekers talk at the DAQ conference in Osaka