Type safe physics data analysis

Paper: 173
Session: A (talk)
Speaker: Katayama, Nobu, KEK, Tsukuba
Keywords: algorithms, analysis, C++



Type safe physics data analysis

Nobu Katayama

Abstract

Typically the physics data analysis is done in stages. First
the physics objects are reconstructed from the raw data taken
from the detector. The data summary tape (DST) is written and
the physicists are ready for the final physics data
analysis. A physicist writes programs to analyze the
DST based on the hypotheses he/she is trying to prove. For
example in the study of the B meson decays, the physicist
might want to study the B0 meson decaying into the J/psi
and the K short mesons. As neither J/psi nor K short meson is
stable in the detector, the physicist try to observe their
decay into a pair of leptons and charged pions
respectively. These decays ( or decay chains ) occur at certain
branching fractions. (Some are known to us and compiled in
Particle Data Table. Others are not known yet.) Therefore a
typical procedure of the physics data analysis is as follows;
using a programming language of his/her choise, the physicist
describes the decay chains and find such candidates in the
events. Then he/she studies the properties of the candidates
using histograms and/or n-tuples and visualizing them using
tools such as PAW. The programming language used to describe
the physics hypotheses is typically FORTRAN. Some physicists
have used other languages such C, Pascal. Several specialized
programming language have been invented for physics
analysis. The most famous one is KAL developed at DESY. The
author had written a package called CABS and have presented at
previous CHEP conferences. The package has been used bymembers of the CLEO experiment successfully. Recently C++ has
been getting a lot of attention and is considered suitable for
the physics data analysis because the physics objects such as
tracks and particles can be abstracted into classes. It is
very natural to use classes for the physics objects. However,
when analyzing a very complicated decay chain or many decay
chains at the same time it would be very desirable if each
decay chain can be made into a class. A new technique was
developed to realize such a requirement. Using inheritance and
template of C++ this package allows the user to create a new
decay chain class by declaring a decay chain like
``DefinitionTS2Dzero_list(Kminus_list,
piplus_list, Dzero::analyze);'' or ``DefinitionTS2 Dzero, Pion> Dstar_list(Dzero_list, piplus_list,
Dstar::analyze);'' As Dzero, Kaon,,, are user defined classes
the user can add attributes of his/her choice and use them in
the analysis. Things like looping over all candidates without
multiple counting are taken care of by the base classes. With
this technique the C++ compiler will catch most of mistakes
that the user might make such as misusing one decay daughter
as another. If the user uses FORTRAN or even C++ with generic
particle classes such errors are not caught by the compiler as
daughters of the decay chain are built using the same types.