PROD2NTU: A CWN Production
Ntuple for KLOE
- Motivations
The usage of Column Wise Ntuples (CWN) allows the creation of standard
DSTs which can make the analysis quick,reliable and easy to all students and
collaborators that prefer not to start from scratch with the KLOE data structure.
The advantages of
CWN
respect to old Row Wise Ntuple consist in:
The faster access to data.
The possibility of having variables which are not only
floating point data. Integer/Character variables can be used.
Furthermore,
for each variable a specific bit range can be selected, at
booking time, to reduce the disk-space occupancy.
The feasibility of creating Ntuples of variable dimension
on an event by event basys. This is possible since the Ntuple can
be organized in different data blocks made with variable arrays.
Example:
- Variable Name/Description
- Nclu / number of EMC clusters;
- Enecl(Nclu) / array of dimension Nclu storing the energy of
each EMC cluster.
Clearly, the size of the data stored in these DSTs depends on what is really
necessary to be saved on it. At the moment, we have not yet used the bit
packing facility offered by CWN.
The Ntuple we propose is organized in data-blocks. By request an option
of selection of the blocks to be booked and filled can be add.
This could be depending upon the user's wishes or on the specific
analysis's item.
Assuming to save all information we have on the first version of the program
(Prod2ntu Version 1.0) we obtained the following:
With 50000 generated Bhabha events the final yb file has a size
of around 524 Mbytes. We have filtered out 50% of the file to ask for a
bhabha-trigger to be satisfied and then filled the Ntuple.
The size of the produced Ntuple is around 65 Mbytes.
50% of the N-tuple contains information of the calorimeter reconstructed
hits. For more general analysis this data block can be dropped
reducing the size to around 30 Mbytes. This is a factor 8 smaller than
the original yb file.
Even if this Ntuple cannot be created for each run, still it can be
useful as monitoring tool for a sub-sample of the produced events or,
otherwise, as a tool in the final analysis stage after events'filtering.
Another important point to be addressed, which is "mostly" KLOE specific,
consists in the necessity to unify tools and code writing to minimize the
amount of personal retrieval functions, common blocks and ,finally, CWN
ntuples.
Before we sink into chaos of zilions of interesting Ntuples of uncertain
origin we should explore if it is possible to satisfy "most" of our needs
with a unified data set. The proposal of this standard PROD2NTU
(production Ntuple) does not forbid the existence of personal
"more-specialized" CWN but, on the contrary, just forces us to
create a common working ground in order to semplify everyday standard
analysis life.
We stress that:
The scheme of this N-tuple can be simply extended and/or
minimized. Variables'names can be discussed.
The retrieval functions used in this code can be useful for
each further analysis package and can minimize the need and/or
the existence of any personal common block since they use
"structures" as input/output arguments.
There is only the necessity of one big "well known and
organized" common block to transfer these information to PAW.
PROD2NTU CWN data blocks'layout
The proposed N-tuple is organized in sub-blocks. Many of
these blocks are fully implemented and do not need any
further extention. Some of them should still be cleaned-up
and/or implemented.
General Event Information
Calorimeter Clusters
Calorimeter Cells
Tracks connected to a Vertex
Verticies
All reconstructed Tracks
MC Geant Information
Track to Cluster association
Drift chamber hits (To be implemented yet)
Neutral Vertex (To be implemented yet)
Global Fit (To be implemented yet)
PROD2NTU MODULE
The module is mainly organized in a booking
entrypoint and a set of calls to unpacking routines in the
event entrypoint.
Such retrieval routines can have some utilities by their own.
The basic includes are shown in the following:
- evtstruct.cin
- emcstruct.cin
- celestru.cin
- vtxstru.cin
- trkstru.cin
- tclostruct.cin
- cfhistruct.cin
- geanfistruct.cin
The basic retrieval functions are shown in the following:
- getclustru.kloe
- getcelestru.kloe
- getmcstru.kloe
- trkv2stru.kloe
- gettclostru.kloe
- getcfhistru.kloe
- getevcl.kloe
The b_j dictionary of PROD2NTU is really simple.
The module can be linked together with yours and/or the production modules
and can be used in the a_c path following all the other relevant modules to
produce clusters,tracks and so on ..
Here is an example of usage and
opening of an Ntuple.
This module has been inserted in the package area K_TLS
(setup tls development). In order to link it with your program
just add in your option file the following line:
$K_TLS/tlslib.olb.
PAW-USAGE of PROD2NTU
Up to now, we have succeded to build without any problems Ntuple of around
30000 events filled with all information described above.
There is a limit on the size of the RZ-files which can be created but in
general this can be overcome by enlarging the max-number of records allowed
in the opening of the RZ-file (for istance typically you can use
in A_C>> HIST/OPEN/MAX_NREC=40000 OUT.HBOOK).
Whenever, for huge datasets, you encounter some major problems in getting your
Ntuple working ---> CHAIN ntuples to add statistics from smaller files.
Let's call file1.hbook, file2.hbook the smaller RZ-files created from two
subsamples of your dataset. In order to chain them just:
- PAW> chain - pntu
- PAW> chain pntu file1.hbook file2.hbook ...
Before chaining Ntuples check if their layout is the same.
Different version of prod2ntu or different users selection of the
data blocks in the Ntuple can result in a different PAW Common
Block Layout. To check if this is the case create the PAWC-Include files,
for all the Ntuples you need to connect via chain, as follows:
- PAW> chain -pntu
- PAW> chain pntu file1.hbook
- PAW> n/uwfun 1 prod2ntu1.inc
- PAW> chain -pntu
- PAW> chain pntu file2.hbook
- PAW> n/uwfun 1 prod2ntu2.inc
Then make a differ of the include files generated by paw. You should have
NO differences before merging the two N-tuples. If this is not the case
the results of your analysis will make no sense.
Once you have chained your Ntuples RECOMPILE all the
paw-functions needed. Example:
Your files are the simple file1.f ,
and file2.f which calculates the
total energy and the total momentum respectively:
The following instructions compile them assuming they reside in the area
where you are running paw:
- PAW> cd //pntu/prod2ntu
- PAW> n/uwfun 1 prod2ntu.inc
- PAW> application comis quit
- comis> !file file1.f
- comis> !file file2.f
- comis> quit
- comis> quit
To show how the Ntuple works just plot the total energy and total momentum:
- PAW> n/plot 1.file1(0.)
- PAW> n/plot 1.file2(0.)
- and so on ...
Results of this operation can be seen in the following plots:
Etot ,
Ptot ,
obtained running prod2ntu on 10000 BhaBha events.
This page is preliminary and still under development !
S. Miscetti 26-May-1997
Mail to:
miscetti@hpcalc.lnf.infn.it