PROD2NTU: A CWN Production Ntuple for KLOE


Motivations

The usage of Column Wise Ntuples (CWN) allows the creation of standard DSTs which can make the analysis quick,reliable and easy to all students and collaborators that prefer not to start from scratch with the KLOE data structure.

The advantages of CWN respect to old Row Wise Ntuple consist in:

  • The faster access to data.
  • The possibility of having variables which are not only floating point data. Integer/Character variables can be used. Furthermore, for each variable a specific bit range can be selected, at booking time, to reduce the disk-space occupancy.
  • The feasibility of creating Ntuples of variable dimension on an event by event basys. This is possible since the Ntuple can be organized in different data blocks made with variable arrays. Example:

    Variable Name/Description
    Nclu / number of EMC clusters;
    Enecl(Nclu) / array of dimension Nclu storing the energy of each EMC cluster.
  • Clearly, the size of the data stored in these DSTs depends on what is really necessary to be saved on it. At the moment, we have not yet used the bit packing facility offered by CWN. The Ntuple we propose is organized in data-blocks. By request an option of selection of the blocks to be booked and filled can be add. This could be depending upon the user's wishes or on the specific analysis's item. Assuming to save all information we have on the first version of the program (Prod2ntu Version 1.0) we obtained the following:

  • With 50000 generated Bhabha events the final yb file has a size of around 524 Mbytes. We have filtered out 50% of the file to ask for a bhabha-trigger to be satisfied and then filled the Ntuple.
  • The size of the produced Ntuple is around 65 Mbytes. 50% of the N-tuple contains information of the calorimeter reconstructed hits. For more general analysis this data block can be dropped reducing the size to around 30 Mbytes. This is a factor 8 smaller than the original yb file.
  • Even if this Ntuple cannot be created for each run, still it can be useful as monitoring tool for a sub-sample of the produced events or, otherwise, as a tool in the final analysis stage after events'filtering.

    Another important point to be addressed, which is "mostly" KLOE specific, consists in the necessity to unify tools and code writing to minimize the amount of personal retrieval functions, common blocks and ,finally, CWN ntuples. Before we sink into chaos of zilions of interesting Ntuples of uncertain origin we should explore if it is possible to satisfy "most" of our needs with a unified data set. The proposal of this standard PROD2NTU (production Ntuple) does not forbid the existence of personal "more-specialized" CWN but, on the contrary, just forces us to create a common working ground in order to semplify everyday standard analysis life.

    We stress that:

  • The scheme of this N-tuple can be simply extended and/or minimized. Variables'names can be discussed.
  • The retrieval functions used in this code can be useful for each further analysis package and can minimize the need and/or the existence of any personal common block since they use "structures" as input/output arguments.
  • There is only the necessity of one big "well known and organized" common block to transfer these information to PAW.

  • PROD2NTU CWN data blocks'layout

    The proposed N-tuple is organized in sub-blocks. Many of these blocks are fully implemented and do not need any further extention. Some of them should still be cleaned-up and/or implemented.

  • General Event Information
  • Calorimeter Clusters
  • Calorimeter Cells
  • Tracks connected to a Vertex
  • Verticies
  • All reconstructed Tracks
  • MC Geant Information
  • Track to Cluster association
  • Drift chamber hits (To be implemented yet)
  • Neutral Vertex (To be implemented yet)
  • Global Fit (To be implemented yet)

  • PROD2NTU MODULE

    The module is mainly organized in a booking entrypoint and a set of calls to unpacking routines in the event entrypoint. Such retrieval routines can have some utilities by their own. The basic includes are shown in the following:

    evtstruct.cin
    emcstruct.cin
    celestru.cin
    vtxstru.cin
    trkstru.cin
    tclostruct.cin
    cfhistruct.cin
    geanfistruct.cin

    The basic retrieval functions are shown in the following:

    getclustru.kloe
    getcelestru.kloe
    getmcstru.kloe
    trkv2stru.kloe
    gettclostru.kloe
    getcfhistru.kloe
    getevcl.kloe

    The b_j dictionary of PROD2NTU is really simple. The module can be linked together with yours and/or the production modules and can be used in the a_c path following all the other relevant modules to produce clusters,tracks and so on .. Here is an example of usage and opening of an Ntuple. This module has been inserted in the package area K_TLS (setup tls development). In order to link it with your program just add in your option file the following line:

    $K_TLS/tlslib.olb.


    PAW-USAGE of PROD2NTU

    Up to now, we have succeded to build without any problems Ntuple of around 30000 events filled with all information described above. There is a limit on the size of the RZ-files which can be created but in general this can be overcome by enlarging the max-number of records allowed in the opening of the RZ-file (for istance typically you can use in A_C>> HIST/OPEN/MAX_NREC=40000 OUT.HBOOK). Whenever, for huge datasets, you encounter some major problems in getting your Ntuple working ---> CHAIN ntuples to add statistics from smaller files. Let's call file1.hbook, file2.hbook the smaller RZ-files created from two subsamples of your dataset. In order to chain them just:

    PAW> chain - pntu
    PAW> chain pntu file1.hbook file2.hbook ...

    Before chaining Ntuples check if their layout is the same. Different version of prod2ntu or different users selection of the data blocks in the Ntuple can result in a different PAW Common Block Layout. To check if this is the case create the PAWC-Include files, for all the Ntuples you need to connect via chain, as follows:

    PAW> chain -pntu
    PAW> chain pntu file1.hbook
    PAW> n/uwfun 1 prod2ntu1.inc
    PAW> chain -pntu
    PAW> chain pntu file2.hbook
    PAW> n/uwfun 1 prod2ntu2.inc

    Then make a differ of the include files generated by paw. You should have NO differences before merging the two N-tuples. If this is not the case the results of your analysis will make no sense. Once you have chained your Ntuples RECOMPILE all the paw-functions needed. Example:

    Your files are the simple file1.f , and file2.f which calculates the total energy and the total momentum respectively: The following instructions compile them assuming they reside in the area where you are running paw:

    PAW> cd //pntu/prod2ntu
    PAW> n/uwfun 1 prod2ntu.inc
    PAW> application comis quit
    comis> !file file1.f
    comis> !file file2.f
    comis> quit
    comis> quit

    To show how the Ntuple works just plot the total energy and total momentum:

    PAW> n/plot 1.file1(0.)
    PAW> n/plot 1.file2(0.)
    and so on ...

    Results of this operation can be seen in the following plots: Etot , Ptot , obtained running prod2ntu on 10000 BhaBha events.


    In WORK: This page is preliminary and still under development !

    S. Miscetti 26-May-1997
    Mail to: miscetti@hpcalc.lnf.infn.it