Notes on MC meeting, 23-Jan-03 ============================== Agenda ------ 1. Work completed 2. Work in progress, status reports 3. Discussion items EVERYONE THE LEAST BIT INTERESTED IN USING THE OUTPUT FROM THIS MC PRODUCTION CAMPAIGN, WHICH NOW INCLUDES PHI -> ALL DECAYS, SHOULD ATTENTIVELY READ AT LEAST PART (3) ON THE DISCUSSION CONCERNING THE PRODUCTION SCHEME. 1. Work completed ================= The following items have been completed, in the sense that as much work has been done that can be done without testing the entire production chain: - ISR and phi decay simulation (M. Antonelli) - Bank-reduction code for DST's (M. Moulson) - Trigger simulation parameters (M. Palutan et al.) - GEANFI on IBM (S. Giovannella) 2. Work in progress, status reports =================================== DC geometry (S. Dell'Agnello, A. Antonelli) ------------------------------------------- All planned DC geometry modifications as per previous presentations (see e.g., 20-Dec-02 KGM presentation) have been implemented. The remaining item for discussion is whether or not to implement the global shift by -1 cm in y. For technical reasons, it is difficult to implement this shift. The KLOE and local DC coordinate systems currently coincide. The DC volume can be moved in GEANT so that the entry point of the track in the KLOE system can be calculated. Antonella and Simone will investigate the feasibility of transforming the track entry point into the local DC system for the DC tracking (which is handled in dedicated code), and transforming it back into the KLOE system on exit, so that GEANT tracking can be continued. The DTFS banks will also have to be transformed from the local DC into the KLOE system. Antonella and Simone will also investigate alternative solutions if necessary. Diagnostics to prove that no problems arise as a result of lowering the DC (volume conflicts in GEANT, etc.) are very important and may represent a non-trivial effort. Calorimeter geometry (S. Miscetti) ---------------------------------- The calorimeter also needs to be moved, by about -0.4 cm. The global shift should be straightforward and will be implemented. Currently, the 24 barrel modules are positioned by successive rotations and are arranged in a regular pattern. It is known that there are module-by-module offsets in the actual positions. After the global shift has been taken care of, the feasibility of implementing these offsets will be studied. Update of beam position, sqrt(s) information -------------------------------------------- There are many 2001 runs without BPOS information. Simone will look into closing the holes. There are many 2002 runs without information on sqrt(s). Matt, in some combination with the pi-pi-gamma group, will attempt to run BVLAB on the 2002 data as was done for 2001 data. Calorimeter response (S. Miscetti) ---------------------------------- Stefano reported on studies of the calorimeter response aimed mainly at understanding the accuracy and precision of the energy response as a function of position, energy, etc. His presentation is available on this web site. Background selection module, calorimeter background (S. Miscetti) ----------------------------------------------------------------- A preliminary skeleton of the SELBKG A/C module now exists. This module selects gamma-gamma events, writes a CELE bank with the uncorrelated (i.e., background) calorimeter elements, and writes the DC hits into an output file. The weights used to downscale events with exactly one background cluster as a function of polar angle and energy are input to the module. We need to study how these weights change with time. We need to decide how these weights will be stored, or whether they will be stored in the database, or calculated on the fly by the background isolation procedure. Most other aspects of the development are in good shape. A version-zero module should be in the library within a week or so. DC background insertion (M. Moulson) ------------------------------------ The situation is essentially the same as it was at the last meeting. A relatively complete framework has been implemented. There are still a few residual items to work on (see 23-Jan-03 presentation by M. Moulson), totaling perhaps a week of work. Wire sag in DC (S. Dell'Agnello, A. Antonelli) ---------------------------------------------- The wire-tension measurements recently conducted on several layers indicate a further loss of tension. Relative to the previous round of measurements (2000), which cover only the external (> 25th) layers of the DC, the wire sag increases by circa 100 um. Relative to the 1998 measurements, which cover the whole chamber, increases in the wire sag of 100 to 200 um are seen (in general the increase is greater for the external layers). The plan is to use the 2003 wire-sag values in generation. The current set of measurements is highly incomplete. A complete set of measurements for all layers would be highly desirable. At a minimum, more measurements are necessary for the layer intervals in which the wire sag changes rapidly. The plan is to use the best information available when generation starts, interpolating with reference to previous measurements where necessary. Generators, constants, and other MC tuning (C. Bloise) ------------------------------------------------------ A preliminary version of the radiative KL3 generator has been developed by Fabrizio Scuri and the Bern group. This will be inserted into GEANFI by Fabrizio and Caterina next week. Paolo Gauzzi is working on an improved a0-gamma generator. This work is not yet finished. Further developments are expected next week. Mario's phi decay generator has been fully implemented and seems to be working well. Numerous constants related to particle masses, branching ratios, Dalitz plot slopes, etc. have been comprehensively updated by Caterina. A summary is available on this web site (see documents related to the 23-Jan-03 meeting). Modifications to the DB interface are needed in order to obtain the run number to simulate and the values of sqrt(s), x_phi and p_phi to be used. DB2 database modifications (I. Sfiligoi) ---------------------------------------- Igor raised the question as to whether an MC run is specific to a particular run number in real life, or to a particular raw file. This has implications in term of the most efficient way to organize the new tables in the database. After some discussion, it became clear that what is simulated is actually a raw file. While the generation process knows nothing about the background per se (which is added at reconstruction time), and while sqrt(s), etc. are presumed to be constant over a given run, the integrated luminosity corresponding to the background file must be known in advance in order to decide how many events to simulate. Moreover, it is unlikely that a particular MC file will be reconstructed more than once with different background levels. One or more MC files are generated specifically for each raw files, or in some cases, for a group of raw files, and this locks in the background files to be used. The only situation foreseen for the "replacement" of the background corresponding to a given generated MC run is an upgrade of the selection algorithm, in which case the .mcr files would presumably be replaced (i.e., the old .mcr files would not be kept). So, the most direct link is to the background files. Before beginning coding, Igor will sketch a written proposal describing the new tables which we can all review calmly. 3. Discussion items: Revisions to production model ================================================== The original proposal was to produce 500 pb-1 (500 Mevents) of KS -> all, KL -> all events. Discussions in December led to the revised proposal to first generate 100 pb-1 of phi -> all events (300M) followed by the remaining 400 pb-1 of KS -> all, KL -> all events (400M). This revised proposal would satisfy the immediate needs of a larger group of people, or at least give them something to get started on. This also means that a larger number of people would be looking at the MC output sooner, which implies a more rapid shakedown of the production machinery. However, the revisions to the production model raise a host of issues, most of which were resolved at the meeting. - Temporal profile: The original proposal called for the 500M KS,KL events to be evenly distributed in time over all 2001--2002 data. The issue was raised as to whether the 100 pb-1 of phi -> all and the 400 pb-1 of KS,KL -> all should each separately reflect the global time profile. In other words, the 100 pb-1 of phi->all would have a run-condition and background profile (i.e., a run-number profile) which reflects the ~500 pb-1 of 2001--2002 running, but with a ~1/5 downscale. The 400 pb-1 of KS,KL -> all would have a similar profile, with a ~4/5 downscale. To use 500 pb-1 equivalent of KS,KL data, the two sets could simply be summed. There was a general consensus that this is almost certainly the way it should be done. - Streaming: The proposal to produce KS,KL -> all events didn't call for a streaming mechanism, since the output basically corresponded to a particular datarec stream (ksl) in any case. Note that a substreaming scheme based on the MC-true event type was and continues to be proposed, as discussed below; nevertheless, event classification in the datarec sense is not involved. The 300 pb-1 of phi -> all events will need to be divided up along the lines of the DST's. Ideally, these DST's should be as identical as possible to the datarec DST's they simulate. This means that the specific algorithms run for each type of datarec DST should be run for MC DST's. The only way to do this without encountering the problems encountered with overlapping events in the datarec streams is to make DST's in separate A_C jobs, one for each DST stream. As in the case of the datarec DST's, this means 4 separate DST jobs (kpm, ksl, rpi, rad). Hence, the MC reconstruction job ends with an .mcr file, which is then immediately presented as input to the four MC DST production processes. The main difference between DST's for data and MC events is that in the latter case, the event topology is known a priori, and events of a given type need to be retained in the DST independently of whether or not they were recognized as such. In the data DST's, all events with a certain EVCL tag are kept, and the subsequent, stream-specific algorithms (t0's, retracking for K+K- events, etc.) are applied on this basis. In the MC DST's, this type of decision will be likewise made on the basis of the EVCL tag. However, all events belonging to a particular stream on the basis of MC truth will be retained in the output in any case. As an example, consider the KSL stream produced as part of the phi -> all production. All events in which the phi decayed into KS,KL in the MC will be present in the KSL DST. All events which were recognized by any of the EVCL algorithms for this stream will likewise be present. For those events that were recognized as KS -> pi+pi-, the step-1 t0 algorithm and neutral vertex reconstruction will be applied. The user of the KSL DST then has available all events which were generated as KSL events (for efficiencies) and all events which were generated as something else and which made it though EVCL as KSL events (for background studies). - Substreaming: The group has slowly converged on the issue of whether or not to divide up the production itself, or its outputs, so that the end product is compact files containing a particular type of event. With reference to Matt's presentations, e.g., at the 17-Dec-02 meeting or the 20-Dec-02 KGM, the proposals can be summarized as: 1.) Combined production and output of KS,KL -> all 2.) Differentiated production and output by KS decay mode, and possibly by whether or not the KL decays in calorimeter. 3.) Combined production and differentiated output along the above lines. As of the 17-Dec-02 meeting, it was becoming clear that the third option was the only one that was satisfactory to everyone. This decision was formalized during the meeting. The general principles regarding the substreaming are: - The substreaming is done rigorously on the basis of MC truth. - The sum of the substreams gives all generated events for a stream. The latter point means that if it is inconvenient to have the events divided up for a particular analysis (for example, for a background study), the user is free to simply sum the files together for the analysis. The former point means that there are no complications from overlaps when doing so. At the moment, the substreams and streams to be used for the phi -> all production are as follows: 1. KPM stream No substreams 2. KSL stream KS -> pi+pi- substream KS -> pi0pi0 substream Caterina will investigate the possibility of unambiguously marking in the .mcr files if the KL decays in the DC, in the EmC, or outside of the detector. If this can be done, then the above two substreams may be further divided into two substreams each corresponding to KL decays inside or outside of the DC. The issue of how to handle the rarer KS decays (in particular semileptonics) was not discussed at the meeting. In keeping with the above two general principles, there are two options: split them out into (a) separate substream(s), or insert them into one of the KS -> pi+pi- or KS -> pi0pi0 substreams. The former solution risks creating many zero-length files, which may or may not be a problem, but doesn't seem like a good idea. A formal decision remains to be made. What about the fact that the t0 step-1 algorithm runs only for EVCL-identified KS -> pi+pi- events? Since we never split the DST's on this basis anyway, is seems like this algorithm should simply be applied to EVCL-identified KS -> pi+pi- events regardless of what substream they end up in. 3. RPI stream No substreams 4. RAD stream Charged substream Neutral substream Note that the RAD stream is a bit different from the KSL stream, in that the charged and neutral substreams already exist in datarec DST's and are actually differentiated by the event classification, rather than the MC truth. This is in contrast with the general principles regarding the substreaming outlined above. This may not be so serious, since the charged/neutral division probably satisfies the same criteria (or almost does): the sum of the two streams equals all events with no overlaps. However, we need input from the users of the RAD stream to better define these substreams, especially with regards to the MC truth: - If the substreaming is based on EVCL, then some events truly of one type may end up in the other substream (i.e., the substreams can't be used to calculated EVCL-related efficiency terms) - If the substreaming is based on MC truth, some events classified in one way may end up in the other substream, which is not how it works for data and could be inconvenient - If the substreaming is based on the union of EVCL+MC truth, then there may be overlaps, which hopefully will be small in number.