The KLOE relational database

Preface

KLOE has two distinct databases:

a HEPDB database, also known as the offline database, since it is used to store the detector geometry and calibration data
a relational database, also known as the online and data handling database, since it contains the computing system catalog, the DAQ configuration parameters and the file bookkeeping data.

In this document, only the second database will be described.

Introduction

The KLOE relational database is managed by the IBM DB2 Universal Database v5 which is a standard-compliant RDBMS. However, as most of the database management systems, it has its own extensions and its own limitations; for more information, have a look at:

The data in a relational database are organized in tables. Each table is an independent entity made up of a defined number of columns and a variable number of rows. A column is a set of values of the same data type. Each row contains an entry in the table. To access the data, users make queries on the tables. The rows are not necessarily ordered within a table, so if order is important, it must be specified in the query.

Since a DBMS normally manages a large amount of data, dumb linear search in a table is not feasible. For this reason, each table has normally one or more indexes. An index is a subset of columns that is maintained ordered at all times, allowing much faster accesses on tables when the search condition of a query contains the initial columns of the index. Special indexes can also be used to ensure uniqueness inside the table (considering only the index columns); the preffered unique index been called the primary key.

To minimize the redundancy of the data, a single piece of information can be split between several tables. To ensure that the pieces fit together, foreign keys are used. A foreign key is a link between the child and the parent table, preventing the deletion of a parent row needed by the child.

To access the whole information, a user has to join different tables (usually using the columns specified in the foreign key). Since this can be a bit tedious, views are often used to hide the complexity away from the user. A view is a logical view on the data; it does not contain any data by itself, but instead gets the data on-the-fly from other tables when needed. To be more precise, it is just a shorthand for a complex query.

Tables and views themselves are grouped in schemas. A schema is just a collection of tables and views used to provide a logical classification. Tables (and/or views) in different schemas are normally used for different purposes.

Database organization

The KLOE relational database can be classified using two different approaches:

Based on the type of data:

The computing system catalog: contains the description of all the computing, disk and tape resources present in the KLOE environment; see the dedicated page for more details.
The DAQ configuration parameters: contains the DAQ related data, such as FEE and DAQ machines used for each run, the FEE and trigger configuration, etc.; see the dedicated page for more details.
The run and file bookkeeping catalog: contains the descriptions of the DAQ and MC runs and the descriptions and location of all the files created during the lifetime of KLOE; see the dedicated page for more details.
DB2 DBMS internal data: DB2 maintains a set of tables containing the current state of the DBMS.; see the official DB2 page for more details .
Test and monitoring data: There are several tables that were created by the database administrator to store test and/or monitoring data. Regular user should never access them, since they can change at any time.

Based on the access pattern:

(Mostly) read-only data: These tables, residing in the descript and sysIBM schemas, are almost read-only; i.e. data are added very rarely and deleted event less frequently.
Interval based data: These tables, residing in the kloemaps schema, have rows that are valid for a range of DAQ runs. When a new data is added, the previous “current” row is updated.
To know if x is the specified range, the following condition must be checked:
(Start_Of_Validity<=x) and ((x<End_Of_Validity) of (End_of_Validity is NULL))
Logging data: These tables, residing in the logger schema, contain the description of all the DAQ and MC runs and of all the created files. Rows are added quite frequently, but very rarely deleted.
Position data: These tables, residing in the status schema, contain the location of the files on disks. Rows are added and deleted quite frequently.
Test and monitoring data: Since these tables are used essentially for testing, not much can be told about them.

A general purpose table browser is also available here.

FAQ (Frequently Asked Questions)

None yet.

Please send questions to Igor Sfiligoi