# ISTITUTO NAZIONALE DI FISICA NUCLEARE

Sezione di Napoli

INFN/TC-97/06 6 Marzo 1997

F. Fabozzi, P. Parascandolo: **DESIGN OF AN HIGH SPEED CLOCK DISTRIBUTION NETWORK** 

### <u>INFN – Istituto Nazionale di Fisica Nucleare</u> Sezione di Napoli

INFN/TC-97/06 6 Marzo 1997

## DESIGN OF AN HIGH SPEED CLOCK DISTRIBUTION NETWORK

F.Fabozzi P.Parascandolo INFN - Sezione di Napoli, Mostra d'Oltremare, Pad. 20, I-80125 Naples, Italy

#### Abstract

In this note the analog effects that prevent integrity of digital signals are discussed with reference to the implementation of a reliable distribution network for a 60 MHz clock onto a card with a 64-bit data bus.

## 1. INTRODUCTION

Generally, with the exception of RF modules, until a decade ago systems operated at clock rates which never posed to the designer clock distribution problems. However, with the introduction of new processors both in the PC arena and on dedicated controllers, clock rates reach now the range of hundreds of MHz. Signal integrity is therefore becoming of paramount importance and poses a new challenge to the digital designer as a good understanding of analog techniques is a must for successful design.

Clock distribution is clearly of the utmost importance as the clock signal carries the highest frequency of a module. To avoid timing failures, clocks are specified with the tightest tolerance as systems have to operate synchronously despite the unavoidable variations in the manufacturing process -part skew- and the noise which worsens signal integrity.

In general, the digital designer must take care that signals arrive at their load with the best edges, preserving the pulse width, while generating the lowest possible noise. With a 64 bit system bus and a clock above 50 MHz on an overcrowded board other effects such as crosstalk, reflections and ground bounces are to be considered as they could prevent reliable operations.

This note describes the timing environment and the performance of a clock distribution network developed for the FIFO board of the BaBar experiment which works with a 59.5 MHz clock and a 64 bit system bus. The methods presented are general and can be applied to different systems.

## 2. THE FIFO BOARD OF THE BABAR IFR

BaBar detector [1-2] has been designed to study CP violation in B meson decays predicted by the Standard Model. The detector itself is composed of five subdetectors: Silicon Vertex Detector, Drift Chamber, Particle Identification System, Electromagnetic Calorimeter and a magnet with an Instrumented Flux Return (IFR).



Fig.1

The IFR is a Resistive Plate Chamber (RPC) detector [3] whose task is to identify muons and neutral hadrons. The required detection performances imply a large number of strips (about 50K

channels) which must be extensively buffered while making trigger decision. A front end card (FEC) [4-5] acting both as discriminator and buffer of data, mounted on the detector and serving 16 channels, has been produced (more than 3K cards). Data from FECs will be transmitted to 52 FIFO boards (IFB) located into 8 crates close to the apparatus which will provide for data buffering. Into these crates will house also ICB modules whose function is to test the front end cards and ITB modules providing the time occurrence of the hits (Fig. 1).

Data acquired into the crates by the IFB and ITB must then be transmitted to the Data Acquisition system (DAQ).

The BaBar collaboration has chosen to multiplex a whole crate readout section into a single high speed transmission line going to standard DAQ Read Out Modules (ROM) common for all the sub-detectors and located in the Electronic House (EH). The approved solution for the transmission line consists of a fiber optic transmission system at a speed of 1.2 GHz (CLINK & DLINK) [6]. The fiber optic system serves both to acquire data from the detector to the DAQ (DLINK) and to transmit clock (59.5 MHz), trigger and control signals to the IFR sub-system crates (CLINK).

Although the average acquisition frequency foreseen at BaBar is of just 2 KHz, maximum frequency peaks of 0.4 MHz could be possible. A FIFO board (IFB) operating as a buffer memory is therefore necessary to act as an interface both vs. front end cards and vs. data acquisition system.

Each IFB serves 64 FEC channels. The clock signal for BaBar subdetector is fixed at 59.5 MHz. Commands (12 bit) and test data (64 bit) reach the module via Din line. Data are output serially via Dout line. The design of the IFB relies heavily upon MACH210-7 PLDs. Seven such components are employed both to decode the commands and to assure correct operation of the board and also to manage the interface versus the front end.

For the IFB module faithful operations up to a clock period of 14 nsec are requested. This design goal can be achieved since a close look at the MACH210 PLD's manufacturer catalog reveals that this component is specified to have a set up time  $t_{\rm s}$  from input, IO or feedback to clock of 5.5 nsec. whenever the input register is D\_type, while the clock to output delay  $t_{\rm co}$  is 5 nsec.

Actually a first prototypes series is under test and preliminary results are satisfying as the board is operating up to a 12 nsec. clock period well beyond the 14 nsec. requested period.

#### 3. THE TIMING ENVIRONMENT

The timing environment design process starts with the performance targets requested to the board (14 nsec.).

The clock generator is standard for the whole BaBar detector. It is guaranteed to have onto each board a precision of +/- 0.5 nsec. comprising of skew, jitter and pulse width distortion.

The first step in the design process involves the choice of how many loads are to be clocked to the same time as this can have an impact in specifying how the clock has to be distributed to different PLDs. For the system to operate reliably on each cycle implies that the tolerance on the time of arrival of each clock signal to each component is foreseen with a good degree of accuracy. Consequently this poses some limits to the choice of the clock driver. Two types of clock driver architecture do exist: a buffer type or a feedback type (PLL). In this design we have opted for a buffer type architecture where the master clock is input to the device and the output buffers redrive it to the loads. The clock driver however introduces skew (device or internal skew).

Next the designer must face with the problem of ensuring that PLDs get their clock without waveform degradation in close synchronization with each other. This implies that the differences in the propagation time of the clock signals on the board must be contained within a minimum value otherwise skew will arise (external skew). External skew can arise from different mechanisms. The first is due to different lengths of the clock traces which, by a careful design, can be kept to a minimum. Furthermore, differences in the PCB manufacturing process can cause tolerance in the delay of the interconnect. Moreover, different input capacitance of the PLDs can load the line differently and hence can slow down clock edges causing skew.

Last the designer must afford the problem of jitter which apparently is random and causes variation in the skew. Jitter comes from unavoidable changes in the propagation delay of the components due to noise sources and power supply variations.

Once that target specifications, clock generator tolerance and number of loads are known, a timing specification of the board can be written down and analyzed. To achieve reliable operation with a 14 nsec clock period, it is necessary for the tolerance to be contained within +/- 1.5 nsec since the clock generator itself bears a +/- 0.5 nsec. skew. It is clear that:

Tolerance ≥ device skew + ext. skew + jitter

To be conservative, allowing a part to part skew of 1 nsec. for the clock distributor device, 0.5 nsec are left for external skew.

The overall tolerance budget indicates that if the noise -through a careful layout of the PCB - is kept low, a minimum clock period of 14 nsec. of operation could be achieved.

#### 4. CLOCK DISTRIBUTION NETWORK

To distribute the clock with a buffer architecture to a system operating at frequency above 50 MHz special purposes devices are necessary since with standard components like the 74F244 skew is can be very large. In fact for the 74F244 we can calculate the skew by subtracting the maximum propagation delay from the minimum obtaining 4 nsec. as a result.

Most manufacturers have therefore developed special purposes devices aimed to drive PCB clock trace with low output skew. One such device is the CDC391 clock driver. This component, manufactured by Texas Instruments, is specified with a part to part skew of 1 nsec. This means that either output of two different devices cannot present a skew in excess of this value. In addition, the skew between the output pins for the same device is guaranteed to be under 0.5 nsec. Yet another problem arises from the low impedance of the clock transmission line. Indeed, to assure reliable operations it is requested that the output pin of the driver reaches a valid level on the initial wavefront of both edges. This requests high current to the driver -better if symmetrical- both in the high and in the low state. In this respect, the CDC391 is capable of high drive since its I<sub>OL</sub> and I<sub>OH</sub> are both specified at 48 mA. However, it can be shown that the device can be used to switch lower loads since it can operate up to 96 mA driving loads down to 30 Ohm. The CDC 391 fits particularly well in this application since it incorporates just 6 outputs as requested by this application.

In order to achieve the best results, we have also opted for a point to point distribution network (fig.2) with each output of the CDC391 driving just one PLD with a series termination, i.e. adding an SMD resistor just onto the output pin of the clock driver.



Fig. 2

As noted, the external skew can be due to different trace lengths, unproper line terminations, different capacitive loading, and different threshold voltage at the loads. A point to point distribution

is obviously the best remedy against the first two problems; but, assuring the best possible transition edges at the load, helps also in minimizing possible differences in the threshold voltage.

As transmission line theory dictates, the cause of transmission line effect is impedance mismatch. The mismatch can be due either to existing differences between the impedances of the transmission line and that of the load, or it can arise from impedance discontinuities, or from unequal loading of the transmission line (i.e. clock lines driving different number of loads). Since a voltage reflection is produced each time the impedance along a transmission line changes, great care must be given to the PCB. The clock distribution network obviously must have no stubs or vias as they greatly add to impedance discontinuities. Further, the PCB traces should have always the same width also at the corners.

Since the rise and fall time of our clock driver is under 1 nsec., a trace of just 5 cm. has to be considered a transmission line. Further, radiated noise must also be taken into account because in generating such fast edges the component produces frequency components in the range of 1 GHz. This introduces crosstalk routing problems and also problems on the choice of the power supply bypassing capacitors. To get rid of the former, clock signals must be routed as far as possible from the other critical signals. The problem of power supply bypassing, instead, refers both to the choice of the bypass capacitor technology and to its placement. X7R SMD capacitor exhibits the best behavior in the hundreds MHz range and should be preferred. They should also be placed with multiple vias both to ground and power supply planes to reduce parasitic inductance.

In our design, to face the clock distribution problems on the overcrowded IFB board, the number of layers had to increase up to 10 layers. One such layers is nearly completely devoted to the clocks and other time critical signals with ground planes up and down. Incrementing the number of layers the characteristic impedance of a line goes down posing further problems to the clock drivers, which must drive a line impedance of the order of 35 Ohm. Obviously, we have tried to keep the characteristic impedance as high as possible for a 10 layer board.

We have measured line impedances of about 33 Ohm (Fig.3) on our PCB, employing a Tektronix 11801B Digital Sampling Oscilloscope equipped with a special purpose probe.



Fig. 3

## 5. PLACEMENT OF THE PLDS

Also if clock distribution is of primary issue, PLDs cannot just be placed so that the distances from the clock driver are all the same. Other critical signal do exist so that on many other traces the "time of flight" must be taken into account. Critical are all these signals which, produced as output of a PLD on a clock edge, must feed an external D flip flop in such a way that on the next clock edge a decision can be taken.

We have opted for the layout in Fig. 4



Fig. 4

#### 6. CLOCK MEASUREMENTS

Clearly to perform measurements on a challenging clock an high quality instrumentation is of the utmost importance. To measure a jitter or a "time of flight" it is necessary to have an instrument whose performances are better by one order of magnitude. We have started our tests with the HP 8133 a 3 GHz pulse generator whose pulse stability (i.e. rms. jitter) is under 10 ps. With this high stable clock we have performed tests on the clock driver measuring jitters on each output. In the following figure 5 we show just an example of a clock waveform measured at the clock input of the Cmddec PLD on our PCB (all the others are the same). The frequency of this clock is 120 MHz, two times the typical operating frequency of the board. Measurements have been performed with a purposely build high bandwidth probe consisting of a  $1K\Omega$  in series with an RG174 cable closed at 50 Ohm on the scope (TDS744 2GS/sec).



Fig. 5

#### 7. SUMMARY

In this note we have described the clock distribution network for the IFB board, developed to afford data taking for the IFR of the BaBar detector. We have outlined the challenge which a clock network distribution poses to the designer and we have shown measurements on the PCB both of the characteristic impedance of two clock traces and of an high frequency (120MHz) clock waveform.

#### REFERENCES

- [1] Letter of Intent for the Study of CP Violation and Heavy Flavour Physics at PEPII, BaBar collaboration, SLAC Report SLAC-433, June 1994.
- [2] BaBar Technical Design Report, BaBar Collaboration, SLAC Report SLAC-R-95 457, March 1995.
- [3] R. Santonico and R. Cardarelli, Nucl. Instr. and Meth. 187, 337 (1987).
- N. Cavallo et al., "A possible front-end readout scheme for the resistive plate chamber detector at BaBar", INFN/TC 95/07.
- [5] N. Cavallo et al., "Front-end card design for the RPC detector at BaBar", INFN/TC 96/22.
- [6] BaBar Note 281, BaBar DAQ Group, Draft 4/15/96.