# A 1GSAMPLE/s 6-BIT FLASH A/D CONVERTER WITH A COMBINED CHOPPING AND AVERAGING TECHNIQUE FOR REDUCED DISTORTION IN $0.18 \mu \mathrm{~m}$ CMOS 

A Thesis<br>by<br>NIKOLAOS STEFANOU<br>Submitted to the Office of Graduate Studies of<br>Texas A\&M University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE

May 2005

Major Subject: Electrical Engineering

A 1GSAMPLE/s 6-BIT FLASH A/D CONVERTER
WITH A COMBINED CHOPPING AND AVERAGING TECHNIQUE
FOR REDUCED DISTORTION IN $0.18 \mu \mathrm{~m}$ CMOS

A Thesis<br>by<br>NIKOLAOS STEFANOU

Submitted to Texas A\&M University in partial fulfillment of the requirements for the degree of

MASTER OF SCIENCE

Approved as to style and content by:

Sameer Sonkusale
(Chair of Committee)


Alexander Parlos
(Member)

Costas Georghiades
(Member)

Chanan Singh
(Head of Department)

May 2005

Major Subject: Electrical Engineering


#### Abstract

A 1GSample/s 6-bit Flash A/D Converter with a Combined Chopping and Averaging Technique for Reduced Distortion in $0.18 \mu \mathrm{~m}$ CMOS. (May 2005)

Nikolaos Stefanou, Diploma, National Technical University of Athens Chair of Advisory Committee: Dr. Sameeer Sonkusale


Hard disk drive applications require a high Spurious Free Dynamic Range (SFDR), 6-bit Analog-to-Digital Converter (ADC) at conversion rates of 1 GHz and beyond. This work proposes a robust, fault-tolerant scheme to achieve high SFDR in an averaging flash $\mathrm{A} / \mathrm{D}$ converter using comparator chopping. Chopping of comparators in a flash A/D converter was never previously implemented due to lack of feasibility in implementing multiple, uncorrelated, high speed random number generators. This work proposes a novel array of uncorrelated truly binary random number generators working at 1 GHz to chop all comparators.

Chopping randomizes the residual offset left after averaging, further pushing the dynamic range of the converter. This enables higher accuracy and lower bit-error rate for high speed disk-drive read channels. Power consumption and area are reduced because of the relaxed design requirements for the same linearity.

The technique has been verified in Matlab simulations for a 6-bit 1Gsamples/s flash ADC under case of process gradients with non-zero mean offsets as high as 60 mV and potentially serious spot offset errors as high as 1 V for a 2 V peak to peak input signal. The proposed technique exhibits an improvement of over 15 dB compared to pure averaging flash converters for all cases.

The circuit-level simulation results, for a 1V peak to peak input signal, demonstrate superior performance. The reported ADC was fabricated in TSMC $0.18 \mu \mathrm{~m}$

CMOS process. It occupies $8.79 \mathrm{~mm}^{2}$ and consumes about 400 mW from 1.8 V power supply at 1 GHz . The targeted SFDR performance for the fabricated chip is at least 45 dB for a 256 MHz input sine wave, sampled at 1 GHz , about 10 dB improvement on the 6 -bit flash ADCs in the literature.

To my parents

## ACKNOWLEDGMENTS

I would like to thank Dr. Sameer Sonkusale, for giving me the opportunity to work under him. I would like to thank him for his friendly and encouraging attitude and the support he provided. He has been a good teacher and a good advisor. I would also like to express my gratitude to my committee members Dr. Jose Silva-Martinez, Dr. Costas Georghiades and Dr. Alexander Parlos for their time and support. Thanks are due to my friends and all of my current and former colleagues. Lastly, I would like to thank my parents and sister. Without their support, I would have never come this far.

## TABLE OF CONTENTS

## CHAPTER <br> Page

I INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . 1
A. Flash A/D Converter . . . . . . . . . . . . . . . . . . . . . 1
B. Design Issues with Flash A/D Converters . . . . . . . . . . 2
C. Offset Cancellation Techniques . . . . . . . . . . . . . . . . 4
D. Thesis Layout . . . . . . . . . . . . . . . . . . . . . . . . . 8

II RESISTIVE AVERAGING AND CHOPPING TECHNIQUE . . 11
A. Resistive Averaging . . . . . . . . . . . . . . . . . . . . . . 11

1. Limitations of Resistive Averaging . . . . . . . . . . . 13
B. Chopping . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
C. Simulation Results for Averaging and Chopping and
Averaging Combined Techniques . . . . . . . . . . . . . . . 18

III HIGH SPEED ARRAY OF TRULY BINARY RANDOM GENERATORS27
A. Introduction to Random Number Generators ..... 27
B. Voltage Controlled Ring Oscillator ..... 30
C. Design of the RNG Array ..... 34
D. Results ..... 37
E. Conclusions ..... 39
IV DESIGN AND IMPLEMENTATION ..... 41
A. Sample and Hold ..... 41
B. Comparator Design ..... 44

1. Preamplifier ..... 47
2. $2^{\text {nd }}$ Preamplifier and $1^{\text {st }}$ Latch ..... 50
3. $2^{\text {nd }}$ Latch ..... 50
4. SR Latch ..... 53
5. Chopping Switches ..... 53
6. Averaging ..... 55
7. Array of Random Number Generators ..... 57
8. Comparator Performance ..... 59
C. Resistor Ladder ..... 60

## CHAPTER <br> Page

D. Clock Generator ..... 61
E. Digital Encoder ..... 63

1. Bubble Correction ..... 64
2. Gray Encoded ROM ..... 64
3. Gray Decoder ..... 67
4. Decimation ..... 67
F. Output Buffers ..... 69
G. Layout ..... 70
H. Post Layout Performance ..... 70
V TESTING BOARD AND TEST SETUP ..... 75
A. Testing Board ..... 75
B. Test Setup ..... 79
VI CONCLUSIONS ..... 81
REFERENCES ..... 83
VITA ..... 88

## LIST OF TABLES

TABLE Page
I $\quad$ Simulated Results for $R_{2} / R_{1}=1$. ..... 20
II $\quad$ Simulated Results for $R_{2} / R_{1}=0.1$. ..... 21
III $\quad$ Simulated Results for $R_{2} / R_{1}=0.01$ ..... 22
IV Statistical Averages and FIPS 140-1 Tests. ..... 40
V Table of Truth for SR NOR Based Latch. ..... 53
VI Clock Loads. ..... 63

## LIST OF FIGURES

FIGURE Page

1 A Hard Disk Drive Read Channel. . . . . . . . . . . . . . . . . . . . 2
2 A Flash Analog-to-Digital Converter. . . . . . . . . . . . . . . . . . . 3
3 Input and Output Offset Storage (IOS and OOS) Cancellation Techniques. The offset of the amplifier is stored at every conversion cycle at a capacitor, at the input for IOS and the output for OOS. In IOS the input referred offset is reduced by A times, where A is the amplifier gain, and in OOS it is completely removed.

4 An Averaging Flash Analog-to-Digital Converter. Averaging schemes use the outputs of neighboring active input amplifiers to increase the effective gate area and in this way reduce the effect of offset voltages.

An Interpolation by 2 Flash Analog-to-Digital Converter. In this scheme a zero crossing is obtained by interpolating between two reference levels. The number of input amplifiers can be reduced depending on the number of times an interpolation takes place. However, the size of the input transistors of the amplifier need to be increased.

6 A 7-bit Folding Analog-to-Digital Converter. This architecture uses analog preprocessing to transform the input signal into a repetitive output signal to be applied to the fine converter. Number of comparators is drastically reduced compared to flash ADC architectures

7 Chopping of a Comparator. The input and output terminals are swaped using a random dither sequence r. Any possible offset is on average zero.

Block Diagram of the Proposed ADC.
9 Averaging Implementation.

## FIGURE

Page

Effect of Averaging.14
Effect of Averaging and Chopping. ..... 16
Chopping Implementation. ..... 17
Spectrum Plots of the Simulated Cases for $R_{2} / R_{1}=1$. ..... 23
Spectrum Plots of the Simulated Cases for $R_{2} / R_{1}=0.1$. ..... 24
Spectrum Plots of the Simulated Cases for $R_{2} / R_{1}=0.01$. ..... 25
RNG Technique Using Direct Sampling. ..... 28
Oscillator Based RNG. ..... 29
Ring Oscillator ..... 31
Conventional Voltage Controlled Ring Oscillator. ..... 32
Voltage Controlled Ring Oscillator Using Variable Resistors. ..... 32
Implementation of Voltage Controlled Ring Oscillator Using Vari- able Resistors. ..... 33
Period Spread of the Two VCOs. ..... 35
(a) The Novel RNG Implementation. Many such RNGs are used in parallel to generate an array, (b) Decorrelating XOR and Array Implementation. ..... 36
Block Diagram of the Proposed ADC. ..... 42
Sample and Hold Implementation. ..... 43Sample and Hold Dynamic Performance (Circuit Level Simula-tion) with Fin $=256.8359375 \mathrm{MHz}$ at 1 GHz Sampling, 1024pt FFT.45Sample and Hold Dynamic Performance (Postlayout Simulation)with Fin $=256.8359375 \mathrm{MHz}$ at 1 GHz Sampling, 1024pt FFT46
Chopped Comparator. ..... 47
FIGURE Page
$29 \quad 1^{\text {st }}$ Preamplifier (4-input). ..... 48
30 Chopped 4-input Preamplifier. ..... 49
31 $2^{\text {nd }}$ Preamplifier and $1^{\text {st }}$ Latch. ..... 51
32
$2^{\text {nd }}$ Latch. ..... 52
33
SR Latch with NOR Configuration. ..... 54
34
SR Flip-Flop with NOR Configuration. ..... 55
35
Chopping Switch. ..... 56
3637Comparator Overdrive Recovery.59
38
Clock Generator. ..... 62
39
Bubble Correction Using NAND Gates. ..... 65
A $4 \times 4$ NOR ROM. ..... 66
41

Gray to Binary Decoder and Decimation Stage. ..... 6842
ADC Dynamic Performance (Postlayout Simulation of the Chip)
1/16 Clock Divider. ..... 69
Pad and Probe Parasitic Model. ..... 70
Output Buffer. ..... 71
Output Buffer Performance. ..... 72
Chip Layout. ..... 73
46with Fin $=256.8359375 \mathrm{MHz}$ at 1 GHz Sampling, 1024pt FFT.74
Testing Board Schematic. ..... 76
PCB Bottom Layer. ..... 77
FIGURE ..... Page
$50 \quad$ PCB Top Layer. ..... 78
51 Test Setup. ..... 80

## CHAPTER I

## INTRODUCTION

High speed analog-to-digital converters are used in many signal processing applications. One such application is a hard disk drive read channel where a low cost, 6 b flash ADC with high dynamic range at speeds higher than 1 GHz is desired.

Figure 1 shows the block diagram of an ADC embedded in the Extended Partial Response Class IV (EPR4) PRML front-end hard disk read channel. Accurate timing and gain control require dynamic range higher than 45 dB for the 1 GHz ADC. Moreover, in order to achieve a low bit-error-rate (BER), a high dynamic range is needed.

## A. Flash A/D Converter

A flash ADC is illustrated in Figure 2. A flash n-bit ADC has $2^{n}-1$ stages of comparators. Each comparator compares the input $V_{i n}$ with the reference voltage that corresponds to its position $V_{\text {ref(i) }}$. The reference voltages divide the full scale of the applicable input swing in $2^{n}-1$ equal taps. The comparison is made by using a set of amplifiers and latches. After comparison, the $2^{n}-1$ comparators produce a thermometer code which consists of consecutive 1s and then 0s. The transition point from 1 s to 0 s gives the corresponding input amplitude. Then the thermometer code is transformed into n-binary code.

This thesis follows the style and format of IEEE Journal of Solid-State Circuits.


Fig. 1. A Hard Disk Drive Read Channel.

## B. Design Issues with Flash A/D Converters

Most digital signal processing circuits are implemented in digital CMOS process and as a result the embedded flash ADC has to be built using the same process. Consequently, the use of transistors with poor linearity and matching is common. This results in large offsets and nonlinearities at the comparators of a flash ADC introducing spurious tones that limit signal-to-noise-and-distortion ratio (SNDR) and spuriousfree dynamic range (SFDR) performance.

In a flash ADC , quantization level errors mostly come from resistor mismatches in the resistor reference ladder which exhibits spatial gradient distribution with non-zero mean offsets and from comparator input offsets mainly due to transistor threshold voltage mismatches. The latter is usually the dominating source of offset in lowvoltage applications due to reduced signal swings and quantization step-size. The difference $\Delta V_{T}$ between the threshold voltages of a pair of MOS transistors is de-


Fig. 2. A Flash Analog-to-Digital Converter.
scribed by its standard deviation [1], [2], [3]:

$$
\begin{equation*}
\sigma_{\Delta V T}=\frac{A_{V T}}{\sqrt{W L}} \tag{1.1}
\end{equation*}
$$

Usually the comparator input offset is dominated by this mismatch ( $\sigma_{\text {off }} \approx$ $\left.\sigma_{\Delta V T}\right)$. At high speed there is a need to use smaller transistors, increasing the offsets introduced by the comparators. Furthermore, in minimum length transistors, this offset is usually much greater than predicted by Eq. 1.1. $A_{V T}$ for $0.18 \mu \mathrm{~m}$ CMOS is $5 \mathrm{mV} \cdot \mu \mathrm{m}[3]$ and tends to decrease with newer technologies. Thus, input offsets with standard deviations in the range of $10 \mathrm{mV}-20 \mathrm{mV}$ are typical. In a $6 \mathrm{~b} A D C$ for a 1 V signal swing, the quantization step size is 15.625 mV , which is comparable with the input offsets. This limits the resolution of the flash ADC to less than 5 bits.

## C. Offset Cancellation Techniques

Offset cancellation techniques are needed to improve the resolution performance of the ADC.Various input and output offset storage (IOS and OOS) schemes (Figure 3) along with background offset cancellation techniques have been proposed in the literature [4], but they render to low speed applications. Moreover, they require cascaded gain stages to reduce offsets to an acceptable level.

At high speeds in A/D converters, averaging and interpolation between adjacent comparators is used to mitigate the effect of offsets, Figure 4, 5, [5], [6], [7], [8]. However, averaging and interpolation techniques require over-range amplifiers towards the end of the array. Moreover, the performance is limited by the overall number of preamplifiers that are in a linear region for a given input [5], [8].

Folding architectures can be used to reduce the area, but complex analog preprocessing and higher bandwidth requirement for folder design limits their dynamic


Fig. 3. Input and Output Offset Storage (IOS and OOS) Cancellation Techniques. The offset of the amplifier is stored at every conversion cycle at a capacitor, at the input for IOS and the output for OOS. In IOS the input referred offset is reduced by A times, where A is the amplifier gain, and in OOS it is completely removed.


Fig. 4. An Averaging Flash Analog-to-Digital Converter. Averaging schemes use the outputs of neighboring active input amplifiers to increase the effective gate area and in this way reduce the effect of offset voltages.


Fig. 5. An Interpolation by 2 Flash Analog-to-Digital Converter. In this scheme a zero crossing is obtained by interpolating between two reference levels. The number of input amplifiers can be reduced depending on the number of times an interpolation takes place. However, the size of the input transistors of the amplifier need to be increased.

Coarse Quantization


Fig. 6. A 7-bit Folding Analog-to-Digital Converter. This architecture uses analog preprocessing to transform the input signal into a repetitive output signal to be applied to the fine converter. Number of comparators is drastically reduced compared to flash ADC architectures.
performance for high speed applications, Figure 6, [9], [10], [8].
Chopping of comparators (Figure 7) was theoretically analyzed in [11] for $\Delta \Sigma$ Modulators and flash A/D converter but never implemented in practice due to lack of feasibility in implementing multiple, uncorrelated, random number generators.

## D. Thesis Layout

This work proposes the use of chopped comparators in a traditional averaging flash ADC to improve the linearity [12]. To facilitate chopping, a novel array of binary random number generators, is proposed [13]. These ideas have led to a high speed ADC with exceptional dynamic performance [14].

This thesis reports on a 6 -bit 1Gsample/s flash ADC with averaging and chopping combined, Figure 8. The analog part of the 6 -bit flash ADC consists of 63 comparator


Fig. 7. Chopping of a Comparator. The input and output terminals are swaped using a random dither sequence $r$. Any possible offset is on average zero.
slices (plus an additional 12 slices for averaging termination), each one using a 4-input chopped differential preamplifier, a second preamplifier and two latches. In addition a sample and hold is used. At the back-end the digital part includes bubble correction, Gray encoder, Gray to binary decoder and an array of random number generators to perform chopping at the input of the 4 -input preamplifier.

Chapter II describes the resistive averaging and chopping techniques and their limitations. Simulations results of the proposed technique using MATLAB are also shown in the same Chapter. Chapter III presents the array of truly binary random number generators used for chopping. Chapter IV presents the architecture of the ADC and detailed description of its building blocks as well as the layout design. Chapter V describes the testing board and test setup. Finally, conclusions are drawn in Chapter VI.


Fig. 8. Block Diagram of the Proposed ADC.

## CHAPTER II

## RESISTIVE AVERAGING AND CHOPPING TECHNIQUE ${ }^{\dagger}$

## A. Resistive Averaging

Resistive averaging is a popular technique for improving the deteriorated performance of a flash ADC due to the offsets of the preamplifiers in the comparators. It was first introduced by Kattmann and Barrow [15]. The main idea of averaging is to average error sources by connecting resistors between the output stages of neighboring amplifiers in a flash ADC architecture. In an infinite averaging resistive network the differential non-linearity (DNL) and integral non-linearity (INL) are significantly improved without the need of high tolerance averaging resistors and without affecting input capacitance and power dissipation. The resistive network implemented on an array of preamplifiers of a flash ADC is shown in Figure 9.

For averaging the offsets, the outputs of the amplifiers are interconnected via averaging resistors $R_{2}$. The offset reduction depends on $R_{2}$ and the output impedance of the amplifiers, $R_{1}$. $R_{X}$ is the resistance seen from each node $n_{i}$ towards either end assuming infinite resistive array and is given by the following expression derived in [15]:

$$
\begin{equation*}
R_{X}=R_{2}+R_{1} \| R_{X} \Rightarrow R_{X}=\frac{R_{2}+\sqrt{R_{2}^{2}+4 R_{1} R_{2}}}{2} \tag{2.1}
\end{equation*}
$$

The behavior of the amplifiers can be modeled by a controlled voltage source $A\left(V_{\text {in }}-V_{\text {ref(i) }}+V_{\text {off }(i)}\right)$ with an output impedance $R_{1}[6]$, where $V_{\text {off(i) }}$ is the input referred offset attributed to variety of sources like resistor mismatch and preamplifier

[^0]

Fig. 9. Averaging Implementation.
offset. In [6] it is shown that the equivalent offset voltage at output node $n_{i}$ depends not only on its nearest offset voltage source $V_{o f f(i)}$ but also on all the other offset voltages of the neighboring preamplifiers through a reduced weighing function:

$$
\begin{equation*}
V_{o f f\left(n_{i}\right)}=K_{\alpha}\left(V_{o f f(i)}+\sum_{j=1}^{\infty} K_{b}^{j} \cdot V_{o f f(i+j)}+\sum_{j=1}^{\infty} K_{b}^{j} \cdot V_{o f f(i-j)}\right) \tag{2.2}
\end{equation*}
$$

where $K_{a}=\frac{R_{X}}{R_{X}+2 R_{1}} \quad K_{b}=\frac{R_{1} \| R_{X}}{R_{1} \| R_{X}+R_{2}}$
Assuming that all offset voltage sources have an uncorrelated variance of $\sigma_{o f f}^{2}$, then the equivalent variance at each node $n_{i}$ was shown to be [6]:

$$
\begin{equation*}
\sigma_{o f f(n)}^{2}=K_{\alpha}^{2} \cdot\left(1+2 \sum_{j=1}^{\infty} K_{b}^{2 j}\right) \cdot \sigma_{o f f}^{2}=K_{\alpha}^{2} \cdot \frac{1+K_{b}^{2}}{1-K_{b}^{2}} \cdot \sigma_{o f f}^{2} \tag{2.3}
\end{equation*}
$$

Eq. 2.3 shows that the equivalent offset at the input of the amplifier is reduced through averaging. The effect of averaging can be seen in Figure 10 where the transfer function of an ADC is illustrated before and after averaging (thin and thick plain line). If the offsets are randomly distributed the transfer function after averaging gets closer to the ideal one.

## 1. Limitations of Resistive Averaging

In Eq. 2.1, 2.2 and 2.3 it was assumed an infinite resistive network. In actual implementations, the assumption of infinite resistive array is invalid towards the end of the array, making averaging less attractive solution. Averaging is only effective for the preamplifiers in the middle of the array. Towards the end of the array, it becomes ineffective since the number of the preamplifiers involved in the weighted summation of offsets is less. To solve this problem, overrange preamplifiers are added towards the ends of the array, at the expense of more power consumption and area.

The offset reduction is still limited by the fact that averaging includes only a

## Transfer Function of ADC



Fig. 10. Effect of Averaging.
few neighboring preamplifiers that are supposed to be in a linear region of operation for the applied input signal. This considerably reduces the effect of averaging, under effects of gradient and spot offset errors. Linear gradient errors introduce offsets with nonzero mean due to transistor or resistor mismatch with a linear distribution in the comparator array. Spot offset errors are more abrupt discontinuities that are usually localized large offsets on selective comparators. Spot offset errors usually result due to poor layout, localized fabrication errors or stuck-at faults. Averaging is much less effective for large spot offset errors.

Usually, averaging is required at the output of each preamplifier stage in the comparator array. It has been shown that under ideal situations, optimum averaging can lower the random offset by up to 3 times, [5]. This means that the input transistors can be 9 times smaller (Eq. 1.1) in the preamplifier array compared to flash converters with no averaging. However, in high speed applications where transistors occupy the less possible area and technologies with large process variations, poor device reliability and poor matching, this reduction may not be enough.

We propose chopping to go along with averaging to eliminate any residual offset through randomization. The use of chopping relaxes this design requirement further, owing to it's randomization effect on the overall offset, however large, to yield lower harmonics in the spectral output.

## B. Chopping

Chopping has been traditionally used in operational amplifiers and filters to reduce the effect of DC offsets and $1 / \mathrm{f}$ noise, by swapping the inputs (and therefore output) terminals randomly, [16], [17]. It was also proposed for the dynamic element matching for reduced distortion multi-bit quantization in $\Delta \Sigma$ converters [11].


Fig. 11. Effect of Averaging and Chopping.

The main idea of chopping is to swap the two inputs of the comparator using a random dither sequence. The swapping causes the comparators to have two offsets that are randomly chosen independently of the input signal. In Figure 10 it was shown that by using averaging the equivalent comparator offsets are reduced. Since chopping essentially alternates the sign of the equivalent offsets of each comparator, the overall transfer function will lie between the two thick lines (plain and dashed)(Figure 11) and on average will be close to the ideal curve.

Implementation of chopping is illustrated in Figure 12, where $V_{i n}$ is the instan-


Fig. 12. Chopping Implementation.
taneous value of the input to the flash $\mathrm{ADC}, r$ is the instantaneous value of the $\pm 1$ random sequence which modulates the offset by changing the connection of the switches $S_{1}$ and $S_{2}, V_{\text {ref }}$ and $V_{\text {off }}$ are the reference voltage and the offset voltage of the comparator respectively and $y$ is the output of the comparator. Using chopping, the output sequence $y$ of the comparator is given by:

$$
y= \begin{cases}1, & V_{\text {in }}-V_{\text {ref }}-r \cdot V_{o f f}>0  \tag{2.4}\\ 0, & \text { otherwise }\end{cases}
$$

The resulted expected value of the input $V_{X}$ at a comparator is then equal to:

$$
\begin{align*}
E\left[V_{X}\right] & =E\left[V_{\text {in }}-V_{\text {ref }}-r V_{o f f}\right]=E\left[V_{\text {in }}-V_{r e f}\right]-E\left[r V_{o f f}\right] \\
& =E\left[V_{\text {in }}-V_{r e f}\right]-E[r] E\left[V_{o f f}\right]=E\left[V_{\text {in }}-V_{r e f}\right] \tag{2.5}
\end{align*}
$$

where $E[r]=0$
This means that the chopping at the inputs of the comparator converts the systematic offset into a random offset with an expected value of 0 (same as the ideal case).

However the power contribution from the offset remains unchanged but it appears
in the output spectrum only as increased noise floor and not as spurious tones. The expected value of power (mean-square) value of $V_{X}$ which will also give a measure of the input-referred power contribution of the offset, is given by:

$$
\begin{gather*}
E\left[V_{X}^{2}\right]=E\left[\left(V_{\text {in }}-V_{r e f}-r V_{o f f}\right)^{2}\right]=E\left[\left(V_{\text {in }}-V_{r e f}\right)^{2}\right]+E\left[r^{2} V_{o f f}^{2}\right]-2 E\left[r\left(V_{\text {in }}-V_{r e f}\right) V_{o f f}\right] \\
=E\left[\left(V_{\text {in }}-V_{r e f}\right)^{2}\right]+E\left[r^{2}\right] E\left[V_{o f f}^{2}\right]-2 E[r] E\left[\left(V_{\text {in }}-V_{r e f}\right) V_{o f f}\right] \\
=E\left[\left(V_{\text {in }}-V_{r e f}\right)^{2}\right]+E\left[V_{o f f}^{2}\right] \tag{2.6}
\end{gather*}
$$

where $E\left[r^{2}\right]=\frac{1}{2}+\frac{1}{2}=1$ is the variance of the r .
Without chopping, the offset contributes to spurious tones, resulting from periodic sampling of the offsets. After chopping, the offset contribution of the flash will only result in increased noise floor [11]. Compared to IOS and OOS technique, where the offset is reduced deterministically through offset sampling, chopping does not eliminate the offset. Instead it provides an average close-to-zero mean offset performance, irrespective of the amount of absolute offset at the input of the comparators. This is desirable in many communication applications where spurious free dynamic range (SFDR) is critical to the performance of the overall system.
C. Simulation Results for Averaging and Chopping and Averaging Combined Techniques

A 6 b flash ADC with 2 V input swing ( 31.25 mV quantization step size), with bubble correction in the digital encoder was simulated for different nature of input referred offsets. 20 amplifiers are assumed in linear region for averaging. 10 over-range amplifiers are used to improve linearity at the end of the array.

Several cases with different $\sigma_{o f f}$ and gradients were simulated. It is shown that the improvement strongly depends on the value of averaging resistor $R_{2}$ and the
output impedance of the preamplifier $R_{1}$. The optimum value is not the same for different cases. The results for input referred offsets with a standard deviation of $\sigma_{\text {off }}=20 \mathrm{mV}$ and with or without a gradient of -60 mV to 60 mV from top to bottom comparator are summarized in Tables I, II, III, for the cases of $R_{2} / R_{1}=1, R_{2} / R_{1}=$ 0.1 and $R_{2} / R_{1}=0.01$ respectively . We also show the results for an extreme case where there are spot offset errors as large as 1 V at five consecutive comparators, along with a standard deviation of $\sigma_{o f f}=20 \mathrm{mV}$. The spectrum plots of the simulated cases are shown in Figures 13, 14 and 15, for the different cases of $R_{2} / R_{1}$ ratios, where a 4096 point Fast-Fourrier-Transform (FFT) was used.

We see that for the case without gradient, averaging gives an improvement in the SFDR of $4.02 \mathrm{~dB}, 8.37 \mathrm{~dB}$ and 9.94 dB for $R_{2} / R_{1}=1, R_{2} / R_{1}=0.1$ and $R_{2} / R_{1}=0.01$ respectively. Using chopping along with averaging gives an improvement of 14.17 dB , 14.31 dB and 12.28 dB respectively for the same cases.

For the case of offsets due to linear gradients the results are even more impressive. In this case averaging does not cancel out the odd harmonics, because of the nonzero mean of the offsets whereas chopping whitens the effect of offsets. The improvement in the SFDR of averaging only is less than 1 dB for all $R_{2} / R_{1}=1, R_{2} / R_{1}=0.1$ and $R_{2} / R_{1}=0.01$. Using chopping along with averaging can improve the SFDR by approximately 16 dB for all the $R_{2} / R_{1}$ ratios .

For the extreme case of large spot errors we see that without any calibration the SFDR is 8.76 dBc . Using only the averaging technique the SFDR improved by 12.08 dB for $R_{2} / R_{1}=0.01$ and by using both techniques the improvement is about 27 dB even for such an extremely defective case. Thus chopping introduces fault tolerance in the converter.

Furthermore, as mentioned above we assumed a large number of amplifiers in linear region and used many overrange amplifiers. This favors optimum performance

Table I. Simulated Results for $R_{2} / R_{1}=1$.

|  | No Calibration | Averaging | Choping and <br> Averaging |
| :---: | :---: | :---: | :---: |
| Input referred offsets with $\sigma=\mathbf{2 0 m V}$ |  |  |  |
| SFDR | 42.60 dBc | 46.64 dBc | 57.07 dBc |
| SNDR | 31.09 dB | 34.92 dB | 34.07 dB |
| Input referred offsets with a Gradient <br> from -60mV to 60mV with $\sigma=\mathbf{2 0 m V}$ |  |  |  |
| SFDR | 35.42 dBc | 36.24 dBc | 49.16 dBc |
| SNDR | 29.24 dB | 30.89 dB | 29.32 dB |

Input referred offsets with $\sigma=20 \mathrm{mV}$ and 5 consecutive comparators have 1 V offset

| SFDR | 8.76 dBc | 11.29 dBc | 20.84 dBc |
| :---: | :---: | :---: | :---: |
| SNDR | 6.48 dB | 8.55 dB | 13.08 dB |

Table II. Simulated Results for $R_{2} / R_{1}=0.1$.

|  | No Calibration | Averaging | Choping and <br> Averaging |
| :---: | :---: | :---: | :---: |
| Input referred offsets with $\sigma=\mathbf{2 0} \mathbf{m V}$ |  |  |  |
| SFDR | 42.60 dBc | 50.97 dBc | 56.91 dBc |
| SNDR | 31.09 dB | 37.33 dB | 36.28 dB |
| Input referred offsets with a Gradient <br> from -60mV to 60mV with $\sigma=\mathbf{2 0 m V}$ |  |  |  |
| SFDR | 35.42 dBc | 36.12 dBc | 51.04 dBc |
| SNDR | 29.24 dB | 31.41 dB | 31.67 dB |
| Input referred offsets with $\sigma=\mathbf{2 0 m V}$ |  |  |  |
| and 5 consecutive comparators have $\mathbf{1 V}$ offset |  |  |  |
| SFDR | 8.76 dBc | 19.29 dBc | 31.69 dBc |
| SNDR | 6.48 dB | 14.37 dB | 18.27 dB |

Table III. Simulated Results for $R_{2} / R_{1}=0.01$.

|  | No Calibration | Averaging | Choping and <br> Averaging |
| :---: | :---: | :---: | :---: |
| Input referred offsets with $\sigma=\mathbf{2 0} \mathbf{m V}$ |  |  |  |
| SFDR | 42.60 dBc | 52.54 dBc | 54.88 dBc |
| SNDR | 31.09 dB | 37.66 dB | 36.61 dB |
| Input referred offsets with a Gradient |  |  |  |
| from -60mV to 60mV with $\sigma=\mathbf{2 0 m V}$ |  |  |  |
| SFDR | 35.42 dBc | 35.89 dBc | 50.76 dBc |
| SNDR | 29.24 dB | 31.53 dB | 31.59 dB |
| Input referred offsets with $\sigma=\mathbf{2 0 m V}$ |  |  |  |
| and 5 consecutive comparators have $\mathbf{1 V}$ offset |  |  |  |
| SFDR | 8.76 dBc | 22.07 dBc | 35.70 dBc |
| SNDR | 6.48 dB | 18.56 dB | 20.61 dB |



Fig. 13. Spectrum Plots of the Simulated Cases for $R_{2} / R_{1}=1$.


Fig. 14. Spectrum Plots of the Simulated Cases for $R_{2} / R_{1}=0.1$.


Fig. 15. Spectrum Plots of the Simulated Cases for $R_{2} / R_{1}=0.01$.
for the averaging technique. Simulating the same cases without overrange amplifiers, which increases the nonlinearity at the ends of the comparator array, we observed that although the performance of the averaging technique was much deteriorated, chopping along with averaging could still achieve about the same performance as with the use of overrange preamplifiers. Signifying the case with gradient the combined technique attained an SFDR of 48 dB for $R_{2} / R_{1}=0.01$. Therefore chopping relaxes the requirements for proper termination and overrange amplifiers contributing to less power consumption and area.

Another advantage of combined averaging and chopping is that mere averaging in order to achieve the best possible performance needs smaller ratio of $R_{2} / R_{1}$ than the proposed technique. Decreasing $R_{2}$, increases the nonlinearities at the ends of the comparator array, thus the termination or the over-range amplifiers become inefficient. For most of the cases a ratio greater than 0.1 was adequate for the combined chopping and averaging technique while for averaging alone, a ratio less than 0.01 was often needed to see improvement in some worse cases.

From the above Tables I, II, III we note that chopping doesn't improve the SNDR more than the case with mere averaging, since it only converts the spurious tones into white noise without reducing the overall noise power. This is also apparent in the spectrum plots of the cases discussed above in Figures 13, 14 and 15.

To integrate chopping at the array of the comparators special design considerations have to be made for the front end preamplifier of the comparator and the chopping switches, which have to be linear and ultra fast.But most important of all an array of uncorrelated random number generators is needed to perform chopping. In the next section we propose a novel technique to implement an array of uncorrelated truly binary random number generators working at 1 GHz .

## CHAPTER III

## HIGH SPEED ARRAY OF TRULY BINARY RANDOM GENERATORS $\dagger$

A basic and important block of the proposed analog-to-digital converter is the random number generators (RNGs) that are needed to chop the inputs of the preamplifiers of each comparator independently. For this application an array of different and uncorrelated binary random number generators, working at 1 Ghz , is needed.

In this chapter, we will give an introduction to RNGs and present a technique to produce many high speed uncorrelated truly binary random number generators (RNGs) utilizing least area and power. The technique relies on the phase noise and jitter of voltage controlled oscillators (VCOs) to generate RNGs using the oscillator sampling technique. To obtain true randomness, the frequency of oscillation of the VCOs is controlled by other RNGs in the array, increasing the jitter spread much more than conventional designs. Parallel high speed uncorrelated random sequences are tested at 1 GHz in CMOS $0.18 \mu \mathrm{~m}$ process.

## A. Introduction to Random Number Generators

Integrated random number generators (RNGs) are often needed in applications such as cryptography, statistical simulation, built-in self-test of digital and mixed signal systems and dynamic element matching for Digital-to-Analog Converters. Furthermore, RNGs are also needed for mobile applications where they have a tight area and power budget.

Many techniques of RNG have been proposed in the literature. Pseudo RNGs

[^1]

Fig. 16. RNG Technique Using Direct Sampling.
such as the Linear Feedback Shift Register (LFSR) [18] have been used in mobile computing chips but the random sequences they provide are deterministic and of low quality making them unsuitable for cryptography. However, they are suitable for chopping but a LFSR with sufficiently high repetition rate for higher spectrally pure random numbers will require increased digital hardware.

Currently, true RNGs have achievable data rates of less than 100 MHz . Chaos based RNGs provide good randomness but require complicated circuits using large area and high power while only low data rates can be obtained [19], [20], [21].

A common true RNG technique uses direct sampling of the amplified noise signal, Figure 16, [22]. In this technique the noise source, usually thermal noise from a resistor, is in the order of $\mu \mathrm{V}$ making the RNG very sensitive to signal coupling, [23]. Consequently, these RNGs are affected by deterministic disturbances, need proper design techniques and are power hungry, since a very high gain high bandwidth


Fig. 17. Oscillator Based RNG.
amplifier is needed for every RNG.
Oscillator based RNGs have a clear advantage over all the above-mentioned techniques even in the presence of sinusoidal signal coupling. Oscillator based RNGs use timing jitter or oscillator drift found in ring oscillators as a source of randomness. Two or more oscillators are combined to produce a random bit stream. In Figure 17, a low frequency oscillator samples the output of a high frequency oscillator using a D flip-flop. The level of randomness depends on the mean frequency separation of the oscillators and the amount of achievable drift [22]. The later one depends directly on the jitter to mean period ratio of the oscillator used. If the low frequency oscillator period has a standard deviation much greater than the fast oscillator period, then the states for two successive sampled times can be considered uncorrelated and therefore the output bit stream is random in nature [23].

Another concern in our design is the achievable data rate of the output bit stream. This depends on the nature of the oscillator used in the design. The simplest oscillators are the ring oscillators and can be designed in such a way as to have bad
phase noise and jitter, but experimental results have shown that in CMOS $0.18 \mu \mathrm{~m}$ process, ring oscillators have a jitter to mean period ratio lower than $10^{-4}$, [23]. Therefore, to produce a 1 Gbps random stream at least a 10 THz oscillator is needed, which is unfeasible.

To overcome this problem, we introduce a novel idea of mixing more sources of jitter in order to obtain a high jitter to mean period spread. One way to achieve this is to use voltage controlled oscillators, controlled by other noise sources. Since the use of resistors as noise sources needs large amplification and is sensitive to coupling, resistors are not practical for our design at 1 GHz . Moreover, for numerous applications like built-in testing and dithering in A/D converters, there is a need to generate many RNGs in parallel which have little cross correlation between them. For this purpose, we use a random sequence generated by another RNG of the array to control the frequency and phase noise of the VCO. Parallel implementation and negligible cross-correlation is achieved through random mixing of the RNGs from the array using a simple circuit topology.

The Voltage Controlled Ring Oscillator design is described in the next section. In section C the RNG and the additional logic needed for the design of an array is presented, while the corresponding simulation results are discussed in section D and conclusions are drawn in section E .

## B. Voltage Controlled Ring Oscillator

Figure 18 shows a simple ring oscillator architecture by connecting an odd number of inverting gain stages in a ring. The frequency of oscillation is given by [24]:

$$
\begin{equation*}
f_{o s c}=\frac{1}{T}=\frac{1}{2 n \tau_{i n v}} \tag{3.1}
\end{equation*}
$$



Fig. 18. Ring Oscillator.
where $n$ is the number of stages and $\tau_{i n v}$ is the delay of one inverter.
Usually, a differential implementation is used to improve common mode rejection of substrate coupled noise and power supply noise, thus decreasing jitter and phase noise. For our implementation, these effects are desirable as an additional source of randomness and therefore the simple CMOS inverter was used. However, they introduce some amount of correlation to the output that can be removed digitally using a XOR decorrelating circuit [25].

Several techniques for Voltage Controlled Ring Oscillators have been reported. A simple way to control the frequency of the ring oscillators is to vary the supply voltages, but in that case the high and low levels are varied too. Moreover, the frequency drift is not large enough. A common approach is to vary the control current as illustrated in Figure 19 where the frequency drift is small and the oscillator output amplitude is varying as well.

In [26] a ring oscillator based VCO was proposed by adding a variable resistor at the input terminal of each inverter, as shown in Figure 20. In this approach, the oscillating frequency is varied by varying the resistance. Assuming that the transcoductances $G_{M}$ and parasitic capacitances $C_{G}$ of the NMOS and PMOS transistors of the inverters are equal, the delay of each inverter stage will be approximately [26]:

$$
\begin{equation*}
\tau_{i n v}=\frac{C_{G}\left(1+G_{M} R_{V}\right)}{G_{M}} \tag{3.2}
\end{equation*}
$$



Fig. 19. Conventional Voltage Controlled Ring Oscillator.


Fig. 20. Voltage Controlled Ring Oscillator Using Variable Resistors.


Fig. 21. Implementation of Voltage Controlled Ring Oscillator Using Variable Resistors.
where $R_{V}$ is the variable resistor. The oscillation frequency is given by [26]:

$$
\begin{equation*}
f_{o s c}=\frac{G_{M}}{2 n\left(1+G_{M} R_{V}\right)} \tag{3.3}
\end{equation*}
$$

In this approach a large frequency drift can be achieved over the conventional designs.
The implementation of this circuit is shown in Figure 21. The variable resistor is implemented using complementary PMOS and NMOS transistors and the inverter is a simple CMOS inverter. The resistance of the complimentary MOS switches is varied by controlling their gate voltage. This implementation was used since it provided the best frequency control and stable output levels.

In our implementation, an additional degree of randomness was obtained, by the novel proposed idea of combining two VCOs for each RNG, one with three and another with five stages of inverters. Their frequencies ranged from 0.862 GHz to 1.188 GHz (period range $0.841 \mathrm{~ns}-1.159 \mathrm{~ns}$ ) and from 0.549 GHz to 0.751 GHz (period range $1.331 \mathrm{~ns}-1.819 \mathrm{~ns})$ respectively, when the control voltage ranged from 0 to 0.4 V .

## C. Design of the RNG Array

If the scheme of Figure 17 is used to produce samples at 1 GHz , a very high frequency oscillator is needed which is difficult to achieve in a CMOS $0.18 \mu \mathrm{~m}$ process. Moreover, the mean frequency separation and the jitter spread of the oscillators is not enough to produce a random sequence at such high speeds.

In our implementation, two VCO's are combined to obtain a sequence with very high jitter to mean period ratio. Their combined period spread is about 1 ns and the distribution resembles a uniform one as oppose to a Gaussian, since there are two mean periods with 0.5 ns separation, see Figure 22 . This property allows a 1 GHz random sequence to be obtained by sampling the combined VCOs output at 1 GHz (1ns period). The uniform distribution is obtained by controlling the frequency of the VCOs by complimentary random sequences provided from another RNG in the array, such that their frequencies are shifting in opposite directions and consequently providing sufficient jitter spread and increased phase noise behavior.

The design of a typical RNG in the array is illustrated in Figure 23a. The outputs of the two VCOs are followed by two inverters acting as buffers to reduce the rise and fall times. The XOR operation combines the two random sequences to yield an output of higher statistical quality. The XOR operation is preferred due to the equal probability of 0 s and 1 s at the output. The output of the XOR gate is sampled at


Fig. 22. Period Spread of the Two VCOs.

(a)

(b)

Fig. 23. (a) The Novel RNG Implementation. Many such RNGs are used in parallel to generate an array, (b) Decorrelating XOR and Array Implementation.

1 GHz clock rate by using a D flip-flop to yield the random output bit stream. Finally an XOR gate is used in the configuration shown in Figure 23b to remove any possible correlations to consecutive bits [25].

To produce the controlling signals for each VCOs in individual RNGs we use Q and Q' outputs of other RNG's in the array 23b. To increase rise and fall times of the controlling signal, so that they are uniformly distributed, we used small inverters before feeding them to the control terminals. A voltage divider was used after the inverters to change the range of 0 to 1.8 V to 0 to 0.4 V .

Each RNG consumes about 0.6 mW and occupies negligible area compared to the rest of the design. This is also an advantage over previous designs.

## D. Results

Three RNGs were simulated in Cadence for the results shown in Table IV. The output of the first RNG (RNG1) was fed to the controlling terminals of the second RNG (RNG2) and similarly the third RNG (RNG3) used the output of RNG2 as controlling signals and its output was fed to RNG1. 20000 sample sequences were generated. The statistical averages were calculated by using the 1 and -1 representation so that they are easier to compare. Along with the statistical averages, the three RNGs were validated by using the FIPS 140-1 tests [27]. The FIPS 140-1 are tests provided by the National Institute of Standard and Technology in accordance of security requirements for cryptographic modules and are explained below.

- The Monobit Test

1. Count the number of ones in the 20000 bit stream. Denote this quantity by $X$.
2. The test is passed if $9654<X<10346$.

- The Poker Test

1. Divide the 20000 bit stream into 5000 contiguous 4 bit segments. Count and store the number of occurrences of each of the 16 possible 4 bit values. Denote $f(i)$ as the number of each 4 bit value $i$ where $0 \leq i \leq 15$.
2. Evaluate the following:
$X=(16 / 5000) *\left(\sum_{i=0}^{15}[f(i)]^{2}\right)-5000$
3. The test is passed if $1.03<X<57.4$.

- The Runs Test

1. A run is defined as a maximal sequence of consecutive bits of either all ones or all zeros, which is part of the 20000 bit sample stream. The incidences of runs (for both consecutive zeros and consecutive ones) of all lengths ( $\geq$ 1) in the sample stream should be counted and stored.
2. The test is passed if the number of runs that occur (of lengths 1 through 6 ) is each within the corresponding interval specified below. This must hold for both the zeros and ones; that is, all 12 counts must lie in the specified interval. For the purpose of this test, runs of greater than 6 are considered to be of length 6 .

| Length of Run | Required Interval |
| :---: | :---: |
| 1 | $2267-2733$ |
| 2 | $1079-1421$ |
| 3 | $502-748$ |
| 4 | $223-402$ |
| 5 | $90-223$ |
| $6+$ | $90-223$ |

- The Long Run Test

1. A long run is defined to be a run of length 34 or more (of either zeros or ones).
2. On the sample of 20000 bits, the test is passed if there are NO long runs.

Table IV shows that the mean and $\sigma$ values are close to ideal and that there is not any significant autocorrelation (correlation between the sequence and its delayed version by $1,2,3$ or 4 samples respectively). Moreover, all the FIPS $140-1$ tests [27], that give a measure of the quality of the random number generators, passed successfully. The quality of the RNG measured is comparable to the existing schemes reported in the literature with 10X improvement in the speed performance. Results also indicate little or no cross-correlation between the numerous RNGs in an array.

## E. Conclusions

In this chapter a technique is shown to produce many uncorrelated random bit streams at high speeds. The technique is based on the jitter and phase noise of VCOs implemented using simple ring oscillators. The overall circuit provides true random bit streams at 1 GHz , of good statistical quality at low cost, small area and low power consumption. The results show that there is no cross-correlation between the different RNGs in the array and that this technique passes most of the FIPS 140-1 tests. The circuit was implemented in CMOS $0.18 \mu \mathrm{~m}$ process.

Table IV. Statistical Averages and FIPS 140-1 Tests.

| Test | Pass | RNG1 | RNG2 | RNG3 |
| :---: | :---: | :---: | :---: | :---: |
| Mean |  | -0.0230 | -0.0278 | -0.0203 |
| $\sigma$ | 0.9997 | 0.9996 | 0.9998 |  |
| Autocorrelation 1 |  | -0.0113 | -0.0081 | 0.0089 |
| Autocorrelation 2 |  | -0.0242 | -0.0180 | -0.0269 |
| Autocorrelation 3 |  | 0.0394 | 0.0491 | 0.0372 |
| Autocorrelation 4 |  | 0.0110 | -0.0056 | 0.0079 |
| Monobit | $9654-10346$ | 9770 | 9722 | 9797 |
| Poker | $1.03-57.4$ | 41.72 | 36.56 | 28.83 |
| Runs1 0s | $2267-2733$ | 2460 | 2469 | 2375 |
| Runs1 1s | $2267-2733$ | 2532 | 2522 | 2403 |
| Runs2 0s | $1079-1421$ | 1383 | 1343 | 1336 |
| Runs2 1s | $1079-1421$ | 1443 | 1425 | 1370 |
| Runs3 0s | $502-748$ | 569 | 567 | 620 |
| Runs3 1s | $502-748$ | 551 | 572 | 664 |
| Runs4 0s | $223-402$ | 284 | 285 | 268 |
| Runs4 1s | $223-402$ | 249 | 244 | 255 |
| Runs5 0s | $90-223$ | 190 | 197 | 177 |
| Runs5 1s | $90-223$ | 159 | 149 | 133 |
| Runs6+ 0s | $90-223$ | 170 | 179 | 180 |
| Runs6+ 1s | $90-223$ | 123 | 129 | 131 |
| Long Run | 0 | 0 | 0 | 0 |
|  |  | RNG1,RNG2 | RNG1,RNG3 | RNG2,RNG3 |
| Cross-correlation |  | -0.0038 | -0.0057 | 0.0089 |
|  |  |  |  |  |
|  |  |  |  |  |

## CHAPTER IV

## DESIGN AND IMPLEMENTATION

The 1GSample/s 6-bit Flash A/D Converter with the Combined Chopping and Averaging Technique described has been implemented in CMOS $0.18 \mu \mathrm{~m}$ process. In this chapter the building blocks of the ADC are discussed, Figure 24.

The analog part of the 6 -bit flash ADC consists of 63 comparator slices (plus an additional 12 dummy slices for averaging termination purposes), each one using a 4-input chopped differential preamplifier, a second preamplifier and two latches. In addition a Sample and Hold is used.

The digital part includes Bubble Correction, Gray Encoder, Gray to Binary Decoder and an array of Random Number Generators to perform chopping at the input of the 4 -input preamplifier.

## A. Sample and Hold

In flash ADCs, although the input is sampled directly at the comparators, it is essential to use a Sample-and-Hold (S/H) at the input, especially for high speed sampling rates. The $\mathrm{S} / \mathrm{H}$ improves the dynamic performance of the ADC because by holding the analog input unchanged during one sampling cycle, the errors due to skews in clock delivery, limited input bandwidth prior to latch regeneration, signal dependent dynamic nonlinearity and aperture jitter are removed [5], [8].

For the intended application and for 6-bit resolution a sample-and-hold with an SFDR of at least 55 dBc is needed. Closed loop configurations provide good linearity and dynamic range but cannot achieve very high speeds. In order to operate at 1 GHz , the $\mathrm{S} / \mathrm{H}$ has to be a simple open-loop configuration. The best solution is to use a simple NMOS switch connected to a sampling capacitor, followed by a buffer.


Fig. 24. Block Diagram of the Proposed ADC.


Fig. 25. Sample and Hold Implementation.

In Figure 25 the implementation of Sample-and-Hold is shown. For the intended speed complimentary switches cannot be used. We use an NMOS switch because it can be fast enough. But, the common mode has to be very low in order to achieve high linearity and dynamic range. For this reason the common mode at the input is 300 mV for $1 V_{p-p}$ differential input (single ended signal lie in the region 50 mV to 550 mV ). The NMOS switch is followed by a dummy switch driven by the complimentary sampling clock. The dummy switch is mainly used to decrease the charge injection and lower the common mode jump of the active switch [5], [28]. Since both the source and drain of the dummy switch are connected to the hold mode the size of the dummy NMOS is chosen have of the size of the NMOS switch and then tuned through simulations. Finally, a PMOS source follower with the source tied to the well to eliminate nonlinear body effect is used as buffer. The transistor sizes are designed in order for the sample-and-hold to drive the capacitive load of the next comparator stages (preamplifiers and chopping switches), which is approximately 2.6 pF for each single-ended branch. The PMOS source follower elevates the common mode at around 1V.

The Sample-and-Hold was simulated at 1 GHz sampling rate with an input of $F_{\text {in }}=256.8359375 \mathrm{MHz}$. The circuit level and postlayout simulations are shown in Figures 26, 27, respectively. The Sample-and-Hold has SFDR of about 60 dBc and SNDR of about 59 dB for both cases. The power consumption is about 36 mW .

## B. Comparator Design ${ }^{\dagger}$

In Figure 28 the comparator's architecture level design is shown. The reported comparator has superior performance at speeds as high as 1.25 GHz . The main compo-

[^2]

Fig. 26. Sample and Hold Dynamic Performance (Circuit Level Simulation) with Fin $=256.8359375 \mathrm{MHz}$ at 1 GHz Sampling, 1024pt FFT.


Fig. 27. Sample and Hold Dynamic Performance (Postlayout Simulation) with Fin $=256.8359375 \mathrm{MHz}$ at 1 GHz Sampling, 1024pt FFT.


Fig. 28. Chopped Comparator.
nents of the design that merit special consideration are the design of the preamplifier, the random number generators and the switches used for chopping. The rest of the blocks used in the design: $2^{\text {nd }}$ preamplifier and latches are typical designs used for high speed comparators found in the literature [5], [8].

Averaging was used in the $1^{\text {st }}$ preamplifier only, as shown in Figure 28, because of the relaxed requirements due to chopping. Also chopping is performed only on the $1^{s t}$ preamplifier, otherwise it would affect the averaging stage.

## 1. Preamplifier

In order to achieve high conversion speed, a dual differential amplifier was used as shown in Figure 29 [5], [8]. The advantage of this architecture is that the comparison of input and reference voltage is continuous and the there is no need to sample first the difference between input and reference voltage, which caused increased delay. Moreover, it provides us with reduced number of switches, and consequently reduces the switching noise in the system.

This architecture amplifies the differences of $\mathrm{V}_{\text {in+ }}-\mathrm{V}_{\text {ref+ }}$ and $\mathrm{V}_{\text {ref- }}-\mathrm{V}_{\text {in }}$ respectively and then adds them together. The sources of offsets in the preamplifier are


Fig. 29. $1^{\text {st }}$ Preamplifier (4-input).


Fig. 30. Chopped 4-input Preamplifier.
the mismatch between the input transistors of each differential pair and the mismatch between the two differential pairs. Compared to the two-input preamplifier design, the offset contribution due to mismatch is increased by a factor of $\sqrt{2}[8]$.

The chopping for this preamplifier is more complicated compared to the preamplifier in Figure 12 because of a four-input structure. We use two different random sequences to completely randomize the offset. In Figure 30 one random sequence (r1) chops simultaneously $\mathrm{V}_{\text {in+ }}$ with $\mathrm{V}_{\text {ref+ }}$ and $\mathrm{V}_{\text {ref- }}$ with $\mathrm{V}_{\text {in- }}$, randomizing the offset due to the mismatch of the input transistors of each differential pair. The output of the preamplifier is inverted and hence the same sequence is used to chop the output. The other random sequence (r2) chops simultaneously $\mathrm{V}_{\text {in+ }}$ with $\mathrm{V}_{\text {ref- }}$ and $\mathrm{V}_{\text {ref+ }}$ with $\mathrm{V}_{\text {in- }}$, randomizing that way the offset due to mismatch between pairs. This chopping doesn't affect the output of the preamplifier and hence, doesn't require chopping again with this sequence (r2) at the output.

A reset switch is inserted between the two output nodes as shown in Figure 29 [5]. When the preceding sample-and-hold circuit is in track mode the reset switch is turned on to erase the residual voltage from the previous sample and eliminate hysteresis due to memory. The switch is off when the preceding sample-and-hold is
in the hold mode. The chopping occurs during the reset cycle and therefore doesn't affect the settling of the amplifier's output.

The reset switch relaxes the Gain and Gain Bandwidth (GBW) requirement of the amplifier [5]. The higher the DC gain the lower the required GBW. An increase in the load resistors (diode connected PMOS loads) to increase the voltage gain will lower the output common mode, which can cause the input transistors to enter the triode region. In this design the DC gain is 3 and the GBW is 2.8 GHz . The output common mode is about 1V again to ease the design of the next amplification stage. The reported 4 -input preamplifier consumes about 0.2 mW .

## 2. $2^{\text {nd }}$ Preamplifier and $1^{\text {st }}$ Latch

In order to overcome the dynamic offset in the regenerative latch a second preamplifier is needed. The $2^{\text {nd }}$ preamplifier can be combined with a latch as shown in Figure 31. The $2^{\text {nd }}$ preamplifier and the $1^{\text {st }}$ latch consist of an input differential pair and a latch pair respectively, both sharing diode connected PMOS loads [8].

The $2^{\text {nd }}$ preamplifier reduces the offset of the latch as seen at the input by its gain. The major speed limitation of the $2^{\text {nd }}$ preamplifier and the $1^{\text {st }}$ latch is overdrive recovery. As in the previous preamplifier design, reset switches are inserted to overcome this problem. The CLK signal is the same clock used to reset the $1^{\text {st }}$ preamplifier and CLKN is its complementary clock. In reset mode, CLK is low and the output amplifies the input signal, while a shorting switch at the latch erases the memory of previous decision [5].

## 3. $2^{\text {nd }}$ Latch

In [5] a high speed latch has been reported, which is shown in Figure 32. The circuit uses a single phase clock (CLK) and provides rail-to rail output to the SR latch. The


Fig. 31. $2^{\text {nd }}$ Preamplifier and $1^{\text {st }}$ Latch.


Fig. 32. $2^{\text {nd }}$ Latch.
$2^{\text {nd }}$ latch regenerates the output of the $1^{\text {st }}$ latch further for increased speed.
As in the previous cases of preamplifier and $2^{\text {nd }}$ preamplifier and $1^{\text {st }}$ latch the main speed limitation is overdrive recovery. A fast reset scheme is introduced by resetting the output through two parallel discharge paths. In the next half clock cycle of regeneration a pair of CMOS cross coupled inverters steer the tail current from one side to the other, providing fast regeneration, [5].

Table V. Table of Truth for SR NOR Based Latch.

| $\mathbf{S}$ | $\mathbf{R}$ | $\mathbf{Q}$ | $\mathbf{Q N}$ |
| :---: | :---: | :---: | :---: |
| 0 | 0 | Q (latch) | QN (latch) |
| 0 | 1 | 0 | 1 |
| 1 | 0 | 1 | 0 |
| 1 | 1 | 0 | 0 |

## 4. SR Latch

The SR Latch used is shown in Figure 33. It is nothing else but a simple SR NOR gate based flip-flop, see Figure 34. The table of truth is shown on Table V.

## 5. Chopping Switches

Fast and linear switches are needed for 1 GHz operation. The applied signals are $1 \mathrm{Vp}-\mathrm{p}$ differential (or $0.5 \mathrm{Vp}-\mathrm{p}$ single ended) with a common mode of 1 V . NMOS or PMOS switches alone are not linear enough for the given input common mode (which is usually around the middle of voltage supply). On the other hand, complimentary switches are not fast enough and are not suitable for 1 GHz operation. Convetional bootstrapping techniques cannot be used because of area overhead.

Therefore, we use a simple boosting technique proposed in [29] as shown in Figure 35. The NMOS transistor is a quasi-floating-gate transistor used as an analog switch. It has its gate weakly tied to one given voltage (Vboost) through a large resistor implemented by a PMOS transistor as shown in Figure 35. The gate of the switch is coupled to the clock signal through a small capacitor. The capacitor performs a level shift of approximately Vboost.


Fig. 33. SR Latch with NOR Configuration.


Fig. 34. SR Flip-Flop with NOR Configuration.

In our design a boost voltage of 1 V is used. That way the applied voltage at the gate of the switch is 1 V to 2.8 V and $\mathrm{V}_{G S}$ is always below 0.25 V at OFF mode and always higher than 1.55 V at ON mode. This provides sufficient voltage overdrive for a low on-resistance of the switch during the on-phase.

## 6. Averaging

The MATLAB simulations for chopping and averaging combined show that the selection of the averaging resistor should be around 0.1 to 1 times the resistive load of the amplifier. The smaller the averaging resistor the higher the averaging effect, but nonlinearities at the ends of the array increase. Moreover, by decreasing the averaging resistor the preamplifier settling time is increased due to the decrease in the amplifier gain. In [30] it is shown that when $R_{2} / R_{1}=0.1$, the optimum error correction is close to optimum. But it is good to choose a ratio little higher than that to prevent a steep rolloff in the signal gain [30]. In this design a ratio of $R_{2} / R_{1}=0.25$ is chosen.


Fig. 35. Chopping Switch.

The PMOS loads of the $1^{\text {st }}$ preamplifier have a resistance of $80 k \Omega$ and the averaging resistors were selected to be $20 k \Omega$.

In addition, in [30] an extensive analysis is made on the optimum number of dummy preamplifiers. The optimum is achieved when when the dummy preamplifiers are $1 / 3$ of the total preamplifiers (active + dummies). But even when this number is in the range of $1 / 6$ to $1 / 2$ the averaging result is close to optimum. This means that even when the dummy preamplifiers are as little as $1 / 6$ of the total number of preamplifiers the nonlinearities are sufficiently removed.

Furthermore, in differential systems it is possible to terminate the averaging network using cross-connection (Moebius band averaging compensation) as shown in Figure 36, [8], [30], which improves the the nonlinearity compensation. The averaging network in this case seems as an infinite resistive array, since one edge cross-connects to other edge. However, even in this case few dummy preamplifiers are needed [30].

In this design Moebius band averaging compensation has been used, with the use of 12 dummy preaplifiers ( 6 on each edge). So the number of dummy preamplifiers use is $\frac{12}{63+12}=\frac{1}{6.25}$ of the total number of preamplifiers.

## 7. Array of Random Number Generators

With the preamplifier architecture described, two random number generators (RNGs) are needed for each comparator to enable the chopping algorithm. Assuming that only 20 preamplifiers are in linear region, 40 RNGs for the whole ADC would be enough. For symmetry reasons we used an array of 42 RNGs, such that every RNG will be used exactly in 3 comparators. For example RNG1 and RNG2 will be used for chopping the comparators 1, 22, 43 and etc.

As it was mentioned before each RNG consumes about 0.6 mW and occupies negligible area compared to the rest of the design.


Fig. 36. Cross Connection in Differential Averaging Resistive Networks (Moebius Band Averaging Compensation).


Fig. 37. Comparator Overdrive Recovery.
8. Comparator Performance

The proposed comparator can successfully resolve differences of 15 mV (6-bit resolution for $1 \mathrm{Vp}-\mathrm{p}$ input swing) at 1.25 GHz . The comparator without the 2 RNGs consumes about 1.8 mW and in total consumes (with RNGs) about 3 mW .

To test the dynamic performance of the converter, an overdrive recovery test was performed. In Figure 37 an overdrive recovery test is shown at 1.25 GHz conversion speed. In this test, the input changes randomly from 500 mV to $-15 \mathrm{mV}\left(+\mathrm{V}_{\text {full-scale }}\right.$ to -1 LSB for $1 \mathrm{Vp}-\mathrm{p}$ input swing and 6 -bit resolution). The plot shows that the comparator switches polarity correctly for the appropriate inputs.

The chopped comparator presented in this section can be used in many other applications and other Analog-to-Digital architectures. Offsets at the comparator are randomized giving a close-to-zero average offset performance. The offsets will only appear as white noise at the output power spectral density. Due to chopping, the proposed comparator is stipulated to provide designers with a much higher dynamic range in any $A / D$ converter system that suffers from linearity due to comparator offsets.

## C. Resistor Ladder

In a flash Analog-to-Digital Converter the reference voltages for all the comparators are generated from a resistor ladder. In case of a continuous time system, like this design, the input signal and the reference voltage are connected directly to the differential pair of the $1_{s t}$ preamplifier. The differential pair couples the input signal and the reference voltage applied to the gate terminals through the $C_{g s}$ capacitance of each MOS transistor, since the source terminals of the transistors are tied together. To avoid significant reference ladder feedthrough, the maximum impedance of the reference ladder has to be calculated.

For a single differential pair the capacitance between the input signal and the ladder is equal to $\frac{1}{2} C_{g s}$, [8]. With $n_{s}$ stages in parallel the total input capacitance loading the ladder becomes:

$$
\begin{equation*}
C_{\text {intot }}=n_{s} \frac{1}{2} C_{g s} \tag{4.1}
\end{equation*}
$$

The signal feedtrough is maximum at the middle of the ladder. In [8] a good estimate about the maximum value of the ladder resistance is given by:

$$
\begin{equation*}
R_{\text {laddermax }}=\frac{4 \frac{V_{\text {mid }}}{V_{\text {in }}}}{\pi f_{\text {in }} C_{\text {intot }}}=\frac{4 \Phi}{\pi 2^{n} f_{\text {in }} C_{\text {intot }}} \tag{4.2}
\end{equation*}
$$

where $\Phi$ determines the amount of input signal feedthrough in LSBs, $f_{i n}$ the maximum input frequency and $n$ the number of bits.

For the 4 -input preamplifier there are two choices for resistor ladder. One is to use a single ladder where $n_{s}$ in Eq. 4.1 becomes double, or to use two different ladders, one for each pair of of the 4-input preamplifier, with the same High and Low taps. In both cases the actual total resistance calculated from Eq. 4.2 is equal. In this design, in order to ease the layout, two different ladders were used.

The $C_{g s}$ of each input transistor of the 4 -input preamplifier is around 15 fF . With $n_{s}=75, \Phi=1 L S B$, and $f_{i n}=500 \mathrm{MHz}$, Eq. 4.1 and Eq. 4.2 give:

$$
\begin{equation*}
R_{\text {laddermax }}=70 \Omega \tag{4.3}
\end{equation*}
$$

Two resistor ladders of $70 \Omega$ each, with $V_{\text {refHi }}-V_{\text {refLo }}=586 \mathrm{mV}$ ( 0.5 V for the 63 comparators and rest for the dummy ones) were used. In the designed chip the High and Low voltage taps come as inputs to the chip so that they can be trimmed externally. The resistor ladders consume in total 17 mW .

## D. Clock Generator

The clock generator used to provide the complimentary clocks CLK and CLKN at 1 GHz is shown in Figure 38. The clock generator uses as input a differential sine wave out-of-chip of 1 GHz frequency 0 V to $1.8 \mathrm{~V} V_{p-p}$ and 0.9 V common mode. By using CMOS inverters the sine wave is transformed in square wave. The sizes of the inverters are based on the clock loads which are summarized in Table VI and are designed such that the rise and fall times are less than 120 ps.

The clock jitter degrades the SNR of the ADC. The SNR of an ideal sampling system with only clock jitter is given by [31]:


Fig. 38. Clock Generator.

Table VI. Clock Loads.

|  | CLK | CLKN |
| :---: | :---: | :---: |
| S/H | 1072 fF | 360 fF |
| 1st Preamplifier | 24 fF | 0 fF |
| 2nd Preamplifier | 0 fF | 36 fF |
| 1st Latch | 12 fF | 6 fF |
| 2nd Latch | 12 fF | 0 fF |
| Total Comparator | 48 fF | 42 fF |
| Total Slices | 3024 fF | 2646 fF |
| RNG Array | 0 fF | 672 fF |
| Total | 4.1 pF | 3.7 pF |

$$
\begin{equation*}
S N R=-20 \log \left(2 \pi f_{i n} \delta T_{r m s}\right) \tag{4.4}
\end{equation*}
$$

where $\delta T_{r m s}$ is the rms clock jitter and $f_{i n}$ the maximum input frequency.
Ideal SNR for 6 -bit resolution ADC is 38 dB . To obtain that SNR with an input frequency of 500 MHz , Eq. 4.4 requires a clock jitter less than 4 ps.

## E. Digital Encoder

The Digital Encoding part is needed to transform the thermometer output code of the flash ADC to binary code. The components included in the encoder are bubble correction, gray encoded ROM, Gray-to-Binary and Decimation buffers.

## 1. Bubble Correction

A comparator of an ADC can be often metastable, which means that by the end of the conversion period the output of the comparator hasn't reached a decision yet about the sample. This happens especially when the difference of the input and the reference is close to zero. Moreover, metastability at the comparators and input offsets can cause bubbles at the thermometer code (a zero inside a series of ones or vice versa). Bubbles make the transition point (from ones to zeros) hard to detect since their might be more than one. Furthermore, if there are more than one transition points and they drive a ROM with binary output the error could be very large [31].

An simple and effective way to compensate for bubbles is to use NAND gates as shown in the configuration of Figure 39, [31]. The Q output of each comparator (taken from the SR Latch) is connected to a 3-input NAND gate, and the above two QN comparator outputs are connected to the other two inputs of the gate. Each gate is followed by an inverter to make the output code 1-of-n.

## 2. Gray Encoded ROM

Metastability and bubbles in flash A/D converters can be suppressed using Gray encoding as an intermediate stage between thermometer (or 1-of-n) and binary output [31]. For 6 -bit it is very common to use either Gray encoded ROM or Gray encoder implemented by gates.

In this ADC a NOR ROM was designed. A NOR ROM consists of PMOS pull-up and NMOS pull down devices, see Figure 40 [32]. It is called NOR ROM because the bit line constitutes nothing other than a pseudo-NMOS NOR gate with the word lines as inputs. An $N \times M$ ROM is a combination of $M$ NOR gates with at most $N$ inputs (for a fully populated column). For this encoder a $63 \times 6 \mathrm{ROM}$ is needed.


Fig. 39. Bubble Correction Using NAND Gates.


Fig. 40. A $4 \times 4$ NOR ROM.

To keep the cell size and the bit line capacitance small, the pull-down device (NMOS) has to be as small as possible. Furthermore, the resistance of the pullup device (PMOS) should be larger than the pull-down resistance to guarantee a sufficient low level [32].

Initially, the sizes of PMOS pull-down and NMOS pull-up transistors were designed to be $0.6 \mu \mathrm{~m} / 0.2 \mu \mathrm{~m}$ and $0.4 \mu \mathrm{~m} / 0.2 \mu \mathrm{~m}$ respectively. However, after the postlayout simulations, the width of both transistors was increased by 15 times, simply because the bit lines in the layout had to run through the whole vertical dimension of the chip, to cover the full range of the comparators, which increased drastically their capacitive load and hence the bit line propagation delay. The word line delay is negligible since it is affected by only the 6 NMOS on each word line.

## 3. Gray Decoder

After the ROM the 6-bit Gray encoded output has to be transformed to Binary code. The Gray to Binary decoder can be seen in Figure 41. D-flip-flops are used to sample and latch the output at 1 GHz . After the XOR gates stage the Binary output is ready.

## 4. Decimation

High speed signals are difficult to measure in standard logic analyzers. For this reason, the signals are decimated at $1 / 16$ of the clock speed as shown in Figure 41.

The clock divider used to provided the $1 / 16$ clock speed is shown in Figure 42. The signals are now sampled and latched with D-flip-flops using the $1 / 16$ clock, Figure 41. Next to this stage a couple of inverters are used to drive the inputs of the output buffers.


Fig. 41. Gray to Binary Decoder and Decimation Stage.


Fig. 42. 1/16 Clock Divider.

## F. Output Buffers

The chip pads and the probes used to connect the test points to a logic analyzer have large parasitics. The model of the chip pad and the probe used for the simulations is shown in Figure 43.

In order to drive the large parasitics shown in Figure 43 an output buffer is needed. The outputs that are being measured are the 6 -bit output of the ADC, which is decimated at $1 / 16 \mathrm{GHz}$ and 6 RNGs from the array. The RNG's weren't decimated because it is important to measure their performance at 1 GHz . The output buffer used is shown in Figure 44. It is a differential amplifier with the resistive load applied outside the chip, as shown in Figure 43 after the pad model. $R_{b u f f}$ is an external resistor to adjust the performance of the output buffer.

The output buffers are designed to drive the load for a 1 GHz signal with 370 mV swing. Most of the logic analyzers need at least $500-600 \mathrm{mV}$ swing but it is impossible to drive these loads at 1 GHz . Besides, to test the RNGs a state of the art equipment will be needed. If the signal is slower then the swing can be 600 mV . The transients that show the output buffer performance are shown in Figure 45.


Fig. 43. Pad and Probe Parasitic Model.

## G. Layout

The final layout of the chip can be seen in Figure 46. The different blocks of the ADC are marked. The chip has approximately a size of $4.2 \mathrm{~mm} \times 2.1 \mathrm{~mm}$ and occupies $8.8 \mathrm{~mm}^{2}$. The pads are located toward the edges of the chip and the rest empty space is filled with metal layers to fulfil the metal density requirements.

## H. Post Layout Performance

A post layout simulation was run to check the integrity and performance of the designed chip. An SFDR measurement was performed at 1 GHz sampling rate with an input of $F_{\text {in }}=256.8359375 \mathrm{MHz}$. SFDR was measured at 44.4 dBc , SNDR was 36.1 dB and the effective number of bits 5.7 bits, see Figure 47 . This was a draft simulation to test the chip before fabrication. The high and low reference voltage taps that are applied externally could have been adjusted more carefully and obtain


Fig. 44. Output Buffer.


Fig. 45. Output Buffer Performance.


Fig. 46. Chip Layout.


Fig. 47. ADC Dynamic Performance (Postlayout Simulation of the Chip) with Fin $=256.8359375 \mathrm{MHz}$ at 1 GHz Sampling, 1024pt FFT.
an even better dynamic performance.

## CHAPTER V

## TESTING BOARD AND TEST SETUP

## A. Testing Board

A PCB testing board was designed to test the fabricated chip. The schematic is illustrated in Figure 48.

Input signal and clock generators are single ended signals and are converted to differential through balun transformers.

On the board and inside the chip different power supplies for analog and digital ground and Vdd are used to isolate the analog plane from the increased noise of the digital supplies. The two ground supplies are connected at one single point on the board using ferrite bead and the two Vdd supplies are shorted at one single point as well.

In order to filter power supply noise decoupling capacitors are used. Usually decoupling capacitors in the range of $0.01 \mu F-0.1 \mu F$ are needed to be connected between power and ground supplies. However, these capacitors do not provide high frequency decoupling or filtering. The frequency range needed to be decoupled is calculated by:

$$
\begin{equation*}
\text { Bandwidth }=\frac{1}{\pi T_{\text {rise }}}=\frac{1}{\pi \times 120 p s}=2.7 G H z \tag{5.1}
\end{equation*}
$$

An additional capacitor with a resonance frequency given by Eq. 5.1 is needed. In this design the power supplies are decoupled by using a high-K ceramic $0.1 \mu F$ capacitor and an electrolytic $10 \mu F$ capacitor, both with low Effective Series Resistance (ESR). Other external voltages like Vref High and Vref Low are decoupled with analog ground.

External current sources are implemented through variable resistors connected


Fig. 48. Testing Board Schematic.


Fig. 49. PCB Bottom Layer.


Fig. 50. PCB Top Layer.
to power supplies. The ADC and RNG outputs are connected to a MICTOR 38 pin connector which provides low capacitive load and low-voltage high-speed performance.

A Printed Circuit Board (PCB) was manufactured in a 2-layer, no mask, plated through, tin lead re-flow, 0.062 " FR-4 laminate process with standard 1oz. finished copper weight. The PCB bottom and top layers are shown in Figures 49 and 50 respectively. The PCB board has an area of $4.7 \mathrm{in} \times 4.8 \mathrm{in}$ or $11.9 \mathrm{~cm} \times 12.2 \mathrm{~cm}$.

## B. Test Setup

The testing setup to measure the high speed ADC is shown in Figure 51. The signal generator should be phase locked with the clock generator. The outputs from the MICTOR connector are connected to a logic analyzer or a mixed signal oscilloscope through a low-voltage high-speed probe. The results are then analyzed on a PC using MATLAB.


Fig. 51. Test Setup.

## CHAPTER VI

## CONCLUSIONS

This work describes a robust and fault tolerant 1Gsample/s 6-bit flash ADC with very high spurious-free dynamic-range (SFDR). The proposed work shows unprecedented results for the design of high speed flash ADC through the use of chopping along with averaging.

Mere averaging reduces the impact of offset errors of the comparators but has its limitations. Averaging is only effective for the preamplifiers in the middle of the array. Towards the end of the array, it becomes ineffective since the number of the preamplifiers involved in the weighted summation of offsets is less. The offset reduction is still limited by the fact that averaging includes only a few neighboring preamplifiers that are supposed to be in a linear region of operation for the applied input signal. This considerably reduces the effect of averaging, under effects of gradient and spot offset errors. The simulations for mere averaging under these kind of offsets show minimal improvement.

Using chopping along with averaging improves the spectral performance by randomizing any residual offsets and therefore relaxing the design requirements and reducing power consumption and area. The spurious tones associated with offsets, however large, are converted into white noise contributing only into increased noise floor. Thus, the SFDR can be dramatically improved while keeping the SNR in the same level. Even under the cases of severe spot offset errors of linear gradient offsets the chopping technique shows superior performance.

Chopping of all comparators is integrated in the flash ADC design through a very simple and novel design of an array of uncorrelated truly random number generators working at 1 GHz . The new design of RNG array consists of simple digital minimum
sized gates and consumes 0.6 mW per RNG. The proposed array configuration offers unlimited expandability of number of possible RNGs working uncorrelated and at very high speeds. The described technique passes all the FIPS 140-1 tests needed for the security requirements for cryptographic modules, given from the National Institute of Standard and Technology. The proposed array of RNG provides random sequences of good statistical quality at low cost and at speeds much higher than any other RNG implementations in the literature.

The proposed flash ADC has been verified in simulations level (MATLAB, circuit level and postlayout) and shows a superior performance. The anticipated ADC is robust against various types of errors, as the results show more than 15 dB improvement from the raw flash ADC.

The design has been fabricated in TSMC $0.18 \mu \mathrm{~m}$ CMOS process. The chip occupies $8.79 \mathrm{~mm}^{2}$ and consumes about 400 mW from 1.8 V power supply at 1 GHz , verified through post layout simulations. The results on the fabricated chip will be published in a leading journal in the near future.

## REFERENCES

[1] M.J.M. Pelgrom, A.C.J Duinmaijer, and A.P.G. Welbers, "Matching properties of MOS transistors," IEEE Journal of Solid-State Circuits, vol. 24, no. 5, pp. 1433-1439, October 1989.
[2] M.J.M. Pelgrom, H.P. Tuinhout, and M. Vertregt, "Transistor matching in analog CMOS applications," in International Electron Devices Meeting, December 1998, pp. $915-918$.
[3] K. Uyttenhove and M.S.J. Steyaert, "Speed-power-accuracy tradeoff in highspeed CMOS ADCs," IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, vol. 27, no. 12, pp. 1916 - 1926, December 1992.
[4] B. Razavi and B.A. Wooley, "Design techniques for high-speed, high-resolution comparators," IEEE Journal of Solid-State Circuits, vol. 49, no. 4, pp. 280 287, April 2002.
[5] M. Choi and A.A. Abidi, "A 6-b 1.3-Gsample/s A/D converter in $0.35-\mu \mathrm{m}$ CMOS," IEEE Journal of Solid-State Circuits, vol. 36, no. 12, pp. 1847-1858, December 2001.
[6] P.C.S. Scholtens and M. Vertregt, "A 6-b 1.6-Gsample/s flash ADC in $0.18-\mu \mathrm{m}$ CMOS using averaging termination," IEEE Journal of Solid-State Circuits, vol. 37, no. 12, pp. 1599-1609, December 2002.
[7] K. Sushihara, H. Kimura, Y. Okamoto, K. Nishimura, and A. Matsuzawa, "A 6 b 800 MSample/s CMOS A/D converter," in IEEE International Solid-State Circuits Conference, February 2000, pp. $428-429$.
[8] R.J. van de Plassche, CMOS Integrated Analog-to-Digital and Digital-to-Analog Converters, 2nd Edition. Boston: Kluwer Academic Publishers, 2003.
[9] B. Nauta and A.G.W Venes, "A 70-MS/s 110-mW 8-b CMOS folding and interpolating A/D converter," IEEE Journal of Solid-State Circuits, vol. 30, no. 12, pp. $1302-1308$, December 1995.
[10] A.G.W Venes and R.J. van de Plassche, "An 80-MHz, 80-mW, 8-b CMOS folding A/D converter with distributed track-and-hold preprocessing," IEEE Journal of Solid-State Circuits, vol. 31, no. 12, pp. 1846 - 1853, December 1996.
[11] E. Fogleman and I. Galton, "A dynamic element matching technique for reduceddistortion multibit quantization in delta-sigma ADCs," IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, vol. 48, no. 2, pp. 158-170, February 2001.
[12] N. Stefanou and S.R. Sonkusale, "An average low offset comparator for 1.25 GSample/s ADC in $0.18 \mu \mathrm{~m}$ CMOS," in 11th IEEE International Conference on Electronics, Circuits and Systems, ICECS 2004, December 2004, pp. 246 - 249.
[13] N. Stefanou and S.R. Sonkusale, "High speed array of oscillator-based truly binary random number generators," in IEEE International Symposium on Circuits and Systems, ISCAS 2004, May 2004, vol. 1, pp. I - 505-8.
[14] N. Stefanou and S.R. Sonkusale, "Achieving higher dynamic range in flash A/D converters," in IEEE International SOC Conference, September 2004, pp. 175 176.
[15] K. Kattmann and J. Barrow, "A technique for reducing differential non-linearity
errors in flash A/D converters," in IEEE International Solid-State Circuits Conference, February 1991, pp. 170-171.
[16] M.A.T Sanduleanu, A.J.M. Van Tuijl, R.F. Wassenaar, M.C. Lammers, and H. Wallinga, "A low noise, low residual offset, chopped amplifier for mixed level applications," in IEEE International Conference on Electronics, Circuits and Systems, September 1998, vol. 2, pp. 333 - 336.
[17] A.T.K. Tang, "A $3 \mu$ V-offset operational amplifier with $20 \mathrm{nV} / \sqrt{H z}$ input noise PSD at DC employing both chopping and autozeroing," in IEEE International Solid-State Circuits Conference, February 2002, vol. 1, pp. 386 - 387.
[18] New Wave Instruments, Linear Feedback Shift Registers Implementation, M-Sequence Properties, Feedback Tables. [Online]. Available: http://www.newwaveinstruments.com/resources/articles/m_sequence_linear_ feedback_shift_register_lfsr.htm.
[19] A. Gerosa, R. Bernardini, and S. Pietri, "A fully integrated 8-bit, 20 uppercaseMHz, truly random numbers generator, based on a chaotic system," in IEEE 2001 Southwest Symposium on Mixed-Signal Design, February 2001, pp. 152-157.
[20] T. Stojanovski and L. Kocarev, "Chaos-based random number generators-part uppercaseI: analysis [cryptography]," IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 48, no. 3, pp. 281-288, March 2001.
[21] T. Stojanovski, J. Pihl, and L. Kocarev, "Chaos-based random number generators. part II: practical realization," IEEE Transactions on Circuits and Systems

I: Fundamental Theory and Applications, vol. 48, no. 3, pp. 382 - 385, March 2001.
[22] C.S. Petrie and J.A. Connelly, "Modeling and simulation of oscillator-based random number generators," in IEEE International Symposium on Circuits and Systems, May 1996, vol. 4, pp. $324-327$.
[23] M. Bucci, L. Germani, R. Luzzi, A. Trifiletti, and M. Varanonuovo, "A highspeed oscillator-based truly random number source for cryptographic applications on a smart card IC," IEEE Transactions on Computers, vol. 52, no. 4, pp. 403 -409, April 2003.
[24] D.A. Johns and K. Martin, Analog Integrated Circuit Design. New York: John Wiley \& Sons, Inc., 1997.
[25] M.Bucci, L. Germani, R. Luzzi, P. Tommasino, A. Trifiletti, and M. Varanonuovo, "A high speed truly IC random number source for smart card microcontrollers," in Proceedings of the 2003 10th IEEE International International Conference on Electronics, Circuits and Systems, September 2002, vol. 1, pp. $239-242$.
[26] N. Retdian, S. Takagi, and N. Fujii, "Voltage controlled ring oscillator with wide tuning range and fast voltage swing," in Proceedings. 2002 IEEE Asia-Pacific Conference on ASIC, August 2002, pp. 201 - 204.
[27] National Institute of Standard and Technology, "Security requirements for cryptographic modules, FIPS 140-1," Tech. Rep., January 1994.
[28] C. Eichenberger and W. Guggenbuhl, "Dummy transistor compensation of ana$\log$ MOS switches," IEEE Journal of Solid State Circuits, vol. 24, no. 4, pp.

1143-1146, August 1989.
[29] F. Munoz, J. Ramirez-Angulo, A. Lopez-Martin, R.G. Carvajal, A. Torralba, B. Palomo, and M. Kachare, "Analogue switch for very low-voltage applications," Electronics Letters, vol. 39, no. 9, pp. 701-702, May 2003.
[30] Hui Pan and A.A. Abidi, "Spatial filtering in flash A/D converters," IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, vol. 50, no. 8, pp. $424-436$, August 2003.
[31] Behzad Razavi, Principles of Data Conversion System Design. Piscataway, NJ: IEEE Press, 1995.
[32] Jan M. Rabaey, Anantha Chandrakasan, and Borivoje Nikolic, Digital Integrated Circuits: A Design Perspective, 2nd Edition. Upper Saddle River, NJ: Pearson Education, 2003.

## VITA

Nikolaos Stefanou received the Diploma degree in Electrical and Computer Engineering from the National Technical University of Athens, Athens, Greece, in 2001. In August 2001, he joined the Electrical Engineering department at Texas A\&M University, College Station, TX. His research interests include Analog and Mixed Signal Circuit Design, Data Converters and High Speed applications.

Address:
37 Rododafnis St.,
Athens, 15125
Greece.
email:
nstef@ee.tamu.edu, nstefanou@gmail.com


[^0]:    ${ }^{\dagger}$ © 2004 IEEE. Reprinted, with permission, from "Achieving higher dynamic range in flash A/D converters", N. Stefanou and S.R. Sonkusale, IEEE International SOC Conference, September 2004, pp. 175-176.

[^1]:    †⑳04 IEEE. Reprinted, with permission, from "High speed array of oscillatorbased truly binary random number generators", N. Stefanou and S.R. Sonkusale, IEEE International Symposium on Circuits and Systems, ISCAS 2004, May 2004, vol. 1 pp . I-505-8.

[^2]:    ${ }^{\dagger}$ © 2004 IEEE. Reprinted, with permission, from "An average low offset comparator for 1.25 GSample/s ADC in $0.18 \mu \mathrm{~m}$ CMOS", N. Stefanou and S.R. Sonkusale, in 11th IEEE International Conference on Electronics, Circuits and Systems, ICECS 2004, December 2004, pp. 246-249.

