# DESIGN OF A HIGHLY LINEAR RF FRONT-END, A LOW-SPUR FRACTIONAL-N FREQUENCY SYNTHESIZER AND A HIGH-SPEED, LOW-POWER DATA CONVERTER

### A Dissertation

by

## JUNNING JIANG

# Submitted to the Graduate and Professional School of Texas A&M University in partial fulfillment of the requirements for the degree of

## DOCTOR OF PHILOSOPHY

Chair of Committee, Jose Silva-Martinez Co-Chair of Committee, Aydin Karsilayan Committee Members, Duncan M. Walker Sung Il Park Head of Department, Miroslav M. Begovic

December 2021

Major Subject: Electrical Engineering

Copyright 2021 Junning Jiang

#### ABSTRACT

High-performance and low-power complementary metal-oxide-semiconductor (CMOS) circuit design techniques have been widely investigated both in industry and academia. For a 5G receiver, a wideband and linear RF front-end using minimal power is necessary. Meanwhile, a high-speed and low-power analog-to-digital converter (ADC) is needed to digitize the signal. The RF front-end and ADC need clock sources with low jitters and spurs to function properly.

A wideband and linear RF front-end is included in the first project. This front-end aimed at 3-6 GHz with a 200 MHz baseband bandwidth for 5G applications. A cross-coupled common-gate (CG) low noise transconductance amplifier (LNTA) with resistive degeneration was implemented to achieve linearity enhancement and noise reduction. A minimally-invasive filter enhanced out-of-band filtering. A 15.1 dBm input-referred third-order intercept point (IIP3) and a 3.0 dBm 1dB compression point ( $P_{1dB}$ ) over 3 to 6 GHz were demonstrated. The noise figure was less than 5.3 dB at 3 MHz offset. The power consumption was 69.6 mW.

The second project is a time-to-digital converter (TDC) assisted charge pump (CP) phase locked-loop (PLL) aiming at a 2.4-3.9 GHz output range with less than a -100 dBc reference spur and a -90 dBc out-of-band fractional spur. With the TDC, a charge pump with a 4-bit current digital-to-analog converter (DAC) was used to suppress the reference spurs. The digital phase processor filtered out the fractional spurs. Measured reference spur and out-of-band fractional spurs were -108 dBc and -95 dBc, respectively. The root mean square (rms) jitter was 247 fs in fractional-N mode. The power consumption was 15.94 mW at 3.3 GHz.

A 14-bit 1GS/s pipelined analog-to-digital converter is designed in the third project. The current-reuse telescopic amplifier with a class-C slew rate booster in the switched-capacitor MDAC was used as the residue amplifier in each stage. Foreground and background calibrations were used to compensate for the inter-stage gain errors, nonlinearities, memory effects, and dynamic non-linearities. The power consumption was 56.0 mW. After digital calibration, the ADC achieved a Schreier's FoM of 168.4 dB and Walden's FoM of 24.6fJ/conv at Nyquist frequency.

# DEDICATION

To my parents.

#### ACKNOWLEDGMENTS

I would like to thank everyone sincerely and wholeheartedly for their encouragement and support during my Ph.D. student life, especially during the days of the COVID-19 pandemic.

To begin with, I would like to express my deepest gratitude to my Ph.D. advisor, Dr. Jose Silva-Martinez, for his help and advice during my Ph.D. career. He helped me understand profound circuit and system projects in several areas, which have led me into the semiconductor world. During our group meetings, his critical thinking, intuitive understanding, as well as rigorous analysis always enhanced my knowledge of the circuit world. It is my belief that these techniques will benefit my career as an engineer, and I will always value his guidance throughout my time at the university and from this point forward.

I also want to thank Dr. Aydin Karsilayan, my Ph.D. co-advisor, for his suggestions regarding the circuits and systems. His passion for teaching gave me enough courage when I was trying to work with other people. I felt comfortable sharing my knowledge with other people, learning from their feedback, and making progress together with them.

At the same time, I wish to express my thanks to Dr. Duncan Walker, Dr. Sung Il Park, and Dr. Peng Li for their devotion to my Ph.D. career and for becoming my Ph.D. committee members. Their visions from different areas were beneficial and relevant to me.

I feel humbled to be part of Dr. Silva-Martinez's research group, where I have met a lot of group members. I want to thank Dr. Dadian Zhou and Tanwei Yan for spending their valuable time working with me on multiple projects. I also want to thank Dr. Carlos Briseno-Vidrios, Dr. Alexander Edward, Dr. Negar Rashidi, Dr. Haoyu Qian, Dr. Yu-Chung Lo, Dr. Qiyuan Liu, Dr. Eric Park, Dr. Jian Shao, Haiyue Yan, Sungjun Yoon, Shangfeng Qiu and other group members for their willingness to share their friendship and working experiences. It is my life treasure to learn from all who helped me as we worked together on the projects that led to this moment. For those still working in our group, I also want to express my best wishes for your academic lives.

During my days on campus, I also met many other students from other groups. Dr. Xiaosen

Liu, Dr. Congyin Shi, Dr. Haitao Tong, Dr. Fernando Aviles, Dr. Adrianna Sanabria Borbon, Dr. Kunzhi Yu, Dr. Shengchang Cai, Dr. Xin Zhan, Dr. Haibin Hu, Jichen Lu, Shan He, Yin Luo, Lili Yu, Yuanming Zhu, Po-Hsuan Chang, Tong Liu, Ruida Liu for all your kindnesses in my PhD life, I really want to thank you. I also want to thank Tammy Carda, Anni Brunker, Ella Gallagher, Katie Bryan, and Melissa Sheldon of the TAMU Electrical and Computer Engineering Department for their help during my life as a Ph.D. candidate.

To my parents, I wish to thank you for your persistent encouragement during my Ph.D. years, especially as the COVID-19 pandemic progressed. Your support has given me the confidence I needed to overcome difficulties. Finally, I would like to express my gratitude towards everyone who helped me obtain this achievement.

I also want to express my heartfelt condolences to Dr. Edgar Sanchez-Sinencio's family and Dr. Alex Edward's family. Their contributions will be remembered.

### CONTRIBUTORS AND FUNDING SOURCES

### Contributors

This work was supervised by a dissertation committee consisting of Professors Jose Silva-Martinez, Aydin Karsilayan and Sung II Park of the Department of Electrical and Computer Engineering and Professor Duncan Walker of the Department of Computer Science and Engineering. All work for the dissertation was completed independently by the student.

## **Funding Sources**

Graduate study was supported by the National Science Foundation under Grant Number EECS 1547447.

# NOMENCLATURE

| LNTA             | Low Noise Transconductance Amplifier       |
|------------------|--------------------------------------------|
| TIA              | Trans-Impedance Amplifier                  |
| NF               | Noise Figure                               |
| IIP3             | Input-referred Third-Order Intercept Point |
| P <sub>1dB</sub> | 1 dB Compression Point                     |
| PLL              | Phase Locked Loop                          |
| FIR              | Finite Impulse Response                    |
| MAF              | Moving Average Filter                      |
| TDC              | Time-to-Digital Converter                  |
| DTC              | Digital-to-Time Converter                  |
| СР               | Charge Pump                                |
| DAC              | Digital-to-Analog Converter                |
| VCO              | Voltage-Controlled Oscillator              |
| ADC              | Analog-to-Digital Converter                |
| MDAC             | Multiplying Digital-to-Analog Converter    |
| SC               | Switched-Capacitor                         |
| OTA              | Operational Transconductance Amplifier     |
| SFDR             | Spurious Free Dynamic Range                |
| SQNR             | Signal-to-Quantization-Noise Ratio         |
| SNDR             | Signal-to-Noise-and-Distortion Ratio       |
| FoM              | Figure of Merit                            |

# TABLE OF CONTENTS

| DEDICATION       iii         ACKNOWLEDGMENTS       iv         CONTRIBUTORS AND FUNDING SOURCES       vi         NOMENCLATURE       vii         TABLE OF CONTENTS       viii         LIST OF FIGURES       xi         LIST OF TABLES.       xvi         1. INTRODUCTION       1         1.1.1       A Wideband and Highly Linear RF Front-End Design       1         1.1.2       A Fractional-N PLL with Reference and Fractional Spur Reduction Techniques       2         1.2       Thesis Organization       3         2.       A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND PIDB AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS AND COGNITIVE RADIO APPLICATIONS       4         2.1       Introduction       4         2.2       Proposed Receiver Circuit Blocks       11         2.2.2       Proposed Low Noise Transconductance Amplifier       11         2.2.2       A sasive Mixer       22         2.2.4       Trans-impedance Amplifier (TIA)       23 | ABSTRACT ii                                                                                                                                                                       |                |  |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------|--|
| ACKNOWLEDGMENTS       iv         CONTRIBUTORS AND FUNDING SOURCES       vi         NOMENCLATURE       vii         TABLE OF CONTENTS       viii         LIST OF FIGURES       xi         LIST OF FIGURES       xvi         1. INTRODUCTION       1         1.1.1       A Wideband and Highly Linear RF Front-End Design       1         1.1.2       A Fractional-N PLL with Reference and Fractional Spur Reduction Techniques       2         1.2       Thesis Organization       3         2.       A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND P <sub>IDB</sub> AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS AND COGNITIVE RADIO APPLICATIONS       4         2.1       Introduction       4         2.2       Proposed Receiver Circuit Blocks       11         2.2.3       Passive Mixer       22         2.2.4       Trans-impedance Amplifier (TIA)       23                                                                                              | DEDICATION iii                                                                                                                                                                    |                |  |
| CONTRIBUTORS AND FUNDING SOURCES       vi         NOMENCLATURE       vii         TABLE OF CONTENTS       viii         LIST OF FIGURES       xi         LIST OF TABLES.       xvi         1. INTRODUCTION       1         1.1.1 A Wideband and Highly Linear RF Front-End Design       1         1.1.2 A Fractional-N PLL with Reference and Fractional Spur Reduction Techniques       2         1.2 Thesis Organization       3         2. A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND P <sub>1DB</sub> AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS AND COGNITIVE RADIO APPLICATIONS       4         2.1 Introduction       4         2.2 Proposed Receiver Circuit Blocks       11         2.2.1 Source Degenerated Transconductance Amplifier       11         2.2.2 Prossed Low Noise Transconductance Amplifier       22         2.2.4 Trans-impedance Amplifier (TIA)       23                                                                           | ACKNOWLEDGMENTS                                                                                                                                                                   | iv             |  |
| NOMENCLATURE       vii         TABLE OF CONTENTS       viii         LIST OF FIGURES       xi         LIST OF TABLES.       xvi         1. INTRODUCTION       1         1.1.1 A Wideband and Highly Linear RF Front-End Design       1         1.1.2 A Fractional-N PLL with Reference and Fractional Spur Reduction Techniques       2         1.2 Thesis Organization       3         2. A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND PliDB AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS AND COGNITIVE RADIO APPLICATIONS       4         2.1 Introduction       4         2.2 Proposed Receiver Circuit Blocks       11         2.2.1 Source Degenerated Transconductance Amplifier       11         2.2.2 Proposed Low Noise Transconductance Amplifier       22         2.2.4 Trans-impedance Amplifier (TIA)       23                                                                                                                                       | CONTRIBUTORS AND FUNDING SOURCES                                                                                                                                                  | vi             |  |
| TABLE OF CONTENTS       viii         LIST OF FIGURES       xi         LIST OF TABLES       xvi         1. INTRODUCTION       1         1.1.1 Motivation of High-Performance and Power-Efficient Circuit Design       1         1.1.1 A Wideband and Highly Linear RF Front-End Design       1         1.1.2 A Fractional-N PLL with Reference and Fractional Spur Reduction Techniques       2         1.1.3 A High-Speed Low-Power pipelined ADC       2         1.2 Thesis Organization       3         2. A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND P <sub>1DB</sub> AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS AND COGNITIVE RADIO APPLICATIONS       4         2.1 Introduction       4         2.2 Proposed Receiver Circuit Blocks       11         2.2.2 Proposed Low Noise Transconductance Amplifier       11         2.2.3 Passive Mixer       22         2.2.4 Trans-impedance Amplifier (TIA)       23                                         | NOMENCLATURE v                                                                                                                                                                    | 'ii            |  |
| LIST OF FIGURES       xi         LIST OF TABLES.       xvi         1. INTRODUCTION.       1         1.1. Motivation of High-Performance and Power-Efficient Circuit Design       1         1.1.1 A Wideband and Highly Linear RF Front-End Design       1         1.1.2 A Fractional-N PLL with Reference and Fractional Spur Reduction Techniques       2         1.1.3 A High-Speed Low-Power pipelined ADC       2         1.2 Thesis Organization       3         2. A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND P <sub>1DB</sub> AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS AND COGNITIVE RADIO APPLICATIONS       4         2.1 Introduction       4         2.2 Proposed Receiver Circuit Blocks       11         2.2.1 Source Degenerated Transconductance Amplifier       11         2.2.2 Proposed Low Noise Transconductance Amplifier       22         2.2.4 Trans-impedance Amplifier (TIA)       23                                             | TABLE OF CONTENTS vi                                                                                                                                                              | iii            |  |
| LIST OF TABLES       xvi         1. INTRODUCTION       1         1.1.1 Motivation of High-Performance and Power-Efficient Circuit Design       1         1.1.1 A Wideband and Highly Linear RF Front-End Design       1         1.1.2 A Fractional-N PLL with Reference and Fractional Spur Reduction Techniques       2         1.1.3 A High-Speed Low-Power pipelined ADC       2         1.2 Thesis Organization       3         2. A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND PIDB AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS AND COGNITIVE RADIO APPLICATIONS       4         2.1 Introduction       4         2.2 Proposed Receiver Circuit Blocks       11         2.2.1 Source Degenerated Transconductance Amplifier       11         2.2.2 Proposed Low Noise Transconductance Amplifier       22         2.2.4 Trans-impedance Amplifier (TIA)       23                                                                                           | LIST OF FIGURES                                                                                                                                                                   | xi             |  |
| 1. INTRODUCTION.       1         1.1 Motivation of High-Performance and Power-Efficient Circuit Design       1         1.1.1 A Wideband and Highly Linear RF Front-End Design       1         1.1.2 A Fractional-N PLL with Reference and Fractional Spur Reduction Techniques       2         1.1.3 A High-Speed Low-Power pipelined ADC       2         1.2 Thesis Organization       3         2. A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND PIDB AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS AND COGNITIVE RADIO APPLICATIONS       4         2.1 Introduction       4         2.2 Proposed Receiver Circuit Blocks       11         2.2.1 Source Degenerated Transconductance Amplifier       11         2.2.2 Proposed Low Noise Transconductance Amplifier       16         2.2.3 Passive Mixer       22         2.2.4 Trans-impedance Amplifier (TIA)       23                                                                                        | LIST OF TABLES xv                                                                                                                                                                 | vi             |  |
| 1.1       Motivation of High-Performance and Power-Efficient Circuit Design       1         1.1.1       A Wideband and Highly Linear RF Front-End Design       1         1.1.2       A Fractional-N PLL with Reference and Fractional Spur Reduction Techniques       2         1.1.3       A High-Speed Low-Power pipelined ADC       2         1.2       Thesis Organization       3         2.       A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND P <sub>1DB</sub> AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS AND COGNITIVE RADIO APPLICATIONS       4         2.1       Introduction       4         2.2       Proposed Receiver Circuit Blocks       11         2.2.2       Proposed Low Noise Transconductance Amplifier       11         2.2.3       Passive Mixer       22         2.2.4       Trans-impedance Amplifier (TIA)       23                                                                                                                | 1. INTRODUCTION                                                                                                                                                                   | 1              |  |
| 1.1.2       A Fractional-N PLL with Reference and Fractional Spur Reduction Techniques       2         1.1.3       A High-Speed Low-Power pipelined ADC       2         1.2       Thesis Organization       3         2.       A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND P <sub>1DB</sub> AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS AND COGNITIVE RADIO APPLICATIONS       4         2.1       Introduction       4         2.2       Proposed Receiver Circuit Blocks       11         2.2.1       Source Degenerated Transconductance Amplifier       11         2.2.2       Proposed Low Noise Transconductance Amplifier       16         2.2.3       Passive Mixer       22         2.2.4       Trans-impedance Amplifier (TIA)       23                                                                                                                                                                                                              | 1.1Motivation of High-Performance and Power-Efficient Circuit Design1.1.1A Wideband and Highly Linear RF Front-End Design                                                         | 1<br>1         |  |
| 1.2       Thesis Organization       3         2. A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND       P1DB AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS         AND COGNITIVE RADIO APPLICATIONS       4         2.1       Introduction       4         2.2       Proposed Receiver Circuit Blocks       11         2.2.1       Source Degenerated Transconductance Amplifier       11         2.2.2       Proposed Low Noise Transconductance Amplifier       16         2.2.3       Passive Mixer       22         2.2.4       Trans-impedance Amplifier (TIA)       23                                                                                                                                                                                                                                                                                                                                                                                          | <ul> <li>1.1.2 A Fractional-N PLL with Reference and Fractional Spur Reduction Techniques</li> <li>1.1.3 A High-Speed Low-Power pipelined ADC</li> </ul>                          | 2<br>2         |  |
| 2. A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND         P1DB AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS         AND COGNITIVE RADIO APPLICATIONS         4         2.1 Introduction.         4         2.2 Proposed Receiver Circuit Blocks         11         2.2.1 Source Degenerated Transconductance Amplifier         11         2.2.2 Proposed Low Noise Transconductance Amplifier         16         2.2.3 Passive Mixer         22         2.2.4 Trans-impedance Amplifier (TIA)                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 1.2 Thesis Organization                                                                                                                                                           | 3              |  |
| 2.1Introduction                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 2. A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND<br>P <sub>1DB</sub> AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS<br>AND COGNITIVE RADIO APPLICATIONS |                |  |
| 2.2.3       Passive Mixer       22         2.2.4       Trans-impedance Amplifier (TIA)       23                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | <ul> <li>2.1 Introduction</li></ul>                                                                                                                                               | 4              |  |
| 2.2.5 Second-Order Minimally-Invasive Filter                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 2.2.3Passive Mixer22.2.4Trans-impedance Amplifier (TIA)22.2.5Second-Order Minimally-Invasive Filter2                                                                              | 22<br>23<br>25 |  |
| 2.3 Experimental Results    30      2.4 Conclusions and Summary    42                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 2.3 Experimental Results                                                                                                                                                          | 50<br>12       |  |

| 3.     | A 2.3<br>TION | 3-3.9 GHZ FRACTIONAL-N PLL WITH CHARGE PUMP AND TDC CALIBRA-<br>NS FOR REFERENCE AND FRACTIONAL SPUR REDUCTION | 45  |
|--------|---------------|----------------------------------------------------------------------------------------------------------------|-----|
|        | 3.1           | Introduction                                                                                                   | 45  |
|        | 3.2           | PLL as a Dynamic System                                                                                        | 48  |
|        | 33            | A Overview of a Fractional-N PLL system                                                                        | 50  |
|        | 5.5           | 3 3 1 Reference Clock Source                                                                                   | 51  |
|        |               | 3.3.2 Phase-and-Frequency Detector                                                                             | 52  |
|        |               | 3 3 3 Charge Pump Design Challenges                                                                            | 54  |
|        |               | 3 3 4 Voltage Controlled Oscillator                                                                            | 59  |
|        |               | 3.3.5 Loop Filter                                                                                              | 60  |
|        |               | 3.3.6 Feedback Frequency Divider                                                                               | 62  |
|        |               | 3.3.7 Division Ratio Control Block                                                                             | 62  |
|        | 3.4           | Proposed PLL Architecture                                                                                      | 65  |
|        | 011           | 3.4.1 VCO Realization                                                                                          | 66  |
|        |               | 3.4.2 Charge Pump with 4-bit Current DAC Calibration Scheme                                                    | 67  |
|        |               | 3.4.3 Loop Filter Design                                                                                       | 70  |
|        |               | 3.4.4 Programmable Fractional Divider                                                                          | 71  |
|        |               | 3.4.5 Proposed Division Ratio Control                                                                          | 71  |
|        |               | 3.4.6 Digital Phase Processor                                                                                  | 72  |
|        |               | 3.4.7 Transmission Gate Based Phase Frequency Detector (TG-PFD)                                                | 77  |
|        | 3.5           | Proposed PLL system calibration process                                                                        | 78  |
|        |               | 3.5.1 Free-running VCO characterization                                                                        | 79  |
|        |               | 3.5.2 Sub-ranging TDC and DTC Linearity Calibration                                                            | 80  |
|        |               | 3.5.3 Charge Pump Calibration                                                                                  | 82  |
|        |               | 3.5.4 Fractional-N Mode with DPP in the Loop                                                                   | 84  |
|        |               | 3.5.5 Background Calibration of PLL in Fractional-N Mode                                                       | 85  |
|        | 3.6           | Measurement Results                                                                                            | 86  |
|        | 3.7           | Conclusions and Summary                                                                                        | 90  |
| 4. A 1 |               | -BIT 1GS/S LOW POWER PIPELINED ADC WITH COMPREHENSIVE FORE-                                                    |     |
|        | GRO           | OUND AND BACKGROUND CALIBRATIONS                                                                               | 93  |
|        | 4.1           | Introduction                                                                                                   | 93  |
|        | 4.2           | An Overview of Pipelined ADC Structure                                                                         | 99  |
|        | 4.3           | An Overview of Analog and Digital Calibrations                                                                 | 106 |
|        | 4.4           | Proposed Pipelined ADC with Analog and Digital Calibration Techniques                                          | 115 |
|        |               | 4.4.1 A Current-reuse OTA with Slew Rater Boosting Circuit                                                     | 118 |
|        |               | 4.4.2 Foreground analog and digital calibration techniques                                                     | 125 |
|        |               | 4.4.3 Background Calibration Techniques                                                                        | 132 |
|        | 4.5           | Experimental Results                                                                                           | 137 |
|        | 4.6           | Conclusions and Summary                                                                                        | 143 |
| 5.     | SUM           | IMARY AND CONCLUSIONS                                                                                          | 145 |
|        | 5.1           | Conclusions                                                                                                    | 145 |

| 5.2    | Future work | 146 |
|--------|-------------|-----|
| REFERI | ENCES       | 148 |

# LIST OF FIGURES

| FIGUR | E                                                                                                                                                                                                              | Page |
|-------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
| 2.1   | Illustration of frequency and time allocation in cognitive radio systems with power level indicated.                                                                                                           | . 5  |
| 2.2   | Measured power spectral density during the busiest hour of one entire day (between 3:00 p.m. and 4:00 p.m. in [3]), and signal power when integrated in 500 MHz bands.                                         | . 6  |
| 2.3   | Structure comparisons between LNA-first, LNTA-first and mixer-first                                                                                                                                            | , 7  |
| 2.4   | Proposed wideband highly linear receiver system.                                                                                                                                                               | , 9  |
| 2.5   | Comparison of LNTA structures                                                                                                                                                                                  | . 9  |
| 2.6   | Common gate LNTA (a) conventional (b) with source degeneration resistor $R_X$                                                                                                                                  | . 12 |
| 2.7   | Small signal model of CG LNTA degenerated by resistor $R_X$ and load impeadance $Z_{D1}$                                                                                                                       | . 14 |
| 2.8   | Input signal levels over each node labeled for (a) single-ended LNTA with resistive degeneration (b) proposed cross-coupled differential LNTA with resistive degeneration.                                     | . 17 |
| 2.9   | Simplified circuit for calculating NF of cross-coupled and resistive degenerated CG LNTA.                                                                                                                      | . 17 |
| 2.10  | Simulated and calculated NF of proposed LNTA over different $R_X$ values                                                                                                                                       | . 19 |
| 2.11  | Simulated LNTA IIP3 and power consumption over different $R_X$ values                                                                                                                                          | . 20 |
| 2.12  | Simulated small signal current over input voltage as function of frequency                                                                                                                                     | . 21 |
| 2.13  | Calulated and simulated IIP3 for different $R_X$ values                                                                                                                                                        | . 21 |
| 2.14  | A two-stage OTA with feed-forward compensation.                                                                                                                                                                | . 23 |
| 2.15  | A two-stage high speed OTA with feed-forward compensation. The connection including the feedback elements $R_F$ and $C_F$ as well as elements connected at the OTA input are shown in the bottom-right corner. | . 24 |
| 2.16  | Schematic of receiver's baseband showing second-order minimally invasive filter                                                                                                                                | 25   |
| 2.17  | Pseudo differential OTA employing complementary transconductors.                                                                                                                                               | . 26 |

| 2.18 | Single-ended small signal model of proposed baseband filter                                                                                                      | 27 |
|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 2.19 | Simulated magnitudes of impedances of TIA, filter, $C_3$ and nodal total(LNTA output impedance > 100 $\Omega$ , passive mixer on < 10 $\Omega$ , for 3 to 6 GHz) | 28 |
| 2.20 | Micrograph of the chip.                                                                                                                                          | 30 |
| 2.21 | Measurement setup of the chip.                                                                                                                                   | 30 |
| 2.22 | Simulated and measured (de-embedded) $S_{11}$ of the receiver.                                                                                                   | 31 |
| 2.23 | Simulated and measured NF of the receiver                                                                                                                        | 32 |
| 2.24 | Simulated and measured conversion gain: the -3dB frequency is found above 200 MHz.                                                                               | 33 |
| 2.25 | Simulated and measured in-band lienearity around LO of 3 GHz                                                                                                     | 34 |
| 2.26 | Measured out-of-band linearity around LO of 3GHz                                                                                                                 | 34 |
| 2.27 | Conversion gain compression over in-band and out-of-band blocekrs                                                                                                | 36 |
| 2.28 | NF over blocker power at different frequency offset $\Delta f$ without clock source phase noise filtering.                                                       | 37 |
| 2.29 | Measured system $P_{1dB}$ versus blocker frequency offset $\Delta f$                                                                                             | 38 |
| 2.30 | Measured IIP3 versus frequency offset $\Delta f$                                                                                                                 | 39 |
| 2.31 | Simulated EVM performance for 0 dBm input power (QAM-64)                                                                                                         | 40 |
| 2.32 | Measured receiver's performance: gain, NF, IIP3 and P <sub>1dB</sub>                                                                                             | 41 |
| 3.1  | Effects of spurs on a wideband system's SNR.                                                                                                                     | 45 |
| 3.2  | Proposed DPP assisted PLL architecture                                                                                                                           | 47 |
| 3.3  | PLL structure.                                                                                                                                                   | 48 |
| 3.4  | PLL system diagram.                                                                                                                                              | 48 |
| 3.5  | An charge pump fractional-N PLL.                                                                                                                                 | 50 |
| 3.6  | Design challenges of a charge pump fractional-N PLL.                                                                                                             | 51 |
| 3.7  | Diagram of a PFD.                                                                                                                                                | 52 |
| 3.8  | Timing diagram of a PFD.                                                                                                                                         | 53 |

| 3.9  | Transfer function of PFD's noise.                                                        | 53 |
|------|------------------------------------------------------------------------------------------|----|
| 3.10 | A simple charge pump                                                                     | 54 |
| 3.11 | Transfer function of charge pump's noise.                                                | 55 |
| 3.12 | Output ripple from charge pump                                                           | 56 |
| 3.13 | Period ripple at charge pump output                                                      | 57 |
| 3.14 | Transfer function of VCO's noise.                                                        | 59 |
| 3.15 | Second order loop filter.                                                                | 60 |
| 3.16 | Transfer function of loop filter's noise                                                 | 61 |
| 3.17 | Proposed DPP assisted PLL architecture                                                   | 65 |
| 3.18 | Proposed LC-VCO with complementary cross-coupled pair                                    | 66 |
| 3.19 | Proposed charge pump with 4-bit current DAC                                              | 68 |
| 3.20 | Charge pump's main path working process.                                                 | 69 |
| 3.21 | Comparison between current mismatch and timing mismatch.                                 | 69 |
| 3.22 | An third order loop filter design with/without ground bonds                              | 71 |
| 3.23 | Proposed digital phase processor architecture.                                           | 72 |
| 3.24 | Frequency and phase response of the PLL with analog part only and MAF and FIR activated. | 74 |
| 3.25 | Proposed sub-ranging TDC structure                                                       | 75 |
| 3.26 | Proposed sub-ranging DTC structure                                                       | 76 |
| 3.27 | Proposed PLL calibration process.                                                        | 78 |
| 3.28 | Free-running VCO characterization.                                                       | 79 |
| 3.29 | Proposed sub-ranging TDC and DTC calibration                                             | 80 |
| 3.30 | Patterns in phase domain for TDC and DTC calibration.                                    | 80 |
| 3.31 | Charge pump calibration by find the minimum static phase error                           | 82 |
| 3.32 | Reference spur level and static phase error                                              | 83 |
| 3.33 | DPP is included in the loop.                                                             | 84 |

| 3.34 | Chip photo and testbench.                                                  | 86  |
|------|----------------------------------------------------------------------------|-----|
| 3.35 | Phase noise measurement.                                                   | 87  |
| 3.36 | Comparison of measured phase noise with calculated components              | 88  |
| 3.37 | Reference spur measurement.                                                | 89  |
| 3.38 | Fractional spur measurement results.                                       | 90  |
| 4.1  | ADC's high-frequency Schreier's FoM versus speed                           | 96  |
| 4.2  | A conventional pipelined ADC.                                              | 99  |
| 4.3  | Timing diagram of a pipeline ADC                                           | 100 |
| 4.4  | A 2.5-bit stage in sampling phase and evaluation phase                     | 101 |
| 4.5  | 2.5-bit residue amplifier ideal transfer curve                             | 102 |
| 4.6  | Typical residue amplifier used in pipelined ADCs                           | 104 |
| 4.7  | Digital calibration for inter-stage gain errors and non-linearities.       | 107 |
| 4.8  | 2.5-bit residue amplifier ideal transfer curve                             | 109 |
| 4.9  | 2.5-bit residue amplifier transfer curve with lower loop gain              | 109 |
| 4.10 | 2.5-bit residue amplifier transfer curve with comparator offset in sub-ADC | 110 |
| 4.11 | 2.5-bit residue amplifier transfer curve with capacitor mismatch in MDAC   | 111 |
| 4.12 | 2.5-bit residue amplifier transfer curve with amplifier distortions        | 112 |
| 4.13 | Simplified block diagram of ADC architecture.                              | 115 |
| 4.14 | Boot-strapped switch with tunable boot-strapped dummies                    | 116 |
| 4.15 | Proposed current-reuse OTA.                                                | 118 |
| 4.16 | Equivalent circuit of proposed current-reuse OTA with level shifting       | 119 |
| 4.17 | Comparison of single stage OTA structures                                  | 120 |
| 4.18 | Current reuse amplifier in a switched-cap MDAC.                            | 121 |
| 4.19 | Proposed current-reuse OTA with slew-rate booster                          | 123 |
| 4.20 | Closed loop simulation of proposed OTA.                                    | 124 |

| 4.21 | Foreground calibration scheme; LDO with glitch reduction is used in the first two stages                                             |
|------|--------------------------------------------------------------------------------------------------------------------------------------|
| 4.22 | Using an FIR filter to generate flat in-band frequency response                                                                      |
| 4.23 | A foreground calibration method through PSO algorithm                                                                                |
| 4.24 | A low glitch LDO based on pre-charging technique130                                                                                  |
| 4.25 | Gain and non-linearity calibration through small input power                                                                         |
| 4.26 | High-frequency distortion calibration through small input power                                                                      |
| 4.27 | Gain and non-linearity calibration through large input power134                                                                      |
| 4.28 | Adaptive background calibration schemes that includes measuring out-of-band power, two tone test and frequency response compensation |
| 4.29 | Chip photo with measurement set-up                                                                                                   |
| 4.30 | Measured spectrum before calibration                                                                                                 |
| 4.31 | Measured spectrum after calibration                                                                                                  |
| 4.32 | DNL and INL before calibration                                                                                                       |
| 4.33 | DNL and INL after calibration                                                                                                        |
| 4.34 | SNDR/SFDR versus normalized input amplitude in dB141                                                                                 |

# LIST OF TABLES

| TABLE | . I                                           | Page |
|-------|-----------------------------------------------|------|
| 2.1   | Integrated power in 500 MHz Bands.            | 5    |
| 2.2   | Comparison with LNA first receivers.          | 42   |
| 2.3   | Comparison with mixer-first receivers.        | 44   |
| 3.1   | Comparison with Low Spur PLLs.                | 91   |
| 4.1   | Comparisons between different OTA structures. | 120  |
| 4.2   | SNDR over different calibration techniques.   | 142  |
| 4.3   | Comparison with pipelined ADCs                | 143  |

#### 1. INTRODUCTION

### 1.1 Motivation of High-Performance and Power-Efficient Circuit Design

With the fast evolution of wireless communication standards from 3G to 5G, data throughput has increased dramatically. Supported by advancing semiconductor technologies, researchers and engineers are boosting system performances with power efficient techniques. Since many portable devices, such as cellphones, Bluetooth earbuds, etc., are powered by batteries, lengthening the battery life by reducing power consumption has always been the goal of numerous projects and products. This dissertation focused on high-performance and low-power circuit design techniques that were related to the 5G wireless communication system. Three fundamental building blocks in a 5G wireless receiver were included in this dissertation. They were a wideband and highly linear RF front-end, a fractional-N PLL with reference and fractional spur reduction techniques and a high-speed low-power pipelined ADC.

### 1.1.1 A Wideband and Highly Linear RF Front-End Design

From LTE to 5G applications, such as WiFi-6, the total channel bandwidth has increased from 20 MHz to 160 MHz, which demands innovative circuit designs to support higher bandwidth without consuming excessive power. The huge increase in the baseband bandwidth leads to significant changes of system specifications, which inevitably requires new ideas in system and circuit levels to achieve the desired performance. For a system with 160 MHz bandwidth in a sub 6 GHz band, it is not easy to find a highly selective RF bandpass filter. Thus, a highly linear front-end is needed to process the in-band information and tolerate the out-of-band blockers.

This RF front-end, based on a mainstream CMOS technology, was designed and tested to achieve high in-band linearity, wide baseband bandwidth and out-of-band blocker tolerance simultaneously. Without consuming excessive power, a high  $P_{1dB}$  was demonstrated on silicon. A comprehensive review over different RF front-end structures and LNAs is introduced to build a comprehensive comparison between different designs.

#### 1.1.2 A Fractional-N PLL with Reference and Fractional Spur Reduction Techniques

A PLL is a fundamental part in a wireless system, which functions as a clock source to RF front-ends and ADCs. The performance of the PLL has a significant impact on receiver's in-band and out-of-band performance. High phase noise and spurs will lead to SNDR reductions in RF front-ends and ADCs. In a charge pump PLL, the phase noise and spur levels are not strong trade-offs with each other. Spur suppression techniques should be realized without introducing additional phase noise.

A fractional-N charge pump PLL targeted at reducing reference spurs, fractional spurs as well as out-of-band noise was incorporated in this section. Detailed system level descriptions and spur reduction technique were included. Comparisons with state-of-the-art charge pump PLLs and alldigital PLLs were exhibited to highlight the performance of this design.

### 1.1.3 A High-Speed Low-Power pipelined ADC

ADCs have been the indispensable parts in many electronic devices. Various system requirements lead to different ADC structures and specifications. For biomedical applications, a deltasigma modulator with over 14-bit resolution is needed to detect the signal below  $\mu V$ . Meanwhile, for a narrowband wireless system, such as Internet of things (IoT), a pipelined or successive approximation register (SAR) ADC with 8 to 10-bit resolution is enough to support the low data rate. Complex modulation schemes, such as QAM-256 in 5G, are decoded by ADCs with higher resolutions. For other applications, such as a modern high-speed wireline system, a high-speed 8bit time-interleaved SAR ADC will cover the expected PAM-4 modulation within certain channel loss.

In this project, a high-speed and low-power pipelined ADC with comprehensive foreground and background calibrations were described. For a sample rate of 1 Giga samples per second (1GS/s), the gain and bandwidth of the residue amplifier should be optimized. Due to the low intrinsic gain of this CMOS technology, inter-stage gain calibration in digital domain was crucial to achieve a desired system performance. Error sources in this pipelined data converter were discussed in detail. Both analog and digital calibrations were introduced to remove these errors. Also, background calibrations were implemented to maintain good SNDR over PVT variations. Comparisons with state-of-the-art ADCs demonstrated the performance of this design.

### **1.2** Thesis Organization

This dissertation is organized as follows: Chapter 2 includes the comprehensive analysis and design of a wide-band RF front-end. An introduction of system structure and comparisons with other structures are also incorporated. The fractional-N PLL with spur reduction techniques is presented in Chapter 3. The PLL calibration is discussed in detail in a systematic way. Chapter 4 described the design considerations of a high-speed low-power pipelined ADC with both fore-ground and background calibration techniques. Digital techniques are incorporated to help with the analog calibrations.

# 2. A 3-TO-6 GHZ HIGHLY LINEAR RECEIVER WITH OVER +3.0 DBM IN-BAND P<sub>1DB</sub> AND 200-MHZ BASEBAND BANDWIDTH SUITABLE FOR 5G WIRELESS AND COGNITIVE RADIO APPLICATIONS\*

### 2.1 Introduction

To achieve a broadband and blocker-resilient wireless communication system, the fifth generation (5G) system in currently under continuing evolving and development [1, 2]. This dedicated report revealed the possibility of the opportunistic utilization of sub-utilized bands in 3 to 6 GHz range for high-speed radio links, which occupied a higher baseband bandwidth over 200 MHz. Compared with the 20 MHz bandwidth of a 4G LTE system, the WiFi-6 and 5G mobile requires a 160 MHz and a bandwidth over 100 MHz, respectively. Meanwhile, a cognitive radio system needs an agile method of dynamic sensing and detection of the broadband spectrum status to make the most of unoccupied bands. All these demand wideband wireless systems.

As depicted in Fig. 2.1, it is required for a frequency-agile cognitive radio front-end to sense the bands occupied by primary users, find empty frequency spots and allocate them to secondary users [3]. Due to the fact that the empty frequency spots are changing dynamically and randomly over time, the front-end for the cognitive radio application should be able to sweep through the desired frequency bands swiftly to make the most of the available spectrum resources. Fig. 2.2 indicates the power spectral density measurement in one month from a 3 to 6 GHz range from the Microsoft Technology Policy Group's Spectrum CityScape project [4]. During the peak times (between 3:00 to 4:00 p.m. on a busy weekday), the integrated power from 3 to 6 GHz bandwidth was -0.09 dBm, which was close to 0 dBm. Meanwhile, the total power in a 500 MHz sub-band was calculated as -6.29 dBm in Table 2.1. In this case, for wideband receivers that need to process information concurrently without using highly selective RF bandpass filters, the required P<sub>1dB</sub> is

<sup>\*</sup>Part of the data reported in this chapter is reprinted with perssion from [1], "A 3-6 GHz Highly Linear I-Channel Receiver with Over +3.0-dBm In-Band P<sub>1dB</sub> and 200-MHz Baseband Bandwidth Suitable for 5G Wireless and Cognitive Radio Applications" by Junning Jiang and et al., 2019, IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 66, no. 8, pp. 3134-3147, Copyright ©2019, IEEE



Figure 2.1: Illustration of frequency and time allocation in cognitive radio systems with power level indicated.

| Frequency Range (GHz) | Integrated Power (dBm) |
|-----------------------|------------------------|
| 3.0-3.5               | -9.04                  |
| 3.5-4.0               | -7.17                  |
| 4.0-4.5               | -8.45                  |
| 4.5-5.0               | -8.14                  |
| 5.0-5.5               | -8.88                  |
| 5.5-6.0               | -6.29                  |

Table 2.1: Integrated power in 500 MHz Bands.

over -6.0 dBm for a 500 MHz bandwidth and 0 dBm for a 3 GHz bandwidth, respectively.

The RF filters should be considered before choosing frequency plans. For 5G front-ends with highly selective RF filters, usually a channel aggregation plan should be used, i.e., multiple narrowband systems need to work in parallel to support a wideband system. Although the linearity and bandwidth are significantly relaxed in each narrowband system, excessive power consumption



Figure 2.2: Measured power spectral density during the busiest hour of one entire day (between 3:00 p.m. and 4:00 p.m. in [3]), and signal power when integrated in 500 MHz bands.

and mismatches among different channels need to be considered in advance. This methodology cannot fully exploit the benefits of the modern high-speed CMOS technologies. If highly selective RF filters are not used, the wireless system need to completely rely on the front-end to tolerate input signals and blockers simultaneously. Since this is a wideband system, out-of-band information needs to be filtered after receiving the broadband signal. This wideband plan reduces hardware complexity greatly and innovative ideas to achieve a highly linear front-end are incorporated in this project.

As shown in Fig. 2.3, three mainstream RF front-end structures are compared with each other. The labels "P", "I" and "V" in y-axis mean power, current and voltage, respectively. For the lownoise amplifier (LNA) first structure, an LNA usually have a gain over 10 dB to suppress the noise



Figure 2.3: Structure comparisons between LNA-first, LNTA-first and mixer-first.

from the mixer. This structure is usually not blocker tolerant due to the required voltage gain of the LNA, which leads to amplified blockers that can strongly distort the LNA and mixer. For the LNTA first structure, as long as the LNTA is linear for desired input power level, the output is in current mode and will not saturate at the LNTA's output. The trans-impedance amplifier (TIA) is linear as the input baseband current does not generate excessive swing at TIA output. This structure needs to optimize the noise figure (NF), linearity, conversion gain and power consumption. For example, larger conversion gain leads to reduces linearity at the output of TIA, thus making the system less tolerant to blockers. As to the mixer first structure, it can only provide narrowband matching. Out-of-band component of the wideband input signal is reflected due to the poor out-of-band matching.

This structure is also sensitive to the phase noise from clock sources and input antenna is modulated with local leakage. Although out-of-band blockers are reflected, this structure cannot tolerate high in-band power. A high in-band IIP3 and  $P_{1dB}$  cannot be achieved through this structure. In a sum, the LNTA first structure is the best candidate for proposed highly linear and wideband applications.

There are numerous RF receivers with different architectures, and most of them are less than a 20 MHz baseband bandwidth for a 4G standard [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]. Among theses topologies, [6] has implemented digital techniques to enhance the front-end's robustness over out-of-band interference. By using a multiple path cancellation technique, [7] achieves low noise figure (NF). [8] improves the front-end's ability to reduce blockers at harmonic frequencies. [9][10][11] also implement techniques to improve their tolerances to blockers through active sensing and cancellation. A reconfigurable receiver provides both low noise amplifier (LNA) first or LNA less modes in [12] to tolerate up to 0 dBm out-of-band (OOB) blocker. [13] strengthens spectrum selectivity by using mixer-first and pole pair adjustment technique in baseband. The work in [14] implements a configurable baseband bandwidth and [15] employs higher order filtering to further enhance the system's tolerance to OOB blockers. In this project, a wideband receiver targets at a 3-to-6 GHz RF range, a 200 MHz baseband bandwidth and a P<sub>1dB</sub> over 0 dBm, which is capable of processing total power over 0 dBm. The proposed receiver is mainly intended to work as a frequency agile spectrum sensing node for cognitive or opportunistic radios; the direct conversion receiver from this front-end can as be configured as wide-band 5G receivers which need a 100 MHz for mobile and 160 MHz for WiFi applications.

The block diagram of this wideband receiver consists of a LNTA, a current-mode passive mixer [16], a TIA and minimally-invasive baseband filters as shown in Fig. 2.4. The main contribution in this paper are associated with the linear LNTA and the second-order minimally invasive baseband filter. Mainstream technologies with short channel transistors come with higher channel noise and a lower power supply simultaneously; hence, the design of the LNTA is critical in this wideband wireless system.

For different LNTA structures in Fig. 2.5, common source (CS) with inductive degeneration is



Figure 2.4: Proposed wideband highly linear receiver system.



Figure 2.5: Comparison of LNTA structures.

only suitable for narrowband applications, such as GSM. The inductor  $L_G$  provides gain boosting to the LNTA. NF is reduced at the cost of low linearity. The common gate (CG) structure has acceptable linearity. However, its NF is limited by the structure[17]. For the LNTA with resistive feedback, similar to a TIA, it provides good linearity with closed-loop operation, which requires additional power. This structure also needs an auxiliary path for noise cancellation [17]and power consumption is further increased. Meanwhile, to drive a current mode passive mixer, the LNTA with resistive feedback needs a voltage-to-current conversion stage, which also incorporates extra design efforts and introduces distortions. In this way, the resistive feedback LNTA consumes excessive power without offering excessive advantages over common gate structure. Thus, CG structure is the candidate for wideband LNTA design and its linearity and NF need to be further improved.

Since a large amount of down-converted current components are injected into the TIA, a low input impedance and moderate trans-impedance gain prevents the TIA from generating nonlinear components. To reduce mixer's distortions, the total impedance at mixer's output is maintained low over a wide range by paralleling the TIA,  $C_3$  and the minimally-invasive filter.

For the filter's implementation, a second-order minimally-invasive topology is preferred. The filter will be transparent to in-band signals and will provide a low-impedance path for out-of-band signals. The capacitor  $C_3$  helps maintain a low nodal impedance at the TIA input with higher frequencies, thereby absorbing the very high frequency components after the mixer.

To prove the properties of the proposed LNTA and minimally-invasive filter within the receiver chain, only the I channel is implemented. The proposed concepts can be extrapolated to a direct conversion receiver. The chapter is organized as follows. First of all, transistor-level implementations of blocks are discussed with a focus on the LNTA, the second-order minimally-invasive filter and system-level optimizations. Second, comprehensive measurement results on silicon are provided. Finally, discussions and conclusions pertaining to this receive are presented.

#### 2.2 Proposed Receiver Circuit Blocks

#### 2.2.1 Source Degenerated Transconductance Amplifier

The LNTA determines the noise floor of the system [5, 17]. The CS architecture with inductive degeneration is popular for narrowband systems, but it is not suitable for broadband applications. The LNTA architecture with resistive feedback has been widely studied and used during recent publications [18]. Its linearity improvement is a result of feedback. However, it provides a low output impedance and can only be coupled to a mixer working in the voltage mode. In addition, the resistive feedback LNTA presents an NF larger than  $1 + \gamma$  unless a noise cancellation technique is employed [17]. This topology is usually power hungry even if advanced technologies are used due to potential stability issues related to its closed loop operation.  $\gamma$  is the fitting parameter of the transistor's channel noise, and its value can be as large as 2 in many cases. The common source and common gate (CS-CG) LNTA requires higher transconductance in the CS branch to reduce NF, which leads to the second-order distortions due to un balanced properties over the two paths. Second-order distortions can generate DC offsets, which can be converted to third-order components due to nonlinear behavior in later stages. On the other hand, the signal directly goes to the transistor's gate or source, which limits its linearity. By trading off between the systematic linearity, NF, bandwidth and power consumption, the best choice is the differential CG LNTA.

The CG LNTA structure shows an inherently wideband impedance matching and higher linearity compared to the CS LNTA with inductive degeneration [17], [19]. With required input matching, it can be shown that the NF of a CG LNTA is bound by  $1 + \gamma$  [17]. Unfortunately, when short channel transistors are used to build the CG LNTA, the NF is larger than  $10log(1 + \gamma) = 4.77dB$ , if  $\gamma = 2$ . The non-zero gate resistance  $R_G$  of the transistor and the limited quality factor of  $L_S$  and  $L_D$  will make the NF higher than 5. In this prototype, a CG LNTA-based architecture is proposed with simultaneous improvements on NF and linearity.

The conventional CG LNTA displayed in Fig. 2.6(a) has a fundamental trade-off between input impedance matching and NF. Both parameters are dictated by the transconductance  $(g_m)$  of a CG



Figure 2.6: Common gate LNTA (a) conventional (b) with source degeneration resistor  $R_X$ .

transistor. To overcome this trade-off, a degeneration resistor  $R_X$  is inserted between the input and the CG LNTA as shown in Fig. 2.6(b). The voltage swing across the nonlinear CG transistor reduces due to the presence of  $R_X$ , which improves the linearity with limited transistor's overdrive voltage. In addition, the NF of the resistively degenerated CG-LNTA is smaller than  $1 + \gamma$  since the thermal noise of a resistor is smaller than the one generated by a transistor with an equivalent resistance equal to  $1/g_m$ , especially when  $\gamma > 2$ . The input matching condition and NF of the proposed CG-LNTA can be briefly expressed as,

$$R_S = R_X + \frac{1}{g_m} \tag{2.1}$$

$$NF = 1 + \frac{R_X}{R_S} + \frac{\gamma}{g_m R_S} \tag{2.2}$$

where  $R_S$  is the driving impedance from the antenna. Since the input matching requires (2.1) to be satisfied, and  $\gamma > 2$ , it can be concluded that the noise contribution due to input matching devices reduces by the amount  $(\gamma - 1)R_X/R_S$  for short channel devices. For instance, to make the NF < 4.0 dB when  $R_X/R_S > 0.5$ .

If the impedance of both load inductor  $L_S$  and transistor output impedance  $r_{ds}$  are ignored, it can be shown that the input impedance looking into node P in Fig. 4(b) is expressed as,

$$Z_{in} = \frac{\frac{1}{g_m} \frac{1}{s(C_{GS} + C_{SB} + C_{Par})}}{\frac{1}{g_m} + \frac{1}{s(C_{GS} + C_{SB} + C_{Par})}} + R_X$$
$$= R_S - \left(\frac{\frac{s(C_{GS} + C_{SB} + C_{Par})}{g_m}}{1 + \frac{s(C_{GS} + C_{SB} + C_{Par})}{g_m}}\right) \left(\frac{1}{g_m}\right)$$
(2.3)

where  $C_{GS}$  and  $C_{SB}$  are the capacitances between gate-source and source-bulk, respectively.  $C_{Par}$ is the parasitic capacitance at source node to ground. If (2.1) holds, then (2.3) indicates that  $C_{GS} + C_{SB} + C_{Par}$  has a minimum effect on impedance matching if the frequency of the pole is well beyond the frequency of interest and if  $1/g_m < R_S$ . However, since the transistor is built on the deep-N well (DNW) from this technology, the parasitic capacitor  $C_{Par}$  contributes to the decrease in pole frequency to the range of 50 GHz, where  $C_{GS}$ ,  $C_{SB}$  and  $C_{Par}$  are estimated as 39.87 fF, 5.07 fF and 161.2 fF. The  $C_{Par}$  comes from the fixed layout patterns requested by technology. Also, larger area of metal is used to reduce the gate resistance of the transistor and tolerate more current safely. All these factors contribute to  $C_{Par}$ . The pole frequency drops to around 10 GHz when the transistor's length and width doubles. Therefore, transistor length over 80 nm is not preferred for this design.

The presence of finite  $r_{ds}$  affects the overall input impedance since it interacts with the amplifier's source and drain nodes. If the drain of the CG is loaded with low impedance, it will not introduce significant differences on (2.1), (2.2) and (2.3);  $g_m$  can be replaced by  $g_m + g_{ds}$ . For example, for the case where  $R_X = 35 \Omega$  and  $1/g_m = 15 \Omega$  and  $g_{ds}/g_m$  ratio of 0.2 the overall LNTA input impedance decreases by 2.5  $\Omega$ . However, if CG loads higher impedance than  $1/g_m$ , the LNTA input impedance increases [17]. On the other hand, the source degeneration approach linearizes the transistor with inherently linear passive components. When the weak desired signals are processed in the presence of strong out-of-band blockers, this LNTA still functions well since the resistor effectively reduces the voltage swing across the transistor terminals by a factor of



Figure 2.7: Small signal model of CG LNTA degenerated by resistor  $R_X$  and load impeadance  $Z_{D1}$ .

 $1 + g_m R_X$ . The inherent local feedback present in this resistive degeneration topology makes the LNTA's performance robust over PVT variations. Since the transistor's  $g_m$  is a nonlinear function of the transistor's overdrive voltage, the  $I_{DS}$  can be expressed from the items from  $g_m$ , such as second order  $g'_m$  and third order  $g''_m$ [20][21]. Also the  $g_{ds}$  is included, which is reflected in Fig. 2.7

$$I_{DS} = g_m V_{GS} + g'_m V_{GS}^2 + g''_m V_{GS}^3 + g_{ds} (V_D - V_S)$$
(2.4)

For  $V_D$  and  $V_X$ ,

$$V_D = -I_{DS}ZD1 \tag{2.5}$$

$$V_X = V_S - I_{DS}RX \tag{2.6}$$

Since the gate (G) of the transistor is AC-grounded, and the target is to obtain  $I_{DS}$ 's relatioship with  $V_X$  and in following form, like

$$I_{DS} = -(bV_X + b'V_X^2 + b''V_X^3)$$
(2.7)

through calculations and assuming higher order items small in comparison with second order and third order items, it can be derived that

$$b = \frac{\frac{g_m + g_{ds}}{1 + g_{ds}Z_{D1}}}{1 + \frac{g_m + g_{ds}}{1 + g_{ds}Z_{D1}}R_X}$$
(2.8)

$$b' = \frac{\frac{g'_m}{1 + g_{ds} Z_{D1}}}{\left(1 + \frac{g_m + g_{ds}}{1 + g_{ds} Z_{D1}} R_X\right)^3}$$
(2.9)

$$b'' = \frac{\frac{g_m''}{1+g_{ds}Z_{D1}} - 2\frac{\frac{g_m'R_X}{(1+g_{ds}Z_{D1})^2}}{1+\frac{g_m+g_{ds}}{1+g_{ds}Z_{D1}}R_X}}{\left(1 + \frac{g_m+g_{ds}}{1+g_{ds}Z_{D1}}R_X\right)^4}$$
(2.10)

If an input signal of two tones with same amplitudes, such as  $A\cos(\omega_1 t) + A\cos(\omega_2 t)$ , is applied at the LNTA input, the output current  $I_{DS}$  can be expressed as,

$$I_{DS}(t) = b(A\cos(\omega_{1}t) + A\cos(\omega_{2}t)) + b'(A\cos(\omega_{1}t) + A\cos(\omega_{2}t))^{2} + b''(A\cos(\omega_{1}t) + A\cos(\omega_{2}t))^{3}$$
(2.11)

where  $g_m$  is the small signal transconductance for the transistor, and  $g'_m$  and  $g''_m$  are the first-order and second-order derivatives of the transistor  $g_m$  over  $V_{GS}$ , respectively.  $g_m$ ,  $g'_m$ ,  $g''_m$  and  $g_{ds}$  are evaluated at the selected operating point. Much like [21], nonlinear items from  $g_m$  and a linear  $g_{ds}$ are assumed.  $Z_{D1}$  is the equivalent load impedance looking out from the drain of the CG LNTA. When a cascode stage is implemented,  $Z_{D1}$  equals the impedance looking into the source of the cascode transistor. If no cascode is used,  $Z_{D1}$  is  $L_D$  in parallel with the mixer's input impedance and the CG stage's output capacitance. The IIP3 is a widely used parameter [17] for evaluating small signal system linearity. The expression for IIP3 can be obtained from 2.11 as

$$IIP3 = \sqrt{\frac{4}{3}\frac{b''}{b}} = \sqrt{\frac{4}{3}\frac{\frac{g_m + g_{ds}}{1 + g_{ds}Z_{D1}}}{\frac{g'_m}{1 + g_{ds}Z_{D1}} - 2\frac{\frac{g'_m R_X}{(1 + g_{ds}Z_{D1})^2}}{\frac{g'_m + g_{ds}}{1 + g_{ds}Z_{D1}}R_X}} \times \left(1 + \frac{g_m + g_{ds}}{1 + g_{ds}Z_{D1}}R_X\right)^{\frac{3}{2}}$$
(2.12)

The LNTA's IIP3 without source degeneration is

$$IIP3_{transistor} = \sqrt{\frac{4}{3} \frac{g_m + g_{ds}}{g_m''}}$$
(2.13)

Since the  $g_m,g'_m$  and  $g''_m$  are strong functions of the transistor's overdrive voltage, extensive simulations can help to select the required bias voltages. In this design, it is set at 100 mV, which leads to an IIP3 of around 12.5 dBm when  $R_X = 0$ . From (5), for the same overdrive voltage, the LNTA IIP3 improves with  $R_X$ . For instance, if  $g_m R_X = 2.3$ , the LNTA's IIP3 increases by around 15.5 dB, when  $g_{ds}$  is considered 0. When  $Z_{D1}$  is small, with a  $g_{ds}/g_m = 0.2$ , a 17.9 dB IIP3 increment should be observed. Besides, the proposed circuit breaks the direct relationship between  $g_m$  and input impedance matching in (2.1). Enlarging  $R_X$  improves linearity, but this approach is limited by the impedance matching condition as shown by the impedance matching limitation that (2.1) holds. The linearity performance enhancement also comes with a power consumption penalty.

#### 2.2.2 Proposed Low Noise Transconductance Amplifier

To further reduce the single CG LNTA's NF, a fully differential LNTA with the cross-coupled technique is adopted as shown in Fig. 2.8(b). In comparison with the single-ended LNTA in Fig. 2.8(a), when the input matching condition is kept, the signal swings over, and the  $M_{CG1A}$  and  $M_{CG1B}$  in Fig. 2.8(b) are the same as the  $M_{CG}$  in Fig. 2.8(a). Additionally, even order components are suppressed when differential output signals are processed, which benefits the system's large signal linearity [21]. The cross-coupled structure decreases the NF through the noiseless  $g_m$  boosting mechanism [22] [23].



Figure 2.8: Input signal levels over each node labeled for (a) single-ended LNTA with resistive degeneration (b) proposed cross-coupled differential LNTA with resistive degeneration.



Figure 2.9: Simplified circuit for calculating NF of cross-coupled and resistive degenerated CG LNTA.

A detailed step-by-step derivation of cross-coupled LNTA's NF is included as follows. A simplified circuit is depicted in Fig. 2.9.  $I_{n,DM}(\sqrt{4kT\gamma gm})$  and  $I_{n,DX}(\sqrt{4kT/R_X})$  are the noise of  $R_X$  and  $M_{CG1A}$  in current mode, respectively. Employing KCL, it can be shown that

$$\frac{g_m}{1+g_m R_X} (V_X - V_Y) = \frac{V_Y}{0.5R_S} = I_{RF+}$$
(2.14)

From  $R_S = R_X + 1/g_m$  and (2.14),  $V_X = 3V_Y$  is derived. Since the current noise  $I_{n,DM}$  and  $I_{n,DX}$  are statistically independent, they can be analyzed separately. The principle of superposition is applied to analyze them separately. Start with  $I_{n,DM}$ ,

$$g_m(V_Y - V_Z) + I_{n,DM} = \frac{V_Z - V_X}{R_X} = \frac{V_X}{0.5R_S} = I_{RF+}$$
(2.15)

and for  $I_{n,DX}$ ,

$$g_m(V_Y - V_Z) = \frac{V_Z - V_X}{R_X} + I_{n,DX} = \frac{V_X}{0.5R_S} = I_{RF+}$$
(2.16)

The current noise from  $I_{n,DM}$  and  $I_{n,DX}$  are observed ad  $I_{n,diff_DM}$  and  $I_{n,diff_DX}$  at the differential output  $I_{RF+}$  and  $I_{RF-}$  respectively. For the noise from  $I_{n,DM}$  and  $I_{n,DX}$  at differential output,

$$\overline{I_{n,diff}^{2}} = \overline{I_{n,diff_{D}M}^{2}} + \overline{I_{n,diff_{D}X}^{2}}$$

$$= \left(\frac{1}{2g_{m}R_{S}}\right)^{2}\overline{I_{n,DM}^{2}} + \left(\frac{g_{m}R_{X}}{2g_{m}R_{S}}\right)^{2}\overline{I_{n,DX}^{2}}$$

$$= \left(\frac{1}{2g_{m}R_{S}}\right)^{2}4kT\gamma gm + \left(\frac{g_{m}R_{X}}{2g_{m}R_{S}}\right)^{2}\frac{4kT}{R_{X}}$$
(2.17)

Thus, the NF can be expressed as follows by considering the two arms present in differential implementations, which also in accordance with [21],

$$NF = 1 + \frac{2\left(\frac{1}{4}\frac{4kT\gamma g_m}{g_m^2 R_S^2} + \frac{1}{4}\frac{4kTR_X}{R_S^2}\right)}{\frac{4kT}{R_S}} = 1 + \frac{R_X}{2R_S} + \frac{\gamma}{2g_m R_S}$$
(2.18)

When (2.1) holds, NF reduces when  $R_X$  increases, as depicted in Fig. 2.10, when input matching is maintained. For  $\gamma = 2$  and  $R_S = 50 \Omega$ , NF = 4.77 dB when  $R_X = 0$ ; however, if



Figure 2.10: Simulated and calculated NF of proposed LNTA over different  $R_X$  values.

 $R_X = 35\Omega$ , then a gm of 66.7 mS is required, and the estimated LNTA's NF is 2.17 dB. The results from this equation fits with the Cadence simulation results when  $\gamma = 2$ . Meanwhile,  $M_{CG1A}$ ,  $M_{CG1B}$ ,  $M_{CG2A}$  and  $M_{CG2B}$ 's width over length ratio is 96 um/40 nm.  $L_S(4nH)$  and  $L_D(6nH)$  are center-tapped octagonal inductors with a quality factor over 12.5 and 10, respectively, from 3 to 6 GHz. Metal-Oxide-Metal (MOM) capacitors are used to implement  $C_{P1}$  with a quality factor of more than 22 from 3 to 6 GHz.

To simulate the IIP3 of the LNTA over different  $R_X$  values, the LNTA's output is loaded with an AC coupled 20  $\Omega$  resistor to model the input impedance of the passive mixer terminated with a low impedance TIA. Input matching conditions of the LNTA are kept, and a constant overdrive voltage of 100 mV over  $M_{CG1A}$  and  $M_{CG1B}$  are maintained.  $M_{CG2A}$  and  $M_{CG2B}$  have a gate bias voltage of 1.8 V, which is the same as the  $V_{DD}$ . The source degeneration resistor  $R_X$  is swept from 0 to 45  $\Omega$  while (2.1) is satisfied. Transistors are implemented with thin-oxide transistors with deep N-well technology, which avoids reliability issues over high-voltage swings and achieves substrate noise isolation. Fig. 2.11 shows that the simulated IIP3 improvement and power consumption



Figure 2.11: Simulated LNTA IIP3 and power consumption over different  $R_X$  values.

increase as a function of the source degeneration resistor. Compared to the case of  $R_X = 0$ , which consumes around 10 mW, the power consumption doubles, and IIP3 improves by more than 5 dB when an  $R_X$  of 25  $\Omega$  is used. When the  $R_X$  is set at 35  $\Omega$ , the LNTA's IIP3 increases by 10 dB at a power consumption of 29.4 mW with a DC current of 16.3 mA. Fig. 2.12 shows the simulated output small signal current over input voltage versus different frequencies of the cross-coupled and resistive degenerated LNTA. As shown in Fig. 2.29(b), the differential output currents are observed from  $I_{RF}$ + and  $I_{RF}$ - nodes. The variation of LNTA output small signal transconductance is less than 2 mS in the 3 to 6 GHz frequency range, and the 3-dB bandwidth is 6.2 GHz, ranging from 2.2 to 8.4 GHz. Fig. 2.13 compares the simulated IIP3 with the calculated IIP3 by 2.12 of a CG LNTA with a cascode.  $Z_{D1} = 1/(g_m + g_{ds})$  is the case when CG LNTA is loaded with a cascode transistor, as a single branch of the cross-coupled LNTA in Fig. 2.29(b). The impedance into the cascode stage with its small signal load is  $1/(g_m + g_{ds})$  [17]. The simulated IIP3 fits well with the


Figure 2.12: Simulated small signal current over input voltage as function of frequency.



Figure 2.13: Calulated and simulated IIP3 for different  $R_X$  values.

calculated IIP3 when  $R_X$  is less than  $20 \Omega$ . When  $R_X$  is larger than  $25 \Omega$ , the benefit of resistive degeneration over IIP3 improvement is less than expected. The resistive degeneration's benefit on

IIP3 does not grow as fast as the model predicted in (2.12). The model implemented in (5) assumed a linear  $g_{ds}$ , a linear load when looking into the cascode stage of the CG LNTA. Due to the nature of short channel transistors, other small signal sources exist, such as  $g_{dg}V_{DG}$  [24], which makes the IIP3 derivation more complex. Hence, the authors concluded that the linearity model derived from [21] generated an optimistic prediction of the LNTA IIP3 growth with resistive degeneration. Compared to a long channel design, short channel transistors come with lower intrinsic gain and reduced linearity benefit from resistive degeneration but better frequency response. Applying a more complex model to approximate the transfer function of short channel transistor is beyond the scope of this paper. In this design,  $g_m = 64.3 mA/V$ ,  $g'_m = -3.4 mA/V^2$  and  $g''m = 67 mA/V^3$ with the overdrive voltage set at 100 mV.

Based on the observed trade-offs between linearity, noise and power consumption,  $R_X$  was set at 35  $\Omega$ ; this selection allowed the LNTA to achieve a systematic in-band IIP3 and P<sub>1dB</sub> over 22.5 and 7.0 dBm, respectively, even under process variations and preset design margins, which is comparable with the LNAs mentioned in [25], [26]. An  $R_X$  of 45  $\Omega$  requires a higher than 1.8 V power supply and excessive power consumption. Overall, this innovative topology exhibits outstanding linearity over a wide input power range, does not require additional linearity compensation methodologies, and accommodates reliability and robustness through linear feedback properties.

#### 2.2.3 Passive Mixer

A fully differential passive mixer is used, and its linearity surpassed the linearity of other building blocks provided the single-ended impedance into the TIA is under 10  $\Omega$  [18] [27]. System simulation results indicates that the mixer has an approximately -4.2 dB of conversion gain to the system, which is 0.3 dB lower than the theoretical conversion gain  $2/\pi$  of the differential passive mixer [17]. A high output impedance of LNTA and a low input impedance of TIA ensures the suppression of a larger insertion loss of the fully differential current-mode mixer. The sizes of the mixer transistors are chosen to provide similar current conduction capabilities of LNTA transistors and thereby avoid large signal distortions.

#### 2.2.4 Trans-impedance Amplifier (TIA)



Figure 2.14: A two-stage OTA with feed-forward compensation.

To maintain a low input impedance at TIA's input and handle a large amount of current from the mixer, a wideband and high-gain operational transconductance amplifier (OTA) is required. The TIA's feedback resistor  $R_F$  in Fig. 2.4 was set to 200  $\Omega$  to reach the desired trade-offs between conversion gain, linearity and NF. Large conversion factors are not possible in this application because large input signals are expected. In other words, a -10 dBm in-band blocker power will generate a 530 mV differential peak-to-peak swing at the TIA output. A larger conversion gain of the receiver will saturate at TIA's output. The limited receiver's conversion gain makes it even more relevant to reduce the noise's TIA components. A simplified schematic of the two-stage amplifier with feedforward compensation [28] was used in this TIA. Its simplified schematic is shown in Fig. 2.14. The open loop gain is described as,

$$\frac{V_{out}(s)}{V_{in}(s)} = -(A_{V1}A_{V2} + A_{V2a})\frac{1 + \frac{sA_{V2a}}{(A_{V1}A_{V2} + A_{V2a})\omega_{p1}}}{\left(1 + \frac{s}{\omega_{p1}}\right)\left(1 + \frac{s}{\omega_{p2}}\right)}$$
(2.19)

In this case,  $A_{V1} = g_{m1}R_{O1}$ ,  $A_{V2} = g_{m2}R_{O2}$  and  $A_{V2a} = g_{m2}aR_{O2}$ .  $\omega_{p1}$  and  $\omega_{p2}$  are located at  $1/R_{O1}C_{O1}$  and  $1/R_{O2}C_{O2}$ , while  $\omega_{p1}$  is the dominate pole. The zero at  $(A_{V1}A_{V2}/A_{V2a}+1)\omega_{p1}$  was designed to match  $\omega_{p2}$  to enhance the OTA's bandwidth, phase margin and transient response. The transconductance  $g_{m1}$  determines the baseband noise at the OTA output, especially flicker noise.

In this case, the size of transistors in  $g_{m1}$  was enlarged and optimized to reduce both thermal and flicker noise. The transconductance  $g_{m2}$  must be able to pull and push the current flowing through the feedback resistor  $R_F$  and the amplifier's loading; then, more power was budgeted for the second stage. The transconductor  $g_{m2a}$  was used to compensate for the non-dominant pole. Design



Figure 2.15: A two-stage high speed OTA with feed-forward compensation. The connection including the feedback elements  $R_F$  and  $C_F$  as well as elements connected at the OTA input are shown in the bottom-right corner.

details of the OTA in TIA are exhibited in Fig. 2.15. The NMOS loads of  $M_{N1A}$  and  $M_{N1B}$  with resistor feedback maintain the functionality of the  $g_{m1}$  stage with local common-mode feedback. A fast common-mode feedback was implemented on  $M_{P2A}$  and  $M_{P2B}$  to achieve wideband implementation. The amplifier's DC gain was around 39.4 dB, and its unity gain frequency was around 1.1 GHz, while consuming 18.0 mW. The unity gain frequency was chosen to have five times the desired baseband bandwidth of 200 MHz, which ensured low and flat baseband input impedance. High gain, high bandwidth, and low noise requirements for OTA were achieved with increased power consumption. The input referred noise is 4.15, 1.47 and 0.83  $nV/\sqrt{Hz}$  at 100 kHz, 1 MHz and 10 MHz, respectively. Noise performance was dominated by the flicker noise component up to 3 MHz bandwidth. Flicker noise was dominated by the transistors  $M_{1A}$ ,  $M_{1B}$ ,  $M_{N1A}$ , and  $M_{N1B}$ . The design of the OTA was involved with the elements used in the filter, especially  $C_3$  and the input impedance of the minimally-invasive baseband filter. The feedback capacitor  $C_F$  served two purposes: i) it provides first order filtering to suppress the high-frequency components and ii) it generates a zero in the loop gain allowing stabilizing of the closed loop.

#### 2.2.5 Second-Order Minimally-Invasive Filter



Figure 2.16: Schematic of receiver's baseband showing second-order minimally invasive filter.

The second-order minimally-invasive baseband filter provides a low-impedance path to drain the wideband out-of-band blocker current from the mixer. Its impact on in-band noise and nonlinearity are minimal because its impedance is much higher than the TIA input impedance for in-band signals of less than 200 MHz. The second-order filtering function is provided by the shunt impedance  $Z_{in}$  shown in Fig. 2.16. Different from OTA1, OTA2 requires a high  $g_m$  to process signals from 200 MHz to 1 GHz. A pseudo differential complementary OTA is used to achieve a larger  $g_m$  through current reuse. If an ideal OTA2 is assumed, from Fig. 2.16, its input impedance can be derived as,

$$Z_{in}(s) = \frac{1}{\left(s(C_1 + C_2)\right)\left(1 + s\frac{R_1C_1C_2}{C_1 + C_2}\right)}$$
(2.20)

Although this equation manifests that  $C_1$  can be made large, it will increase the amplifier's output voltage beyond its linear range since it functions as a differentiator, and its gain is expressed as  $sR_1C_1$ . Higher  $C_1$  will lead to higher gain at the OTA output. In 2.20,  $C_1$  and  $C_2$  have a similar effect. However, since  $C_2$  is placed between the input and output of the differentiator, it will not impose strong amplification limitations as in the case of  $C_1$ . So, it was found that making  $C_1 < C_2$  is a good design practice. When signal frequency is lower than  $1/(R_1(C_1C_2/(C_1 + C_2)))$ ,  $Z_{in}$  approaches  $1/s(C_1 + C_2)$ , and the impedance behaves as a single capacitor. When the signal frequency goes higher than  $1/(R_1(C_1C_2/(C_1 + C_2)))$ ,  $Z_{in}$  behaves as a super capacitor  $1/s^2C_2C_1R_1$ , which rolls off with -40 dB per decade over frequency.  $1/(R_1(C_1C_2/(C_1 + C_2)))$  was set to 100 MHz for this design. However, this equation indicates stringent OTA requirements over the desired frequency range. To reduce power consumption, a pseudo differential complementary OTA was



Figure 2.17: Pseudo differential OTA employing complementary transconductors.



Figure 2.18: Single-ended small signal model of proposed baseband filter.

used. The schematic is depicted in Fig. 2.17. The capacitors isolate the operating point of TIA and the resistive feedback self-bias the transistors  $M_{3A}$ ,  $M_{3B}$ ,  $M_{4A}$  and  $M_{4B}$ , maintaining them in the saturation region. The small signal single-ended transconductance of the pseudo-differential amplifier is  $g_m = g_{m3A} + g_{m4A}$ . The input impedance of the filter from the small signal model in Fig. 2.18 can be derived as,

$$Z_{in}(s) = \frac{1 + s \frac{(C_o + C_1 + C_2 + g_o R_1 C_1)}{g_m + g_o} + s^2 \frac{C_1(C_o + C_2)R_1}{g_m + g_o}}{(s(C_1 + C_2)) \left(1 + s \frac{R_1 C_1 C_2}{C_1 + C_2}\right) \left(1 + s \frac{C_o}{g_m + g_o}\right)}$$
(2.21)

Since  $C_0 \ll C_1, C_2$ , the third pole can be ignored, and the filter's input impedance is approximated as,

$$Z_{in}(s) \approx \frac{1 + s \frac{(C_1 + C_2 + g_o R_1 C_1)}{g_m} + s^2 \frac{C_1 C_2 R_1}{g_m}}{(s(C_1 + C_2)) \left(1 + s \frac{R_1 C_1 C_2}{C_1 + C_2}\right)}$$
(2.22)

At a low frequency, the impedance of the filter is capacitive, which is dominated by  $C_1 + C_2$ . The filter was designed such that the second pole around the passband edge has an impedance roll-off at -40 dB per decade. This impedance dominates the overall impedance at the mixer output in the stop band and absorbs most of the near band (>200 MHz) blockers, thereby improving the TIA's

blocker tolerance. The OTA's transconductance plays a relevant role in the location of the zeros; thus, there is an evident trade-off between power and bandwidth. At much higher frequencies, the input impedance is close to  $1/g_m$ , which is shown in Fig. 2.19. For this design, the zeros were placed around 500 MHz. To ensure the functionality of the second-order filter in this prototype, resistor or capacitor arrays for wide range tunability were not implemented. For baseband signals



Figure 2.19: Simulated magnitudes of impedances of TIA, filter,  $C_3$  and nodal total(LNTA output impedance > 100  $\Omega$ , passive mixer on < 10  $\Omega$ , for 3 to 6 GHz).

below 200 MHz, as shown in Fig. 2.19, most of the mixer's output current flows into the TIA since it dominates the nodal impedance. At intermediate frequencies, the low impedance of the filter branches most of the current until it reaches the low GHz frequency range. For frequencies above 2 GHz, the capacitor  $C_3$  circulates most of the current generated at the mixer's output. The total nodal impedance is maintained under 10  $\Omega$  up to 10 GHz thanks to the combination of TIA, filter and  $C_3$ . For 3 to 6 GHz, the LNTA output impedance is higher than 100  $\Omega$ , and the on-resistance of the passive mixer is less than 10  $\Omega$ . Thus, most of the out-of-band components above 400 MHz are filtered, and the TIA only processes the baseband signals. The TIA, the baseband filter and  $C_3$  maintains the nodal impedance under 10  $\Omega$  to guarantee the optimal operation of the mixer. The TIA was optimized for low noise and high in-band linearity, while the filter was optimized for optimal functionality in the range of 200 MHz to 1 GHz. The minimally-invasive filter shows a fast roll-off in the stop frequency band, and on average reduced the blockers by around 15 dB in the range of 200 MHz to 1 GHz. More rejection will be possible if more power is put into the OTA in the minimally-invasive filter.

# 2.3 Experimental Results



Figure 2.20: Micrograph of the chip.



Figure 2.21: Measurement setup of the chip.

The receiver was fabricated in a mainstream 40 nm technology. The chip microphotograph is shown in Fig. 2.20. The four main building blocks, LNTA, mixer, TIA and a minimally-invasive

baseband filter are clearly identified in Fig. 2.20. A differential and low jitter clock buffer was integrated on-chip to drive the mixer with low jitter square wave clocks. A couple of linear output buffers were included to manage the bond-wire inductors and the input impedance of the test equipment. The chip area is  $0.75 \times 1.64 \ mm^2$ . The measurement setup of the chip is shown in Fig. 2.21. Two high frequency hybrids were used to convert a single-ended RF input and clock into differential signals. For the clock buffer used to drive the passive mixer, a current-mode logic (CML) was used to buffer and amplify the incoming differential clock signals. A chain of inverters with proper sizes converted the external sinusoidal input signal into a square wave. An Agilent E8267D PSG vector signal generator was used. A baseband off-chip balun was placed between the chip's output and measurement instruments. An external clock source was used to feed clock signals into the clock buffer.



Figure 2.22: Simulated and measured (de-embedded)  $S_{11}$  of the receiver.

Fig. 2.22 shows the simulated and measured  $S_{11}$  parameter from 1 to 8 GHz range. The effects of the coplanar waveguide in PCB traces, SMA connectors, wire bonds and pads were deembedded. The inductor models from SONNET simulation software overestimated the inductance values and underestimated the quality factor and self-resonant frequency. The global minimum of the measured  $S_{11}$  is located at 4.2 GHz, while the simulated minimum is at 3.7 GHz. PVT variations and model inaccuracies also contribute to this difference. In simulations,  $L_S$  was chosen to have a differential inductance of 4.02 nH, while Q was over 12 with 3 to 6 GHz range. Its self-resonant frequency was around 22 GHz. The  $L_D$  was chosen to have a differential inductance of 6.14 nH, Q over 10 within 3-6 GHz, and its self-resonant frequency was around 18.4 GHz.



Figure 2.23: Simulated and measured NF of the receiver.

Fig. 2.23 shows the double-sided band NF with the LO frequency tuned at 3 GHz. The flickerlike noise dominates at frequencies under 2 MHz, while the noise shows a flat behavior under 5 dB at higher frequencies. Since the LNTA and current mode passive mixer are highly linear, distortions at the output of the passive mixer are small. Additionally, the nodal impedance shown in Fig. 2.19 ensures that the voltage swings at the mixer's input and output are very small. Meanwhile, the passive mixer was implemented in a fully differential mode. Furthermore, the phase noise contribution of the clock source and clock buffer was negligible without the presence of strong in-band blockers below 400 MHz.



Figure 2.24: Simulated and measured conversion gain: the -3dB frequency is found above 200 MHz.

The intended receiver baseband bandwidth is 200 MHz; hence, the system conversion gain is characterized in the frequency range of 20 kHz to 450 MHz. Fig.2.24 shows an average conversion gain of 14.5 dB and 12.5 dB with a LO of 3 GHz and 6 GHz, respectively; also, the in-band conversion gain variations are less than 2.0 dB until 200 MHz. Conversion gain drops by 3.3 dB at 300 MHz. The in-band conversion gain was extensively characterized with the LO frequency covering the 3 to 6 GHz range. Flat in-band conversion gain with less than 2.0 dB variations were found through the desired range.

Fig.2.25 shows the simulated and measured large signal inter-modulation distortion performance at TIA output with LO of 3 GHz. Two baseband tones were 40 and 60 MHz. The measured in-band IIP3 of 15.1 dBm matches well with the simulated IIP3 of 16.2 dBm. The IM3 is -50 dBc for an input signal power as high as -10.0 dBm; these results demonstrated the superior linearity performance of the proposed receiver architecture. In Fig. 2.25, the receiver exhibits a flat in-band



Figure 2.25: Simulated and measured in-band lienearity around LO of 3 GHz.

IIP3 across the 200 MHz bandwidth of the baseband TIA, and the IIP3 variations are found under 1.3 dB within this band.



Figure 2.26: Measured out-of-band linearity around LO of 3GHz.

Fig.2.26 shows the out-of-band linearity characterization, with an in-band signal and another OOB (blocker) component. The LO frequency was set at 3.0 GHz. The in-band signal was tuned at 3.16 GHz (160 MHz apart from the LO frequency but it was still within the receiver band of 200 MHz after down-conversion; thus, the out-of-band signal was set at 3.40 GHz. The baseband third-order components are present at 80 MHz and 640 MHz. The in-band component at 80 MHz was used for IIP3 characterization. In Fig. 2.26, the IM3 is around -48 dBc for an input power of -5 dBm; the extrapolation of these results shows an IIP3 of around 21.0 dBm. The 5.9 dB improvement over in-band IIP3 demonstrates the effectiveness of the minimally-invasive baseband filter, which absorbs the out-of-band power after the mixer.

For in-band signals, the filter does not play a relevant role and the power of the signals and power of the blockers are processed by the TIA. Thus, in-band system linearity is a function of the performance of the LNTA, mixer and TIA. In comparison with [29], [30], due to lower conversion gain, TIA contributes less to system nonlinearities. Meanwhile, for out-of-band components, the tone at 400 MHz and the high-frequency third-order component at 640 MHz are attenuated by the filter because its input impedance is smaller than the TIA (from Fig. 2.19) and more current will flow into the filter. As a result, only 50% of the mixer output power flowed through the TIA; then, its nonlinear contribution reduces. These results show that the nonlinear contribution of the minimally-invasive filter does not limit system performance. For an input power level of less than -5 dBm, the difference between two third-order curves in Fig. 2.25 and 2.26 is around 8 dB. For an input power level over -5 dBm, the case of two in-band signals suffers from the higher orders' folding effects. The LNTA, mixer and TIA are the dominant sources of distortion. Meanwhile, in the second case, the filter absorbs the out-of-band power, which then relaxes the operation of the TIA. This result suggests that the LNTA and mixer linearity is superior to the TIA linearity. In these two cases, the receiver's IM3 is less than -38 dBc for an input signal power as large as -5dBm.

In Fig.2.27, the conversion gain compression with respect to in-band and out-of-band blockers are measured;  $\Delta f$  indicates the RF blocker frequency offset from the RF signal. The frequencies



Figure 2.27: Conversion gain compression over in-band and out-of-band blocekrs.

of LO and RF signals were set at 3.0 GHz and 3.01 GHz, respectively. The RF signal power was kept constant at -30 dBm. Another two-dimensional sweep of the RF blocker frequencies and power levels was used. The frequencies of the RF blocker were swept from 3.02 GHz to 3.31 GHz at different power levels. The system's conversion gain compression with respect to RF blockers of different frequencies was observed. The benefit of the baseband filtering is evident from these results; hence, when the blocker is at a  $\Delta f$  of 300 MHz, the conversion gain compresses by 0.7 dB when the blocker power is as large 5.0 dBm. This limitation is mainly due to a limited power supply encountering a huge input power. A 0 dBm in-band blocker in a resistor of 50 $\Omega$  has a peak-to-peak voltage swing of 632 mV, which is equivalent to 57.5% and 35% of 1.1 V and 1.8 V power supplies, respectively.

Fig.2.28 records the NF at 3 MHz with respect to blocker power level.  $f_{LO}$  was set at 3.0 GHz.  $\Delta f$  indicated the frequency difference between the blocker and LO. An Agilent E8267D PSG signal generator and the clock buffer had in-band phase noises of -160 and -163.3 dBc/Hz



Figure 2.28: NF over blocker power at different frequency offset  $\Delta f$  without clock source phase noise filtering.

respectively. Multiple filters including the ZAFBP-2793+ and VBFZ-3590+ bandpass filters were used to reduce the clock source's phase noise by 7 dB, which is >10 MHz away from the output frequency. Measurements were made with and without filters. OOB ( $\Delta f$  > 400 MHz) 0 dBm blockers can be tolerated by the front-end with less than a 1.0 dB increase in NF. When no filters were used to reduce the clock source's phase noise, for an in-band blocker with  $\Delta f$  at 10 MHz, due to the down-conversion of phase noise from the clock source and clock buffer and no effect of the minimally invasive filter, a 0 dBm blocker increased the system NF up to 15 dB, leading to a reduced dynamic range of the whole system. After filtering the phase noise from the clock source, a 0 dBm blocker at  $\Delta f$  of 10 MHz will increase the NF to 11 dB at 3MHz.

Fig.2.29 displays the measured system  $P_{1dB}$  and IIP3 over different offset frequencies  $\Delta f$ . For the  $P_{1dB}$  plot,  $f_{LO}$  and  $f_{RF}$  were fixed at 3.0 and 3.01 GHz; thus, the  $\Delta f$  here indicated the RF blocker frequency offset from the RF signal. Results come from the two-dimensional sweep of the blocker frequency and power. Here,  $P_{1dB}$  was measured when the in-band signal conversion



Figure 2.29: Measured system  $P_{1dB}$  versus blocker frequency offset  $\Delta f$ .

gain was reduced by 1dB due to in-band or out-of-band blockers. For an IIP3 plot,  $f_{LO}$  was still 3.0 GHz. The two tones were located at 3.01 and 3.01 +  $\Delta f$  GHz with the same power. A 10 MHz baseband component was used for different  $\Delta f$  values. IIP3 was measured through different  $\Delta f$  values. In-band P<sub>1dB</sub> and IIP3 were almost constant with variations under 1.2 dB. After  $\Delta f$  goes higher than 200 MHz, P<sub>1dB</sub> and IIP3 benefit from the minimally-invasive filter and then by the capacitor  $C_3$ . As discussed above, the in-band linearity mainly depended on the LNTA, mixer and TIA chain. Meanwhile, the out-of-band linearity observed at the TIA output depended on the portion of total current absorbed by the baseband filter and  $C_3$ . As shown in Fig. 2.25, an input power higher than -5 dBm, IIP3 is a less valid indicator of system linearity after components from higher orders begin to fold in. On the other hand, in-band P<sub>1dB</sub> indicated how the system can handle larger signal blockers, such as 0 dBm.

Fig.2.30 compares the IIP3 simulation and measurement results. The  $f_{LO}$  was set at 3.0 GHz. The two-tone signals were  $f_{LO} + \Delta f$  and  $f_{LO} + 2\Delta f - f_{IM}$  so that the lower frequency IM3



Figure 2.30: Measured IIP3 versus frequency offset  $\Delta f$ .

component fell at  $f_{LO} + f_{IM}$  and the down-converted signal fell at  $f_{IM} = 10$  MHz. Both in-band and out-of-band IIP3 were always measured at 10 MHz. For  $\Delta f$  under 100 MHz, the in-band IIP3 is dictated by the LNTA, mixer, and TIA. When  $\Delta f$  is above 100 MHz, one of the tones falls outof-band, and more mixer-output current is absorbed by the minimally-invasive filter. When  $\Delta f$ goes higher than 200 MHz, both tones are out-of-band. The TIA plays a minor role in the system's linearity, and the out-of-band IIP3 is decided by the LNTA, mixer, minimally-invasive filter and linear capacitor  $C_3$  (as shown in Fig. 2.30). Like [24], the LNTA will exhibit higher IIP3 due to increased frequency spacing between the two tones and it decides the maximum achievable system linearity. The measured overall linearity results match with the simulation results, especially when  $\Delta f$  is above 100 MHz.

As a proof of concept, a complex I/Q system was built and simulated in Cadence. The receiver's error vector magnitude (EVM) performance for 0 dBm input power is shown in Fig. 2.31; the



Figure 2.31: Simulated EVM performance for 0 dBm input power (QAM-64).

QAM-64 EVM is in the range of 2.1%. I/Q mismatches were not included in simulation.

Fig.2.32 shows the system performance across the entire receiver's bandwidth. The conversion gain decreases from 14.5 dB to 12.5 dB within the input frequency range of 3 to 6 GHz. NF at 3MHz baseband is from 5.0 to 5.8 dB. Over the entire band, the in-band IIP3 and  $P_{1dB}$  are higher than 15.1 dBm and 3.0 dBm, respectively. This system's in-band linearity remains flat up to 200 MHz. The measured power consumption goes from 64.1 mW to 69.6 mW, which was measured with 3 GHz to 6 GHz LO frequency. The increased power consumption comes from the clock buffer part due to increased clock frequency.



Figure 2.32: Measured receiver's performance: gain, NF, IIP3 and  $P_{1dB}$ .

| Ref.                        | [5]                                              | [6]                                  | [13]                                | [18]                  | This Work                                             |
|-----------------------------|--------------------------------------------------|--------------------------------------|-------------------------------------|-----------------------|-------------------------------------------------------|
| Tech.                       | 130 nm                                           | 65 nm                                | 45 nm SOI                           | 90 nm                 | 40 nm                                                 |
| LNTA/LNTA<br>structure      | Cross-coupled<br>CG with<br>multiple<br>feedback | CG-CS with<br>current<br>reuse on CS | Mixer-first                         | Resistive<br>feedback | Cross-coupled<br>CG with<br>resistive<br>degeneration |
| RF Input Range<br>[GHz]     | 1-5.2                                            | 0.08-5.5                             | 0.2-8                               | 2.0-5.8               | 3.0-6.0                                               |
| Baseband BW<br>[MHz]        | 10                                               | 12                                   | 10                                  | 20                    | 200                                                   |
| NFdsb<br>[dB]               | 6.5–8.3<br>at 1 MHz                              | > 3.5<br>at 1 MHz                    | 2.3-5.4<br>(0.5-6 GHz<br>$f_{LO}$ ) | 12.2–13.8<br>at 5 MHz | 5.0–5.8<br>at 3 MHz                                   |
| IIP3 [dBm]<br>(in-band)     | ≥ -1.5                                           | 3.5                                  | 0-5                                 | ≥-1.5                 | 15.1-16.7                                             |
| IIP3 [dBm]<br>(out-of-band) | NA                                               | NA                                   | $39$ $\Delta f/BW = 8$              | NA                    | $33$ $\Delta f/BW = 2.5$                              |
| P <sub>1dB</sub><br>[dBm]   | -10                                              | -22                                  | $-9$ $\Delta f/BW = 1$              | -24-(-21)             | 3.0-3.9                                               |
| B <sub>1dB</sub><br>[dBm]   | NA                                               | NA                                   | $12 \\ \Delta f/BW = 8$             | NA                    | $7.0$ $\Delta f/BW = 2.5$                             |
| Conv. Gain<br>[dB]          | 22.4-24.3                                        | 34                                   | 21                                  | 18-23                 | 12.5-14.5                                             |
| Power<br>[mW]               | 48                                               | 60                                   | 50+30/GHz                           | 85                    | 64.1-69.6                                             |
| V <sub>DD</sub>             | 1.5(A)/1.2(D)                                    | 1.2                                  | 1.2                                 | 2.7                   | 1.8(A)/1.1(D)                                         |

Table 2.2: Comparison with LNA first receivers.

### 2.4 Conclusions and Summary

The performance of the proposed receiver was compared with the state-of-the-art receivers [5], [6], [13], [18] in Table 2.2. After benefiting from the advanced technologies and structures, the proposed receiver achieved a larger than 200 MHz baseband BW, a larger than 3.0 dBm in-band  $P_{1dB}$ , and a total power consumption of less than 70 mW for a 3 GHz to 6 GHz frequency range. The LNTA, TIA and base filter consumes 32, 18 and 8.4 mW, respectively. The power consumption due to the clock buffer increased from 5.7 to 11.2 mW.

The proposed receiver can handle large blockers due to its very high in-band IIP3 and  $P_{1dB}$ 

values as shown in Table 2.2. In-band  $P_{1dB}$  is a good indicator of a system's large signal linearity, and this system outperforms previously reported receivers. For all in-band and out-of-band cases, the IM3 is under -38 dBc for input signal power as large as -5 dBm. For power above -5dBm,  $P_{1dB}$  is an effective parameter for receiver characterization. The measured in-band  $P_{1dB}$  is above 3 dBm. The proposed system shows a remarkable tolerance to both in-band and out-ofband blockers. When the blocker is over 400 MHz away from the input component, the reciprocal mixing problem from the phase noise at LO had a much lesser impact on the system NF as shown in Fig. 2.28, which is the result of the system's large linearity. Since the proposed receiver targets at an input power of -5 dBm over a 200 MHz bandwidth, large conversion gain values can not be used to avoid system saturation. However, if the input power was less than -20 dBm, the conversion gain can be increased with a larger TIA feedback resistor  $R_F$ . To further reduce the impact of reciprocal mixing, prior advancements developed by [31][32][33][34] should be studied for their availability in wideband applications.

The proposed architecture is power efficient because it offers a baseband bandwidth over ten times higher than the listed receivers in Table 2.2, as well as outstanding linearity. As compared to recently published mixer-first receivers in Table 2.3, this design still shows better trade-off between in-band IIP3, conversion gain and power consumption. The in-band  $P_{1dB}$  is much better than other designs. Future advances in this field are needed to reduce the impact of flicker noise from TIA to improve the receiver's NF at low frequencies. A higher order filtering can help to improve the rejection of out-of-band blockers. For the future work, a complete direct conversion receiver will be built to fully test the proposed receiver's system performance. To achieve similar NF and linearity as this design, the transconductance of the LNTA should be doubled, thus doubling the whole system's power consumption. Implementing I/Q paths will require on-chip or off-chip calibrations of frequency dependent I/Q imbalances, such as gain mismatch and group delay offsets.

| Ref.                        | [35]                   | [36]                              | [37]                                     | [38]                                     | This Work                                             |
|-----------------------------|------------------------|-----------------------------------|------------------------------------------|------------------------------------------|-------------------------------------------------------|
| Tech.                       | 28 nm                  | 40 nm                             | 22 nm FDX                                | 180 nm                                   | 40 nm                                                 |
| LNTA/LNTA<br>structure      | Mixer-first            | LNTA with<br>parallel<br>feedback | Mixer-first<br>and LNTA<br>noise cancel. | Mixer-first<br>and noise<br>cancellation | Cross-coupled<br>CG with<br>resistive<br>degeneration |
| RF Input<br>Range<br>[GHz]  | 1.0-2.0                | 0.4-3.2                           | 1.0-6.0                                  | 0.2-1.2                                  | 3.0-6.0                                               |
| Baseband<br>BW [MHz]        | 130                    | 80                                | 175                                      | 22/18                                    | 200                                                   |
| NFdsb<br>[dB]               | 5.5<br>at 2 GHz LO     | 2.7-3.6                           | 2.5-5                                    | 3.4-4.0                                  | 5.0–5.8<br>at 3 MHz                                   |
| IIP3 [dBm]<br>(in-band)     | -12                    | -20                               | 9                                        | 14.5                                     | 15.1-16.7                                             |
| IIP3 [dBm]<br>(out-of-band) | $21$ $\Delta f/BW = 3$ | $\frac{13}{\Delta f/BW} = 5$      | $\frac{18}{\Delta f/BW} = 4$             | $25$ $\Delta f/BW = 2.2$                 | $33$ $\Delta f/BW = 2.5$                              |
| P <sub>1dB</sub><br>[dBm]   | NA                     | -32                               | -10                                      | -10/-5                                   | 3.0-3.9                                               |
| B <sub>1dB</sub><br>[dBm]   | 3                      | $-5$ $\Delta f/BW = 5$            | $2 \\ \Delta f/BW = 4$                   | $2.4$ $\Delta f/BW = 2.5$                | $7.0$ $\Delta f/BW = 2.5$                             |
| Conv. Gain<br>[dB]          | 32.4                   | 36                                | 23                                       | 22.2/31.4                                | 12.5-14.5                                             |
| Power<br>[mW]               | 21.6<br>+7.8/GHz       | 58.5<br>+17.6/GHz                 | 172                                      | 65-155                                   | 64.1-69.6                                             |
| V <sub>DD</sub>             | 1.8/1.2                | 1.3/1.2                           | 1.2                                      | 1.8                                      | 1.8(A)/1.1(D)                                         |

Table 2.3: Comparison with mixer-first receivers.

# 3. A 2.3-3.9 GHZ FRACTIONAL-N PLL WITH CHARGE PUMP AND TDC CALIBRATIONS FOR REFERENCE AND FRACTIONAL SPUR REDUCTION\*

# 3.1 Introduction

Emerging 5G wireless system demands a baseband bandwidth over 200 MHz, which requires high performance fractional-N PLLs with low phase noise and low spur levels, since the receiver's signal-to-noise ratio (SNR) is directly affected by these nonidealities. As shown in Fig.3.1, while



Figure 3.1: Effects of spurs on a wideband system's SNR.

the PLL's phase noise and in-band spurs reduce the SNR by mixing with the in-band signal, out-ofband spurs will also down-convert large out-of-band blockers into the baseband, which degrades the SNR at baseband [39]. Meanwhile on the transmitter ends, out-of-band spurs will also impact

<sup>\*</sup>Part of the data reported in this chapter is reprinted with permission from [39], "A 2.3-3.9 GHz Fractional-N Frequency Synthesizer with Charge Pump and TDC Calibration for Reduced Reference and Fractional Spurs" by Junning Jiang and et al., 2021, IEEE Radio Frequency Integrated Circuits Symposium (RFIC), pp. 71-74, Copyright ©2021, IEEE

the SNR. However, this problem is less demanding than the receiver end, since the out-of-band leakage on transmitter end is controllable and predictable. Still, the out-of-band leakages need to be regulated to meet the transmission mask requirements. To alleviate aforementioned problems, spur reduction techniques need to be implemented in a high-performance and low-noise PLL's design. The PLL's phase noise performance's tradeoffs between reference and fractional spur levels are not apparent. Thus, spur reduction techniques should be achieved without introducing additional phase noise.

There are numerous PLL publications for low spur techniques. [40] reports a dithering and noise-cancellation in the digital delta-sigma modulator (DDSM) inside the fractional-N PLL to minimize the power of the spurs, however it requires training sequences. The analog CP-PLL in [41] contains a calibration algorithm in the digital divider to remove the fractional spurs, which shows several advantages over dithering. All digital PLLs in [42, 43] require time-to-digital (TDC) calibration, internal frequency multiplication and phase-dithering to mitigate the spurs, respectively. These methods need complex digital hardware. [44] initiates the usage of noise-shaping TDC to reduce phase noise. The proposed architecture takes the advantage of the infinite voltage resolution of the CP-PLL that can be helpful to reduce reference spurs and the digital processing capability of digital PLL to nullify the power of fractional spurs of certain patterns. As indicated in Fig.3.2, based on a CP-PLL, the proposed PLL incorporates a digital phase processor (DPP). The DPP includes two 10-bit sub-ranging TDCs, two digital filters, a 10-bit sub-ranging DTC and an output multiplexer. Besides the DPP, the proposed PLL includes a phase frequency detector using transmission gate (TG-PFD), a charge pump with a 4-bit current DAC, an analog loop filter, an LC voltage controlled oscillator and a programmable fractional divider with a third-order self-dithered DDSM realized in multi-stage noise shaping (MASH) 1-1-1 structure.

Since reference and fractional spur reductions are the main aims of this PLL design, DPP, charge pump and programmable divider are the essential blocks. This chapter is organized as follows. Section 3.2 describes general properties of a PLL. Section 3.3 provides detailed descriptions of circuits and proposed reference and fractional spur calibration techniques in a system level.



Figure 3.2: Proposed DPP assisted PLL architecture.

Section 3.4 will have transistor-level descriptions of blocks with focus on the core blocks. Section 3.5 provides measurement results on silicon and conclusions of proposed PLL, respectively. The measurement results demonstrates the effectiveness of proposed spur reduction techniques by providing a reference spur level of -108.3 dBc and out-of-band spur levels below -95 dBc.

#### **3.2** PLL as a Dynamic System



Figure 3.3: PLL structure.

It is needed to review the PLL's behavior through a control system's perspective[17, 45, 46]. In Fig. 3.3 a phase locked loop is locking feedback signal's phase  $\phi_F(t)$  with the input signal's phase  $\phi_I(t)$  through negative feedback. This simplified model consists of only four blocks, a phaseand-frequency detector (PFD), a loop filter, a voltage-controlled oscillator and a feedback divider. For a PLL without divider, which is a special case of feedback division ratio N = 1, its output phase  $\phi_O(t)$  should be aligned with the input phase  $\phi_{in}(t)$ . This implicitly means the input and output signal have the same frequencies. The importance of the loop filter is highlighted, since it



Figure 3.4: PLL system diagram.

determines the static and dynamic performance of the loop [46]. VCO only provides a single pole

at origin and the rest of the poles and zeros are provided by the loop filter. The system diagram in Fig. 3.4 is shown, where  $\phi_I$  and  $\phi_F$  is aligned. In this way, the loop transfer function can be expressed as

$$\frac{\phi_O(s)}{\phi_I(s)} = \frac{K_{PFD}F(s)K_{VCO}}{s + K_{PFD}F(s)K_{VCO}/N}$$
(3.1)

The transfer function shows a low frequency gain given by N. The settling behavior depends on the filter's property F(s). Here the loop gain is

$$T_{PLL}(s) = \frac{K_{PFD}F(s)K_{VCO}}{sN}$$
(3.2)

As shown from from 3.2, the loop gain is reduced by an increased feedback division ratio N.  $K_{PFD}$ and  $K_{VCO}$  are generally chosen with less flexibility [17]. For example,  $K_{PFD}$  is  $1/2\pi$  for D-FF based PFD and  $2/\pi$  for XOR based PFD. The  $K_{VCO}$  depends on the required output frequency range, passive components' quality factors and varactors' sizes. There are technology-dependent constraints behind the design of the VCO. A small  $K_{VCO}$  is more prone to blind bands, which requires smaller granularity of capacitor banks. On the other hand, a large  $K_{VCO}$  is sensitive to the noise at input[17]. Since  $K_{VCO}$  is PVT sensitive, choosing proper F(s) will significantly ensure the stability of the feedback loop and avoiding the risks of ringing and losing lock. Phase margin of the PLL has been widely investigated. An enough phase margin with proper locations of poles and zeros will avoid closed-loop transfer function peaking[17], which is important in PLLs that require fast settling as well as in clock data recovery (CDR) circuits without jitter peaking.

# 3.3 A Overview of a Fractional-N PLL system



Figure 3.5: An charge pump fractional-N PLL.

Fractional-N PLLs can be categorized into analog and digital PLLs [17, 47]. A typical fractional-N analog PLL system, as shown in Fig.3.5, consists of seven blocks; a reference clock source, a phase-and-frequency detector (PFD), a charge pump, a loop filter, a voltage-controlled oscillator (VCO), a feedback frequency divider and a feedback division ratio control block. All of these blocks have significant impact on system performances. Frequency control word (FCW) sets the current division ratio through digital control bits. Typical indicators of a fractional-N PLL's performance are its output phase noise and spur performances. Generally, the noise parts from the reference, VCO and delta-sigma modulator are the main contributors to the system's total phase noise. The noise-shaping property from delta-sigma modulator inside the PLL reduces greatly its in-band noise components. Its out-of-band noise is shaped by the loop. At the same time, reference spurs (green) are coming from non-idealities of the charge pump. Fractional spurs (magenta) arises from the DDSM's fractional division pattern. Fractional spurs are closer to the output carrier



Figure 3.6: Design challenges of a charge pump fractional-N PLL.

that the reference spurs and they need different techniques to suppress other than just changing the loop dynamics.

# 3.3.1 Reference Clock Source

For a PLL's reference clock source, it can be generated from an external crystal oscillator or the output of another PLL. The transfer function from PLL's input to its output can be expressed as [17][48],

$$\frac{\phi_O(s)}{\phi_I(s)} = \frac{K_{PFD}I_{CP}Z(s)K_{VCO}}{s + K_{PFD}I_{CP}Z(s)K_{VCO}/N}$$
(3.3)

The loop filter converts the current from the charge pump into voltage, thus its impedance is expressed as Z(s). Compared with 3.1 and 3.2, loop filter F(s) takes the form of  $I_{CP}Z(s)$ . Then, the corresponding loop gain T(s) is expressed as,

$$T(s) = \frac{K_{PFD}I_{CP}Z(s)K_{VCO}}{sN}$$
(3.4)

Fundamentally, the reference's phase noise are multiplied by  $N^2$  and low-pass filtered at PLL's output. Close-in spurs from the clock source will also appear at PLL's output.

#### 3.3.2 Phase-and-Frequency Detector



Figure 3.7: Diagram of a PFD.

A phase-and-frequency detector distinguishes the phase errors between reference and feedback signals and performs a linear gain block with structure dependent input range. It converts phase difference into pulse width differences with an equivalent PFD's gain  $K_{PFD}$ . For a typical Dflipflop (D-FF) based PFD shown in Fig. 3.7, the  $K_{PFD}$  is  $1/2\pi$ . Fig. 3.8 shows a typical PLL is locked state where reference signal leads feedback signal. Thus, the up pulse from PFD's output is wider than the down pulse.  $T_{SPE}$  stands for the static phase error in locked state and  $T_D$  is the delay when down pulse is set and both pulses are reset.  $T_{D1}$  is the clock-to-Q delay of the D-FF in Fig. 3.7.  $T_D$  is usually described as the delay of feedback path of a PFD. From Fig. 3.7,  $T_D$  can be expressed as,

$$T_D = T_{rst-Q} + T_{nand2} + T_{inv} \tag{3.5}$$

 $T_{nand2}$  and  $T_{inv}$  are the delays of a nand gate with two inputs (nand2) and inverter respectively.  $T_{rst-Q}$  is the delay between reset and Q of the D-FF.  $T_{PFD}$  is the width of wider pulse and is the sum of  $T_D$  and  $T_{SPE}$ . More importantly, if  $T_{SPE}$  approaches zero,  $T_{PFD}$  equals  $T_D$ . All these



Figure 3.8: Timing diagram of a PFD.

delays and pulses' rising and falling times are PVT and technology dependent. To analyze PFD's noise at PLL's output, the following diagram is used. By applying the nodal analysis, the PFD's



Figure 3.9: Transfer function of PFD's noise.

output noise to PLL's output is expressed as,

$$\frac{\phi_O(s)}{\phi_{n,PFD}(s)} = \frac{I_{CP}Z(s)K_{VCO}/s}{1+T(s)}$$
(3.6)

For advanced technology nodes, PFD usually generates minimum phase noise at output due to fast transitions of digital logics.

## 3.3.3 Charge Pump Design Challenges



Figure 3.10: A simple charge pump.

Charge pump converts the difference from time domain (pulse width) to current domain as in Fig.3.10. Thus, the equivalent gain of a charge pump is usually the nominal average current of the charge pump  $I_{CP}$ . As an analog block, charge pump's non-idealities lead to additional noise and reference spurs. For the charge pump's noise, there are various publications have a detailed analysis[17],[44]. Fundamentally, the thermal noise from the current noises upside and downside are filtered and eventually appear at output. Meanwhile, the switched capacitor noise from the switches will also be presented at PLL's output[44]. As the output noise is affected the pulse width  $T_{PFD}$ , smaller  $T_{PFD}$  lead to reduced noise power. The pulsed charge pump noise appears at PLL's output is expressed as follows,



Figure 3.11: Transfer function of charge pump's noise.

$$\frac{\phi_O(s)}{I_{n,CP}(s)} = \frac{Z(s)K_{VCO}/s}{1+T(s)}$$
(3.7)

Here the charge pump is not working in a continuous-time fashion since the switches are only turned on for a short period of time, which indicates a low duty cycle. Thus, the noise power of the charge pump is estimated as [49],

$$\overline{I_{n,CP}^2} \approx 8kT\gamma gm \frac{T_{PFD}}{T_{REF}}$$
(3.8)

Although in Fig. 3.8 the up and down pulse have different widths  $T_{PFD}$  and  $T_D$ , for simplicity, it is assumed that they have similar pulse widths  $T_{PFD}$ . And here it is also assumed that the two current sources have similar noise performances as  $4kT\gamma gm$ . From 3.8, the charge pump's output noise is proportional to the  $T_{PFD}$ , which is not directly related to the loop bandwidth of the PLL.

Fig. 3.12 shows a simplified example of output ripples from charge pump. In Fig. 3.12, the pulse width of UP pulse is wider than the pulse width of the DN pulse. Considering a premise of zero net charge on the filter [17], the integration of the charging current over time should equal the integration of discharging current over time. In this example, taking the charging current with a positive sign and discharging current with a negative sign, the charging current equals the  $I_{UP}$ , which is charge pump current from upside. The charging time for  $I_{UP}$  is  $T_{PFD}$ . And the discharging current is  $I_{DN}$ , which is the from downside of the charge pump and the discharging



Figure 3.12: Output ripple from charge pump.

time is  $T_D$ . Since  $T_D < T_{PFD}$ , this implicitly means that  $I_{DN} > I_{UP}$ , as long as the condition of zero net charge is kept. Here  $I_{UP}$  can also be expressed as  $I_{CP}$  and  $I_{DN}$  is  $I_{CP} + \Delta I_{CP}$ . For simplicity, assuming that the pole/zero pair inside the filter is much lower than the reference frequency, the charging and discharging process basically happens on the top-plate of  $C_2$ . So the amplitude of the ripple voltage is,

$$V_A = \frac{(T_{PFD} - T_D) * I_{CP}}{C_2} = \frac{T_{SPE} * I_{CP}}{C_2}$$
(3.9)

The non-zero  $T_{SPE}$  allows the ripple  $V_A$  to appear and the charge pump current  $I_{CP}$  increases the voltage on the top-plate of  $C_2$ . Lower  $T_{SPE}$  leads a lower  $V_A$ . As from 3.3, the F(s) is changed into  $I_{CP}Z(s)$ , the  $I_{CP}$  scales inversely with Z(s). Thus, increasing  $I_{CP}$  will lead to the increase of  $C_2$  proportionally, if constant loop dynamics are assumed. From this simplified model, reducing  $T_{SPE}$  is fundamental to a minimized ripple voltage  $V_A$ . In this case, minimizing  $T_{SPE}$  leads to minimum ripple from charge pump's output. Consider a periodic output ripple with swing of  $V_A$ , between 0 and  $T_{REF}$ , the ripple  $V_{Rip}$  can be expressed as,

$$V_{Rip}(t) = \frac{V_A}{T_{SPE}} * t, 0 \le t \le T_{SPE}$$
 (3.10)


Figure 3.13: Period ripple at charge pump output.

$$V_{Rip}(t) = -\frac{V_A}{T_D} * (t - T_{PFD}), T_{SPE} \le t \le T_{PFD}$$
(3.11)

As shown in Fig. 3.8, $T_{SPE}$  is the static phase error. And  $T_D$  and  $T_{PFD}$  are pulse widths of DN and UP, respectively.By taking a periodic Fourier analysis,

$$V_{Rip}(t) = \sum_{n=0}^{\infty} A_n e^{jn\Omega t}$$
(3.12)

the coefficient of the n-th order harmonic is,

$$A_{n} = \frac{2}{T_{REF}} * \int_{0}^{T_{REF}} V_{Rip}(t) e^{-jn\Omega t} dt$$
(3.13)

where,  $\Omega$  is  $2\pi/T_{REF}$ . After mathematical manipulations,  $A_n$  can separated into two parts,  $A_{n,p1}$  and  $A_{n,p2}$ .

$$A_{n,p1} = \left(\frac{T_{SPE}e^{-jn\Omega T_{SPE}} + \frac{e^{-jn\Omega T_{SPE}} - 1}{jn\Omega}}{-jn\Omega}\right)\frac{2V_A}{T_{SPE}T_{REF}}$$
(3.14)

$$A_{n,p2} = \left(\frac{T_{PFD}e^{-jn\Omega T_{SPE}} - T_{SPE}e^{-jn\Omega T_{SPE}} + \frac{e^{-jn\Omega T_{PFD}} - e^{-jn\Omega T_{SPE}}}{jn\Omega}}{jn\Omega}\right)\frac{2V_A}{T_D T_{REF}}$$
(3.15)

It's non-trivial to the find the magnitudes of  $A_{n,p1}$  and  $A_{n,p2}$  and their sum' upper bounds are discovered,

$$|A_n| \le \left(\frac{2V_A}{n^2 \Omega^2 T_{SPE}} + \frac{2V_A}{n^2 \Omega^2 T_D}\right) \frac{2}{T_{REF}}$$
(3.16)

Since  $V_A/T_{SPE}$  equals  $I_{CP}/C_2$ , which is design dependent, which finds the lower bound of the

upper bound of  $|A_n|$ . However, as  $T_{SPE}$  approaches zero,  $e^{-jn\Omega T_{SPE}} - 1$  is also approaching zero, which significantly reduces  $V_A$  and  $|A_n|$ . Considering  $T_D$  usually needs to be higher than  $T_{SPE}$ [50], the  $|A_n|$  can be minimized by making  $T_{SPE}$  close to zero as much as possible. In a sum, minimizing  $T_{SPE}$  leads to minimized reference spur levels. In the transistor level scenarios, the situations become much more complex, still, this analysis gives the starting point of reference spur level reduction.

On the other hand, current source mismatches over different output voltages, clock feedthroughs and charge injections from switches will add up to the non-linear charge transfers, thus additional spurs will appear at VCO's output due to the ripple up-conversion from the varactor [17]. Minimizing the reference spurs has been the one of the main goals of this project. Taking the D-FF PFD and charge pump as an example, the pulse widths from the PFD  $T_{PFD}$  cannot be very small, in which the switches in the charge pump does not have enough time to open and then close to channel current to the output. This is a typical dead-zone problem [50] and the way to solve this problem is to enlarge the feedback path length  $T_D$  in the PFD to give the switches in the charge pump enough time to respond. However, this problem is greatly alleviated in advanced CMOS nodes as the transistor is responsive to narrow pulses. Although, short channel devices with higher  $\gamma$  leads to higher current noise, a much shorter  $T_{PFD}$  also reduces the equivalent current noise 3.8. From the noise's point of view, short channel devices generate lower noise at PLL's output. In a sum, the pulse widths from the PFD  $T_{PFD}$  have significant impacts on the charge pump's performance. It needs a longer  $T_{PFD}$  path to have wider pulses to avoid dead-zones. Meanwhile, narrow pulses are helpful in reducing the thermal noise from current sources. The pulse widths also determine the portion of current source mismatches that contributes to reference spurs. Thus, this is a systematic level optimization. PFD and charge pump should be designed and optimized together. Minimizing the static phase error  $T_{SPE}$  without introducing the dead zone problem reduces the reference spur level greatly.

## 3.3.4 Voltage Controlled Oscillator

There are multiple types of voltage controlled oscillators in CMOS designs, such as a ring oscillator and a LC oscillator. A ring VCO has a compact layout, a wide tuning range and unsatisfactory phase noise performance [51]. On the contrary, an LC-VCO has a larger layout, a smaller tuning range and better phase noise performance.



Figure 3.14: Transfer function of VCO's noise.

$$\frac{\phi_O(s)}{\phi_{n,VCO}(s)} = \frac{1}{1+T(s)}$$
(3.17)

The transfer function of VCO's phase noise in the PLL is shown in 3.17 and it shown a highpassed property. There have been techniques to reduce the phase noise of LC-VCO by injection locking [52][53]. However, these locking mechanisms usually lead to additional spurs and is not applicable for the low spur applications[54]. There are also complex structures through transformer-based noise coupling to reduce the phase noise. These techniques need additional silicon areas.

For the output spurs [55] coming from the ripples  $|A_1|$  at charge pump's output 3.16,

$$\frac{A_{spur}}{A_{carrier}} = \frac{1}{2} \frac{K_{VCO}|A_1|}{2\pi f_{ref}}$$
(3.18)

3.18 shown the spur's amplitude level relative to the output carrier's amplitude.  $|A_1|$  is calculated from 3.14, which is the amplitude of the first-order harmonic at  $f_{ref}$ . Similarly, for  $|A_2|$ , the denominator in 3.18 will change to  $2\pi * 2f_{ref}$ . 3.18 also indicates that the loop bandwidth related parameter  $K_{VCO}$  is proportional to the output spur level. Reducing  $K_{VCO}$  will reduce output spurs at the cost of lowering the loop bandwidth, which is equivalent to reducing charge pump current  $I_{CP}$  or loop filter Z(s) from loop transfer's point of view. For most cases, the reference spurs are placed out-of-band and the loop gain is below 0 dB. Thus, the reference spur levels solely depends on the charge pump, loop filter and VCO's properties. For close-in fractional spurs, which is inband, are affected by the loop's properties. For this 40nm technology, ultra-thick metal (UTM) is not available, thus inductors and transformers can be only built with limited quality factors. The phase noise performance is limited by quality factor of inductor and a typical complementary cross-coupled LC-VCO with tail current filtering [56] is implemented. It is targeted at 2.4-4 GHz, with  $K_{VCO}$  from 100 MHz/V to 170 MHz/v.

## 3.3.5 Loop Filter



Figure 3.15: Second order loop filter.

Loop filter is the core part of a fractional-N PLL which is fundamental in deciding the loop transfer function. The loop filter is realized in passive components, because active filters are

noisy[46]. Taking a second-order loop filter (Fig. 3.15) as a example, the  $R_1$ ,  $C_1$  and  $C_2$  provide one pole at origin and a pole-zero pair. The pole-zero pair determines the phase margin of the feedback system. Due to process variations, such as 10% and 20% of  $I_{CP}$  and  $K_{VCO}$  variations, a decent phase margin will avoid the system from overshooting during the pull-in process.



Figure 3.16: Transfer function of loop filter's noise

$$\frac{\phi_O(s)}{V_{n,FIL}(s)} = \frac{K_{VCO}/s}{1+T(s)}$$
(3.19)

Since the capacitors are regarded as noiseless,  $V_{n,FIL}$  mainly comes from resistor  $R_1$ 's noise and is low-pass filtered by the RC. $V_{n,FIL}$  is band-passed and appears at PLL's output. More importantly, the unity gain frequency of the PLL should be low enough to make sure the discrete current pulses from charge pump to have an transfer function, which is approximate to the targeted analog transfer function, which is described as the following equation,

$$f_{BW} \le \frac{f_{ref}}{20} \tag{3.20}$$

The reference frequency should be higher than the system bandwidth, usually 10-20 times [46], to make the whole system have desired analog transfer function.

#### 3.3.6 Feedback Frequency Divider

Designing a feedback divider is also an important topic in wireless systems. Fundamentally, the divider is a counter, which is driven by the VCO's output. Compared with a programmable divider ranging from M to N, a programmable divider ranging from M/2 to N/2 with a divideby-2 prescaler, where M/2 and N/2 need to be integer numbers, can only have output frequencies with granularity of  $2f_{REF}$  instead of  $f_{REF}$ . Thus, the latter case has smaller frequency granularity.

The divider design is usually a technology-dependent process. For fast CMOS technologies, divider can be fully realized in either current-mode logic (CML) or digital logic [57]. D-FFs are commonly used for the implementation of frequency dividers, such as true single phase clock (TSPC) or transmission-gate D flip-flop (TG-DFF). Slow technologies usually employ CMLs as fast dividers and digital logics as slow and programmable dividers in cascade topologies. Bandwidth extension techniques can also be applied to CML to further extend its operating frequency with passive components. Regenerative dividers [58] have been widely studied in millimeter wave applications, such as 60 GHz and 75 GHz.

The frequency granularity of a PLL's output frequency should be planned ahead. To alleviate the problem of reduced granularity of fixed prescaler, a dual-modulus prescaler can be used [17]. However, PVT variations need to be considered to have multiple modulus divider working properly [17, 59]. The problem of jitter accumulation needs to be addressed when implementing the cascade structure. For this design, no prescaler is used and the low powered TSPCs are used to have full capability of programming the division ratios.

#### 3.3.7 Division Ratio Control Block

The division ratio control block is used to generate integer or fractional division ratios. For integer division ratios, its output is fixed. For fractional division ratios, it is realized through changing division ratios periodically. For example, an alternating division ratio of 71, 72 will have an equivalent division ratio of 71.5 and a fractional spur appears at  $f_{REF}/2$ . However, if the divider has the frequency granularity of 2 instead of 1, to have the division ratio of 71.5, then, a pattern

of 70, 72, 72, 72 is needed. Thus, the fractional spurs are decided by the pattern repetition period, which appears at multiples of  $f_{REF}/4$ . Here, the frequency granularity of the PLL has a significant impact on the frequencies of fractional spurs. Since fractional spurs are pattern dependent, there are numerous methods developed to reduce fractional spurs. The fundamentals of these methods are generally described as follows. The first step is to choose the proper integer and fractional division ratio. The fractional division ratio is realized by a repeated integer division ratio pattern. The second step is to break periodicity of the repeated integer division ratio pattern through dithering or other methods. The dithering reduces fractional spurs and increases the noise level. Dithering does not change the average value of the division ratio pattern, thus, the fractional division ratio is not changed. The final step is trying to shape the white noise from the dithering into the out-of-band noise through techniques, such as a DDSM, where low noise components of the division of the DDSM's noise has been well-studied [60],

$$\Phi_{n,dsmo} = S_{qn} \frac{(2\pi)^2}{|1 - z^{-1}|^2} \frac{1}{N^2} \left| N \frac{T(s)}{1 + T(s)} \right|^2$$
(3.21)

where  $S_{qn}$  is the DDSM's noise density spectrum and T(s) is the loop gain. The DDSM's noise density in phase domain is described as [60],

$$S_{qn} = \frac{1}{12f_{ref}} \left| NTF(z) \right|^2.$$
(3.22)

NTF(z) is the DDSM's noise transfer function. Methods on reducing the shaped out-of-band noise are also developed. Since this equivalent noise component is generated at the PLL's input, then the out-of-band component will be filtered. Realizing such functionality usually requires a DDSM due to its noise-shaping properties[61, 62, 63]. Without dithering, the DDSM will generate repeated patterns, whose periodicity is related to its filter order [62]. By injecting dithering into different nodes of the DDSM, different shaped noise happens at output. Different from an analog delta-sigma modulator, where noise are sampled and processed periodically, the noise from DDSM is low-pass filtered at PLL's output. The loop has different delays for different frequency components. Thus, even with a dithered DDSM, there still have been reported fractional spur patterns appear at the fractional-N PLL's output due to loop dynamics [60]. Meanwhile, there have been enough publications devoted to the reducing the noise and fractional spurs [62, 63]. The noise from the DDSM is a trade-off with its remaining fractional spurs [60]. However, all these methods come with more hardware overhead and takes extra power consumption. A self-dithered DDSM [60] is used in this prototype and the out-of-band noise is partially reduced by the filters embedded in this PLL. The proposed fractional spur filter is also verified through this self-dithered DDSM.

# 3.4 Proposed PLL Architecture



Figure 3.17: Proposed DPP assisted PLL architecture.

The proposed PLL includes a digital phase processor (DPP) to assist the proposed PLL calibrations as in Fig. 3.17. The DPP contains two TDCs, a DTC, a moving average filter and a FIR filter. The TDC helps the charge pump to minimize reference spurs through observing and minimizing  $T_{SPE}$ . The 4-bit current DAC is tuned to reach the minimal  $T_{SPE}$ . The systematic calibration steps are controlled by the DSP. For this PLL, a third order architecture with a bandwidth of 250 kHz is used. The VCO is based on LC tanks with  $K_{VCO}$  ranging from 100 MHz/V to 140 MHz/V.



Figure 3.18: Proposed LC-VCO with complementary cross-coupled pair.

# 3.4.1 VCO Realization

An LC-VCO with optimized phase noise performance was used in Fig.3.18. An inductor at the bottom side of LC-VCO not only filtered out second-order harmonics but also increased output impedance of tail current [56], which reduced phase noise due to higher output swing. For this technology, a spiral inductor built on Metal-8 with 0.023  $\Omega$ /square at DC can only achieve a 750-pH differential spiral inductor with quality factor of 12.4 at 3 GHz. Meanwhile, a 1.1 V power supply is used for core devices, which limits the output voltage range of charge pump and output swing of VCO. Thus, an LC-VCO with cross-coupled NMOS and PMOS pair is implemented. Limited by

the low output swing from the spiral inductor with low Q and low power supply, increasing power consumption by burning more current into the LC tank cannot improve the VCO's phase noise performance. In this way, the VCO only consumes 2.7 mA from 1.1 V power supply. Since this VCO has a pair of cross-coupled PMOS transistors, noise from power supply will be up-converted and will introduce additional phase noise. A RC circuit with high frequency poles was used to filter out the high frequency noise from the LDO into the VCO power supply. The capacitor in the low-pass filter was realized with a high frequency capacitor. The de-coupling capacitors were realized with combination of varactors and MoM capaictors to ensure high frequency noise can pass through.

#### 3.4.2 Charge Pump with 4-bit Current DAC Calibration Scheme

From [17], VCO's reference spur is dictated by the reference ripples at the output of loop filter, which can be minimized by optimizing the charge pump. Charge pumps built from less advanced technologies, such as 90 nm, 180 nm, generally focus on minimizing the current mismatches between upside  $I_{UP}$  and downside  $I_{DN}$ . This is based on the fact that these technologies don't have fast switches and charge pumps need wider voltage pulses to drive the current sources to avoid the dead-zone problem [50]. Thus, for all the high frequency charges that are going to hit the top plate of  $C_2$  of the loop filter 3.14, charges from current sources are the main part of the net charge. Thus, balancing  $I_{UP}$  and  $I_{DN}$  would minimize the glitches. However, this cannot easily achieve a less than -100 dBc reference spur since other glitches from other contributors are not processed. Moreover, noise is higher [44], since  $T_{PFD}$  and gm were higher in 3.8.

For the charge pump realized in this 40 nm technology, which is shown in Fig. 3.19, faster switches realized in transmission-gate don't need a wide voltage pulse  $T_{PFD}$  to avoid the deadzone. Thus, the output current pulses can be narrower. Here the net charge injected into the loop filter are separated into several categories, current sources  $I_{UP}$  and  $I_{DN}$ , charge injections from transmission gates, clock feedthroughs from transmission gates and charge sharing from two nodes  $P_{UP}$ ,  $P_{DN}$ . Taking the PMOS of transmission gate at high side as an example Fig.3.20, when voltage pulses turn on the transmission gate, holes begins to be absorbed into the channel



Figure 3.19: Proposed charge pump with 4-bit current DAC.

of the PMOS. Theses holes are from parasitic node  $P_{DN}$ , loop filter and N-well of the PMOS. Meanwhile, the voltage pulses also have clock feedthrough from the  $C_{GD}$  of the PMOS into the top plate of  $C_2$  in the loop filter. The NMOS also generates feedthrough and charge injection, which will cause ripples at the output of the loop filter. When channels are formed, the transistors' on-resistances reduce to their desired values. Currents from  $I_{UP}$  and charge from node  $P_{UP}$  begins to flow into loop filter through the channel. Similar process happened for  $I_{DN} + I_{DAC}$  and the  $P_{DN}$ node. One difference is that the transmission-gates at low side can also absorb electrons/holes from transmission-gates at high side. All of these contribute to the ripples on the loop filter. When the transmission gates are going to be turned-off, their on-resistances are rising and the charge/holes in



Figure 3.20: Charge pump's main path working process.

the channels are released. And that follows the process of charge injection and clock feedthrough of the turn-off stages.



Figure 3.21: Comparison between current mismatch and timing mismatch.

Since narrower voltage pulses are used, mismatches between charge injection and clock feedthrough

begin to weight more for the ripple voltage into the loop filter. Comparisons between current mismatch and timing mismatch is shown in Fig. 3.21. From 3.16, lower timing mismatch leads to minimum ripples at filter's output, where remaining charge injection and clock feedthrough come into the picture. A smaller timing mismatch leads to reduced ripple power and lower output reference spurs.

In the proposed charge pump in Fig. 3.19, two auxiliary OTAs are used to reduce switching glitches and to improve the matching  $I_{UP}$  and  $I_{DN}$  [64] [50]. Different from [64], a 4-bit current DAC was designed for fine current error reduction and no extra capacitors at  $P_{UP}$  and  $P_{DN}$  were used. Worse spur performance was found when extra capacitors placed at these two nodes, i.e., larger  $C_{PARUP}$  and  $C_{PARDN}$ . Since [64] is realized in 130-nm technology, current mismatch still weights more than charge injection, clock feedthrough and charge sharing. Higher  $C_{PARUP}$  and  $C_{PARDN}$  lead to higher charge sharing, which will deteriorates the spur performance. The charge pump still maintained good performance in post-layout simulations and was sensitive to supply noise and ground bounce. In this way, a strong ground with multiple ground bonds and on-chip LDO were used to support the  $I_{DN}$  and  $I_{UP}$  respectively.

#### 3.4.3 Loop Filter Design

Since a QFN package was used and a ground bond around 400 pH was usually estimated, ground bouncing was expected at charge pump's output. The output current pulses from charge pump will inject charge onto the top plate of  $C_2$  and then the voltage difference at two ends of  $R_1$ leads to current through  $R_1$  to balance voltages on the top plate of  $C_1$  and  $C_2$ . However, as shown in Fig. 3.22 on-chip  $C_1$  and  $C_2$  would unavoidably see the inductance of ground bonds, which could generate large glitches at the bottom-plate of  $C_2$ . To alleviate this issue,  $R_1$ ,  $C_1$  and  $C_2$  were realized with off-chip components and the bottom plate of  $C_1$  and  $C_2$  would see a strong ground instead of the ground bond from the package.  $R_2$  and  $C_3$  were realized on-chip and closer to the VCO. All on-chip blocks were implemented with separate and stronger ground connects through multiple ground bonds.



# Fully on-chip loop filter



Figure 3.22: An third order loop filter design with/without ground bonds.

## 3.4.4 Programmable Fractional Divider

The programmable fractional divider implements a division ratio dictated by expression N + M/128, where N is the integer division ratio, ranging from 32 to 256 and M is the numerator for fractional division ratio, ranging from 1 to 127. The dual modulus prescalar is not suitable for proposed low spur PLL design, due to PVT variations [17, 59]. Thus, a divider driven by a synchronous clock was implemented with low-power TSPCs as registers.

# 3.4.5 Proposed Division Ratio Control

A third-order DDSM with self-dithering [60] was implemented as the division ratio controller to generate third-order noise shaping of the quantization noise for in-band signal. There was no specific noise cancellation DDSM implemented in this prototype, and the functionality of the digital phase processor was fully examined.



Figure 3.23: Proposed digital phase processor architecture.

## 3.4.6 Digital Phase Processor

The digital phase processor (DPP) was designed together from the analog loop transfer function of the PLL. The DPP includes two 10-bit TDCs, a moving average filter (MAF), and FIR filter, a 10-bit DTC and an output multiplexer (MUX). The targeted resolution of TDC was 1ps. TDCs was used to observe the  $T_{SPE}$  for the charge pump to achieve the optimal reference spur performance. The DPP can also sample the information in phase domain and generates digital notches from the MAF to filter out fractional spurs at multiple of  $f_{REF}/32$ . The out-of-band noise from DDSM is also filtered by the MAF. The PLL's loop transfer function including the open loop transfer function  $H_{OL}(s, z)$  can be expressed as,

$$H_{OL}(s,z) = H_A(s)H_D(z)H_{Delay}(s)$$
(3.23)

where  $H_A(s)$  is the analog part of the loop, including the PFD, charge pump, loop filter, VCO. For simplicity, the transfer function of the frequency divider is also included in  $H_A(s)$ .  $H_D(z)$  is the digital part of the DPP, and  $H_Delay(s)$  is the total delay with the PLL, mainly accounts for the analog buffers and digital circuit delays. For the analog  $H_A(s)$ ,

$$H_A(s) = \frac{1}{2\pi} \frac{1 + sR_1C_1}{1 + sR_1\frac{C_1C_2}{C_1 + C_2}} \frac{1}{1 + sR_2C_3 + \frac{s^2C_1C_3R_1}{s^2C_1C_2R_1 + s(C_1 + C_2)}} \frac{I_{CP}}{s(C_1 + C_2)} \frac{K_{VCO}}{s} \frac{1}{N}$$
(3.24)

As shown in Fig. 3.22, the  $R_2C_3$  introduces a high frequency pole, which is higher than the unity gain frequency of 250 kHz. And for the digital part,

$$H_D(z) = H_{MAF}(z)H_{FIR}(z) = \frac{\sum_{i=0}^{31} z^{-i}}{32}(14 - 13z^{-1})$$
(3.25)

Where transfer function of MAF and FIR are show in 3.25. The MAF's transfer function  $H_{MAF}(z)$ will generate multiple nulls at  $f_{REF}/32$ , which can track fractional spurs at multiples of  $f_{REF}/32$ . The first null is located at  $f_{REF}/32$ , 1.475 MHz, which is 5.9 times the unit gain frequency of 250 kHz. The negative phase from  $H_{MAF}(z)$  is compensated by  $H_{FIR}(z)$  to maintain the phase margin of the fractional-N PLL.

$$H_{Delay}(s) = e^{-sT_{DTot}} \tag{3.26}$$

The extra delay  $T_{DTot}$  will lead to extra negative phase in the PLL. This part includes the delay of all the blocks in the PLL. Since the unity gain frequency is 250 kHz, which corresponds to  $\mu$ s time constant. For delays in the order of hundred of ps, the total loop delay will not introduce excessive negative phase.

As shown in 3.24, without the DPP, the loop's unity gain frequency is 249 kHz with a phase margin of 69.5°. The 32-tap MAF tracks the fractional-N spurs with its nulls at multiples of  $f_{REF}/32$  but degrades the unity gain frequency to 234 kHz with phase margin of 40.8°. The FIR filter with positive phase response recovered the unity gain frequency to 253 kHz with a phase margin of 64.1°. The first null generated by the MAF happens at  $f_{REF}/32$ , where the excessive phase also happens. Since the denominator of the fractional division ratios are  $2^n$ , where n is an integer number. Thus, the MAF should choose a filter tap length of  $2^n$ , such as 32, 64 or 128.



Figure 3.24: Frequency and phase response of the PLL with analog part only and MAF and FIR activated.

However, the in-band performance cannot be affected by the MAF's functionality. MAF with longer taps, such as 64-tap or 128-tap can track fractional spurs to lower frequencies but leads to instabilities, where the in-band nulls will generate drastic phase change, as shown in 3.24. It is believed that the close-in fractional spurs should be reduced by employing dithering techniques in the DDSM instead of the MAF.

Besides the digital MAF and FIR, high resolution TDCs and DTC with wide linear range are also needed. Authors in [65] uses a gated ring oscillator to achieve an outstanding resolution. As demonstrated in [65], the quantization noise of the TDC should also be considered in the all-digital PLL and this is the case in this prototype. Since timing amplification was difficult and needs a lot of calibration [66], a sub-ranging fashion TDC is proposed; the architecture is shown in Fig. 3.25.



Figure 3.25: Proposed sub-ranging TDC structure.

The 10-bit TDC includes the delay cells of 32 ps and 1 ps delays, respectively. The 32 ps delay cells decides the  $D_{TDC[9:5]}$ . Since, the residue phase in TDC is difficult to store as the case of conventional ADC. In this way, the sub-ranging TDC needs the coarse stage's output to estimate proper residue input for the fine stage. The residue phase is generated by quantizing the input again.  $D_{TDC[9:5]}$  choose the output from the NAND-based multiplexer and the residue phase is quantized by TDCs with 1 ps delay to generate  $D_{TDC[4:0]}$ .  $D_{TDC_{-C}[9:5]}$  is used to compare with  $D_{TDC[9:5]}$  for the matching of the 32 ps delay cells. In summary, the proposed TDC works in a sub-ranging fashion, where coarse and fine bits are decoded sequentially. For example,  $\Phi_{E0}$  is 50 ps ahead of  $\Phi_{D0}$ . After the first 32 ps delay cell,  $\Phi_{E1}$  is 14 ps ahead of  $\Phi_{D1}$ . And  $\Phi_{E2}$  is 18 ps behind  $\Phi_{D2}$ . The  $D_{TDC[9:5]}$  is generated. The residue phase is 14 ps. After the fixed delay,

which gives  $D_{TDC[9:5]}$  enough time to control the NAND based multiplexer, the residue phase of 14 ps is extracted from  $\Phi_{E1D}$  is 14 ps ahead of  $\Phi_{D1D}$  and feed to the 1 ps delay cells as  $\Phi_{PE}$  and  $\Phi_{PD}$ . The  $D_{TDC[4:0]}$  will be decoded as 14. Thus, the sub-ranging TDC finishes the decoding input information. All stages have been Monte Carlo simulated to ensure their variations are under 1/2 LSB (0.5ps). The DTC was implemented employing a similar structure of TDC only with different



Figure 3.26: Proposed sub-ranging DTC structure.

inputs and outputs. The input  $\Phi_{IN}$  is feed into two paths. After the first delay cell, the  $\Phi_{D1}$  is 32 ps behind  $\Phi_{E1}$ . Similarly,  $\Phi_{D2}$  is 64 ps behind  $\Phi_{E2}$ . The  $D_{DTC[9:5]}$  choose the output from multiple inputs with different delays. The output of the multiplexer is sent to 1 ps delay cells. For example, if  $\Phi_{PD0}$  is 96 ps behind  $\Phi_{PE0}$ ,  $\Phi_{PD1}$  will be 97 ps behind  $\Phi_{PE1}$ . Then  $D_{DTC[4:0]}$  choose the output from the second multiplexer with fine delays. Thus, the delays ranging from 1 ps to 703 ps with granularity of 1 ps can be digitally controlled. The TDCs and DTC are designed from tunable Vernier delay cells with a wide tuning range. Monte Carlo simulation results also make sure that the DTC's non-linearity is less than 0.5 ps. A major limitation is that the TDCs and DTC are sensitive to PVT variations [67, 68], especially the power supply noise. Thus, extensive postlayout simulations with reduced IR drop and properly distributed de-coupling capacitors guarantee the TDCs and DTC with targeted performance. During the chip measurement, the separate bias voltages and power supplies were tuned for required TDC and DTC performances.

# 3.4.7 Transmission Gate Based Phase Frequency Detector (TG-PFD)

To have a fast D-flipflop in the PFD, a transmission-gate based D-flipflop was used. The proposed TG-DFF is faster than a True Single-Phase Clock (TSPC) D-FF and conventional latchbased ones. Thus, the 67 ps feedback path of TG-PFD was shorter than the 85 ps of PFD based on TSPC and 112ps of latch-based in post-layout simulations. Since its shorter rst-to-Q path, PFD based on TG-DFF was used. The TG-PFD was also optimized for jitter performance and buffers were inserted between TG-PFD and charge-pump to avoid huge loading and low slew at TG-PFD's output.

# 3.5 Proposed PLL system calibration process



Figure 3.27: Proposed PLL calibration process.

Critical PLL parameters require some calibration steps for optimal performance. The calibration process is achieved in five steps as in Fig.3.27. Firstly, a free running VCO's output frequencies are characterized for different capacitor bank configurations and varactor control voltages. On the second step, sub-ranging TDCs and DTC are calibrated by predictable phase patterns from the programmable dividers triggered by the free-running VCO. Thirdly, while DPP is bypassed, the loop operated in integer-N mode, minimum static phase errors  $T_{SPE}$  are identified for different division ratios by TDCs and the outputs are sent to DSP. Fourthly, DPP is included in the fractional-N PLL's loop for fractional spur and out-of-band noise filtering. There is an interpolation find the minimum static phase error of fractional division ratio. Finally, background calibration is used to monitor the PLL's working status.

## 3.5.1 Free-running VCO characterization



Figure 3.28: Free-running VCO characterization.

As shown in Fig. 3.28, the frequency locking loop (FLL) helps to fully characterize the output frequencies of the VCO from different varactor control voltages and capacitor banks. The DAC marked in red provides the analog voltages for the varactor's control voltages. The DVCO controls the capacitor banks. An external counter with  $f_{REF}/L$  captures the frequency transfer curve of the

VCO with frequency granularity of  $f_{REF}/L$ .

# 3.5.2 Sub-ranging TDC and DTC Linearity Calibration



Figure 3.29: Proposed sub-ranging TDC and DTC calibration.



Figure 3.30: Patterns in phase domain for TDC and DTC calibration.

The TDCs are used not only to measure the static phase error  $T_{SPE}$  of the closed loop system, but also to performance phase domain processing of the PLL. So, the linearity of both TDCs and DTC are important in reference and fractional spur calibrations. Reference spurs are minimized by reducing the static phase errors. Fractional spurs performance are impacted by the linearity of TDC and DTC, i.e., sampling errors. The non-linearities of TDC and DTC degrade fractional spur performance, and the matching between TDC and DTC impacts system dynamics. For example, a calibrated TDC with step size of 1 ps and a DTC with step size of 1.5 ps is equivalent to a gain stage with gain of 1.5 in the PLL. Thus, it is needed to calibrate TDCs and DTC for good linearity and matching. The similarities of TDCs and DTC greatly reduce their mismatches and the effort of running calibrations. The nonlinearity for the Vernier type TDC originates mismatches in intra-stage and inter-stage mismatches.

As shown in Fig. 3.29, the characterized VCO and the divider can generate predictable phase patterns for TDCs and DTC calibration. The DSP will manage the digital output from the TDC and change the FCW. For example, as in Fig. 3.30, the division ratio of 96.5 is realized by alternating between 96 and 97. The duration between two rising edges of feedback is  $96T_{VCO}$  when feedback division ratio is 96. Similarly, for division ratio of 97, the duration is  $97T_{VCO}$ . Consider the reference source with period of  $96.5T_{VCO}$ , the phase different between reference and feedback signal is alternating between  $0.5T_{VCO}$  (166.7 ps) and  $-0.5T_{VCO}$  (-166.7 ps), when the static phase error is zero. A non-zero static phase error, for example, will change the pattern into  $0.51T_{VCO}$  (170 ps) and  $-0.49T_{VCO}$  (-163.3 ps), however, the phase difference in this pattern is fixed at  $T_{VCO}$ . Thus, predictable phase patterns from the fractional-N configuration can be generated and the TDC can be calibrated to have targeted DNL and INL performances. In this prototype, similar to calibrate a sub-ranging ADC, the LMS algorithm was used to calibrate the TDC for required performances. As the noise from the VCO leads to errors, each division pattern was repeated multiple times to calibrate the TDCs and DTC properly.



Figure 3.31: Charge pump calibration by find the minimum static phase error.

#### 3.5.3 Charge Pump Calibration

As show in Fig. 3.31, charge pump needs to be calibrated for minimum reference spur level. As causal relationship between reference spur levels and static phase errors is observed. The calibration algorithm from DSP will change the code of  $I_{DAC}$  to minimize the static phase error for optimal reference spur performance. This step will record the optimal code of different division ratios, such as 70, 71, 72, 73. During this step, the PLL is working in integer-N mode.

The net error charge accumulated on the filter from charge injections and clock feedthroughs not only depended on bias voltages of the transmission gates but also depended on the timings of the pulses. A two-dimensional sweep, including both output voltage and time domain is required to jointly optimize the charge pump. To verify the causal relationship between static phase errors and reference spur levels over different output voltages, the DAC was swept between 0% to 40% of  $I_{DN}$  in parallel with a fixed current source of 80% of  $I_{DN}$ ; this scheme provides a tuning range of 20%. And the static phase errors and reference spur levels were recorded. The most relevant



Figure 3.32: Reference spur level and static phase error.

results are displayed in Fig. 3.32; the global minimum of spur levels is found when static phase error is minimum; the spur level is under -125 dBc if the static phase error is maintained under 15 ps. The static phase error is captured by the TDCs and processed by the DSP.

First of all, when the filter's output is 550 mV, the voltage drops over the transmission gate are similar for the upper and bottom transmission gates. Thus, the charge pump mismatch is mainly determined by the small current mismatch and charge sharing from  $P_{UP}$  and  $P_{DN}$ . When the loop is locked, the static phase error, which is reflected through the phase difference between the UP and DN voltage pulses, should be minimized. Secondly, when the output voltage moves down, such as 250 mV, the non-perfect charge injection cancellation from transmission gates will add to the reference spur levels. The loop is forced to keep the net charge zero, which makes the DN pulse wider and leads to larger static phase error. To reduce the ripple on the filter, if a charging pulse is integrated by the loop filter, then a discharging pulse should be implemented immediately to make the compensation. The timing mismatch between two current pulses is crucial. In this way, an increased  $I_{DN}$  will make the DN pulse narrower and static phase error is minimized. In other

words, mismatches between charge injection through the transmission gate are partially canceled by prudent mismatches between current sources, and the current DAC is used for that purpose. As summarized in Fig. 3.32, reference spur calibration is realized by observing the digital format of the static phase error  $T_{SPE}$  between the UP and DN voltage pulses.

# 3.5.4 Fractional-N Mode with DPP in the Loop



Figure 3.33: DPP is included in the loop.

The DPP is incorporated in the loop for out-of-band fractional spurs filtering, as in Fig. 3.33. Since there is a division ratio change from integer to fractional-N mode, such as from 70 to 70.5, the optimal code for current DAC need to be interpolated to find the minimum static phase error. For example, when the optimal DAC code is 7 for a division ratio of 70 and the code changes to 15 for a division ratio of 71, the optimal code of a fractional division ratio of 70.5 should be 11. The optimal codes for different division ratios come from the previous step. Dithering in the DDSM is

turned on for fractional-N spurs randomization.

#### 3.5.5 Background Calibration of PLL in Fractional-N Mode

The phase error between UP and DN voltage pulses is continuously monitored through the TDC digital output and used as a major metric for loop characterization as shown in Fig. 3.33. The readout from TDC keeps on indicating the static phase error over the changing division ratios. Ideally, the average of the static phase error should approach zero during normal fractional-N operations. When a non-zero average static phase error is captured, the DAC is tuned until minimum average static phase error is reached. A compact ADC (not implemented in this version) could be used to observe the quality of the filter's output voltage. Also, if the loop filter's output voltage is too low (less than 250 mV) or too high (larger than 850 mV), a changing of capacitor banks in VCO is enabled.

As shown in Fig. 3.29, fractional-N PLL without dithering can generate predictable phase patterns. As to the TDC, and DTC variations, the DSP can bypass the DPP, deactivate the self-dithering of DDSM and feeding predictable patterns into the TDC to readout digital codes for TDC's linearity calibration. Here the DTC's linearity is assumed to follow TDC's variations. Except for changing the capacitor banks of VCO, re-calibration of charge pump and TDCs and DTC can be performed without breaking the feedback loop of the PLL. The background calibration is suitable for small and slow variations in the PLL. If a large variation happens, it means the PLL needs to go through the previous calibration steps again.

#### 3.6 Measurement Results



Figure 3.34: Chip photo and testbench.

Fig. 3.34 shows the measurement setup of the proposed PLL. Fabricated in the TSMC 40-nm CMOS technology, the chip size is  $3.0 \times 2.0 \ mm^2$ . Different blocks are supplied isolated power supplies and their bias voltages are tuned separately. The digital readout from the TDC is labeled as the TDCout block and sent to the DSP. The digital control block controls the frequency division ratios of the fractional frequency divider and the capacitor banks in the VCO. A Keysight E8267D with a band-pass filter was used as a reference clock. A Keysight N9030B was used to measure the output spectrum and phase noise performance. The measured total power consumption was 15.9 mW when operating at 3.3 GHz.

Fig. 3.35 shows the measured phase noise performance at 3.319 GHz, with a reference frequency of 47.2 MHz and fractional division ratio of 70+5/16. DPP was activated to filter out the



Figure 3.35: Phase noise measurement.

fractional spurs. Phase noise at 10 kHz, 100 kHz, 1 MHz, 10 MHz are -108, -109, -125, -129 dBc/Hz respectively. Fig. 3.36 shows main contributors of PLL's output phase noise. The input noise of the external clock source contributes to the majority part of the output low frequency phase noise. Starting from 700 kHz frequency offset, VCO's phase noise becomes the dominate phase noise contributor. The noise from DDSM is attenuated by the digital filters and far-out phase noise is reduced by analog and digital filters in the loop. The measured rms jitter, integrated from 10 kHz to 40 MHz, was 243 fs, which was in good agreement with the simulation results. There are two visible spurs in the phase noise measurement results in Fig. 3.35. The low frequency spur comes from the limited resolution of the DTC and a very low frequency truncation error near 200 Hz is circulating inside the loop. While the high frequency spur at 47.2 MHz is the reference spur. Accurate spur levels were read from the instrument in spectrum mode.

Fig. 3.37 records the output spur measured employing the N9030B in spectrum analyzer mode. The reference spur levels were recorded through after a long waiting time and a video bandwidth of 1Hz. Here the instrument's noise floor in spectrum mode was measured around -135 dBm. Here



Figure 3.36: Comparison of measured phase noise with calculated components.

the PLL was working in fractional-N mode and output frequency is 3.329 GHz where the input is 47.20 MHz and division ratio was 70.5. Since the output power was -15.02 dBm and reference spur was -123.47 dBm, the reference spur level relative to the carrier power was -108.45 dBc at lower band. Similar measurements are carried out to record the proposed architecture in reducing reference spur and fractional spurs. Meanwhile, to ensure the robustness of the charge pump calibration, the charge pump was also tested for +- 10% supply voltage variations. It can be found that the proposed method still works well ranging from 1.0 V to 1.2 V power supply. And the current compensation method is valid for various mismatches among charge injection, clock feedthrough and charge sharing.

As summarized in Fig. 3.38, a fractional division ratio  $70 + 1/(2^N)$  was used with N swept from 1 to 7. Close-in fractional spurs at  $f_{REF}/128$  and  $f_{REF}/64$  are not suppressed by the loop.



Figure 3.37: Reference spur measurement.

Fractional spurs starting from  $f_{REF}/32$  are reduced over 18 dB. Non-perfect cancellation to do limited TDC resolution employed in this prototype.

Still, the in-band fractional spurs cannot be easily removed without impacting the PLL's loop dynamics and must be dithered in DDSM. At the same time, due to the limited resolution of TDC



Figure 3.38: Fractional spur measurement results.

and DTC, still the loop exhibits a low frequency spur, which is inversely proportional to TDC's resolution and tightly related to loop bandwidth is observed in measurement. It generates an inband spur located at 231 kHz with a level of -87 dBc. In simulation, the spur is located at 184 kHz with a level of -83 dBc.

# 3.7 Conclusions and Summary

In this chapter, a 2.3-3.9 GHz PLL with charge pump and TDC calibration techniques for reference spur and fractional-spur reduction is discussed in detail. From the experimental results, an average 38 dB reduction and over 18 dB reduction in reference spurs and out-of-band fractional spurs indicated the efficiency of proposed method. Also, the rms jitter less than 250 fs over the 2.3-3.9 GHz demonstrated its targeted output phase noise performances. The TDCs were used both for reference spur reduction and fractional spur filtering. The proposed architecture was compared

| Ref.                           | [40]           | [41]           | [42]          | [43]       | This Work    |
|--------------------------------|----------------|----------------|---------------|------------|--------------|
| Tech.                          | 65 nm          | 180 nm SiGe    | 14 nm         | 65 nm      | 40 nm        |
| Arch.                          | ADPLL          | CP-PLL         | ADPLL         | ADPLL      | CP-PLL with  |
|                                |                |                |               |            | Digital      |
| Spur                           | Ref. dithering | Algorithim     | TDC           | Frequency  | TDC (frac)   |
| Cancellation                   | /noise cancel  | in frac.       | Calib. (frac) | multiplier | CP (integer) |
| Tech.                          | (frac)         | divider (frac) |               | (integer)  | calibration  |
| Supply (V)                     | 1.0            | 1.2/3.3/5.0    | NA            | 1.0        | 0.9/1.1      |
| Power (mW)                     | 18.1           | 118            | 13.4          | 9.52       | 15.9         |
| f <sub>VCO</sub> (GHz)         | 3.0 - 5.2      | 4.485          | 2.69          | 5.0 - 5.4  | 2.3 - 3.9    |
| f <sub>REF</sub> (MHz)         | 32.0           | 61.44          | 26.0          | 50.0       | 47.2         |
| Loop                           | NA             | 100            | NA            | NA         | 250          |
| BW (kHz)                       |                |                |               |            |              |
| rms                            | 1780           | 166            | 140           | 701        | 247          |
| jitter (fs)                    |                |                |               |            |              |
| Out-of-band                    | -120           | -140           | -138          | -130.6     | -130         |
| PN (dBc/Hz)                    | @ 3 MHz        | @ 3 MHz        | @ 3 MHz       | @ 10 MHz   | @ 3 MHz      |
| Worst in-band                  | -62.47         | -72            | -78.6         | -42        | -75.6        |
| frac spur (dBc)                |                |                |               |            |              |
| Ref spur (dBc)                 | -102.32        | -110.0         | -87.6         | -95.8      | -108.3       |
| Active area (mm <sup>2</sup> ) | 0.338          | 13.2           | 0.257         | 0.228      | 0.42         |

Table 3.1: Comparison with Low Spur PLLs.

with state-of-the-art PLLs in Table 3.1. In this table, PLLs with low spurs and using advanced technology node were chosen for comparison. Compared with [40], this design achieved lower worst in-band fractional and comparable reference spurs. Lower jitter was achieved with lower power consumption. [41] had lower jitter and comparable spur performances at the cost of higher power in 180nm SiGe. [42] had lower jitter and lower worst in-band fractional spur in advanced 14 nm technology. [43] had the lowest power, however its jitter and reference spur were higher. More importantly, the worst in-band fractional spur was not suppressed by the dithering and was 20 dB higher than the rest of the papers. It can be concluded that the proposed architecture achieved a good balance between power consumption, rms jitter and spur reduction techniques. Since many works focused on the jitter reduction and noise cancellation, spur performance might not be the major concern for their applications. For the proposed PLL, the fast-varying wireless environment

with wideband input was assumed. Any out-of-band spurs would down-converter the high power out-of-band signal and greatly distorted the in-band signal. Thus, the spur reduction techniques were the highlights of this PLL. For the future work, first of all, VCO's performance can be further optimized with spiral inductors of higher Q, which is built on the ultra-thick metal (UTM). Packaging can also be improved by using flip-chip to reduce the inductance of bond wire connections. Secondly, the resolution of proposed TDC can be further improved by phase-interpolation (PI) technology, which can be regarded as an extension of Vernier delay line. New structures of comparators in phase domain are also needed for sub-ps resolution. [69] has already implemented PI technology to dynamically choose the phase information to reduce the spurs and this leads to the joint design of TDC and divider, since phase domain is easier to process information than frequency domain. The information in frequency domain is amplified or integrated to phase domain. Last but not least, as a feedback system, stability of the PLL should be maintained all the time. The close-in spurs can only be removed by dithering the DDSM and far-out spurs can be filtered. Algorithms can also be developed to both reduce the out-of-band noise and spurs from DDSM. Future work can also integrate an analog delta-sigma modulator directly measure the spectrum at VCO's input to indicate the PLL's calibrations. Compared to [70], which used a heterogeneous path to remove fractional spurs, this method provided a robust filtering method.
# 4. A 14-BIT 1GS/S LOW POWER PIPELINED ADC WITH COMPREHENSIVE FOREGROUND AND BACKGROUND CALIBRATIONS

# 4.1 Introduction

With the fast evolution of wireless and wireline systems, analog-to-digital converters (ADCs) have been the essential parts for receivers to digitize signal with high-order modulation and wide bandwidth, which makes the most use of precious spectrum resources. Such as WiFi-6, a standard supporting 160-MHz bandwidth and QAM-256 modulation, which has a max throughput of 10 Gbps, which is over 15 times the 600 Mbps maximum speed of WiFi-4. All these achievements are supported by high-speed, high-resolution and performance optimized data converters. For a wireless system, a QAM-256 signal supports 8 bit per symbol, which is only 33% increment of a QAM-64 signal. Meanwhile, for a wireline system, an ADC-based receiver supporting PAM-4 can directly double the system throughput over the NRZ modulation, such as the 112-Gb/s PAM-4 over 56-Gb/s NRZ signaling [71]. Thus, improving the ADC's speed, resolution and reducing power consumption has always been the target of data converter academic and industrial research and applications.

There are several types of original ADCs that have been developing, incorporating each other and generating various mixed types of ADC structures. Flash, successive approximate register (SAR), pipelined and delta-sigma converters are the main types prevailing ADCs. Due to their own properties, each of them has its applicable range. For example, SAR ADCs [72, 73] regains additional popularity in extremely high speed and low accuracy applications, because of the evolution of CMOS technologies. SAR ADCs are easy to scale from an old technology node to a new one. Flash ADCs has been parts of stages in pipelined ADCs to quantize the input of every stage and not widely used as a single ADC as in previous years. Delta-sigma ADCs [74, 75, 76] still hold their positions in high-resolution applications because of their noise-shaping nature. For a pipelined ADC, it has several advantages over aforementioned structure. First of all, the analog input is quantized in a pipelined fashion and there is no need to finish the quantization of the analog input in one clock cycle. Thus, the throughput of the systems is increased and design parameters are relaxed such as tracking bandwidth. Secondly, the residue amplification makes consecutive stages easy to quantize the amplified residue, thus the total power consumption is partially reduced. Third, residue is generally amplified in a closed-loop amplifier. This process is less sensitive to PVT variations and output linearity is maintained. Finally, for less advanced technology nodes, such as 90 nm or 130 nm, it generally requires less design effort to design a pipelined ADC[77] than a SAR ADC to achieve medium to high accuracy (9-14 bits) and high sampling frequencies. The main reason is that when the power supply is high, such as 1.5 V, and transistors are enable to tolerate high swings without stress issues. Residue amplifiers with transistor stacks and bootstrapped switches can be designed without additional concerns on semiconductor levels. However, since transistors at less advanced nodes have limited cut-off frequencies (ft), SAR ADCs favor fast switches to perform multiple comparisons during one clock cycle. Fast transistors also have advantages in designing comparators with less kick-back noise through Miller capacitor. For a single SAR ADC, not a pipelined SAR ADC, there is no need to perform residue amplification. Thus, the high power supplies to support high gain residue amplifiers are no longer needed. In a summary, pipelined ADCs need to have enough voltage headroom to support the residue amplifiers and SAR ADCs, which prefer fast transistors as switches.

In order to support next generation high-speed wireless communication systems, a pipelined ADC with short channel devices is chosen to support high data rate with high-resolution. Beyond the ADC's speed and accuracy, power efficiency is also an important indicator of how the ADC can process data efficiently in battery supplied systems. Also, the system's robustness over PVT variations should also be considered. The following section will address these aspects and generate a comprehensive evaluation of the proposed pipelined data converter. There are two types of figure-of-merits (FoMs) mainly used in leveraging ADC's performances. They are Walden's FoM and Schreier's FoM.

$$FoM_{Walden} = \frac{P}{2^{ENOB} * fs}$$
(4.1)

Here the P is the power consumption of the ADC and fs is the sampling frequency. The signal-tonoise and distortion ratio (SNDR) is measured from the output spectrum of the ADC by employing fast Fourier transform (FFT).

$$SNDR = \frac{P_{signal}}{P_{noise} + P_{distortions}}$$
(4.2)

When a sinusoidal input is applied at the ADC's input, the SNDR represents the ratio of the power of the signal  $P_{signal}$  over the sum of noise power  $P_{noise}$  and distortion power  $P_{distortions}$ . Since the signal power  $P_{signal}$  only occupies one bin at FFT's output, the integrated power of the remaining bins stands for the  $P_{noise} + P_{distortions}$ . And effective number of bits (ENOB) is translating ADC's SNDR into number of equivalent bits.

$$ENOB = \frac{SNDR - 1.76}{6.02}$$
(4.3)

Actual ENOB is lower the nominal bits of an ADC. For example, a commercial 12-bit pipelined ADC usually has a measured ENOB around 11-bit. Moreover, Schreier's FoM can be expressed as

$$FoM_{Schreier} = SNDR + 10log\frac{fs/2}{P}$$
(4.4)

According to these two FoMs, ADCs with higher SNDR or ENOB, larger bandwidth (higher sampling frequency) and lower power consumption will come with better FoMs. Better ADCs will have lower Walden's FoM, like 20 fJ/conv. and higher Schreier's FoM, such as 170 dB. As shown in Fig. 4.1, the ADC survey [78] includes the Schreier's FoM at the Nyquist frequency. State-of-the-art papers have demonstrated an over 180 dB Schreier's FoM below a 100 MHz sampling frequency. The envelope predicts a 170 dB Schreier's FoM at a 1 GHz sampling frequency, which has been approached by the measurement results of this design at 1GS/s. For Nyquist ADCs, Walden's and Schreier's FoMs are measured at several input frequencies, ranging from low frequency (close to DC) to Nyquist frequency. Generally, an ADC's performance will degrade at



Figure 4.1: ADC's high-frequency Schreier's FoM versus speed.

higher frequencies due to limited bandwidth, settling error, aperture jitter and other effects. Most papers report their FoMs at low frequency and Nyquist to make a fair comparison of their results with other publications [79, 80, 81, 82, 83, 84].

Meanwhile, calibration techniques have been widely used in mixed signal systems to reduce errors from analog non-idealities. Mathematically, the calibration algorithms are trying to find the inverse function of ADC's real transfer  $f_{ADCreal}$  function over the ADC's ideal transfer function  $f_{ADCideal}$ .

$$f_{ADCideal} = f_{ADCreal} f_{Calibration} \tag{4.5}$$

The calibration methods may not be fully functional for several practical issues. First of all, the calibration function is always mathematically feasible but may not be implementable in a circuitlevel, or the cost of its realization is excessive. For instance, in case the high resolution is needed in the algorithmic operations and have to be resolved in a very short time; matlab algorithms are feasible but prohibited for on-chip solutions. Secondly, accurate ADC characterization is not easy, especially for high speed ADCs where high-frequency limitations must be accounted; measuring errors that are the result of a well designed system with better resolution than the actual one employing the same technology is a paramount. The quantized ADC output information are mixed with different error sources, such as noise and distortions. Since these error sources distort the estimation of ADC's transfer function  $f_{ADCreal}$ , the corresponding  $f_{Calibration}$  will not be inaccurate. Finally, the complexity of the  $f_{ADCreal}$  is high, since many errors are static, such as mismatches of components. On the other hand, there are a lot of dynamic errors such as limited frequency response, slew-rate effects, finite GBW, devices non-linearities, offsets, clock feedthrough, charge injection, and others. Thus, ADCs are required to maintain targeted SNDRs over entire frequency range. Optimizing the SNDRs and FoMs have been the goals of most of ADC projects. A similar process is applied to the channel estimation in a wireless system, where zero-forcing method is difficult to realize without a priori knowledge of the channel matrix. An asymptotic method, such as minimum mean square error (MMSE), will be used to estimate the channel matrix through reducing the bit error rate. Similarly, least mean square (LMS) filters are also widely used in digital calibrations to optimize the SNDR by reducing the errors. ADC's are widely implemented with digital calibration engines to optimize their performances. Analog methods are used as paralleled ways to assist the digital calibration. Both analog and digital calibration methodologies will be applied in foreground and background to fully adjust the ADC over all kinds of variations, especially the temperature variation in background. Prevalent calibration algorithms and structures will be fully evaluated to justify the efficiency of the proposed pipelined ADC. It was found during this research program that the by leveraging both analog and digital simultaneously, better performance can be found over standalone digital calibrations.

Recent state-of-the-art publications are addressing this challenge, [80] employed a pipelined ping-pong multiplying DAC (MDAC) with inter-symbol interference (ISI) calibration, which was also employed in [79, 81]. Theoretically, the ping-pong structure doubles the conversion speed, which increases Schreier's FoM by 6dB. However, the memory effect due to incomplete charge transfers with the MDAC from previous phases is the major issue of the ping-pong structure. The equalization techniques, which have been used in [79, 80, 81], have significantly improved the SNDR to around 60 dB by compensating for the memory effect without sacrificing excessive ana-

log power. This technique is efficient in compensating for SNDR reductions in high frequencies. [82] incorporated the comprehensive foreground and background calibrations. Kick-back noise reduction highlights the main contributions of this design, which significantly increased the SNDR. A dithered DAC for inter-stage gain error calibration has been verified to be effective in extracting the inter-stage gain variations, through the background calibration. [82] includes the total power of everything of a modern ADC design. Still, the bottleneck of this pipelined has been the residue amplifier, which dictates the performance of a pipelined ADC. A two-stage Miller compensated amplifier with neutralization inductors has been used in this paper to provide enough bandwidth for the ADC. [83] used ring amplifier based residue amplifier and calibrates non-linearities in background. The ring amplifier provides a high GBW, class-AB output stage under a low power supply. However, it contains potential stability issues for different signal levels. This 14-nm design proves a close to 170 dB Schreier's FoM with a high power efficiency. A background distortion monitor provides the statistics of observing nodes of the circuits for effective background calibration. [84] used temperature compensated and switched dynamic amplifier as the residue amplifier in a pipelined-SAR ADC. Still, high efficiency is the highlight of this design. The flipped voltage follower used in the residue amplifier structure supports a highly efficient class-AB output. This technique is suitable for around 10-bit accuracy applications. Offsets and gain errors are calibrated in foreground for this design, which achieved a Schreier's FoM at 168.2 dB. A 28-nm technology benefited this high-speed pipelined SAR design.

#### 4.2 An Overview of Pipelined ADC Structure



Figure 4.2: A conventional pipelined ADC.

A conventional architecture of a pipelined ADC is shown in Fig. 4.2. Non-overlapping clock phases  $\phi_1$ ,  $\phi_{1B}$ ,  $\phi_2$  and  $\phi_{2B}$  drive the pipelined operation of each stage. Each stage mainly consists of four blocks, a sample and hold (S/H), a sub-ADC, a DAC and a residue amplifier. Taking the switched-capacitor (SC) realization of the pipelined ADC as an example, on  $\phi_1$ , the input voltage is sampled on the sampling capacitors of sub-ADC and capacitor banks of MDAC for residue generation. Then on  $\phi_{1B}$ , the sub-ADC quantizes input signal and generates digital output both for readout and DAC. Residue signal is generated at the output of residue amplifier through decoding. Here  $\phi_2$  triggers the second stage to sample the residue signal from the previous stage. Usually the S/H, capacitor banks, DAC and residue amplifier are called MDAC. Meanwhile, as the residue amplifier's output settles down, the following stage is also sampling the residue to its input capacitors of sub-ADC and MDAC. For a pipelined ADC, consecutive stages can reuse the designs of residue amplifier and capacitor banks. However, this will not maximize the power efficiency of proposed pipelined ADC. Thus, scaling of each stage coming into the picture and back-end stages usually scale between the MDAC's 1/gain or  $1/gain^2$  for the trade-off between noise and power consumption. Still, the similarities between each stages reduces the design effort and accelerates the whole design process. Compared with these high speed ADCs, this design proposed a current reuse telescopic with a class-C slew rate boosting technique to improve the closed-loop residue amplifier's setting accuracy over large input signals. Equalization technique [79] is also implemented in this design to detect the equivalent frequency response and enhance the ADC's SNDR over the entire input range. Different from the extracting inter-stage gain errors from pseudo-random sequences, out-of-band signature signals (>fs/4) are used to calibrate inter-stage gain, non-linearities and frequency response, simultaneously.



Figure 4.3: Timing diagram of a pipeline ADC.

Fig. 4.3 shows a detailed timing diagram of clocks needed for the realization a pipelined ADC. For the sampling phases  $\phi_1$  and  $\phi_2$ , they usually have a less than 50% duty cycle for different stages to sample the signal. For the evaluation phases  $\phi_{1B}$  and  $\phi_{2B}$ , their duty cycles are higher than 50%. Taking  $\phi_{1B}$  as an example,  $\phi_2$  samples the residues during previous stage's evaluation phase  $\phi_{1B}$ . To avoid switching glitches,  $\phi_{1B}$  must remain longer than  $\phi_2$ . This timing diagram clearly indicated the way of generating  $\phi_1$ ,  $\phi_{1B}$ ,  $\phi_2$  and  $\phi_{2B}$  from non-overlapping clock generating circuits. Current fashion of pipelined structure evolves from the loop unrolled version of algorithmic ADC, which processes multiple bit in one cycle and amplify the residue for next cycle processing. Similar to the algorithmic ADC, the each stage of the pipelined ADC is processing the residue from previous stage (except the first stage). And then ideally the output digital code is combined with different weights, usually these radices are 2-based, such as  $2^{-2}$ ,  $2^{-4}$  etc.



Figure 4.4: A 2.5-bit stage in sampling phase and evaluation phase.

Circuit level design of single stage in a 2.5-bit per stage 14-bit pipelined ADC with SC-MDAC is show in Fig. 4.4. A half bit redundancy is introduced to enhance the system's tolerance to offsets from flash sub-ADC. In Fig. 4.4, for a 2.5-bit stage, a 3-bit ADC and 2-bit DAC is needed. To be more precisely, the a 3-bit ADC actually contain 6 comparators and 2-bit capacitor DAC are usually realized with 4 unit capacitors. For  $\phi_1$ , the input signal is sampled both on the input capacitors of sub-ADC and bottom-plate of capacitors in MDAC. For  $\phi_{1B}$ , the flash quantizes the input and generates digital output for digital calibration and residue amplification. By writing the transfer function of the flip-around MDAC, the transfer function is analyzed as follows,

$$V_{res} = (V_{in} - DV_{ref})G \tag{4.6}$$

where G is the closed loop or inter-stage gain, D is the digital output from the flash sub-ADC, ranging from -3 to 3.



Figure 4.5: 2.5-bit residue amplifier ideal transfer curve.

Fig. 4.5 shows the ideal segmented transfer curve of the 2.5-bit stage, where the inter-stage gain G is 4. The residue's full-scale is the same as the input. Actually, the inter-stage gain G is expressed as follows,

$$G = \frac{\sum_{i=1}^{4} C_i}{C_4} \frac{1}{1 + \frac{1}{A} \frac{\sum_{i=1}^{4} C_i}{C_4}} = \frac{\frac{1}{\beta}}{1 + \frac{1}{A\beta}}$$
(4.7)

Where A is the open loop gain of the amplifier used as a residue amplifier and  $\beta$  is loop feedback

factor of the 2.5-bit stage. The loop again T is,

$$T = A\beta \tag{4.8}$$

Due to limited loop gain T, the closed loop gain G is smaller than the ideal value  $1/\beta$  and the gain error  $G_e$  is calculated as,

$$G_e = 1/\beta - G = \frac{1}{\beta} \frac{1}{1+T}$$
(4.9)

while is inversely proportion to 1 + T. The larger loop gain T leads to the smaller inter-stage gain error  $G_e$ . The actual number of bits that each stage can solve is  $log_2G$ . The reduced G requires the pipelined ADC to have more stages to achieve desired resolution. As a trade-off, the effective radix between each stage is less than 4 for this pipelined ADC.

Apart from the static parameters, the pipelined ADC is working dynamically where the settling and tracking bandwidth are quite relevant parameters. For a standalone passive S/H stage with a switch and a capacitor, the tracking bandwidth is expressed as,

$$BW = \frac{1}{R_{on}C_L} \tag{4.10}$$

Where  $R_{on}$  and  $C_L$  are the on resistance of the switch and load capacitance, respectively. The time constant  $\tau$  is the product of  $R_{on}C_L$ . The settling error for a step response into an RC circuit is dictated by the step response of a single pole systems whose behavior follows the equation  $e^{-t/\tau}$ . For example, to reach a settling error suitable for 10-bit accuracy (less than %0.1), a settling time smaller or equal to  $7\tau$  is needed. For a pipelined converter, the settling requirements are relaxed stage-by-stage. For example, a 14-bit pipelined ADC with the first stage of 2.5-bit, the input settling accuracy needs to be higher or equivalent to 14-bit. Meanwhile, the settling accuracy needs to be higher than 12-bit at the first stage's output, when inter-stage gain is 4. A gain less than 4 will require settling accuracy higher than 12-bit. For a operational transconductance amplifier (OTA) based residue amplifier, the equivalent small signal circuit is shown in Fig. 4.6, And the



Figure 4.6: Typical residue amplifier used in pipelined ADCs.

equivalent time constant  $\tau$  can be expressed as,

$$\tau = RC = \left(\frac{1}{\beta g_m}\right) \left(C_L + \frac{C_I C_F}{C_I + C_F}\right) = \frac{C_I + C_L + \frac{C_I C_L}{C_F}}{g_m}$$
(4.11)

where  $g_m$  is the transconductance of the OTA,  $C_I$ ,  $C_F$ ,  $C_L$  are the capacitance from DAC, feedback capacitor and total capacitance of next stage respectively.  $R_O$  is the output impedance of the OTA. During the settling process,  $R_O$  is usually high and does not impact the equivalent time constant  $\tau$ . The front-end stages need additional  $g_m$  to drive larger capacitance and have more stringent settling requirements. And on the other hand, increasing  $g_m$  or reducing  $\beta$  are the ways to have a smaller time constant  $\tau$ . Indeed, many front-end stages use 1.5-bit or 2.5-bit. Here comes to an heuristic analysis of the parasitics at OTA's input and output.  $C_I$  and  $C_L$  contributes equivalently to  $\tau$  in 4.11. On the other hand, a larger  $C_I$  reduces the feedback factor  $\beta$ , which will make the system less efficient.

Scaling of the pipelined ADC stages depends on several parameters. First of all, a proper OTA structure need to be selected to have desired voltage gain A. Secondly, choose the correct number of bits per stage. Higher number of bits per stage has better power efficiency at the cost of lower loop gain and then reduced accuracy and linearity. Third, scaling the capacitance must also consider the impact on input referred kT/C noise. Since the OTA's gain is process dependent, scaling factor of each stage is generally between  $G^2$  and G.

Architectures reported in [85, 86] realized 200 to 260 MS/s pipelined ADCs with current-mode MDAC, which claimed to have a better efficiency than a SC-MDAC. These techniques are realized by using resistive feedback for close to unity feedback factor  $\beta$  to achieve a high loop gain. However, it requires an efficient two-stage amplifier to provide boosted  $g_m$  for a high loop gain. For high-speed applications, the two-stage amplifier is not fast enough to drive itself. Single stage amplifiers, such as cascode and regulated cascode are fast enough for high-frequency applications. Moreover, the feedback resister together with the input resistance and capacitance forms a differentiator in high frequencies, which leads to dedicated compensation techniques to ensure closed-loop stability. For this design, a current-reuse telescopic amplifier with slew-rate boosting circuit was developed and better Schreier's FoM was achieved at a 1GS/s.

#### 4.3 An Overview of Analog and Digital Calibrations

There has been extensive number of publications devoted to the calibrations of ADCs [82, 87, 88, 89, 90, 91]. The core of all calibration algorithms [92] are finding the differences of actual transfer functions between the ideal ones and reduces the errors progressively. If digital calibration is considered as an example, since the SNDR is a good indicator of the overall system's performance with good ergodicity in amplitude, maximizing SNDR through digital coefficients to calibrate inter-stage gain errors and non-linearities are the most direct and efficient way of optimizing the ADC's performances. Digital algorithms are easy to implement by improving SNDR through blind algorithms.

Analog calibrations target at specific kinds of errors. For example, capacitor mismatches of MDACs[91] can be found by analyzing the output code in response to a ramp input or implementing self-calibration techniques. Digital calibrations on inter-stage gain and non-linearities will lump these errors as static and may not be very efficient in bandwidth related errors. However, analog calibration, such as extra programmable capacitor banks[91] will help reducing the mismatches and ease the pressure on digital calibration. For modern technology nodes, comparator offsets are less prominent than the capacitor mismatches, which has been the dominant error source for pipelined ADCs.

As shown in Fig. 4.7, a typical digital calibration engine for inter-stage gain errors and nonlinearities includes a polynomial to combine the output data. The  $f_{D_{out}}$  increases from 14-bit to 24-bit to accommodate the fractional number multiplications, which avoids the truncation errors of each coefficient.

$$f_{D_{out}} = \sum_{i=1}^{7} f_{D_i}(D_i) = \sum_{i=1}^{7} \left( \sum_{j=1}^{11} \alpha G_{i,j}(D_i)^j \right)$$
(4.12)

 $f_{D_{out}}$  stands for the final output digital data, after combining all the polynomials  $f_{D_i}(D_i)$  for each stage together. Initially, there are 7 stages and the polynomials are counted up to 11-th order 4.12. The number of stages and the order of polynomials are decided by different design specifications. A high-order polynomial demands excessive digital power consumption. The goal of digital cali-



Off-chip digital calibration engine

Figure 4.7: Digital calibration for inter-stage gain errors and non-linearities.

bration is to optimize the SNDR from the  $f_{\mathcal{D}_{out}}$  and it can be expressed as,

$$\max_{\alpha G_{i,j} \in R} SNDR(f_{D_{out}})$$
(4.13)

The SNDR 4.2 of  $f_{D_{out}}$  is obtained by implementing fast fourier transform (FFT) of  $f_{D_{out}}$  with a sinusoidal signal input. By maximizing the SNDR from  $f_{D_{out}}$  through traversal search of all combinations of coefficients  $\alpha_{i,j}$ , the digital calibration performs its functionality. Brute-force search is not realistic for digital implementation and range of the coefficients can be greatly reduced. For example, the inter-stage gain item  $\alpha_{i,1}$  can be limited between 3 to 4 for a 2.5-bit stage. 4 is upper bound for the inter-stage gain and actual value is smaller. And 3 is the lower bound and can be verified by circuit simulations. The rest of coefficients can also be verified through circuit-level simulations, and certain coefficients, such as  $\alpha_{i,11}$ , can be very small, which can be removed without affecting the SNDR optimization process.

By taking the expression of  $f_{D_i}(D_i)$  from 4.12 again, it is expressed as

$$f_{D_i}(D_i) = \sum_{j=1}^{11} \alpha G_{i,j}(D_i)^j$$
(4.14)

4.14 shows a long polynomial, where odd-order distortions are compensated by odd-order polynomials, such as  $\alpha G_{i,3}(D_i)^3$ , and even-order distortions are compensated by even-order counterparts. Usually, for a properly designed ADC, due to the fully differential circuits and common mode feedback networks, even-order distortions are suppressed. Thus, 4.14 can be reduced to

$$f_{D_i}(D_i) = \sum_{j=1}^{6} \alpha G_{i,2j-1}(D_i)^{2j-1}$$
(4.15)

At the meantime, high and odd-order harmonic distortion usually happens when the systems goes into deep non-linear state. For a closed-loop system, a fifth order polynomial is enough to calibrate the odd-order distortions. In this way, 4.15 can be reduced to

$$f_{D_i}(D_i) = \sum_{j=1}^3 \alpha G_{i,2j-1}(D_i)^{2j-1} = \alpha G_{i,1}(D_i) + \alpha G_{i,3}(D_i)^3 + \alpha G_{i,5}(D_i)^5$$
(4.16)

where  $f_{D_i}(D_i)$  only contains three items. 4.12 needs to be applied to analyze its functionality over different errors sources.

Taking a 2.5-bit stage as an example, usually there are comparator offsets in the flash ADC, capacitor mismatches, smaller inter-stage gain and amplifier non-linearities as main contributors to system performance degradation. Comparing Fig. 4.9 with Fig.4.8, the residue voltage is amplified less than expected. Ideally, the residue should be amplified by 4, however, a less than 4 inter-stage gain needs the digital system to combine the output codes with adjust inter-stage gains. For the digital calibration system based on polynomials Equ.4.3A1, the corresponding coefficients  $\alpha G_{i,1}$  can be adjusted. Originally,  $\alpha G_{i,1}$  is choose as

$$\alpha G_{i,1} = \frac{1}{\prod_{j=2}^{i-1} G_{j-1}}, when \ i \ge 2$$
(4.17)



Figure 4.8: 2.5-bit residue amplifier ideal transfer curve.



Figure 4.9: 2.5-bit residue amplifier transfer curve with lower loop gain.

 $\alpha G_{1,1}$  is 1, since there is not preceding stages before the first stage.  $G_i$  is the inter-stage gain of each stage. Theoretically, when ideal inter-stage gain is 4.  $\alpha G_{1,1}$  is 1,  $\alpha G_{2,1}$  is 1/4,  $\alpha G_{3,1}$  is 1/16 and so on. However, due to reduced loop gain and gain variations in each stage,  $\alpha G_{2,1}$  is 1/ $G_1$  and  $\alpha_{3,1}$  is 1/ $G_1G_2$ . Still, as demonstrated in 4.13, the digital calibration engine will search for the  $\alpha G_{i,1}$  to maximize the SNDR, thus compensating for the gain errors of each stage. The reduced residue gain leads to less residue amplification and more stages are needed to achieve required ADC accuracy. Meanwhile, the reduced inter-stage gain also reduces the output swing at the cost of incomplete charge transfer.



Figure 4.10: 2.5-bit residue amplifier transfer curve with comparator offset in sub-ADC.

In Fig. 4.10, the 2.5-bit per stage has half-bit redundancy to tolerate the comparator offsets. The comparator offsets lead to decoding errors, which cannot be compensated easily for a structure without redundancy. As demonstrated in [93], a structure without redundancy will lead to miss codes among certain regions, which will lead to significant SNDR reductions, if this happens in front-end stages. However, the structures with half-bit redundancy gives the back-end stages to

the ability to compensate for the undecided decisions from front-end stages. This redundancy makes this error less dominant in pipelined ADC. With more advanced CMOS technologies, the comparator offsets are greatly reduced. The flash ADC is usually designed without excessive power consumption to have offsets less than 1/2 LSB statistically. since residue is amplified at each stage, the LSB corresponds to be  $1/4V_R$ .



Figure 4.11: 2.5-bit residue amplifier transfer curve with capacitor mismatch in MDAC.

In Fig. 4.11, capacitor mismatches are difficult to calibrate digitally[94]. Compensating for such errors efficiently need to use much high order polynomials [94]. The fundamental reason is that polynomials are not efficient in approximating piece-wise linear transfer functions. As provided in [93] and [94], this capacitor mismatches in MDAC can be calibrated by allocating coefficients  $w_i$  for different capacitors to compensate for the capacitor mismatches. The main difficulty is find  $w_i$  effectively, since errors are propagating and accumulating through different stages. By forcing a close-to-ideal ADC transfer curve, the  $w_i$  can be found provided that the inter-stage gains are well calibrated. This technique is verified in SAR ADCs [93] without too

many residue gain amplification stages and it is difficult to apply in pipelined ADCs. For this design, correction capacitor arrays together with digital coefficients are used together to maximize the SNDR response of the ADC. It is demonstrated through measurement results that the analog calibration can alleviate the need to find  $w_i$  and  $G_i$  jointly and greatly accelerates the operation of the digital calibration algorithm.



Figure 4.12: 2.5-bit residue amplifier transfer curve with amplifier distortions.

In Fig. 4.12, amplifier distortions happen with large residue signal. The polynomial from 4.17 will compensate the for the residue amplifier's non-linearities by searching for the  $\alpha_{i,3}$  and  $\alpha_{i,5}$  in each stage, which will maximize the SNDR performances. Since the each stage is working in a closed-loop, the high-order coefficients, such as  $\alpha_{i,7}$  and  $\alpha_{i,9}$  are introduced first in the optimization algorithm, then if there coefficients are trivial and will not affect the SNDR too much, these high order items will be dropped.

In this work, inter-stage gain error and amplifier's non-linearities are calibrated digitally and the capacitor mismatches are calibrated employing an analog method. The comparator offsets are tol-

erated by the half-bit redundancy embedded in this structure. The coefficients of digital calibration engine are searched to find the optimal SNDR. This process is not convex throughout the whole spans of the coefficients, and global optimization algorithms are needed to reach the global optimal point. The optimization algorithms need a long iteration process to find the optimal coefficient set with excessive hardware overhead. All these calibrated errors are static and non-linear errors that it is not related to signal frequency. Frequency dependent errors, such as settling errors, need to be analyzed and calibrated efficiently. Adding memory effect [80] and dynamic non-linearity correction [95] will enlarge the set size of the coefficients. Searching the optimal coefficient set through a brute-force method is usually not the most efficient way of find the optimal coefficient set. A hierarchical way to weight the error sources differently so that the digital calibration will be effective. First of all, the inter-stage gain errors are the most important error sources for signal from low frequency to Nyquist frequency. Secondly, for a closed-loop system, non-linearities are reduced by the loop gain. Static offsets with the non-linearities will generate harmonic distortions at the output spectrum. Introducing nonlinear items with different coefficients will help to remove theses effects. After these two steps, SNDR at low input frequencies generally have reached their peak value. However, signal at high frequencies still suffer from limited bandwidth, settling errors, memory effects and etc. Finally, introducing equalization [80] and dynamic nonlinear filter [95] to partially compensate for the signal dependent errors will finalize the last stage of foreground calibration. The signal-dependent error calibration may have some effects on the low frequency signal. In this way, finding better trade-offs between low frequency SNDR and high-frequency SNDR is needed. For example, when an ADC is calibrated without signal dependent calibration, the lowfrequency SNDR is 75 dB and high-frequency SNDR is 65 dB. Then after the signal-dependent calibration, the high-frequency SNDR reaches 72 dB, and the low-frequency SNDR drops to 74 dB. This make the signal dependent calibration worthwhile. If the signal dependent calibration make the high-frequency SNDR 67 dB at the cost of dropping the low-frequency SNDR to 68 dB, this means the signal-dependent calibration needs to be adjusted to avoid the cost of sacrificing the low-frequency SNDR. In a sum, foreground calibration needs to include the SNDR improvement

both from low frequency to Nyquist frequency and find the optimal coefficient set hierarchically through adaptive global optimization methodology.

Foreground calibrations are performed offline, while the background calibrations are carried out online. Random input signals are coming into the ADC continuously. Background calibration algorithms need to work with the unknown input signal together to finish the calibration. Pseudorandom numbers are injected into the different nodes of the ADC [96, 97] to detect the inter-stage gain errors and harmonic distortions. And these testing signals must be digitally removed without increasing the ADC's noise floor, such digital noise cancellation [97]. The main advantage of using pseudo-random numbers are their self-correlation property. A pseudo-random number series only generates noise like results if it convolves with other signal. It generates a dirac signal when it convolves with itself and with perfect timing alignment. This property is widely used in 3G CDMA communication system. However, there are several drawbacks of this methodology. Each stage of the pipelined ADC functions like a bandwidth limited system, which will remove the high-frequency component of the pseudo-random series. Moreover, the comparators in each stage may generator decision errors, which need a long pseudo-random sequence to average out the errors. It is difficult to fully remove the pseudo-random test signal completely, which leads to the degradation of the SNDR. And the pseudo random number may still occupy certain voltage headroom, which leads to a reduced input range. Still, a testing signal is needed to find the coefficient variations in background. And the hardware complexity depends on the coefficients that need to be calibrated in background. More coefficients lead to more hardware overhead. Most of the existing publications, such as [96, 98], focused on calibrating the most important parameters, inter-stage gain errors, to maintain their performances through background calibrations.

## 4.4 Proposed Pipelined ADC with Analog and Digital Calibration Techniques



Figure 4.13: Simplified block diagram of ADC architecture.

To achieve 70 dB SNDR with optimized Schreier's FoM (>168 dB), optimized analog circuits with digital and analog calibrations in foreground and background are needed to support this structure. As shown in Fig.4.13, a 14-bit pipelined ADC with 6 stages of 2.5-bit MDAC and a 2-bit flash as the last stage. The input signal buffer was co-designed with the boot-strapped switches with tunable boot-strapped dummies for charge-injection cancellation to reduce signal-dependent harmonic components. By reducing the kickback noise into the source follower, the proposed tunable boot-strapped dummies greatly reduces the power consumption of the input buffer. The charge injection and clock feedthrough from switches introduce signal dependent distortions over different frequencies. Tunable boot-strapped dummies are effective in solving this problem.

As verified in [99], the ideal voltage difference bewtween the source and gate of the boot-



Figure 4.14: Boot-strapped switch with tunable boot-strapped dummies.

strapped switch  $M_{s1}$  in Fig. 4.14, should be

$$V_q = V_s + V_{dd} \tag{4.18}$$

However, due to the parasitic capacitance  $C_{PD1}$  on the top plate of the  $C_{BS1}$ , the effective voltage difference is reduced to

$$V_g = V_s + V_{dd} \frac{C_{BS1}}{C_{BS1} + C_{PD1}}$$
(4.19)

Thus, the parasitic capacitance generates a reduced  $V_{dd}$  over the  $M_{s1}$ . At the mean time, the input signal level-shifted on the gate gets an reduced amplitude, thus leading to the non-ideal boot-strapping. For the channel charge  $Q_{ch}$  stored on the  $M_{s1}$ , a simplified first order estimation is,

$$Q_{ch} = WLC_{OX}(V_{gs} - V_{th}) \tag{4.20}$$

where W is the transistor's width, L is the transistor's length,  $C_{OX}$  is the gate oxide capacitance per area and  $V_{th}$  is the transistor's threshold voltage. Ideally, for dummy switches  $M_{D1}$  and  $M_{D2}$ to cancel the the charge injection and clock feedthrough of  $M_{S1}$ ,  $M_{D1}$  and  $M_{D2}$  are chosen of have the same length as  $M_{S1}$  and half width of  $M_{S1}$ . This is based on the premise that channel

charge of  $M_{S1}$  is split evenly into the source and drain. However, the structure shown in Fig. 4.14, the impedances from the buffer and the sampling capacitors of the first stage are different, thus, different amount of channel charge released by  $M_{S1}$  are split. To cancel the unbalanced channel charges, the bootstrapping capacitors  $C_{BSD1}$  and  $C_{BSD2}$  of dummies are tunable to change the  $V_{qs}$  levels and absorbed charges. During simulation, is it observed that with out dummy switch  $M_{D1}$ , the channel charge of  $M_{S1}$  will be kick back from the source follower and accumulated on the sampling capacitors to generate signal dependent errors. The gate-to-source capacitor  $C_{gs}$  of  $M_{F1}$ , bond wire  $L_1$  and input matching resistor  $R_{IN}$  can form a frequency dependent resonating network, which is sensitive to high-frequency injections. Thus, it is necessary to use  $M_{D1}$  to attract the charges from  $M_{S1}$ . During simulation results, to reach the same level of SNDR from low to Nyquist input frequencies, when the source follower is without dummy  $M_{D1}$ , the  $M_{F1}$  needs to consume 256.8 mW in comparison with the 21.4 mW design, where  $M_{D1}$  is used. The main reason is  $C_{qs}$  of  $M_{F1}$  is used to absorb the charge from the  $M_{S1}$ . With the tunable dummies, the SNDR of sampled signal from the source follower can reach 84 dB at low frequency input and 78 dB at Nyquist frequency input, which satisfies the requirement of this design. The architecture can tolerate a narrow range of PVT variations, such as a 10% variation of the  $V_{dd}$ . Without tunable dummies, the SNDR at low frequency input does not change lot. However, the SNDR at Nyquist input change can drop to 52 dB, which is due to the signal dependent clock feedthrough and charge injections. The sensitivity of this architecture can be alleviated by using more advanced technology nodes without high voltage reliability issues. Since the desired on-resistance of  $M_{S1}$  is realized with much smaller transistors on more advanced node, the clock feedthrough and charge injection is greatly reduced. However, there are also other problems, such as lower reverse isolation, nonlinearity, etc.

The input clock buffer generates non-overlapping clock signals for different stages and the measured rms jitter was less than 150 fs. Low-glitch LDOs were used to ensure high accuracy for both 1<sup>st</sup> and 2<sup>nd</sup> stages. An current-reuse telescopic OTA with class-C slew-rate boosting circuit reduces the settling error of large swing at MDAC's output. Comprehensive analog and digital

calibration techniques have been used to improve the SNDR of this ADC from low to Nyquist input frequencies.

## 4.4.1 A Current-reuse OTA with Slew Rater Boosting Circuit



Figure 4.15: Proposed current-reuse OTA.

The telescopic OTA has been proven to be faster than the two-stage amplifier. Its high voltage gain is suitable for switched-capacitor applications. Details of the current reuse telescopic OTA is shown in Fig. 4.15, where  $V_{INN+}$  and  $V_{INP+}$  are the positive inputs, and  $V_{INN-}$  and  $V_{INP-}$  are the negative inputs. The main amplifier is composed by two sub-amplifiers operating in parallel; the P-type differential pair due to  $M_{P+/-}$  and the N-type differential pair  $M_{N+/-}$  enhance amplifier's transconductance and settling time since its class AB properties. Buffered by the their cascode

stages, the different outputs arrive at  $V_{OUT-}$  and  $V_{OUT+}$ , respectively. This OTA takes the full advantage of switched-capacitor MDAC that the input common mode voltages for  $V_{INN+}$  and  $V_{INP+}$  can be split differently.



Figure 4.16: Equivalent circuit of proposed current-reuse OTA with level shifting.

The equivalent small signal circuit of proposed current-reuse OTA is shown in Fig. 4.16. The input signal  $V_{IN+}$  and  $V_{IN-}$  are level-shifted by input common voltages  $V_{BCMP}$  and  $V_{BCMN}$  to be properly amplified by the parallel P-type and N-type pairs. The level-shifting technique can be easily implemented in a switched-cap circuit.

Considering N-type differential pair  $M_{N+/-}$  and P-type differential pair  $M_{P+/-}$  have similar small signal transconductance gm, and similar output capacitance  $C_O$  at differential output, the corresponding GBW of this OTA is,

$$GBW = \frac{2gm}{C_O} \tag{4.21}$$

where the benefit of current reuse can be seen that the GBW is doubled. And for the slew rate (SR) of this OTA,

$$SR = \frac{I_B}{C_O} \tag{4.22}$$

| Architecture | Telescopic           | Folded cascode       | Current reuse        |
|--------------|----------------------|----------------------|----------------------|
|              | amplifier            | amplifier            | telescopic amplifier |
| GBW          | $gm/C_O$             | $gm/C_O$             | $2gm/C_O$            |
| Efficiency   | $gm/I_B$             | $gm/2I_B$            | $2gm/I_B$            |
| Swing        | $V_{dd} - 5V_{dsat}$ | $V_{dd} - 4V_{dsat}$ | $V_{dd} - 6V_{dsat}$ |
| SR           | $I_B/C_O$            | $I_B/C_O$            | $I_B/C_O$            |

Table 4.1: Comparisons between different OTA structures.

which is limited by the tail currents. It is imperative to compare the GBW, SR, output swing and efficiency of different OTA structures.



Figure 4.17: Comparison of single stage OTA structures

As shown in Fig. 4.17, the telescopic amplifier, folded cascode amplified and current reuse telescopic amplifier are compared with each other.

Table 4.1 includes the comparisons between three structures. The telescopic amplifier and the folded cascode amplifier have similar GBW of  $2gm/C_O$ , where the current reuse telescopic amplifier's GBW is twice of that. Since the telescopic amplifier and the current reuse telescopic amplifier have same static current, efficiency of the current reuse telescopic amplifier get doubled

as  $2gm/I_B$ , where as the folded cascode amplifier has the lowest efficiency of  $gm/2I_B$ . Since the output swing is dictated by the number of transistor stacks, and the current reuse telescopic amplifier has the lowest output swing, while the folded cascode have the largest output swing. Since all these structures have tail currents  $I_B$ , their output slew rate is limited. In summary, the current reuse telescopic amplifier is the best candidate for the switched-cap MDAC applications. However, in the real scenarios, the limited slew rate generates distortions, when there is a high-frequency input with large amplitude. Before taking the slew rate related distortions into consideration, it is needed to show how the current reuse telescopic amplifier works with the switched-cap MDAC.



Figure 4.18: Current reuse amplifier in a switched-cap MDAC.

As shown in Fig. 4.18, the current reuse amplifier is used as a residue amplifier in a switchedcap MDAC. For simplicity, only half circuit is exhibited. Bootstrapped switches and transmission gates are labeled differently. Compared to Fig. 4.4, Fig. 4.18 shows a detailed operation of the proposed amplifier and the split  $C_1$ ,  $C_2$ ,  $C_3$  and  $C_F$ . The input  $C_1$ ,  $C_2$ ,  $C_3$  and feedback capacitor  $C_F$  are split in two parts and the one associated with the N-type differential pair is pre-charged to  $V_{IN}$  or the proper reference voltage and the respective common-mode levels. During sampling phase  $\Phi_1$ , the input signal  $V_{IN}$  is sampled onto the bottom plates of all the capacitors with relevant input common mode voltages  $V_{BCMP}$ ,  $V_{BCMN}$ . Amplifier's linear range is maximized by biasing each differential pair to their optimal bias voltages,  $V_{BCMN}$  and  $V_{BCMP}$  for the N-type and P-type differential pairs, respectively. And during the evaluation phase  $\Phi_{1B}$ , the bottom plates are connected different reference voltages. By writing the charge reservation equations, for  $\Phi_1$ , the total charge is

$$Q_{\Phi_1} = \frac{C_1 + C_2 + C_3 + C_F}{2} (V_{IN} - V_{BCMP}) + \frac{C_1 + C_2 + C_3 + C_F}{2} (V_{IN} - V_{BCMN})$$
  
=  $(C_1 + C_2 + C_3 + C_F) (V_{IN} - \frac{V_{BCMP} + V_{BCMN}}{2})$  (4.23)

Similarly, for the evaluation phase  $\Phi_{1B}$ ,

$$Q_{\Phi_{1B}} = (C_1 + C_2 + C_3)DV_{REF} + C_F V_{OUT}$$
(4.24)

By equating  $Q_{\Phi_1}$  and  $Q_{\Phi_{1B}}$ , it automatically reaches the ideal transfer curve of the 2.5-bit stage as in Fig. 4.5. The equivalent input common mode  $(V_{BCMP} + V_{BCMN})/2$  can be chosen as zero, such that no additional level shifting is needed. It needs to be pointed out that the zero is relative the global common mode voltage  $V_{CMG}$ . For this design, the  $V_{CMG}$  is 1.1 V,  $V_{BCMP}$  and  $V_{BCMN}$ are 1.5 V and 0.7 V, which can make the most of the P-pair and N-pair simultaneously, which is a major benefit of proposed current reuse telescopic amplifier and the embedded level-shifting is used automatically.

The current reuse telescopic amplifier builds the fundamental part of the proposed amplifier and the settling accuracy meets the design specifications for low-frequency input. However, when it comes to high-frequency and large-signal input, the residue amplifier's slew rate will drastically limits the settling accuracy. Since the ADC's performance is mainly dictated by the settling accuracy of each pipeline stage, in which the closed-loop transfer function is fundamental. And the settling accuracy is only decided by the output voltage accuracy at the time following stage samples the voltage. The shape of the settling process is not relevant provided the residue amplifier settles within the required accuracy.



Figure 4.19: Proposed current-reuse OTA with slew-rate booster.

As shown in Fig. 4.19, the current-reused telescopic amplifier is equipped with a class-C slew rate booster that provides current on-demand. The large signal swing at amplifier's input when processing large input signals triggers the class-C slew rate booster to deliver large instantaneous current at amplifier output for faster settling. While the high slew-rate is mainly managed by the agile slew-rate booster, transistors  $M_{CN+/-}$  and  $M_{CP+/-}$ , the final settling is determined by the main current reuse telescopic amplifier. The inputs of pseudo differential pairs of the booster are AC connected to the cascode nodes of the dual-differential pair telescopic amplifier and they are biased at the onset of their turn-on voltages  $V_{BNC}$  and  $V_{BPC}$ . The proposed class-C slew rate boosting circuit does increase the input capacitance of the main amplifier, thus the feedback factor is not reduced.



Figure 4.20: Closed loop simulation of proposed OTA.

To settle a large residue signal, higher slew rate is demanding. The class-C slew rate booster's effect is shown in Fig. 4.20. And clearly the slew rate helper made the circuit settle faster than traditional OTA, whose slew rate is limited by current. For a pipelined ADC, following stages only sample the settled voltage and nonlinear settling processing does not affect the ADC's performance. The class-C slew-rate booster does not turn on when input signal is small, thus has minimum impact on the static power consumption. For this simulation, power consumption increased from 18 mW to 22 mW, when  $\beta$  was 1/4 and a 120-mV single-end step was applied. The high-pass property of the slew rate helper may have residue charge from previous sample, and this is exacerbated at high-frequency and large signal input. Thanks to the digital calibration system and this part can be calibrated as the memory effects. The class-C slew rate booster consumes only 2.6 mW static power.

#### 4.4.2 Foreground analog and digital calibration techniques

There are enough techniques to have foreground calibration techniques to the non-perfect of analog circuits[100][101][102]. A typical digital calibration method is to apply global optimization algorithm[94], such as particle swarm optimization (PSO) is used the adjust the coefficients for best SNDR performance. This is a blind algorithm and used for foreground calibrations. This method is efficient and includes all errors as non-linear coefficients. Fundamentally, all calibration algorithms are trying to find the inverse function of the ADC 4.5. This method is helpful as long as frequency dependent distortions are minimum. For example, if capacitor offsets are static and the OTA can settle properly over all input frequencies, the non-linearities is not coupled with frequency dependent errors. In this case, analog circuits are generally over-designed and cannot achieve the best Walden's FoM or Schreier's FoM. However, it is desired to identify errors must be identified and then corresponding calibration algorithms can be generated. This is similar to the concept of independent component analysis (ICA) [103, 104]. This section will make a brief overview of existing analog and digital calibration algorithms[82, 105].

A comprehensive foreground analog and digital calibration is introduced. One of the highlights of proposed design is to reduce capacitor mismatches in an efficient analog method and signal depend errors are identified and calibrated through out-of-band signal injection. There are certain static offsets coupled with input signals and they are de-coupled and calibrated through both analog and digital calibrations. Self-calibration is an useful tool and it is improving the SNDR progressively [106][107]. In this design, the self-calibration helps with capacitor mismatch tuning of 1<sup>st</sup> and 2<sup>nd</sup> stages by increasing the input amplitude gradually.

The ADC performance improves by means of the calibration schemes; the foreground calibration scheme is depicted in Fig.4.21. The analog calibration includes LDO glitch reduction through a pre-charging technique as well as capacitor mismatch reduction through a correction capacitor array  $C_{COR}$  in the 1st and 2nd stages. The foreground calibration technique incorporates a compensation polynomial for inter-stage gains  $\alpha G_{i,1}$ , non-linearities  $\alpha G_{i,\{3,5,7\}}$ , equalization  $\alpha E_{i,2j-1,k}$  [79, 80, 81], and dynamic non-linearities  $\alpha SD_{i,j}$  and  $\alpha S_{i,j}$  [95]. The coefficients



Figure 4.21: Foreground calibration scheme; LDO with glitch reduction is used in the first two stages.

 $\alpha G_{i,1}$  and  $\alpha G_{i,3,5,7}$  is shown in 4.16 to calibrate the inter-stage gain errors and non-linearities of each stage. The equalization coefficients  $\alpha E_{i,2j-1,k}$  are importing the concepts of an FIR filter to compensate for the frequency response of the ADC.

As shown in Fig. 4.22, due to limited bandwidth of the switches, residue amplifiers and parasitic RC delays, the cut-off frequency uncompensated ADC is fc, which is below the Nyquist frequency fs/2. The equalization method is trying to generate a flat frequency response by multiplying the ADC's output with an FIR filter, such as the the cut-off frequency of compensated ADC is close to Nyquist frequency. In this way, the  $f_{D_i}(D_i)$  will contain the data from previous samples,

$$f_{D_i}(D_i)[N] = \sum_{k=1}^{3} \sum_{j=1}^{3} \alpha E_{i,2j-1,k} (D_i[N-k])^{2j-1}$$
(4.25)

In 4.25, the  $D_i[N-1]^3$  is the third order component from previous sample. This equation includes the FIR response from previous sample's first order to fifth order components. Still, it is believed that the even-orderharmonic distortions are suppressed in previous samples.



Figure 4.22: Using an FIR filter to generate flat in-band frequency response.

$$f_{D_i}(D_i)[N] = \sum_{k=1}^{3} \alpha E_{i,k}(D_i[N-k])$$
(4.26)

If the non-linearities from previous sample, 4.25 will reduce to 4.26, which is a typical FIR filter's response. Meanwhile, for the dynamic non-linearities [95], it compensates for signals with different slopes. For the slope of the signal,

$$Slope[N] = \frac{D_i[N] - D_i[N-1]}{T}$$
 (4.27)

Where T is the clock period.  $D_i[N]$  and  $D_i[N-1]$  are the current and previous digital data, respectively. The dynamic error correction [95] is applied as,

$$f_{D_i}(D_i)[N] = \sum_{j=1}^3 \alpha SD_{i,j} * D_i[N] * (Slope[N])^j + \sum_{j=1}^3 \alpha S_{i,j} * (Slope[N])^j$$
(4.28)

As from 4.28, when the signal's slope is high, limited bandwidth and slew rate will introduce high

order distortions that are related to the signal's slope. Thus, taking the signal's derivative over time, the slope with higher order can partially compensate for the signal distortion due to fast signal transitions. When signal is changing slowly, the slope is close to zero and the dynamic error correction does not impact the system's SNDR.

$$f_{D_{i}}(D_{i})[N] = \sum_{j=1}^{3} \alpha G_{i,2j-1}(D_{i}[N])^{2j-1}$$
  
$$= \sum_{k=1}^{3} \sum_{j=1}^{3} \alpha E_{i,2j-1,k}(D_{i}[N-k])^{2j-1}$$
  
$$= \sum_{j=1}^{3} \alpha SD_{i,j} * D_{i}[N] * (Slope[N])^{j} + \sum_{j=1}^{3} \alpha S_{i,j} * (Slope[N])^{j}$$
(4.29)

By take all the possible coefficient together, the final calibration polynomial is shown in 4.29. Still, optimizing the SNDR over the entire input range is the target of the calibration algorithm. Foreground calibration engine will apply a global optimization algorithm to find the coefficients, which can serve the purpose of SNDR maximization.

These coefficients are searched globally and adaptively. Fig. 4.23 shows the diagram of the PSO algorithm in foreground calibration for optimized SNDR performances. The inter-stage gain errors, capacitor mismatches, settling error and other errors are gathered together for digital calibrations. Signal-dependent calibrations were used to compensate for the dynamic errors in high frequencies. Low frequency SNDR should be remained without being greatly reduced by signal-dependent calibration.

Since a strong ground plane is not assured for QFN packages will ground bonds, glitch reduction technique is applied for 1<sup>st</sup> and 2<sup>nd</sup> stages to ensure their output settling accuracy. As on-chip voltage references are demanding for ADC applications[108], it is imperative to guarantee the voltage references accuracy over different conditions. A voltage reference less than 12-bit accuracy cannot ensure the SNDR performance of the first stage. Also, the reference ringing from packaging bondwires are signal dependent and can reduce SDNR on ADC performances. In this way, the


Figure 4.23: A foreground calibration method through PSO algorithm.

bond wires need to be taken into account for minimized ringing. Since the LDO's bandwidth is not large enough to track very fast variations, a huge capacitor  $C_{LDO}$  is needed and charges are taken from the  $C_{LDO}$ .

As indicated in Fig. 4.24, the LDO is providing  $V_R$  to different capacitors and a large cap  $C_{LDO}$  is holding the voltage . In this figure, glitch reduction technique is applied to  $C_1$ . During the evaluation phase  $\phi_{1B}$ , where control signal  $\phi_{1B}$  connects the bottom plate of  $C_1$  to change its voltage from  $V_{IN}$  to a reference voltage  $V_R$ . The bond wire  $L_{LDOG}$  attached to the bottom plate of the large capacitor  $C_{LDO}$  provides high impedance over switching operations and this leads to ringing on the top plate. In this way, input signal dependent compensation charge is prepared to greatly reduce the switching ringing from  $C_{LDO}$ . The compensation charge should compensate for



Figure 4.24: A low glitch LDO based on pre-charging technique.

the instantaneous charge transfer happened between the LDO and the bottom plate of  $C_1$ . The total instantaneous charge needed from LDO on  $\phi_{1B}$  is

$$Q_I = C_{1PAR} (V_{REF} - V_{IN})$$
(4.30)

Where  $C_{1PAR}$  is the stray capacitance at the bottom plate of  $C_1$ . The main source of the ringing comes from the instantaneous top-plate charging of  $C_{PAR1}$ . For the high-frequency glitches, the OTA cannot respond immediately and the top plate of  $C_1$  is close to open loop. Thus, the glitch reduction is not quite related to  $C_1$ . However, if the charge  $Q_I$  can be compensated by another charge  $Q_{AUX}$ , which has the same amount and a reversed polarity of  $Q_I$ , the ringing will be greatly reduced.  $Q_{AUX}$  must contain the  $-V_{IN}$  to cancel the signal dependent charge. Suppose a glitch reduction capacitor  $C_{LDOCOR}$  shown in Fig.4.24 is used to store  $Q_{AUX}$ . During  $\phi_1$ ,

$$Q_{C_{LDOCOR}} = C_{LDOCOR}((-V_{IN}) - (V_{CM1}))$$
(4.31)

where  $-V_{IN}$  is sampled on to  $C_{LDOCOR}$  and the common mode voltage to the bottom plate is  $V_{CM1}$ . During  $\phi_{1B}$ ,

$$C_{1PAR}V_R + C_{LDOCOR}(V_R - V_{CM2}) = C_{1PAR}V_{IN} + C_{LDOCOR}((-V_{IN}) - (V_{CM1}))$$
(4.32)

4.32 shows the condition for charge balance where  $C_{LDOCOR}$  is the pre-charging cap to provide  $Q_{AUX}$ . Since,  $V_{IN}$  relevant items are cancelled, when  $C_{LDOCOR}$  is tuned to close to  $C_{1PAR}$ . Bottom plate of  $C_{LDOCOR}$  is changed from  $V_{CM1}$  to  $V_{CM2}$  between  $\phi_1$  and  $\phi_{1B}$  to accommodate different common mode voltages required for different phases.  $V_{CM2}$  and  $V_{CM1}$  are natural chosen as  $V_R$  and  $-V_R$ , respectively.

Since the low glitch LDO provides  $V_R$ , the  $V_{CM2}$  and  $V_{CM1}$  are supported by additional LDOs separately to reduce the effect of self-loading. During  $\phi_{1B}$ ,  $C_{LDOCOR}$  is connected to  $C_{LDO}$  earlier to perform pre-charging with  $\phi_{1BE}$ . The phase difference between  $\phi_{1BE}$  and  $\phi_{1B}$  is also digital tuned, in case of too early pre-charging, which can also introduce unexpected ringing. This techniques shows a significant reduction by reducing the peak value of switching glitches from 1.34 mV to 41.2 uV through simulation results when a full scale Nyquist input is applied, which is equivalent to an 30.2 dB improvement. Considering the single-ended peak swing of 450 mV, an equivalent improvement of SNDR from 50.6 dB to 80.8 dB. This preliminary analysis overrates the impacts of reference ringing, since  $C_1$  is not quite related to this kind of ringing. The LDO still needs to provide charge for  $C_1$ , however, this is relevant to settling accuracy not glitch reduction. If glitch reduction technique is also applied to the  $C_1$ , the charge and matching requirements will demand an excessive power consumption. The actual accuracy of the ADC is dictated by the output settling accuracy. This technique's effectiveness needs to be verified in measurement results. For ADC's measurement results, this technique can increase the SNDR at low and Nyquist frequency by 2.9 to 6.1 dB. It is effective in improving the SNDR at high frequencies, which can support SNDR at Nyquist frequency up to 68.6 dB. Without the LDO glitch reduction techniques, the Nyquist SNDR cannot get over 62.1 dB easily after all kinds of digital calibrations. In the real

scenarios, the LDO ringing can be partially calibrated by the digital calibration techniques, which will treat the ringing as a kind of weakly signal dependent gain error.

# 4.4.3 Background Calibration Techniques

For a conventional pipelined ADC, the high loop gain minimizes the incomplete charge transfer and non-linearities. Thus, most of them only monitor the inter-stage gain error with pseudo random number injections. For this design, stage non-linearities variations are also calibrated in background. Here, the variation of inter-stage gains, non-linearities, weak capacitor mismatches and other effects will generate out-of-band spectrum regrowth. The background engine cannot effective separate these errors easily. The proposed methodology measured ADC non-linearities while processing the signal, by making use of the unoccupied high-frequency band. most of the ADCs reduces the frequency range of the input signal under th enyquist frequency; e.g. fs/4 in many cases, which is a half of the Nyquist frequency. Thus the unused frequency range close to fs/2 can be used for background calibration. Instead of using pseudo-random sequences, whose power is spread among the whole band, multiple-tone testing signatures located in the unused band employed. An adaptive digital background calibration algorithm is proposed in this project.



Figure 4.25: Gain and non-linearity calibration through small input power.

Assume the input signal only occupies the band up to fs/4, and testing signature tones are injected between fs/4 and fs/2 to monitor ADC non-linearities. Fig. 4.25 shows the diagram of injecting a two-tone signal at out-of-band to measure the ADC non-linearities and calibration is achieved by observing and reduce their third or fifth order inter-modulation products. Suppressing the high order inter-modulation products will calibrate the inter-stage gain error and non-linearities simultaneously. Since the foreground calibration finds the global maximum of SNDR through the adjusting coefficients, the background calibration will only need to search within the limited spans to track the coefficient variations that lead to the optimal SNDR in background mode. As assumed small signal power in Fig. 4.25, the inter-modulation products are the results of ADC's non-linear response to testing signals. By running FFT in background, the power of the inter-modulation products is reduced by tuning the trained coefficients in digital engine 4.29. For the lab measurement, when the background calibration engine reached a stable point without moving, the ADC's SNDRs were measured from the entire input frequency range.



Figure 4.26: High-frequency distortion calibration through small input power.

For frequency dependent errors, an intuitive low-pass model with a single pole is assumed. As

shown in Fig. 4.26, the out-of-band pole location changes when there is a temperature variation, such as from 27 °*C* to 85 °*C*, the equivalent gain of the ADC reduces at a low frequency. The background calibration engine will apply multiple two-tone signals from fs/4 to fs/2 to track the gain reduction due to frequency dependent factors. Thus, the pole in the frequency dependent calibration model is tracked by tuning the trained coefficients in digital engine 4.29, where the most relevant ones are the equalization and dynamic error correction coefficients. The testing signals can be digitally filtered or canceled without interfering with the input signal, since the they are all out-of-band. The low input in-band signal level generates an opportunity where unused bands can be used for digital background calibration. Since the testing signature's swing is higher than the in-band components, the non-linearities and frequency dependent errors are not quite relevant to the in-band signal.



Figure 4.27: Gain and non-linearity calibration through large input power.

As shown in Fig. 4.27, when signal power is close to the ADC's full scale, applying the test signal will saturate the ADC. For a bandwidth limited signal, non-linearities from the ADC will generate spectrum regrowth out-of-band. For example, a signal less than fs/4 can generate the

components over fs/4 due to non-linearities. Thus, reducing the out-of-band spectrum leakage will be performed in background by observing the output spectrum through FFT and tuning the trained coefficients in digital engine 4.29. Once the out-of-band power is reduced to acceptable levels or below the noise floor. Then the digital background calibration can reduce the training speed. Frequency dependent calibration can also be performed as there is a strong correlation between the shape of in-band power and out-of-band power. For example, a flat in-band power and skewed out-of-band power with lower high-frequency components means the changing of the pole of the low-pass model embedded in the signal dependent calibration. Here no test signal is used and there is no need to remove the test signal. Generally, for this scenario, the background calibration cannot converge very fast as the input power changes dynamically and the optimization process involves the non-convex optimization.



Figure 4.28: Adaptive background calibration schemes that includes measuring out-of-band power, two tone test and frequency response compensation.

Fig. 4.28 shows the whole picture of adaptive background calibration, which depends on the signal power level. When measured input signal power  $P_{IN}$  is higher than the predefined threshold power  $P_{TH}$ , the background calibration engine focused on reducing the out-of-band leakage power since it is generated by ADC non-linearities by tuning the coefficients in the polynomial. For  $P_{IN}$  lower than  $P_{TH}$ , an out-of-band digital test signal is converted into analog format by an external high-performance DAC. The out-of-band testing signal is used when  $P_{IN} < P_{TH}$  to accurately measure ADC linearity; the case of two tones is exemplified in Fig. 4.28. The use of the out-of-band two-tone test signal enables the inter-stage gain and non-linearity background calibration since the ideal test signature is available in digital format. The background calibration algorithm allows the compensation of frequency limitations. The frequency response variations, mainly due to fluctuations in the sample-and-hold and residue amplifiers, are tracked by a fully adaptive digital filter; the out-of-band test signal is swept in the empty spectrum until Nyquist frequency to fully characterize the ADC's frequency response, as depicted in Fig. 4.28. Frequency-dependent calibrations is then arranged.

In summary, an adaptive digital background calibration algorithm was proposed for different input power levels. Inter-stage gain errors, non-linearities and frequency-dependent errors were calibrated. The test signal was adaptive and can be removed digitally without interfering the operating of in-band signal.

# 4.5 Experimental Results



Figure 4.29: Chip photo with measurement set-up.

The low-power pipelined ADC with analog and digital calibration techniques was fabricated in a TSMC 40nm LP technology. The microphotograph of the chip is exhibited. Input buffer, seven stages, clock buffer and output low voltage differential signaling (LVDS) interfaces are exhibited. An Keysight N5172B is used as a input signal source. The silicon labs Si5341D is used as the low jitter clock for the ADC. The LVDS output is captured by the TSW1400 data capture card. Background calibration patterns including two tones and bandwidth limited OFDM signal are generated from N5172B and DAC34SH84 controlled by another TSW1400.

The most relevant measurement results are included in Fig. 4.30 and 4.31. At Nyquist frequency input, an SFDR improve from 65.34 dB to 76.14 dB is verified. The digital calibrations includes all the techniques mentioned in 4.29. The harmonic distortions are greatly suppressed, especially the tones, which are far from the input signal. These harmonics are signal dependent



Figure 4.30: Measured spectrum before calibration.



Figure 4.31: Measured spectrum after calibration

errors, which is reduced by equalization and dynamic error correction techniques. Meanwhile, the calibration scheme provides over 12.6 dB increment in SNDR. SFDR and SNDR for different frequencies were measured, showing a reduction of 2.4 dB SNDR when measured across the entire

ADC bandwidth. The SNDR of the background calibrated ADC remains above 68.6 dB when the temperature varies from 27  $^{\circ}C$  to 85  $^{\circ}C$ .



Figure 4.32: DNL and INL before calibration.

DNLs and INLs before and after calibration are reported in Fig. 4.32 and 4.33. Before calibration, the DNL is between +1.86 and -1.98 LSB. And the INL is between +4.13 and -3.42LSB. The proposed calibration techniques are effective by reducing the DNL to the level between +0.169and -0.175 LSB. And the INL is reduced to the region between +0.504 and -0.523 LSB. The calibrated DNL and INL are under 0.18 LSB and 0.53 LSB, respectively, which satisfies the design requirements.

As shown in Fig. 4.34, the SNDR and SFDR increases linearly with input amplitude in dB. The peak SNDR of 71.3 dB and SFDR of 81.5 dB happens at 0 dB input amplitude, which is the



Figure 4.33: DNL and INL after calibration.

designated full-scale input. There are certain variations, such as a small SFDR peak at -40 dB input, which comes from the non-smooth response of the digital calibration engine. Overall, the calibration keeps a close-to-linear response of the SNDR/SFDR over input amplitude.

Table 4.2 records the SNDR performance updates over different calibration techniques and BG is short for background. The whole calibrations improves the SNDR of low frequency signal from 55.2 to 71.3 dB. And the SNDR of Nyquist frequency input is improved from 46.6 to 68.9 dB. The systems shows an improvement over 16 dB at low frequency and 22 dB at Nyquist frequency. The high-frequency input is more sensitive to LDO noise, thus the LDO glitch reduction techniques exhibit more benefit. As expected, inter-stage gain error calibration showed significant improvement of SNDR. The equalization and dynamic non-linearity calibration technique was effective in improving the SNDR at high frequencies to compensate for the high-frequency errors from switches



Figure 4.34: SNDR/SFDR versus normalized input amplitude in dB.

and MDACs. Background calibration was also effective to recover the SNDR of low frequency input from 65.4 to 71.2 dB and Nyquist input from 60.2 to 68.6 dB. All these results were close to the optimal results in foreground.

| Calibration                                                                                | Low Freq. SNDR | fs/4 Freq. SNDR | Nyq. Freq. SNDR |
|--------------------------------------------------------------------------------------------|----------------|-----------------|-----------------|
| Calibration                                                                                | (dB)           | (dB)            | (dB)            |
| Original                                                                                   | 55.2           | 51.0            | 46.6            |
| Class-C,<br>(omitted later)                                                                | 56.3           | 52.4            | 50.2            |
| LDO glitch reduction,<br>(omitted later)                                                   | 59.2           | 57.4            | 56.3            |
| Inter-stage gain                                                                           | 68.3           | 65.4            | 62.8            |
| Inter-stage gain,<br>non-linearity                                                         | 71.3           | 69.7            | 64.7            |
| Inter-stage gain,<br>Capacitor mismatch,                                                   | 70.1           | 68.2            | 64.0            |
| Inter-stage gain,<br>Capacitor mismatch,<br>non-linearity                                  | 71.6           | 69.7            | 64.8            |
| Inter-stage gain,<br>Capacitor mismatch,<br>non-linearity, Memory                          | 71.2           | 70.4            | 66.3            |
| Inter-stage gain,<br>Capacitor mismatch,<br>non-linearity, Memory<br>Dynamic non-linearity | 71.3           | 70.5            | 68.9            |
| BG Uncalibrated                                                                            | 65.4           | 63.2            | 60.2            |
| BG LDO glitch red.                                                                         | 67.3           | 64.6            | 61.2            |
| BG Inter-stage gain                                                                        | 70.2           | 67.3            | 63.2            |
| BG Inter-stage gain non-linearity                                                          | 71.2           | 69.4            | 64.5            |
| BG Inter-stage gain<br>non-linearity, Memory<br>Dynamic nonlineairty                       | 71.2           | 70.2            | 68.6            |

Table 4.2: SNDR over different calibration techniques.

| Ref.                                                      | [80]                                          | [82]                                                         | [83]                                        | [84]                   | This Work                                                                    |
|-----------------------------------------------------------|-----------------------------------------------|--------------------------------------------------------------|---------------------------------------------|------------------------|------------------------------------------------------------------------------|
| Tech.                                                     | 40 nm                                         | 65 nm                                                        | 16 nm                                       | 28 nm                  | 40 nm                                                                        |
| Arch.                                                     | PIPE                                          | PIPE                                                         | TI PIPE                                     | PIPE SAR               | PIPE                                                                         |
| Sampling<br>rate (MHz).                                   | 2100                                          | 1000                                                         | 4000                                        | 1000                   | 1000                                                                         |
| SNDD (dP)                                                 | 52@Low                                        | 69@Low                                                       | 62.9@Low                                    | 61.36@Low              | 71.3@Low                                                                     |
| SINDR (UD)                                                | 52@Nyq.                                       | 66@Nyq.                                                      | 61.7@Nyq.                                   | 60.02@Nyq.             | 68.9@Nyq.                                                                    |
| SFDR (dB)                                                 | 62@Nyq                                        | 86@Low                                                       | 80.3@Low                                    | 74.61@Low              | 81.5@Low                                                                     |
|                                                           |                                               | 83@Nyq.                                                      | 73.3@Nyq.                                   | 74.56@Nyq.             | 76.14@Nyq.                                                                   |
| FS (Vppd)                                                 | 1.4                                           | 2.0                                                          | 1.52                                        | 1.2                    | 1.8                                                                          |
| $\mathbf{E}_{\mathbf{D}}\mathbf{M}(\mathbf{d}\mathbf{P})$ | 148.4@low                                     | 155.2@low                                                    | 169.2@low                                   | 169.54@low             | 170.8@low                                                                    |
| FOM (UD)                                                  | 147.4@Nyq.                                    | 149.2@Nyq.                                                   | 166@Nyq.                                    | 168.2@Nyq.             | 168.4@Nyq.                                                                   |
| Calibration                                               | Gain<br>non-linearity<br>Equalization<br>(FG) | Gain<br>non-linearity<br>Equalization<br>Kickback<br>(FG+BF) | Gain<br>Distortion<br>monitoring<br>(FG+BG) | Gain<br>Offset<br>(FG) | Gain<br>non-linearity<br>Equalization<br>Dynamic<br>non-linearity<br>(FG+BG) |
| Supply (V)                                                | 0.75/1.75                                     | 1.2/2.5/3.3                                                  | 0.9/1.8                                     | 1.0                    | 1.1/2.2/2.5                                                                  |
| Power (mW)                                                | 240                                           | 1200                                                         | 75                                          | 7.6                    | 56                                                                           |

Table 4.3: Comparison with pipelined ADCs.

#### 4.6 Conclusions and Summary

In this paper, 1 GS/s low-power high-performance pipelined ADC with comprehensive foreground and background calibrations has been implemented. A high-speed complementary OTA with dynamic slew-rate helper circuit successfully settled the MDAC with large input swings. The proposed low noise voltage references are immune to ground bouncing and have minimized glitches. Comprehensive foreground calibration techniques with both analog and digital parts can effectively calibrate the ADC's non-linearities. Through the simulation and measurement results, the settling error and non-linearities are effectively de-coupled and calibrated separately. Inspired from CMOS RF power amplifier calibration techniques, an out-of-band spectrum monitoring technique can efficiently calibrate gain errors and non-linearities in background. The proposed background calibration methodology can both calibrate the gain error and non-linearities simultaneously. Table 4.3 shows the comprehensive comparisons with state-of-the-art ADCs with comparable effective bandwidth. The proposed pipelined ADC with analog and digital calibration techniques achieved around 68.9 dB SNDR at Nyquist frequency, while dissipating 56 mW. The measurement results showed that this ADC achieved SFDR of 81.5 dB, SNDR of 71.3 dB, Schreier's FoM of 170.8 dB and Walden's FoM of 18.7 fJ/conv at 21.4 MHz after foreground calibration. At Nyquist frequency, the ADC architecture achieves an SFDR of 76.1 dB, SNDR of 68.9 dB, a Walden's FoM of 24.6 fJ/conv and Schreier's FoM of 168.4 dB.

In a sum, a pipelined ADC with comprehensive foreground and background calibrations are verified in silicon. A stable voltage reference guarantees the outstanding performance of this design. The ADC designs are strongly technology depend. For example, without triple well in this 40nm technology, it is difficult to realize high swing OTAs without reliability issues, thus, to have same SQNR, the input capacitances are significantly enlarged and cannot have good FoM. However, the main bottom necks for this design are the speed of bootstrapped switches where long channel devices are used, which makes it not fast enough. The high threshold voltage of short channel devices also leads to certain design difficulties. As seen clearly on the SNDR, clock jitter around (150fs) hurts the SNDR at high-frequency input.

For the future work, advanced technology nodes are better candidates for high speed pipelined ADC design, thus the OTA, switches can be faster. Also, an on-chip low jitter PLL (<50fs) is preferred to maintain the SNDR close to Nyquist frequency. Advanced packing plans, such as flip-flop minimized ground bonds are helpful in reducing the voltage reference design efforts. Designing a digital engine to support all the calibration methods is still a challenge and need careful consideration. Still, separating the errors and de-couple them from each other is crucial in achieving the targeted SNDR performances. And analog and digital calibration techniques should be designed in a way that they address different errors separately and calibrate them efficiently.

## 5. SUMMARY AND CONCLUSIONS

The rapid progress of developing 5G both in academic research and industrial applications brings fast and robust wireless communication systems. High-performance CMOS circuits with reduced power consumption are still the prime focus of numerous research groups. Nowadays, developed semiconductor companies are using more advanced nodes, such as 14 nm or 7 nm to realize their high-performance circuits and systems. Every company is pursuing enhanced circuit performance and optimized power consumption with the system on a chip (SoC). Better trade-offs among different aspects are gradually being achieved.

# 5.1 Conclusions

In this dissertation, there are three projects presented. They are an RF front-end, a fractional-N PLL, and a pipelined data converter. The first project includes a 3-6 GHz wideband highly linear RF front-end with 200 MHz baseband bandwidth. This prototype was fabricated using the TSMC 40nm technology. The chip area is  $0.75 \times 1.64 \text{ mm}^2$ . The RF front-end uses an LNTA to generate RF current to be down-converted by passive mixers. Down-converted current components are filtered and then converted to voltages at TIA output. The filters includes a currentmode minimally-invasive second-order filter and a capacitor. The measured double sideband NF is 5.0 to 5.8 dB at 3 MHz offset. The measurement results demonstrate the high linearity within 200 MHz baseband bandwidth. The in-band IIP3 is 15.1 to 16.7 dBm and out-of-band IIP3 is 33 dBm when spacing of two-tone signals is over 500 MHz, which is equivalent to 2.5 times the baseband bandwidth. Meanwhile, in-band P<sub>1dB</sub> is 3.0 to 3.9 dBm, and out-of-band B<sub>1dB</sub> is 7.0 dBm at 2.5 times the baseband bandwidth. Since high input power is expected, this structure comes with reduced conversion gain of 12.5 to 14.5 dB, which is power supply limited. The total power consumption ranges from 64.1 to 69.6 mW, and power supplies for analog and digital are 1.8 V and 1.1 V, respectively.

The second project focuses on high-quality clock generation through a high-performance fractional-

N PLL. Using the same TSMC 40 nm technology, a 2.3 to 3.9 GHz fractional frequency synthesizer with charge pump and TDC calibrations for reduced reference and fractional spurs were developed and realized. The chip area is  $3.0 \times 2.0 \ mm^2$ . Based on a charge pump fractional-N PLL, TDCs are used to reduce both reference spurs and fractional spurs. The calibrated TDC helps with charge pump calibration for minimum static phase error, which leads to the minimum reference spur level. Then the DPP provides time domain sampling, and out-of-band fractional spurs are filtered. The designed loop bandwidth is 250 kHz. The measured highest rms jitter is 255 fs. And the total power consumption is 15.9 mW. The out-of-band phase noise at 3 MHz is  $-130 \ dBc/Hz$ . By using proposed techniques, a  $-108.3 \ dBc$  reference spur is achieved at lower sideband. Fractional spurs starting from  $f_{REF}/32$  are suppressed by the DPP and analog loop filter simultaneously, and an over 18 dB reduction is demonstrated. The worst case in-band fractional spur is  $-75.6 \ dBc$ . A standard 1.1 V power supply is used.

The third project aims at a high-speed, low-power pipelined data converter. A 14-bit 1GS/s low-power pipelined ADC is proposed. Fabricated in the same TSMC 40nm technology, the chip area is  $2.1 \times 2.0 \ mm^2$ . A slew-rate boosted current reuse telescopic OTA is used as the residue amplifier. After comprehensive analog and digital calibrations, the pipelined ADC achieves an SNDR of 71.3 dB, an SFDR of 81.5 dB with a 21.4 MHz full-scale input. The corresponding Walden's FoM is 18.7 fJ/conv and Schreier's FoM is 170.8 dB. With a 491.1 MHz full-scale input, the SNDR is 68.9 dB and SFDR is 76.1 dB. The Walden's FoM is 24.6 fJ/conv and Schreier's FoM is 168.4 dB. The Schreier's FoM will reduce by 1.41 dB at the Nyquist frequency, when input buffer's power consumption is included.

# 5.2 Future work

For future work, there are several considerations. First of all, the circuit design must meet system requirements, and its performance is limited by the available technology nodes. A system-level plan is essential.

For the RF front-end design, system calibration algorithms can be considered in the future to reduce I/Q mismatches over a large bandwidth, such as 400 MHz or 2 GHz. The trade-offs between

noise figure, linearity, and conversion gain can be further investigated. A variable-gain system can be developed to meet different input signal levels and strengthen the SNDR performance as much as possible. This will make the system suitable for multiple standards.

For the fractional-N PLL, VCO's phase noise performance can be improved by better on-chip spiral inductors or other VCO structures. Provided that lower phase noise is attainable, the proposed structure has the potential of achieving less than 100 fs rms jitter. The reference spur reduction technique is only for analog PLL, and the all-digital version needs innovative ideas. Fractional spur filtering is achieved by designing the digital DSM and digital filter together. Algorithms can be implemented to further randomize the fixed patterns without introducing excessive noise. Such designs are usually strongly related to the loop dynamics. TDCs with higher resolutions are also helpful in fractional spur filtering at the cost of more power. More advanced technology nodes can have faster TDCs, and fractional spur filtering can be stronger.

The focus of power-efficient pipelined data converters is the design of residue amplifiers. This is a system-level design process. For more advanced nodes, SAR is becoming popular and more scalable than the pipelined structure; however, their error sources are significantly different. The residue amplification in a pipelined ADC alleviates many design challenges. The stability of onchip references also attracts significant attention, and there is still room for new ideas in designing LDOs for specific ADC structures. The foreground and background calibrations face trade-offs between granularity, efficiency, speed, and other parameters. For all calibrations, there must be direct or indirect indicators of errors. Direct measurement of SNDR is the most straightforward indicator of ADC performance, and weak non-linearities can be calibrated by tuning the digital coefficients. Background calibrations are generally used to track slow PVT variations and tune the coefficients gradually. Still, the background calibrations should be transparent to the normal operation. ADC systems become more complex as more coefficients are included. Time-interleaved pipelined or SAR ADCs can achieve a larger bandwidth than a single channel ADC. Additional innovative ideas are needed for the calibrations of timing skews, bandwidth mismatches, gain errors, etc., both in foreground and background.

#### REFERENCES

- [1] J. Jiang, J. Kim, A. I. Karsilayan, and J. Silva-Martinez, "A 3–6-ghz highly linear i-channel receiver with over +3.0-dbm in-band p<sub>1db</sub> and 200-mhz baseband bandwidth suitable for 5g wireless and cognitive radio applications," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 66, no. 8, pp. 3134–3147, 2019.
- [2] 5G Americas, "Spectrum Landscape for Mobile Services." Accessed: Nov. 2017. [online].
   Available: http://www.5gamericas.org/en/resources/white-papers, Nov. 2017.
- [3] S. Haykin, "Cognitive radio: brain-empowered wireless communications," *IEEE Journal on Selected Areas in Communications*, vol. 23, no. 2, pp. 201–220, 2005.
- [4] Microsoft, "Microsoft Spectrum Observatory." Accessed: Nov. 2017. [online]. Available: http://spectrum-observatory.cloudapp.net, Apr. 2014.
- [5] J. Kim and J. Silva-Martinez, "Low-power, low-cost cmos direct-conversion receiver frontend for multistandard applications," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 9, pp. 2090–2103, 2013.
- [6] Z. Ru, N. A. Moseley, E. A. M. Klumperink, and B. Nauta, "Digitally enhanced softwaredefined radio receiver robust to out-of-band interference," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 12, pp. 3359–3375, 2009.
- [7] D. Murphy, H. Darabi, A. Abidi, A. A. Hafez, A. Mirzaei, M. Mikhemar, and M.-C. F. Chang, "A blocker-tolerant, noise-cancelling receiver suitable for wideband wireless applications," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 12, pp. 2943–2963, 2012.
- [8] D. Murphy, H. Darabi, and H. Xu, "A noise-cancelling receiver resilient to large harmonic blockers," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 6, pp. 1336–1350, 2015.
- [9] Y. Xu, J. Zhu, and P. R. Kinget, "A blocker-tolerant rf front end with harmonic-rejecting n -path filter," *IEEE Journal of Solid-State Circuits*, vol. 53, no. 2, pp. 327–339, 2018.

- [10] J. Zhu, H. Krishnaswamy, and P. R. Kinget, "Field-programmable lnas with interfererreflecting loop for input linearity enhancement," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 2, pp. 556–572, 2015.
- [11] S. Youssef, R. van der Zee, and B. Nauta, "Active feedback technique for rf channel selection in front-end receivers," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 12, pp. 3130–3144, 2012.
- [12] J. Borremans, G. Mandal, V. Giannini, B. Debaillie, M. Ingels, T. Sano, B. Verbruggen, and J. Craninckx, "A 40 nm cmos 0.4–6 ghz receiver resilient to out-of-band blockers," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 7, pp. 1659–1671, 2011.
- [13] Y.-C. Lien, E. A. M. Klumperink, B. Tenbroek, J. Strange, and B. Nauta, "Enhancedselectivity high-linearity low-noise mixer-first receiver with complex pole pair due to capacitive positive feedback," *IEEE Journal of Solid-State Circuits*, vol. 53, no. 5, pp. 1348–1360, 2018.
- [14] M. Tohidian, I. Madadi, and R. B. Staszewski, "3.8 a fully integrated highly reconfigurable discrete-time superheterodyne receiver," in 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pp. 1–3, 2014.
- [15] Y. Xu and P. R. Kinget, "A switched-capacitor rf front end with embedded programmable high-order filtering," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 5, pp. 1154–1167, 2016.
- [16] A. Mirzaei, H. Darabi, J. C. Leete, X. Chen, K. Juan, and A. Yazdi, "Analysis and optimization of current-driven passive mixers in narrowband direct-conversion receivers," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 10, pp. 2678–2688, 2009.
- [17] B. Razavi, RF Microelectronics (2nd Edition). USA: Prentice Hall Press, 2nd ed., 2011.
- [18] J.-H. C. Zhan, B. R. Carlton, and S. S. Taylor, "A broadband low-cost direct-conversion receiver front-end in 90 nm cmos," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 5, pp. 1132–1137, 2008.

- [19] P. Rossi, A. Liscidini, M. Brandolini, and F. Svelto, "A variable gain rf front-end, based on a voltage-voltage feedback lna, for multistandard applications," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 3, pp. 690–697, 2005.
- [20] H. Zhang and E. Sánchez-Sinencio, "Linearization techniques for cmos low noise amplifiers: A tutorial," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 58, no. 1, pp. 22–36, 2011.
- [21] W. Sansen, "Distortion in elementary transistor circuits," *IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing*, vol. 46, no. 3, pp. 315–325, 1999.
- [22] X. Li, S. Shekhar, and D. Allstot, "G/sub m/-boosted common-gate lna and differential colpitts vco/qvco in 0.18-/spl mu/m cmos," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 12, pp. 2609–2619, 2005.
- [23] I. Fabiano, M. Sosio, A. Liscidini, and R. Castello, "Saw-less analog front-end receivers for tdd and fdd," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 12, pp. 3067–3079, 2013.
- [24] W.-H. Chen, G. Liu, B. Zdravko, and A. M. Niknejad, "A highly linear broadband cmos lna employing noise and distortion cancellation," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 5, pp. 1164–1176, 2008.
- [25] B.-K. Kim, D. Im, J. Choi, and K. Lee, "A highly linear 1 ghz 1.3 db nf cmos low-noise amplifier with complementary transconductance linearization," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 6, pp. 1286–1302, 2014.
- [26] H. M. Geddada, C.-T. Fu, J. Silva-Martinez, and S. S. Taylor, "Wide-band inductorless lownoise transconductance amplifiers with high large-signal linearity," *IEEE Transactions on Microwave Theory and Techniques*, vol. 62, no. 7, pp. 1495–1505, 2014.
- [27] C. Andrews and A. C. Molnar, "A passive mixer-first receiver with digitally controlled and widely tunable rf interface," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 12, pp. 2696– 2708, 2010.

- [28] B. Thandri and J. Silva-Martinez, "A robust feedforward compensation scheme for multistage operational transconductance amplifiers with no miller capacitors," *IEEE Journal of Solid-State Circuits*, vol. 38, no. 2, pp. 237–243, 2003.
- [29] R. Chen and H. Hashemi, "19.3 reconfigurable sdr receiver with enhanced front-end frequency selectivity suitable for intra-band and inter-band carrier aggregation," in 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers, pp. 1–3, 2015.
- [30] R. Chen and H. Hashemi, "A 0.5-to-3 ghz software-defined radio receiver using discretetime rf signal processing," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 5, pp. 1097– 1111, 2014.
- [31] A. A. Rafi and T. R. Viswanathan, "Harmonic rejection mixing techniques using clockgating," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 8, pp. 1862–1874, 2013.
- [32] B. van Liempd, J. Borremans, E. Martens, S. Cha, H. Suys, B. Verbruggen, and J. Craninckx,
  "A 0.9 v 0.4–6 ghz harmonic recombination sdr receiver in 28 nm cmos with hr3/hr5 and
  iip2 calibration," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 8, pp. 1815–1826, 2014.
- [33] S. Hameed and S. Pamarti, "24.6 a time-interleaved filtering-by-aliasing receiver front-end with >70db suppression at <4× bandwidth frequency offset," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 418–419, 2017.
- [34] M. Mikhemar, D. Murphy, A. Mirzaei, and H. Darabi, "A cancellation technique for reciprocal-mixing caused by phase noise and spurs," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 12, pp. 3080–3089, 2013.
- [35] G. Pini, D. Manstretta, and R. Castello, "Analysis and design of a 260-mhz rf bandwidth +22-dbm oob-iip3 mixer-first receiver with third-order current-mode filtering tia," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 7, pp. 1819–1829, 2020.

- [36] M. A. Montazerolghaem, S. Pires, L. C. de Vreede, and M. Babaie, "6.5 a 3db-nf 160mhzrf-bw blocker-tolerant receiver with third-order filtering for 5g nr applications," in 2021 IEEE International Solid- State Circuits Conference (ISSCC), vol. 64, pp. 98–100, 2021.
- [37] A. N. Bhat, R. van der Zee, S. Finocchiaro, F. Dantoni, and B. Nauta, "A basebandmatching-resistor noise-canceling receiver architecture to increase in-band linearity achieving 175mhz tia bandwidth with a 3-stage inverter-only opamp," in 2019 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), pp. 155–158, 2019.
- [38] P. K. Sharma and N. Nallam, "Breaking the performance tradeoffs in n-path mixer-first receivers using a second-order baseband noise-canceling tia," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 11, pp. 3009–3023, 2020.
- [39] J. Jiang, T. Yan, D. Zhou, A. I. Karsilayan, and J. Silva-Martinez, "A 2.3-3.9 ghz fractionaln frequency synthesizer with charge pump and tdc calibration for reduced reference and fractional spurs," in 2021 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), pp. 71–74, 2021.
- [40] C.-R. Ho and M. S.-W. Chen, "A fractional-n digital pll with background-dither-noisecancellation loop achieving <-62.5dbc worst-case near-carrier fractional spurs in 65nm cmos," in 2018 IEEE International Solid - State Circuits Conference - (ISSCC), pp. 394– 396, 2018.
- [41] M. P. Kennedy, Y. Donnelly, J. Breslin, S. Tulisi, S. Patil, C. Curtin, S. Brookes, B. Shelly,
  P. Griffin, and M. Keaveney, "16.9 4.48ghz 0.18 μm sige bicmos exact-frequency fractionaln frequency synthesizer with spurious-tone suppression yielding a -80dbc in-band fractional spur," in 2019 IEEE International Solid- State Circuits Conference - (ISSCC), pp. 272–274, 2019.
- [42] C.-W. Yao, W. F. Loke, R. Ni, Y. Han, H. Li, K. Godbole, Y. Zuo, S. Ko, N.-S. Kim, S. Han,I. Jo, J. Lee, J. Han, D. Kwon, C. Kim, S. Kim, S. W. Son, and T. B. Cho, "24.8 a 14nm

fractional-n digital pll with 0.14psrms jitter and -78dbc fractional spur for cellular rfics," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 422–423, 2017.

- [43] H. Kim, J. Sang, H. Kim, Y. Jo, T. Kim, H. Park, and S. H. Cho, "14.4 a 5ghz –95dbcreference-spur 9.5mw digital fractional-n pll using reference-multiplied time-to-digital converter and reference-spur cancellation in 65nm cmos," in 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers, pp. 1–3, 2015.
- [44] C.-M. Hsu, M. Z. Straayer, and M. H. Perrott, "A low-noise wide-bw 3.6-ghz digital  $\delta\sigma$  fractional-n frequency synthesizer with a noise-shaping time-to-digital converter and quantization noise cancellation," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 12, pp. 2776–2786, 2008.
- [45] K. Ogata, Modern Control Engineering. USA: Prentice Hall PTR, 4th ed., 2001.
- [46] F. Gardner, Phaselock Techniques. John Wiley & Sons, Ltd, 2005.
- [47] R. Staszewski, J. Wallberg, S. Rezeq, C.-M. Hung, O. Eliezer, S. Vemulapalli, C. Fernando, K. Maggio, R. Staszewski, N. Barton, M.-C. Lee, P. Cruise, M. Entezari, K. Muhammad, and D. Leipold, "All-digital pll and transmitter for mobile phones," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 12, pp. 2469–2482, 2005.
- [48] X. Gao, E. A. M. Klumperink, M. Bohsali, and B. Nauta, "A low noise sub-sampling pll in which divider noise is eliminated and pd/cp noise is not multiplied by n<sup>2</sup>," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 12, pp. 3253–3263, 2009.
- [49] X. Gao, E. A. M. Klumperink, P. F. J. Geraedts, and B. Nauta, "Jitter analysis and a benchmarking figure-of-merit for phase-locked loops," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 56, no. 2, pp. 117–121, 2009.
- [50] I. Young, J. Greason, and K. Wong, "A pll clock generator with 5 to 110 mhz of lock range for microprocessors," *IEEE Journal of Solid-State Circuits*, vol. 27, no. 11, pp. 1599–1607, 1992.

- [51] A. Abidi, "Phase noise and jitter in cmos ring oscillators," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 8, pp. 1803–1816, 2006.
- [52] J. Lee and H. Wang, "Study of subharmonically injection-locked plls," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 5, pp. 1539–1553, 2009.
- [53] B. Razavi, "A study of injection locking and pulling in oscillators," *IEEE Journal of Solid-State Circuits*, vol. 39, no. 9, pp. 1415–1424, 2004.
- [54] D. Coombs, A. Elkholy, R. K. Nandwana, A. Elmallah, and P. K. Hanumolu, "8.6 a 2.5to-5.75ghz 5mw 0.3psrms-jitter cascaded ring-based digital injection-locked clock multiplier in 65nm cmos," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 152–153, 2017.
- [55] T.-H. Lin and W. Kaiser, "A 900-mhz 2.5-ma cmos frequency synthesizer with an automatic sc tuning loop," *IEEE Journal of Solid-State Circuits*, vol. 36, no. 3, pp. 424–431, 2001.
- [56] E. Hegazi, H. Sjoland, and A. Abidi, "A filtering technique to lower lc oscillator phase noise," *IEEE Journal of Solid-State Circuits*, vol. 36, no. 12, pp. 1921–1930, 2001.
- [57] S. Cheng, H. Tong, J. Silva-Martinez, and A. I. Karsilayan, "A fully differential low-power divide-by-8 injection-locked frequency divider up to 18 ghz," *IEEE Journal of Solid-State Circuits*, vol. 42, no. 3, pp. 583–591, 2007.
- [58] J. Lee and B. Razavi, "A 40-ghz frequency divider in 0.18-/spl mu/m cmos technology," *IEEE Journal of Solid-State Circuits*, vol. 39, no. 4, pp. 594–601, 2004.
- [59] J. Kang, P. Qin, X. Li, and T. Mo, "13 ghz programmable frequency divider in 65 nm cmos," in 2012 IEEE 11th International Conference on Solid-State and Integrated Circuit Technology, pp. 1–3, 2012.
- [60] Z. Xu, J. Lee, and S. Masui, "Self-dithered digital delta-sigma modulators for fractional-n pll," *IEICE Electron. Exp.*, vol. E94-C, no. 6, pp. 1065–1068, 2011.

- [61] Y. Donnelly and M. P. Kennedy, "Prediction of phase noise and spurs in a nonlinear fractional- N frequency synthesizer," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 66, no. 11, pp. 4108–4121, 2019.
- [62] D. Mai and M. P. Kennedy, "A design method for nested mash-sq hybrid divider controllers for fractional- n frequency synthesizers," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 65, no. 10, pp. 3279–3290, 2018.
- [63] H. Mo and M. P. Kennedy, "Masked dithering of mash digital delta-sigma modulators with constant inputs using linear feedback shift registers," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 63, no. 8, pp. 1131–1141, 2016.
- [64] S. Cheng, H. Tong, J. Silva-Martinez, and A. Karsilayan, "Design and analysis of an ultrahigh-speed glitch-free fully differential charge pump with minimum output current variation and accurate matching," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 53, no. 9, pp. 843–847, 2006.
- [65] M. Z. Straayer and M. H. Perrott, "A multi-path gated ring oscillator tdc with first-order noise shaping," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 4, pp. 1089–1098, 2009.
- [66] M. Lee and A. A. Abidi, "A 9 b, 1.25 ps resolution coarse–fine time-to-digital converter in 90 nm cmos that amplifies a time residue," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 4, pp. 769–777, 2008.
- [67] P. Dudek, S. Szczepanski, and J. Hatfield, "A high-resolution cmos time-to-digital converter utilizing a vernier delay line," *IEEE Journal of Solid-State Circuits*, vol. 35, no. 2, pp. 240– 247, 2000.
- [68] J. Yu, F. F. Dai, and R. C. Jaeger, "A 12-bit vernier ring time-to-digital converter in 0.13 μm cmos technology," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 4, pp. 830–842, 2010.
- [69] S.-Y. Hung and S. Pamarti, "6.4 a 0.5-to-2.5ghz multi-output fractional frequency synthesizer with 90fs jitter and -106dbc spurious tones based on digital spur cancellation," in 2019 IEEE International Solid- State Circuits Conference - (ISSCC), pp. 262–264, 2019.

- [70] K. J. Wang, A. Swaminathan, and I. Galton, "Spurious tone suppression techniques applied to a wide-bandwidth 2.4 ghz fractional-n pll," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 12, pp. 2787–2797, 2008.
- [71] P. Mishra, A. Tan, B. Helal, C. Ho, C. Loi, J. Riani, J. Sun, K. Mistry, K. Raviprakash, L. Tse, M. Davoodi, M. Takefman, N. Fan, P. Prabha, Q. Liu, Q. Wang, R. Nagulapalli, S. Cyrusian, S. Jantzi, S. Scouten, T. Dusatko, T. Setya, V. Giridharan, V. Gurumoorthy, V. Karam, W. Liew, Y. Liao, and Y. Ou, "8.7 a 112gb/s adc-dsp-based pam-4 transceiver for long-reach applications with gt;40db channel loss in 7nm finfet," in 2021 IEEE International Solid-State Circuits Conference (ISSCC), vol. 64, pp. 138–140, 2021.
- [72] L. Kull, T. Toifl, M. Schmatz, P. A. Francese, C. Menolfi, M. Brändli, M. Kossel, T. Morf, T. M. Andersen, and Y. Leblebici, "A 3.1 mw 8b 1.2 gs/s single-channel asynchronous sar adc with alternate comparators for enhanced speed in 32 nm digital soi cmos," *IEEE Journal* of Solid-State Circuits, vol. 48, no. 12, pp. 3049–3058, 2013.
- [73] L. Kull, D. Luu, C. Menolfi, M. Braendli, P. A. Francese, T. Morf, M. Kossel, H. Yueksel, A. Cevrero, I. Ozkaya, and T. Toifl, "28.5 a 10b 1.5gs/s pipelined-sar adc with background second-stage common-mode regulation and offset calibration in 14nm cmos finfet," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 474–475, 2017.
- [74] A. Edward, Q. Liu, C. Briseno-Vidrios, M. Kinyua, E. G. Soenen, A. I. Karşılayan, and J. Silva-Martinez, "A 43-mw mash 2-2 ct σδ modulator attaining 74.4/75.8/76.8 db of sndr/snr/dr and 50 mhz of bw in 40-nm cmos," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 2, pp. 448–459, 2017.
- [75] Q. Liu, A. Edward, D. Zhou, and J. Silva-Martinez, "A continuous-time mash 1-1-1 delta–sigma modulator with fir dac and encoder-embedded loop-unrolling quantizer in 40nm cmos," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 26, no. 4, pp. 756–767, 2018.

- [76] C. Briseno-Vidrios, A. Edward, A. Shafik, S. Palermo, and J. Silva-Martinez, "A 75-mhz continuous-time sigma-delta modulator employing a broadband low-power highly efficient common-gate summing stage," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 3, pp. 657– 668, 2017.
- [77] Y. Chiu, P. Gray, and B. Nikolic, "A 14-b 12-ms/s cmos pipeline adc with over 100-db sfdr," *IEEE Journal of Solid-State Circuits*, vol. 39, no. 12, pp. 2139–2151, 2004.
- [78] B. Murmann, "ADC Performance Survery." Accessed: Jun. 2021. [online]. Available: http://web.stanford.edu/ murmann/adcsurvey.html, Jun. 2021.
- [79] J. Wu, A. Chou, T. Li, R. Wu, T. Wang, G. Cusmai, S.-T. Lin, C.-H. Yang, G. Unruh, S. R. Dommaraju, M. M. Zhang, P. T. Yang, W.-T. Lin, X. Chen, D. Koh, Q. Dou, H. M. Geddada, J.-J. Hung, M. Brandolini, Y. Shin, H.-S. Huang, C.-Y. Chen, and A. Venes, "27.6 a 4gs/s 13b pipelined adc with capacitor and amplifier sharing in 16nm cmos," in 2016 IEEE International Solid-State Circuits Conference (ISSCC), pp. 466–467, 2016.
- [80] J. Wu, C.-Y. Chen, T. Li, L. He, W. Liu, W.-T. Shih, S. S. Tsai, B. Chen, C.-S. Huang, B. J.-J. Hung, H. T. Hung, S. Jaffe, L. K. Tan, and H. Vu, "A 240-mw 2.1-gs/s 52-db sndr pipeline adc using mdac equalization," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 8, pp. 1818–1828, 2013.
- [81] J. Wu, A. Chou, C.-H. Yang, Y. Ding, Y.-J. Ko, S.-T. Lin, W. Liu, C.-M. Hsiao, M.-H. Hsieh, C.-C. Huang, J.-J. Hung, K. Y. Kim, M. Le, T. Li, W.-T. Shih, A. Shrivastava, Y.-C. Yang, C.-Y. Chen, and H.-S. Huang, "A 5.4gs/s 12b 500mw pipeline adc in 28nm cmos," in 2013 Symposium on VLSI Circuits, pp. C92–C93, 2013.
- [82] A. M. A. Ali, H. Dinc, P. Bhoraskar, C. Dillon, S. Puckett, B. Gray, C. Speir, J. Lanford, J. Brunsilius, P. R. Derounian, B. Jeffries, U. Mehta, M. McShea, and R. Stop, "A 14 bit 1 gs/s rf sampling pipelined adc with background calibration," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 12, pp. 2857–2867, 2014.

- [83] B. Hershberg, D. Dermit, B. van Liempd, E. Martens, N. Markulić, J. Lagos, and J. Craninckx, "A 4-gs/s 10-enob 75-mw ringamp adc in 16-nm cmos with background monitoring of distortion," *IEEE Journal of Solid-State Circuits*, pp. 1–1, 2021.
- [84] W. Jiang, Y. Zhu, M. Zhang, C.-H. Chan, and R. P. Martins, "3.2 a 7.6mw 1gs/s 60db sndr single-channel sar-assisted pipelined adc with temperature-compensated dynamic gmr-based amplifier," in 2019 IEEE International Solid- State Circuits Conference - (ISSCC), pp. 60–62, 2019.
- [85] D. Zhou, C. Briseno-Vidrios, J. Jiang, C. Park, Q. Liu, E. G. Soenen, M. Kinyua, and J. Silva-Martinez, "A 13-bit 260ms/s power-efficient pipeline adc using a current-reuse technique and interstage gain and nonlinearity errors calibration," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 66, no. 9, pp. 3373–3383, 2019.
- [86] C. Briseno-Vidrios, D. Zhou, S. Prakash, Q. Liu, A. Edward, E. G. Soenen, M. Kinyua, and J. Silva-Martinez, "A 44-fj/conversion step 200-ms/s pipeline adc employing current-mode mdacs," *IEEE Journal of Solid-State Circuits*, vol. 53, no. 11, pp. 3280–3292, 2018.
- [87] K. C. Dyer, J. P. Keane, and S. H. Lewis, "Calibration and dynamic matching in data converters: Part 1: Linearity calibration and dynamic-matching techniques," *IEEE Solid-State Circuits Magazine*, vol. 10, no. 2, pp. 46–55, 2018.
- [88] K. C. Dyer, J. P. Keane, and S. Lewis, "Calibration and dynamic matching in data converters: Part 2: Time-interleaved analog-to-digital converters and background-calibration challenges," *IEEE Solid-State Circuits Magazine*, vol. 10, no. 3, pp. 61–70, 2018.
- [89] A. M. A. Ali, A. Morgan, C. Dillon, G. Patterson, S. Puckett, P. Bhoraskar, H. Dinc, M. Hensley, R. Stop, S. Bardsley, D. Lattimore, J. Bray, C. Speir, and R. Sneed, "A 16-bit 250-ms/s if sampling pipelined adc with background calibration," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 12, pp. 2602–2612, 2010.
- [90] A. M. A. Ali, H. Dinc, P. Bhoraskar, S. Bardsley, C. Dillon, M. McShea, J. P. Periathambi, and S. Puckett, "A 12-b 18-gs/s rf sampling adc with an integrated wideband track-and-hold

amplifier and background calibration," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 12, pp. 3210–3224, 2020.

- [91] H.-S. Lee, D. Hodges, and P. Gray, "A self-calibrating 15 bit cmos a/d converter," *IEEE Journal of Solid-State Circuits*, vol. 19, no. 6, pp. 813–819, 1984.
- [92] J. Li and U.-K. Moon, "Background calibration techniques for multistage pipelined adcs with digital redundancy," *IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing*, vol. 50, no. 9, pp. 531–538, 2003.
- [93] W. Liu, P. Huang, and Y. Chiu, "A 12b 22.5/45ms/s 3.0mw 0.059mm<sup>2</sup> cmos sar adc achieving over 90db sfdr," in 2010 IEEE International Solid-State Circuits Conference (ISSCC), pp. 380–381, 2010.
- [94] Y. Chiu, C. Tsang, B. Nikolic, and P. Gray, "Least mean square adaptive digital background calibration of pipelined analog-to-digital converters," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 51, no. 1, pp. 38–46, 2004.
- [95] B. Setterberg, K. Poulton, S. Ray, D. J. Huber, V. Abramzon, G. Steinbach, J. P. Keane, B. Wuppermann, M. Clayson, M. Martin, R. Pasha, E. Peeters, A. Jacobs, F. Demarsin, A. Al-Adnani, and P. Brandt, "A 14b 2.5gs/s 8-way-interleaved pipelined adc with background calibration and digital dynamic linearity correction," in *2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers*, pp. 466–467, 2013.
- [96] U.-K. Moon and B.-S. Song, "Background digital calibration techniques for pipelined adcs," *IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing*, vol. 44, no. 2, pp. 102–109, 1997.
- [97] A. Panigada and I. Galton, "Digital background correction of harmonic distortion in pipelined adcs," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 53, no. 9, pp. 1885–1895, 2006.

- [98] J. Ming and S. Lewis, "An 8b 80msample/s pipelined adc with background calibration," in 2000 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.00CH37056), pp. 42–43, 2000.
- [99] A. Abo and P. Gray, "A 1.5-v, 10-bit, 14.3-ms/s cmos pipeline analog-to-digital converter," *IEEE Journal of Solid-State Circuits*, vol. 34, no. 5, pp. 599–606, 1999.
- [100] S. Lewis and P. Gray, "A pipelined 5-msample/s 9-bit analog-to-digital converter," *IEEE Journal of Solid-State Circuits*, vol. 22, no. 6, pp. 954–961, 1987.
- [101] B.-S. Song, M. Tompsett, and K. Lakshmikumar, "A 12-bit 1-msample/s capacitor erroraveraging pipelined a/d converter," *IEEE Journal of Solid-State Circuits*, vol. 23, no. 6, pp. 1324–1333, 1988.
- [102] X. Wang, P. Hurst, and S. Lewis, "A 12-bit 20-ms/s pipelined adc with nested digital background calibration," in *Proceedings of the IEEE 2003 Custom Integrated Circuits Conference*, 2003., pp. 409–412, 2003.
- [103] S.-C. Lee, B. Elies, and Y. Chiu, "An 85db sfdr 67db sndr 8osr 240ms/s sd adc with nonlinear memory error calibration," in 2012 Symposium on VLSI Circuits (VLSIC), pp. 164–165, 2012.
- [104] W. Liu, P. Huang, and Y. Chiu, "A 12-bit 50-ms/s 3.3-mw sar adc with background digital calibration," in *Proceedings of the IEEE 2012 Custom Integrated Circuits Conference*, pp. 1–4, 2012.
- [105] H. Wang, X. Wang, P. J. Hurst, and S. H. Lewis, "Nested digital background calibration of a 12-bit pipelined adc without an input sha," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 10, pp. 2780–2789, 2009.
- [106] E. Soenen and R. Geiger, "An architecture and an algorithm for fully digital correction of monolithic pipelined adcs," *IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing*, vol. 42, no. 3, pp. 143–153, 1995.

- [107] M. de Wit, K.-S. Tan, and R. Hester, "A low-power 12-b analog-to-digital converter with onchip precision trimming," *IEEE Journal of Solid-State Circuits*, vol. 28, no. 4, pp. 455–461, 1993.
- [108] D. Zhou, J. Jiang, Q. Liu, E. G. Soenen, M. Kinyua, and J. Silva-Martinez, "A 245-ma digitally assisted dual-loop low-dropout regulator," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 8, pp. 2140–2150, 2020.