# EFFICIENT AND LINEAR CMOS POWER AMPLIFIER AND FRONT-END DESIGN FOR BROADBAND FULLY-INTEGRATED 28-GHZ 5G PHASED ARRAYS

A Dissertation

by

# SHERIF ABDELHALIM MAHMOUD SHAKIB ROSHDY

# Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the requirements for the degree of

## DOCTOR OF PHILOSOPHY

| Chair of Committee,    | Kamran Entesari    |
|------------------------|--------------------|
| Co-chair of Committee, | Samuel Palermo     |
| Committee Members,     | Robert Nevels      |
|                        | Mahmoud El-Halwagi |
| Head of Department,    | Miroslav Begovic   |

August 2017

Major Subject: Electrical Engineering

Copyright 2017 Sherif Abdelhalim Mahmoud Shakib Roshdy

# ABSTRACT

Demand for data traffic on mobile networks is growing exponentially with time and on a global scale. The emerging fifth-generation (5G) wireless standard is being developed with millimeter-wave (mm-Wave) links as a key technological enabler to address this growth by a 2020 time frame. The wireless industry is currently racing to deploy mm-Wave mobile services, especially in the 28-GHz band. Previous widely-held perceptions of fundamental propagation limitations were overcome using phased arrays. Equally important for success of 5G is the development of low-power, broadband user equipment (UE) radios in commercial-grade technologies. This dissertation demonstrates design methodologies and circuit techniques to tackle the critical challenge of key phased array front-end circuits in low-cost complementary metal oxide semiconductor (CMOS) technology. Two power amplifier (PA) proof-of-concept prototypes are implemented in deeply scaled 28nm and 40-nm CMOS processes, demonstrating state-of-the-art linearity and efficiency for extremely broadband communication signals. Subsequently, the 40 nm PA design is successfully embedded into a low-power fully-integrated transmit-receive front-end module.

The 28 nm PA prototype in this dissertation is the first reported linear, bulk CMOS PA targeting low-power 5G mobile UE integrated phased array transceivers. An optimization methodology is presented to maximizing power added efficiency (PAE) in the PA output stage at a desired error vector magnitude (EVM) and range to address challenging 5G uplink requirements. Then, a source degeneration inductor in the optimized output stage is shown to further enable its embedding into a two-stage transformer-coupled PA. The inductor helps by broadening inter-stage impedance matching bandwidth, and help-ing to reduce distortion. Designed and fabricated in 1P7M 28 nm bulk CMOS and using a 1 V supply, the PA achieves +4.2 dBm/9% measured  $P_{out}$ /PAE at -25 dBc EVM for

a 250 MHz-wide, 64-QAM orthogonal frequency division multiplexing (OFDM) signal with 9.6 dB peak-to-average power ratio (PAPR). The PA also achieves 35.5%/10% PAE for continuous wave signals at saturation/9.6dB back-off from saturation. To the best of the author's knowledge, these are the highest measured PAE values among published *K*-and *Ka*-band CMOS PAs to date.

To drastically extend the communication bandwidth in 28 GHz-band UE devices, and to explore the potential of CMOS technology for more demanding access point (AP) devices, the second PA is demonstrated in a 40 nm process. This design supports a signal radio frequency bandwidth (RFBW) >3× the state-of-the-art without degrading output power (i.e. range), PAE (i.e. battery life), or EVM (i.e. amplifier fidelity). The three-stage PA uses higher-order, dual-resonance transformer matching networks with bandwidths optimized for wideband linearity. Digital gain control of 9 dB range is integrated for phased array operation. The gain control is a needed functionality, but it is largely absent from reported high-performance mm-Wave PAs in the literature. The PA is fabricated in a 1P6M 40 nm CMOS LP technology with 1.1 V supply, and achieves  $P_{out}/PAE$  of +6.7 dBm/11% for an 8×100 MHz carrier aggregation 64-QAM OFDM signal with 9.7 dB PAPR. This PA therefore is the first to demonstrate the viability of CMOS technology to address even the very challenging 5G AP/downlink signal bandwidth requirement.

Finally, leveraging the developed PA design methodologies and circuits, a low power transmit-receive phased array front-end module is fully integrated in 40 nm technology. In transmit-mode, the front-end maintains the excellent performance of the 40 nm PA: achieving +5.5 dBm/9% for the same  $8 \times 100$  MHz carrier aggregation signal above. In receive-mode, a 5.5 dB noise figure (*NF*) and a minimum third-order input intercept point (*IIP*<sub>3</sub>) of -13 dBm are achieved. The performance of the implemented CMOS front-end is comparable to state-of-the-art publications and commercial products that were very recently developed in silicon germanium (SiGe) technologies for 5G communication.

# DEDICATION

To my parents.

#### ACKNOWLEDGMENTS

All thanks are foremost due to Allah, the Most Gracious, the Most Merciful, for enabling me to complete my PhD studies.

I express my deepest gratitude to my advisor, Professor Kamran Entesari, for his support throughout my studies, and for his continuous guidance and tireless effort in helping me to reach this achievement.

I also thank Professor Samuel Palermo, for his contributions to my development and learning experiences at Texas A&M both as an instructor and through our fruitful research collaboration. Special thanks are also due to my dissertation committee members, Professor Robert Nevels and Professor Mahmoud El-Halwagi, for their time, feedback, and valuable suggestions.

My life as a PhD student has certainly been enriched by my fellow graduate students that I was so fortunate to have met and learned so much from. I express my greatest thanks to Mohamed El-Kholy, Osama El-Hadidy, Hatem Osman, Ayman Ameen, Omar El-Sayed, Ahmed Helmy, Ramy Saad, and Amr Abuellil for all that I learned from them and for the wonderful time I had in their company.

The research in this dissertation is a result of the highly valuable support I received from Qualcomm Technologies, Inc., which was made possible by Vladimir Aparin and Jeremy Dunworth, to whom I am indebted. I also thank Hyun-Chul Park and Bon-Hyun Ku for our many technical discussions and for their helpful suggestions. I especially thank my friend and colleague Mohamed Elkholy for his help on our 40-nm millimeter wave test chip.

Finally, I thank my parents Wafaa and Abdelhalim for their great dedication and sacrifice throughout my life. Nothing I ever say or do can repay their love and support, and their tireless effort and prayers for me to reach my full potential. I also thank my brother Husam for his advice, prayers, and unconditional love.

## CONTRIBUTORS AND FUNDING SOURCES

#### Contributors

This work was supported by a dissertation committee consisting of Professor Kamran Entesari, and Professors Samuel Palermo and Robert Nevels of the Department of Electrical and Computer Engineering, as well as Professor Mahmoud El-Halwagi of the Department of Chemical Engineering.

The semi-empirical 5G link budget analysis in Chapter 2 (Section 2.2.1) was originally performed by Dr. Vladimir Aparin of Qualcomm Technologies, Inc. and was revised and re-organized by the student to cite appropriate references and collect the key theoretical equations in a compact and academic format. Dr. Hyun-Chul Park helped with the ground shield and top-level finishing of the physical layout of the 28-nm CMOS power amplifier prototype in Chapter 2. With the help of the student on theoretical and practical mm-wave circuit design practices, as well as with extensive computer-aided design ennoblement by the student, Dr. Mohamed Elkholy designed the low-noise amplifier and the phase shifter circuit blocks that are described in Sections 4.3.2 and 4.3.3. He also provided suggestions and ran initial circuit simulations for the integration of the front-end module in Section 4.4. Mr. Ozvaldo Alcala, Mr. Andrew Yang, and Mr. David Palmer of Qualcomm Technologies, Inc. provided test equipment support for automating the millimter wave single-tone, two tone, and modulated signal error vector magnitude measurements presented in Chapters 3 and 4, and Mr. Martin Lim of Rhode & Schwarz, Inc. helped with code development for this automated testing.

All other work conducted for the dissertation was completed by the student independently.

## **Funding Sources**

This work was made possible by Qualcomm Technologies, Inc. in part through a professional internship that was provided to the student, and in part through its subsequent and resulting agreement to collaborate technically and to fund Professor Kamran Entesari's research group under the Texas Engineering Experiment Station contract for the project entitled "CMOS mm-wave transceivers."

Its contents are solely the responsibility of the authors and do not necessarily represent the official views of Qualcomm Technologies, Inc.

# TABLE OF CONTENTS

| Ι                                                                                                   | Page                                                          |
|-----------------------------------------------------------------------------------------------------|---------------------------------------------------------------|
| ABSTRACT                                                                                            | ii                                                            |
| DEDICATION                                                                                          | iv                                                            |
| ACKNOWLEDGMENTS                                                                                     | v                                                             |
| CONTRIBUTORS AND FUNDING SOURCES                                                                    | vii                                                           |
| TABLE OF CONTENTS                                                                                   | ix                                                            |
| LIST OF FIGURES                                                                                     | xii                                                           |
| LIST OF TABLES                                                                                      | xix                                                           |
| I. INTRODUCTION AND LITERATURE REVIEW                                                               | 1                                                             |
| <ul> <li>1.1 Motivation</li></ul>                                                                   | 1<br>2<br>2<br>4<br>5<br>8<br>9<br>10<br>10<br>11<br>12<br>14 |
| 2. A HIGHLY EFFICIENT AND LINEAR POWER AMPLIFIER FOR 28-GHZ<br>5G PHASED ARRAY RADIOS IN 28-NM CMOS | 16                                                            |
| <ul> <li>2.1 Introduction</li></ul>                                                                 | 16<br>18<br>18<br>23                                          |

|    | 2.3  | Output Stage Optimization Methodology                | 25       |
|----|------|------------------------------------------------------|----------|
|    |      | 2.3.1 Parameterized Output Stage                     | 25       |
|    |      | 2.3.2 Optimization Procedure                         | 26       |
|    |      | 2.3.3 Optimization Results                           | 28       |
|    | 2.4  | Inter-stage Impedance Matching                       | 29       |
|    |      | 2.4.1 Physical Circuit Operation                     | 30       |
|    |      | 2.4.2 Driver Load Impedance                          | 32       |
|    |      | 2.4.3 Effect of $L_{deg}$ on Power Capability        | 33       |
|    |      | 2.4.4 Effect of $L_{deg}$ on Gain and Distortion     | 35       |
|    | 2.5  | Circuit Implementation                               | 36       |
|    |      | 2.5.1 Core Stages                                    | 39       |
|    |      | 2.5.2 Transformers                                   | 41       |
|    | 2.6  | Experimental Results                                 | 42       |
|    |      | 2.6.1 Measured Data                                  | 42       |
|    |      | 2.6.2 Comparison with State-of-the-art               | 44       |
|    |      | 2.6.3 Comparison with 5G Requirements                | 45       |
|    |      | 2.6.4 Discussion                                     | 45       |
|    | 2.7  | Conclusion                                           | 48       |
|    |      |                                                      |          |
| 3. | ΑW   | IDEBAND LINEAR 28-GHZ POWER AMPLIFIER FOR POWER-     |          |
|    | EFF  | ICIENT 5G PHASED ARRAYS IN 40-NM CMOS                | 61       |
|    | 3 1  | Introduction                                         | 61       |
|    | 3.1  |                                                      | 67       |
|    | 3.2  | Experimental Results                                 | 62<br>63 |
|    | 5.5  |                                                      | 05       |
| 4. | A 28 | 8-GHZ TRANSMIT-RECEIVE FRONT-END MODULE FOR 5G HAND- |          |
|    | SET  | PHASED ARRAYS IN 40-NM CMOS                          | 75       |
|    |      |                                                      |          |
|    | 4.1  | Introduction                                         | 75       |
|    | 4.2  | Transmit-receive Module Considerations               | 77       |
|    |      | 4.2.1 Link Budget                                    | 78       |
|    |      | 4.2.2 Beam Steering                                  | 79       |
|    | 4.3  | Circuit Blocks                                       | 83       |
|    |      | 4.3.1 Power Amplifier                                | 84       |
|    |      | 4.3.2 Low-noise Amplifier                            | 84       |
|    |      | 4.3.3 Phase Shifter                                  | 86       |
|    | 4.4  | Module Integration                                   | 88       |
|    |      | 4.4.1 Antenna Interface                              | 89       |
|    |      | 4.4.2 Phase Shifter Interface                        | 96       |
|    | 4.5  | Experimental Results                                 | 98       |
|    |      | 4.5.1 Transmit Mode                                  | 99       |
|    |      | 4.5.2 Receive Mode                                   | 03       |

|    |       | 4.5.3   | Perfor | man | ce ( | Corr | ipa | riso | n. | • | • • | • |     | • |     |   |     | • |     |   | • • | •   | • • | 107 |
|----|-------|---------|--------|-----|------|------|-----|------|----|---|-----|---|-----|---|-----|---|-----|---|-----|---|-----|-----|-----|-----|
|    | 4.6   | Conclus | sion   |     | • •  | • •  | •   | •••  | •• | • | • • | • | ••• | • |     | • | ••• | • | • • | • |     | ••• | • • | 112 |
| 5. | CON   | ICLUSI  | ON.    |     |      |      | •   |      |    | • | • • | • |     | • | ••• | • | ••• | • | • • | • |     | ••• | ••• | 113 |
|    | 5.1   | Future  | Work   |     |      |      | •   |      |    | • | • • | • |     | • |     | • |     | • |     | • |     | ••• |     | 114 |
| RE | EFERI | ENCES   |        |     |      |      |     |      |    |   |     |   |     |   |     |   |     | • |     |   |     | •   |     | 116 |

# LIST OF FIGURES

| FIGUR | E                                                                                                                                                                                                                                                                                                                                                                             | Page |
|-------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
| 1.1   | Illustration of fundamental single-transistor power amplifier operation                                                                                                                                                                                                                                                                                                       | 3    |
| 1.2   | Illustration of error vector.                                                                                                                                                                                                                                                                                                                                                 | 4    |
| 1.3   | Illustration of adjacent channel leakage.                                                                                                                                                                                                                                                                                                                                     | 5    |
| 1.4   | Illustration of continuous conduction of output signal by pseudo-differential power amplifier action (also known as "push-pull" action).                                                                                                                                                                                                                                      | 6    |
| 1.5   | Graphical/visual definition of sub-threshold (i.e. weak inversion) conduc-<br>tion as the gradual transition from cut-off to saturation.                                                                                                                                                                                                                                      | 7    |
| 2.1   | Illustration of link budget analysis use scenario in comparison of potential carrier frequencies for 5G systems. Reprinted from [1].                                                                                                                                                                                                                                          | 19   |
| 2.2   | Two-dimensional URA of UE or AP antennas for allowable physical size $d_x \times d_y$ of $26 \text{mm} \times 15 \text{mm}$ for UE array, $47 \text{mm} \times 47 \text{mm}$ for AP array: (a) UE or AP at arbitrary $f_c$ , and (b) number of antenna elements in UE (left axis) and AP (right axis) arrays; inset shows example for UE at $f_c$ =30GHz. Reprinted from [1]. | 20   |
| 2.3   | (a) Scatter of best published data and fitted trendlines for PAE $(f_c)$ and $P_{Tx,dc}(f_c)$ , (b) required average $P_{out}$ per element vs. $f_c$ for 64-QAM at different $BW_{sig}$ , and (c) required average $P_{out}$ per element vs. LOS range at 30GHz for QPSK and 64-QAM at different $BW_{sig}$ . Reprinted from [1].                                             | 22   |
| 2.4   | Parameterized output stage circuit for optimization: (a) power cell layout, (b) parameterized output stage circuit. Reprinted from [1].                                                                                                                                                                                                                                       | 25   |
| 2.5   | The two steps in the optimization procedure: (a) Step (1) - load-pull, (b) Step (2) - EVM simulation. Reprinted from [1].                                                                                                                                                                                                                                                     | 27   |
| 2.6   | Output stage transistor design chart for $V_{DD} = 1$ V: PAE (shaded) and average 64-QAM OFDM $P_{out}$ (line) contours plotted at an EVM of $-27$ dBc, i.e. at a 2dB margin from EVM <sub>req</sub> ; design choice indicated by a circle, and inset shows correspondence between $V_{GS} - V_t$ on x-axis and bias current density $J_{PA}$ . Reprinted from [1].           | 28   |

| 2 | 2.7  | Issues in using $m = 12$ , $V_{GS} - V_t = -150$ mV optimization result: (a)<br>Cascaded amplifier frequency response overly sensitive to PVT and mod-<br>eling accuracy, (b) AM-PM conversion and DA stage load modulation due<br>to $C_{gs}$ nonlinearity. Reprinted from [1].                                                                                                                                                                               | 30 |
|---|------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 2 | 2.8  | Single-ended model of differential-mode inter-stage matching: (a) circuit, (b) 1 <sup>st</sup> simplification (c) simplified model. Reprinted from [1]                                                                                                                                                                                                                                                                                                         | 31 |
| 2 | 2.9  | Smith chart trajectories for inter-stage matching to present $(100 + j0) \Omega$<br>differentially to DA: (a) $L_{deg} = 0$ , (b) $L_{deg}=28$ pH (i.e. 14pH single-<br>ended), and scatter of $Z_E$ due to independent Gaussian variations ( $\pm 3\sigma \equiv \pm 10\%$ ) in $C_{gs}$ and $C_D$ ; -13dB return loss region w.r.t ( $100 + j0$ ) $\Omega$ target<br>indicated with circle: (c) $L_{deg} = 0$ , and (d) $L_{deg}=28$ pH. Reprinted from [1]. | 33 |
| 2 | 2.10 | Micrographs of $12 \times 32 \times 1 \mu m/28$ nm transistor test structures used in<br>power capability verification: (a) $L_{deg} = 0$ , (b) $L_{deg}$ =14pH single-ended.<br>Reprinted from [1]                                                                                                                                                                                                                                                            | 34 |
| 2 | 2.11 | Simulated AM-AM/AM-PM response of the output stage using re-designed<br>lossless input matching for each $L_{deg} \in \{0, 5, 15, 25, 50\}$ ]pH at $W_{nMOS} = 12 \times 32 \times 1 \mu m$ and $J_{PA} = 12 \mu A / \mu m$ : (a) AM-AM response, and (b)<br>AM-PM response. Reprinted from [1].                                                                                                                                                               | 37 |
| 2 | 2.12 | Schematic of PA circuit: (a) two-stage transformer-coupled topology, (b) push-pull stage with capacitive neutralization capacitor $C_n$ and single-ended source degeneration inductor $L_{deg}$ . Reprinted from [1].                                                                                                                                                                                                                                          | 38 |
| 2 | 2.13 | Measured MOM neutralization capacitor characteristics: (a) capacitance<br>and quality factor, (b) shunt-equivalent loss resistance calculated from ca-<br>pacitance and quality factor. Reprinted from [1]                                                                                                                                                                                                                                                     | 41 |
| 2 | 2.14 | Output balun characterization: (a) test structure micrograph, (b) induc-<br>tances and quality factors (c) differential mode maximum available gain<br>(i.e. $\equiv$ transformer efficiency). Reprinted from [1]                                                                                                                                                                                                                                              | 50 |
| 2 | 2.15 | Die micrograph of fabricated two-stage PA. Reprinted from [1]                                                                                                                                                                                                                                                                                                                                                                                                  | 51 |
| 2 | 2.16 | Small- and large-signal CW signal measurement results for $J_{PA} = 12\mu$ A / $\mu$ m, $J_{DA} = 22\mu$ A/ $\mu$ m and 1V supply: (a) s-parameter results (b) best measured CW signal input power sweep at $f_c = 30$ GHz. Reprinted from [1].                                                                                                                                                                                                                | 52 |

| 2.17 | Swept large-CW-signal measurement results summary for $J_{PA} = 12\mu$ A / $\mu$ m, $J_{DA} = 22\mu$ A/ $\mu$ m: (a) key performance metrics over 27–31GHz for 1V supply, and (b) saturated performance metrics vs. supply voltage at 30GHz. Reprinted from [1]                                                                                                                                                                    | 53 |
|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 2.18 | EVM measurement setup for 64-QAM OFDM signal. Reprinted from [1].                                                                                                                                                                                                                                                                                                                                                                  | 54 |
| 2.19 | Peak 64-QAM OFDM measured performance: output spectrum, ACPR, and constellation for $J_{PA} = 12\mu A/\mu m$ , $J_{DA} = 22\mu A/\mu m$ and 1V supply at $P_{out} = +4.2$ dBm, $BW_{sig} = 250$ MHz (1.5Gbps), achieving 9% PAE at EVM= -25dBc. Reprinted from [1].                                                                                                                                                                | 54 |
| 2.20 | Swept 64-QAM OFDM signal measurement results summary for $J_{PA} = 12\mu A/\mu m$ , $J_{DA} = 22\mu A/\mu m$ : (a) average $P_{out}$ and corresponding PAE vs. $f_c$ for 1V supply, (b) average $P_{out}$ and corresponding PAE supply voltage at $f_c = 30$ GHz. Reprinted from [1].                                                                                                                                              | 55 |
| 2.21 | Measured AM-AM/AM-PM characteristics of two-stage PA at $J_{DA} = 22\mu A/\mu m$ and corresponding Savitzky-Golay smoothed characteristics for three example $J_{PA}$ values: (a) $J_{PA} = 10.0\mu A/\mu m$ , (b) $J_{PA} = 12.9\mu A/\mu m$ , and (c) $J_{PA} = 16.5\mu A/\mu m$ . Reprinted from [1]                                                                                                                            | 56 |
| 2.22 | Key metrics of measured AM-AM/AM-PM characteristics at 30GHz for 1V supply and $J_{DA} = 22\mu A/\mu m$ before and after Savitzky-Golay smoothing: (a) $P_{1dB}$ , and (b) maximum AM-PM deviation w.r.t. small-signal for $P_{out} \leq P_{1dB}$ . Reprinted from [1].                                                                                                                                                            | 57 |
| 2.23 | EVM vs. average 64-QAM OFDM $P_{out}$ at 30GHz obtained using direct<br>measurement (setup in Fig. 2.18) for $BW_{sig} = \{150, 250\}$ MHz, and using<br>behavioral simulation with measured AM-AM/AM-PM characteristics of<br>the two-stage PA (i.e. $BW_{sig} \rightarrow 0$ ) for different $J_{PA}$ at $J_{DA} = 22\mu$ A/ $\mu$ m;<br>some example AM-AM/AM-PM characteristics are shown in Fig. 2.21.<br>Reprinted from [1]. | 58 |
| 2.24 | Simulated two-tone inter-modulation distortion at $J_{DA} = 22\mu A/\mu m$ , $J_{PA} = 12\mu A/\mu m$ for $\Delta f = \{20, 75, 125, 150250\}$ MHz at the amplifier center frequency: (a) lower $IM_3$ , (b) upper $IM_3$ . Reprinted from [1]                                                                                                                                                                                     | 59 |
| 2.25 | Measured two-tone inter-modulation distortion at $J_{DA} = 21\mu \text{A}/\mu\text{m}$ , $J_{PA} = 23.8\mu\text{A}/\mu\text{m}$ for $\Delta f = 20$ MHz across 27–31GHz center frequency: (a) lower $IM_3$ , (b) upper $IM_3$ , (c) lower $IM_5$ , and (d) upper $IM_5$ . Reprinted from [1].                                                                                                                                      | 60 |
|      |                                                                                                                                                                                                                                                                                                                                                                                                                                    |    |

| 3. | 1 Three-stage PA: cascode VGA 1st stage (4×2dB digital gain steps), and capacitively-neutralized common-source 2nd and 3rd stages; power transistor size scaling indicated in units. Reprinted from [2].                                                                                                                                                                                                                                                                                                                  | 65 |
|----|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 3. | 2 Optimization of linearity and PAE in stage 3 using spacing $\Delta f_{in}$ of the two<br>resonance frequencies of inter-stage matching network (input of stage 3).<br>Reprinted from [2].                                                                                                                                                                                                                                                                                                                               | 66 |
| 3. | 3 Die microgrph of 40 nm CMOS PA. Reprinted from [2]                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 67 |
| 3. | 4 Measured s-parameters across digital gain states as well as the associated gain/phase errors vs. frequency. Reprinted from [2]                                                                                                                                                                                                                                                                                                                                                                                          | 67 |
| 3. | 5 Measured large CW signal power sweep results over 27–30GHz ( $P_{in,max} = -3.5$ dBm at all frequencies); CW Pout/PAE summaries at key power levels over 26–33GHz. Reprinted from [2].                                                                                                                                                                                                                                                                                                                                  | 68 |
| 3. | 6 EVM measurement setup. (a) Block diagram. (b) Characterization of<br>EVM floor over center frequency for the tested carrier aggregation wave-<br>forms. Worst-case EVM floor data measured by connecting SMW200A<br>directly to FSW43 using only a cable ( $\approx$ 2-2.5dB loss at 30GHz); i.e. with-<br>out CMOS DUT. At each center frequency, $P_{out}$ of SMW200A is increased<br>until EVM is no longer noise-limited; then highest/worst EVM floor across<br>component carriers is recorded. Reprinted from [2] | 69 |
| 3. | 7 EVM/PAE vs. Pout at 27GHz for 64-QAM OFDM with 1,4,8CC and<br>Pout/PAE summaries vs. center frequency for -25dBc EVM; measured<br>spectrum/ACLR for peak 8CC performance at 27GHz: 4.32Gbps, +6.7dBm,<br>11% PAE. Reprinted from [2].                                                                                                                                                                                                                                                                                   | 70 |
| 3. | 8 Summary of QPSK OFDM carrier aggregation measurements versus carrier frequency for $-16$ dBc EVM on each CC. (a) Average $P_{out}$ . (b) Average PAE. Reprinted from [2].                                                                                                                                                                                                                                                                                                                                               | 71 |
| 3. | 9 Summary of QPSK OFDM carrier aggregation measurements versus carrier frequency for $-25$ dBc EVM on each CC. (a) Average $P_{out}$ . (b) Average PAE. Reprinted from [2].                                                                                                                                                                                                                                                                                                                                               | 72 |
| 3. | 10 Summary of 64-QAM OFDM measurements versus carrier frequency for<br>a single CC having different contiguous RFBW values at $-25$ dBc EVM.<br>(a) Summary of average $P_{out}$ . (b) Summary of PAE. Reprinted from [2].                                                                                                                                                                                                                                                                                                | 73 |

| 4.1  | Impact of quantization/random phase step errors on array performance. (a)<br>Worst-case EVM degradation as a result of phase quantization error in dig-<br>ital PS. (b) Impact of random per-element phase errors on beam pointing<br>angle accuracy.                                                                                                                                                                                                                                                                     | 81 |
|------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 4.2  | Effect of passive phase shifter insertion loss on overall transmitter EVM.<br>(a) Simulation scenario illustration. (b) 8-element phased array transmitter<br>EVM versus phase shifter insertion loss.                                                                                                                                                                                                                                                                                                                    | 82 |
| 4.3  | Effect of passive phase shifter insertion loss on overall receiver noise figure and linearity. (a) Simulation scenario illustration. (b) 8-element phased array receiver $NF$ and $IIP_3$ versus phase shifter insertion loss                                                                                                                                                                                                                                                                                             | 82 |
| 4.4  | Die micrographs of stand-alone front-end module component test circuits.<br>(a) Power amplifier. (b) Low-noise amplifier. (c) Phase shifter 3                                                                                                                                                                                                                                                                                                                                                                             | 83 |
| 4.5  | Schematic of three-stage power amplifier. (a) Top level block diagram and relative stage scaling. (b) Stage 1: $4 \times 2$ dB cascode VGA. (c) Stages 2 and 3: common source stages with capacitive neutralization.                                                                                                                                                                                                                                                                                                      | 85 |
| 4.6  | Schematic of three-stage LNA. (a) Top level block diagram. (b) Stage 1.<br>(c) Stage 2. (d) Stage 3                                                                                                                                                                                                                                                                                                                                                                                                                       | 87 |
| 4.7  | Schematic of three-bit phase shifter. (a) Block diagram. (b) 45° Cell. (c) 90° Cell (d) 180° Cell.                                                                                                                                                                                                                                                                                                                                                                                                                        | 88 |
| 4.8  | Candidate topologies for antenna matching and transmit-receive switch.<br>(a) Concept of $\lambda/4$ transformer topology used in [3,4]. (b) Transformer-<br>based multiplexer topology of [5]. (c) Transformer-based topology in [6].<br>(d) Proposed topology.                                                                                                                                                                                                                                                          | 89 |
| 4.9  | Illustration of trade-off between Tx-mode bandwidth and Rx-mode $NF$ in design of PA balun for circuit of Fig. 4.8(d). (a) Configuring PA gain devices to replace shunt switch in Rx-mode; $V_{DD}$ center-tap is pulled down to ground to minimize gain device's on-resistance $R_{on,PA}$ . (b) Simplified $\pi$ -equivalent circuit in Rx-mode. (c) Smith chart trajectory of output impedance 'looking back' at antenna from point B in Rx-mode; red arrow indicates effect of tighter magnetic coupling in Tx balun. | 92 |
| 4.10 | Antenna port matching and transmit-receive switch. (a) Schematic. (b) 3D illustration of physical layout.                                                                                                                                                                                                                                                                                                                                                                                                                 | 93 |

| 4.11 | Illustration of signal path in the circuit of Fig. 4.10 for the Tx/Rx modes.<br>(a) Schematic in Tx-mode. (b) 3D structure in Tx-mode. (c) Schematic in Rx-mode. (d) 3D structure in Rx-mode.                                                        | 94  |
|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
| 4.12 | Die micrographs of passive test structures for antenna interface. (a) Tx-<br>mode: LNA port and TR switch shorted. (b) Rx-mode: PA port shorted,<br>TR switch gate tied to ground.                                                                   | 96  |
| 4.13 | Simulated and measured $IL$ and $RL$ of antenna interface passive test structures. (a) Tx-mode. (b) Rx-mode $\ldots \ldots \ldots$                                                    | 97  |
| 4.14 | <ul><li>Phase shifter port matching and transmit-receive switch. (a) Schematic.</li><li>(b) 3D illustration of physical layout</li></ul>                                                                                                             | 97  |
| 4.15 | Die micrograph of fully-integrated front-end module                                                                                                                                                                                                  | 98  |
| 4.16 | Measured s-parameters in Tx-mode across $9 \times 1$ dB PA gain steps for different PS phase state pairs. (a) States $\{0, 4\}$ . (b) States $\{1, 5\}$ . (c) States $\{2, 6\}$ . (d) States $\{3, 7\}$                                              | 100 |
| 4.17 | Measured gain step nonlinearity errors in Tx-mode across $9 \times 1$ dB PA gain steps for different PS phase states; r.m.s. error indicated with thick black line in each case. (a) State 0. (b) State 1. (c) State 2. (d) State 3                  | 100 |
| 4.18 | Measured s-parameters in Tx-mode across $7 \times 45^{\circ}$ PS phase steps for dif-<br>ferent PA gain settings: (a) PA gain state 0 (min. gain). (b) PA gain state<br>4. (c) PA gain state 9 (max. gain).                                          | 101 |
| 4.19 | Measured errors in $7 \times 45^{\circ}$ PS phase steps in Tx-mode for different PA gain settings; r.m.s. error indicated with thick black line in each case. (a) PA gain state 0 (min. gain). (b) PA gain state 4. (c) PA gain state 9 (max. gain). | 101 |
| 4.20 | Measured CW power sweep results at maximum PA gain setting and PS phase state 0 at different CW frequencies. (a) 27GHz. (b) 28GHz. (c) 29GHz. and (d) 30GHz                                                                                          | 102 |
| 4.21 | Summary of measured CW power sweep results at maximum PA gain setting and PS phase state 0 versus CW frequency at key power back-off levels. (a) $P_{out}$ . (b) PAE                                                                                 | 103 |
| 4.22 | Summary of measured $P_{out}$ and PAE for carrier aggregation scenarios versus center frequency for EVM < -25dBc on each CC for different PS digital states. (a) $P_{out}$ for 1CC. (b) PAE for 1CC. (c) $P_{out}$ for 8CC. (d) PAE for 8CC.         | 104 |

| 4.23 | Measured s-parameters in Rx-mode across $8 \times 1$ dB LNA gain steps for different PS phase state pairs. (a) States $\{0, 4\}$ . (b) States $\{1, 5\}$ . (c) States $\{2, 6\}$ . (d) States $\{3, 7\}$                                                                                                                              | 106 |
|------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
| 4.24 | Measured gain step nonlinearity errors in Rx-mode across $8 \times 1$ dB LNA gain steps for different PS phase states; r.m.s. error indicated with thick black line in each case. (a) State 0. (b) State 1. (c) State 2. (d) State 3                                                                                                  | 107 |
| 4.25 | Measured s-parameters in Rx-mode across $7 \times 45^{\circ}$ PS phase steps for different LNA gain settings. (a) LNA gain state 0 (min. gain). (b) LNA gain state 3. (c) LNA gain state 8 (max. gain)                                                                                                                                | 107 |
| 4.26 | Measured errors in $7 \times 45^{\circ}$ PS phase steps in Rx-mode for different LNA gain settings; r.m.s. error indicated with thick black line in each case. (a) LNA gain state 0 (min. gain). (b) LNA gain state 3. (c) LNA gain state 8 (max. gain).                                                                              | 108 |
| 4.27 | Measured Rx-mode noise figure versus frequency across all LNA gain set-<br>tings at PS phase state 0                                                                                                                                                                                                                                  | 108 |
| 4.28 | Summary of measured Rx-mode linearity performance versus frequency.<br>(a) CW input $P_{1dB}$ results at minimum and maximum LNA gain settings<br>and PS phase state 0 versus CW frequency. (b) Two-tone $IIP_3$ results at<br>LNA gain settings $\{0, 1, 7, 8\}$ and PS phase state 0 versus center frequency<br>of two-tone signal. | 109 |

# LIST OF TABLES

| TABLE | ]                                                                                                                                                                                 | Page |
|-------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
| 1.1   | Definitions of conventional linear PA modes of operation                                                                                                                          | 3    |
| 1.2   | Summary of recent literature on linear CMOS RF power amplifiers                                                                                                                   | 9    |
| 1.3   | Review of conventional transmit-receive switch literature                                                                                                                         | 12   |
| 1.4   | Summary of passive $LC$ delay cell based phase shifter literature review.                                                                                                         | 13   |
| 2.1   | Expressions and chosen values for various variables used in (2.1), Fig. 2.1, and Fig. 2.2. Reprinted from [1].                                                                    | 21   |
| 2.2   | Summary of targeted PA circuit specifications from [7]. Reprinted from [1].                                                                                                       | 24   |
| 2.3   | Comparison of 29GHz CW load-pull measurement results for single-ended transistor test structures: (a) $L_{deg} = 0$ , and (b) $L_{deg} = 14$ pH single-ended. Reprinted from [1]. | 34   |
| 2.4   | Summary of design values for two-stage power amplifier. Reprinted from [1].                                                                                                       | 39   |
| 2.5   | Comparison with state-of-the-art silicon mm-Wave PAs. Reprinted from [1].                                                                                                         | 46   |
| 3.1   | Comparison with state-of-the-art linear mm-wave silicon PAs for data com-<br>munication. Reprinted from [2].                                                                      | 74   |
| 4.1   | Key specifications for circuit components of UE FEM and the correspond-<br>ing measured performances achieved by stand-alone test circuits in this<br>work.                       | 77   |
| 4.2   | Comparison of candidate topologies in Fig. 4.8 for antenna matching and transmit-receive switch.                                                                                  | 90   |
| 4.3   | Comparison with state-of-the-art front-ends for 28 GHz 5G communications                                                                                                          | .110 |

# 1. INTRODUCTION AND LITERATURE REVIEW

#### 1.1 Motivation

The emerging fifth-generation (5G) wireless standard is expected to bring unprecedented increase in data rates that will enable new consumer and business applications that rely on wireless technology. Examples are virtual reality (VR) and augmented reality (AR), with their highly diverse set of sub-applications. The key technology behind the sought improvement in data rate is highly integrated millimeter wave (mm-Wave) phased arrays for communication on both the air interface accessed by the consumers through their user equipment (UE) devices, and on the back-haul side where access point (AP) devices first off-load data to be routed towards the core of the mobile network.

A fundamental challenge to the 5G vision is design of highly power-efficient UE phased arrays that can cope with extremely broadband communication signals. Due to market forces, the UE devices additionally must be implemented in low-cost consumergrade technologies, which makes their design even more challenging. Traditionally, mm-Wave phased array radios have been restricted to military applications, e.g. airborne radar units, and have been implemented in exotic compound semiconductor technologies. In such military applications, the key drivers are performance and reliability, while cost is less important. However, migration of military phased array development to silicongermanium (SiGe) technologies to reduce their cost and increase their integration level is an emerging trend. While the more traditional microwave community insists that only compound semiconductors can achieve the requirements of mm-Wave phased array, the mentioned migration trend is behind several forward-lookin industrial developments of SiGe solutions for 5G communication that aim to displace their III-V semiconductor competition. The motivation of this research is to investigate the possibility of displacing both compound semiconductors and SiGe solutions by using the even cheaper CMOS technology for UE devices. The most challenging circuits to demonstrate are the radio front-end components: power amplifier (PA), low-noise amplifier (LNA), phase shifter (PS) and time-division duplexing (TDD) transmit-receive (TR) antenna switch. Therefore, this research focuses on power- and area-efficient CMOS implementations of these critical circuits and their integration into a phased array front-end module.

#### **1.2** Power Amplifier

This section duscusses the fundamental operation of a power amplifier (PA), explains the concept of load-line matching and how it differs from conjugate matching, then presents a review of linear radio frequency (RF) PA literature, followed by challenges associated with design of CMOS PAs at very high millimeter wave (mm-Wave) frequencies.

#### 1.2.1 Classes of Operation

Figure 1.1 shows a single-transistor PA, operating in a linear mode, i.e. such that the output signal ideally contains a significant component that is a linearly amplified version of the input signal. The proportion of the RF signal cycle over which the transistor conducts an output current  $i_D$  is expressed as an angle  $\varphi$ . The angle  $\varphi$  depends on whether the total gate-to-source voltage  $V_{GS}$  of the transistor exceeds its threshold voltage, and is controlled by the value of the direct current (d.c.) bias voltage  $V_{dc}$  in Fig. 1.1. Table 1.1 shows the definitions of th conventional linear modes of operation for PA design according to the value of  $\varphi$  in degrees.

#### 1.2.2 Load Line Matching

The concept of a load-line match is important in the design of PAs at any frequency of operation. It marks clear deviation of large-signal circuit design from the simpler theories used to treat small-signal circuits. The key point to understand is that a real transistor can



Figure 1.1: Illustration of fundamental single-transistor power amplifier operation.

| Class of<br>Operation | Conduction Angle<br>(Degrees) |  |  |
|-----------------------|-------------------------------|--|--|
| A                     | arphi= 360                    |  |  |
| AB                    | 180 < $arphi$ < 360           |  |  |
| В                     | arphi= 180                    |  |  |
| С                     | arphi < 180                   |  |  |

Table 1.1: Definitions of conventional linear PA modes of operation.

only provide a finite output signal power. The upper bound on this power is the product of the maximum voltage swing across the transistor's drain–source terminals in Fig. 1.1 and its maximum drain current. The device simply cannot produce any more output power than this product. The load-line match is the procedure by which the load impedance of the transistor is synthesized to allow it to approach this upper bound on  $P_{out}$ . Since the load impedance is by definision the ratio of output voltage to output current, the optimum load resistance (also known as the "load-line resistance") is therefore defined as [8]:

$$Optimum resistance = \frac{Maximum drain to source voltage swing}{Maximum drain current}$$
(1.1)

#### 1.2.3 PA Linearity Metrics for Modulated Signals

The two key metrics for evaluating the linearity (i.e. fidelity) of an RF PA as it amplifies a complex modulated signal are the error vector magnitude (EVM) and the adjacent channel leakage ratio (ACLR).

#### 1.2.3.1 Error Vector Magnitude

EVM is basically the inverse of the signal-to-noise ratio (SNR); but measured at the output of the digital receiver. Figure 1.2 shows the ideal constellation points for a quadrature phase shift keying (QPSK) signal (blue circles), as well as a received symbol with some deviation or error vector relative to the ideal point. The root mean square (r.m.s.) value of the error vector shown in Fig. 1.2 taken over an ensemble of received symbols is the EVM of the signal. For PA design, a fictitious, ideal RF down-converter followed by an ideal digital receiver are appended at the output of the RF PA in simulations to compute the EVM that characterizes the linearity of the PA by as a stand-alone circuit without a real receiver. EVM is the metric used to give an indication of in-channel signal *quality*.



Figure 1.2: Illustration of error vector.

#### 1.2.3.2 Adjacent Channel Leakage Ratio

The ACLR is the ratio of transmit power 'leaked' into the adjacent channel relative to the power within the desired channel. Figure 1.3 illustrates the computation of ACLR for both the first adjacent upper (i.e. higher frequency) channel,  $ACLR_{U,1}$  and the first adjacent lower (i.e. lower frequency) channel,  $ACLR_{L,1}$ . The leaked power density is integrated over the same bandwidth as the desired RF signal (RFBW in Fig. 1.3), but centered at the first adjacent channel. Notice that there is typically a non-zero guard band region that separates channels in wireless communication standards, which is indicated by the small spacing between the integration regions highlighted in yellow.



Figure 1.3: Illustration of adjacent channel leakage.

#### 1.2.4 Linearity Benefits of Differential Operation

Sub-threshold biasing is classically associated with class-C PAs, which are highly efficient but unfortunately very nonlinear. A combination of three factors allows a subthreshold-biased PA to operate linearly in this work (in order of significance): differential operation, gradual transistor d.c. cut-off, and limited transit frequency. This section discusses how these factors interact to yield favorable results in later chapters.

#### 1.2.4.1 Differential operation

The PAs in this work use a pseudo-differential topology (also known as a push-pull topology see e.g. Section 10.1 of [8]). Effectively, signal rectification does not take place, as the differential pair allows the signal to be continuously trans-conducted from the input to output, even if one of the two differential arms is in cut-off during half of the RF cycle (for class-B bias). However, as shown by the conceptual waveforms in Fig. 1.4 for ideal transconducting devices, CW input, and sub-threshold biasing: the output voltage across the load resistor has 'kinks' that are responsible for the observed/expected distortion. If these kinks are smoothed out, the amplifier nonlinearity is reduced.



Figure 1.4: Illustration of continuous conduction of output signal by pseudo-differential power amplifier action (also known as "push-pull" action).

#### 1.2.4.2 Gradual d.c. cutoff

Note that, in this work, the term "sub-threshold biasing" is not used in its classic sense from microwave PA design literature that typically assumes ideal/abrupt cut-off in the transistors. The difference between ideal/abrupt cut-off and real/gradual cut-off is illustrated conceptually in the Fig. 1.5, showing a typical MOSFET d.c.  $I_D$ - $V_{GS}$  characteristic in the weak and moderate inversion regions near  $V_{\text{threshold}}$  (see [9]):



Figure 1.5: Graphical/visual definition of sub-threshold (i.e. weak inversion) conduction as the gradual transition from cut-off to saturation.

Idealized/abrupt cut-off occurs exactly at the threshold voltage, while real/gradual cutoff exhibits sub-threshold conduction over an extended d.c.  $V_{GS}$  range near the threshold voltage as shown. Also, throughout the manuscript, the intrinsic device transconductance refers to the general definition:  $g_m \triangleq \partial I_D / \partial V_{GS}$ , which is non-zero in the sub-threshold region as shown by the non-zero slope of  $I_D$  in the above conceptual diagram.

#### 1.2.4.3 Finite Transit Frequency

The 'switching' speed of any MOSFET is a function of its transit frequency  $f_T$ , and  $f_T$  is itself a function of the bias point [10]. At mm-Wave frequencies,  $f_T$  of even a

hypothetical deep-submicron MOSFET having ideal/abrupt d.c. cut-off cannot be large enough for the transistor to reach complete cut-off instantaneously when the input voltage swings below the ideal/abrupt threshold voltage. Such an instantaneous jump to zero drain current would require the device to have appreciable transconductance at the harmonic frequencies of the waveform, which is not the case due to limited  $f_T$ . This logic applies even more soundly for sub-threshold biasing; where  $f_T$  is reduced further.

The smoothing/filtering of unwanted distortion near the signal zero crossings in a pushpull stage is a strong function of the bias point  $V_{GS}$ ; i.e. by the combined action of gradual d.c. cut-off near the threshold on one hand, and the limited/controlled device  $f_T$  that lowers harmonic content on the other hand. Therefore, our encompassing optimization of device width and bias point in Chapter 2 is responsible for the achieved class-AB backoff linearity performance, and clearly does not yield class-C characteristics based on the reported laboratory measurements.

#### 1.2.5 Linear RF CMOS PA Literature

Table 1.2 shows a summary of recent literature on linear CMOS power amplifiers at RF/mm-Wave frequencies. Since modulated signal performance is complex/expensive to measure, it is not un-common for PA publications to use continuous wave (CW) signal performance metrics such as 1 dB gain compression point or peak amplitude-to-phase (AM-PM) modulation conversion as a proxy. The cited references in Table 1.2 are chosen either because they did report modulated data, or for being recent and therefore relevant. This review is further augmented by the back-off PAE trend collection and commentary in Chapter 2. It is important to point out that linear operation refers to inherent circuit-level linearity in this dissertation, i.e. without added measures such as digital pre-distortion (DPD). Note for example, that earlier works in deeply scaled CMOS technology did not achieve comparable performance to that reported in later works including Chapters 2 and

3 of this work. This is the case even for works that use DPD, e.g. Cohen'09 in Table 1.2, [11]. Also note that [12] is a good example of the state-of-the-art in current fourth-generation (4G) long-term evolution (LTE) UE PAs; peak throughput supported is  $3 \times 20$  MHz carrier aggregation. This is a testament of the immense potential for improvement using mm-Wave technologies in 5G systems.

|             |                                             | <u> </u>                   | _                          |                                | II                                     |                             |
|-------------|---------------------------------------------|----------------------------|----------------------------|--------------------------------|----------------------------------------|-----------------------------|
| Ref.        | Tech./ f <sub>c</sub> / VDD<br>(nm/ GHz/ V) | Psat/<br>PAEmax<br>(dBm/%) | P1dB/<br>PAE1dB<br>(dBm/%) | Linear<br>Pout/ PAE<br>(dBm/%) | Signal/ EVM<br>(none/ dBc)             | Gss/<br>3dB BW<br>(dB/ GHz) |
| Elmala'06   | 90/3.65/<br>1.55                            | 28.9/39<br>VDD=1.85        | 25.4/32                    | 19.4/<br>12.5*                 | 64-QAM<br>54Mbps/ —26.2                | 27/<br>NA                   |
| Chowd'09    | 90/2.4/3.3                                  | 30.1/33                    | 27.7/<br>≈27               | 22.7/<br>12.4                  | 64-QAM<br>10MHz/—25.3                  | 28/<br>0.7                  |
| Cohen'09    | 45/60/1.2                                   | 6.1/16                     | NA                         | —2/<br>6.1**                   | Hi-constellation OFDM<br>40MHz ***/–28 | 13/<br>NA                   |
| Zhao'13     | 40/60/1.0                                   | 17/30.3                    | 13.8/<br>21.6              | NA/ NA                         | NA                                     | 17/<br>5.5                  |
| Thyag'14    | 28/60/2.1                                   | 16.5/12.6                  | 11.7/ 6.3                  | NA/NA                          | NA                                     | 24.4/<br>11                 |
| Kulkarni'14 | 40/60/1.8                                   | 16.4/23                    | 13.9/<br>18.9              | 7.0/ ≈5                        | 64-QAM 3Gbps/<br>—25.2                 | 22.4/<br>NA                 |
| Zhao'15     | 40/70-85/0.9                                | 20.9/22                    | 17.8/<br>12                | 11.9/ ≈3                       | 64-QAM 3Gbps/<br>—24.7                 | 18/ 15                      |
| Wanxin'15   | 65/2-6/1.8                                  | 22.4/28.4                  | "17.8 to<br>20.7"/ NA      | 10.66 to 12.8"\/ NA            | 64-QAM 40MHz/<br>—28                   | 23.6/ 5                     |
| Larie'15    | 28SOI/60/1.0                                | 18.8/21                    | 18.2/21                    | 10/ 8                          | CW signal is used                      | 15.4/8                      |
| Francois'15 | 180SOI/2.5/2.5                              | 28.1/47                    | 27.4/                      | 22.4/21.7                      | 64-QAM 20MHz/ –28                      | 11/<br>NA                   |

Table 1.2: Summary of recent literature on linear CMOS RF power amplifiers.

References: Elmala'06 [13], Chowd'09 [14], Cohen'09 [11], Zhao'13 [15], Thyag'14 [16], Kulkarni'14 [17], Zhao'15 [18], Wanxin'15 [19], Larie'15 [20], Francois'15 [12]

\*Driver amplifier consuming 150 mW not accounted for in reported PAE. \*\*Digital pre-distortion is used to achieve reported linearity. EVM degrades by  $\approx$  7 dB. \*\*\*Constellation order not given.

#### 1.2.6 mm-Wave CMOS PA Design Challenges

Besides the typical challenges of any high frequency circuit design, silicon materialsystem related limitations are a key challenge for RF/mm-Wave power generation in CMOS technologies. That is, in comparison to e.g. the conventionally used III-V compound semiconductor technologies as explained below.

- In silicon, the critical (i.e. breakdown) electric field is 2.5×10<sup>5</sup> V/cm, which is lower than in III-V compounds. For example it 3×10<sup>5</sup> V/cm in GaAs and 30×10<sup>5</sup> V/cm in GaN [21].
- Driven by market economics, and fundamentally limited by the low silicon breakdown field, technology scaling forces the supply voltage  $V_{DD}$  to be lowered in CMOS. This reduces the attainable output power of a CMOS PA operating at any frequency.
- Lowering  $V_{DD}$  increases the *relative* size of the 'knee' voltage  $V_{d,sat}$ , which in turn degrades PAE.
- Threshold voltage  $V_t$  not equally scaled down with respect to  $V_{DD}$ , forcing subthreshold operation. MOS device input capacitance is increasingly more nonlinear below  $V_t$  [9].
- A high substrate conductivity is used in CMOS to avoid latch-up problems. This causes especially large insertion losses in passive elements, especially inductors and transformers.

#### **1.3 Transmit-Receive Switch**

- 1.3.1 MOS Switch Parasitics
  - Series switch (ON-state):  $R_{on}$  determines low frequency insertion loss (forms potential divider with load). To reduce  $R_{on}$  of series switch, make  $W_{switch}$  larger: larger capacitance leads to signal feed-through (degrades isolation).

- Both series and shunt topologies (ON-state): junction capacitances couple signal to bulk resistance  $R_B$  at high frequencies, further increasing insertion loss. For a given switch, there exists a value of  $R_B$  for which power coupled to substrate is at a maximum. Hence, very small or very large  $R_B$  are both viable options. Without triple well devices, high  $R_B$  can lead to latch-up. One published solution is to use a parallel *LC*-tank to float the bulk at a desired RF frequency.
- Shunt switch on Tx-side (OFF-state): Typically experiences large voltage swing due to PA output. Large gate resistor  $R_G$  (for D.C. path isolation) effectively floats gate node. Equal  $C_{gd}$  and Cgs (overlap components) force gate voltage to ; drain voltage, leads to self-biasing ('boot-strapping') and OFF-state resistance  $R_{OFF}$  thus drops with increasing Tx signal strength resulting in nonlinearity/gain compression.

#### 1.3.2 Review of CMOS TR Switch Literature

The purpose of the present review is to illustrate the *need* for innovation in the design of the antenna TR switch for this work; as containing insertion loss to within  $IL \approx 1.5$  dB is highly desirable for PA efficiency and LNA noise figure. Achieving a reasonably high input 1 dB compression point is also important if the switch topology appears in cascade with the PA in transmit-mode as with conventional switch topologies.

Table 1.3 shows a summary of literature on *conventional* series, shunt, series-shunt, and asymmetric transmit-receive (TR) antenna switch topologies. The data in Table 1.3 illustrates that conventional topologies yield more than 2 dB of insertion loss (*IL*) for any publication above 20 GHz. Also, to the best of the author's knowledge, the lowest reported insertion loss *IL* of 1.6 dB achieved using a non conventional switchable balun with a shunt-only TR switch topology in [22]. However, it suffers from poor linearity due to the shunt switch being 'boot-strapped'. The reported *IP*<sub>1 dB</sub> is +12 dBm. The same concept of the switchable balun was later re-published in [23], but using stacked devices

to enhance transmit-mode linearity. The new design pushed  $IP_{1 \text{ dB}}$  up to +28 dBm, but the insertion loss deteriorated to 3 dB, making it unattractive for this work. More recent, and non-conventional design techniques for the TR switch from published high-performance Rf/mm-Wave transceivers and front-ends are further considered in more detail in Chapter 4.

| Ref. | Design<br>Technique                 | f<br>(GHz)  | Тороlоду           | Node<br>(nm)  | IL<br>(dB)  | lsolation<br>(dB) | P <sub>-1dB</sub><br>(dBm) |
|------|-------------------------------------|-------------|--------------------|---------------|-------------|-------------------|----------------------------|
| [r1] | Stepped<br>Impedance                | 24          | Asymmetric         | 90<br>(T-W)*  | 3.5         | 10                | 28.7                       |
| [r2] | Minimized $\mathbf{R}_{\mathbf{B}}$ | 5.8         | Series-Shunt       | 180           | 0.8         | 20                | 26.5                       |
| [r3] | Maximized $R_B$                     | 35          | Series-Shunt       | 130           | 2.2         | 32                | 23                         |
| [r4] | Optimized<br>Dimensions             | 5.425       | Differential       | 180           | 1.8         | 15                | 15                         |
| [r5] | LC Body<br>Floating                 | 2.4/<br>5.2 | Asymmetric         | 180           | 1.5         | 30                | 28                         |
| [r6] | Stacked<br>Transistors              | 5           | Series-Shunt       | 180           | 1.44        | 22                | 21.5                       |
| [r7] | Travelling Wave                     | 0-20        | Series-Shunt       | 180<br>(T-W)* | 0.7-2.5     | 25-60             | 26.2                       |
| [r8] | Switched Body<br>Floating           | 16.6,<br>28 | Series, asymmetric | 130<br>(T-W)* | 1.9,<br>2.6 | 20,<br>23         | 26.5 <i>,</i><br>25.5      |
| [r9] | Impedance<br>Matching               | 15          | Series-Shunt       | 130           | 1.8         | 17.8              | 21.5                       |

Table 1.3: Review of conventional transmit-receive switch literature.

References: r1 [24], r2 [25], r3 [26], r4 [27],r5 [28], r6 [29], r7 [30], r8 [31], r9 [32]. \*Triple-well process.

### **1.4** Passive Phase Shifter Literature

As will be shown in Chapter 4, a passive and bidirectional PS circuit is highly desirable for low-power UE phased array front-ends. Table 1.4 specifically summarizes the literature on passive PS circuits that use lumped-element cells to achieve linear-phase behavior over a finite bandwidth. The key point of this review is that the average insertion loss per single bit of resolution is 2.5 dB for any design operating at or above 20 GHz. That is, e.g. a 3-bit resolution PS based on any of the publications in Table 1.4 is expected to show  $\approx$ 7.5 dB of insertion loss. The techniques used in this work will be shown to achieve a lower insertion loss in [33] as explained in Chapter 4.

| Reference          | Tech.             | Freq<br>(GHz) | Resolution                              | Insertion<br>Loss<br>(dB)                      | Area<br>(mm²)         |
|--------------------|-------------------|---------------|-----------------------------------------|------------------------------------------------|-----------------------|
| [Campbell_2000]    | 0.25um<br>pHEMT   | 17-21         | 5-bit                                   | 5                                              | 1.693×<br>0.75*       |
| [Hancock_2005]     | Atmel<br>SiGe2-RF | 10-14         | 1-bit<br>inversion+<br>analog<br>Tuning | 6.8<br>(simulated)                             | 1.68×<br>0.66         |
| [Kang_2006]        | 180nm<br>CMOS     | 9-15          | 5-bit                                   | 14.5                                           | 3.1×1.4*              |
| [Min_2008]         | 0.12um<br>SiGe    | 28-40         | 4-bit                                   | 14 (S.E.)<br>11 (diff)                         | 0.53×0.22<br>0.7×0.25 |
| [Cohen_2010]       | 90nm<br>CMOS      | 60            | 2-bit                                   | 7                                              | n/a                   |
| [Gharibdoust_2012] | 180nm<br>CMOS     | 7.5-10        | 6-bit                                   | 0.7+1.9+1.6+0.5+2.3+<br>2.8=9.8<br>(simulated) | n/a                   |
| [Li_2013]          | 90nm<br>LP CMOS   | 60            | 5-bit                                   | 14.6                                           | 0.85×0.4              |

Table 1.4: Summary of passive LC delay cell based phase shifter literature review.

References: Campbell\_2000 [34], Hancock\_2005 [35], Kang\_2006 [36], Min\_2008 [37], Cohen\_2010 [38], Gharibdoust\_2012 [39], Li\_2013 [40].

#### **1.5 Dissertation Scope and Organization**

Chapter 2 presents the first major project in this work; which resulted in the first reported linear and efficient bulk CMOS PA targeting low-power 5G mobile user equipment (UE) integrated phased array transceivers [1, 7]. The chapter begins with a link budget analysis of UE phased array transmitter power consumption versus carrier frequency. This analysis considers very detailed and practical circuit- and antenna-module-oriented limitations, and serves as the back-bone for link-budget considerations throughout the dissertation. Then, an optimization methodology is proposed for the output stage of the PA with the cost function being power added efficiency (PAE) at desired error vector magnitude (EVM) and link range. Building on the optimization results, inductive source degeneration is employed to enable embedding of the optimized output stage into a two-stage transformer-coupled PA. It is shown in [1] that carefully designed inductive degeneration in the output stage is beneficial to the PA performance due to broadening of inter-stage impedance matching bandwidth, and due to the positive contribution of this degeneration to reduce distortion. The prototype PA demonstrating these concepts was designed and fabricated in 1P7M 28nm bulk CMOS and used a 1V supply, and achieves state-of-the-art performance.

The second project of this dissertation is presented in Chapter 3, and reported in [2,41]. The project focuses on a PA design that addresses the extremely challenging RF signal bandwidth requirements of 5G, with the added challenge of integrating digital gain control for phased array functionality, e.g. magnitude tapering across elements for side-lobe level control, or array element gain mismatch compensation. This second, three-stage design overcomes the linearity limitation of the 28 nm PA in coping with wider signal bandwidths that is investigated in Chapter 2, and simultaneously achieves higher power gain. To achieve wideband linearity and higher gain without compromising the excel-

lent back-off PAE of the first design, loosely-coupled transformer matching networks with dual in-band resonances at optimized frequency spacing were employed. Furthermore, to decouple impedance matching and linearity performances from digital gain setting, a current-steering cascode topology is used for the variable-gain first stage. This topology results in the excellent broadband gain-step linearity performance. The measured throughput for this design shows more than a three-fold improvement over the highest throughput supportable by the state-of-the-art defined by the 28 nm PA of Chapter 2, [7]; but with concurrent improvements in all of output power, PAE, and power gain.

For the third project of this dissertation, Chapter 4 describes the full details of the transmit-receiver module considerations, as well as shows how the PA design of [2] and the low-noise amplifier (LNA) and phase shifter (PS) designs in [33] are integrated to form a high-performance, wideband, and low-power transmit-receive front-end module for 5G phased array UE time-division duplex (TDD) radios employing the RF phase shifting architecture. Integrating this front-end module in a compact area without compromising the excellent wideband performances of the individual circuit components is a major signal integrity/electromagnetic design challenge that is tackled in this project. Area constraints dictate that conventional, wideband distributed-element networks have to be completely avoided, particularly in the TDD transmit-receive switch at the antenna port. A compact, low-loss lumped-element topology for the critical antenna interface is developed to address the UE requirements

Finally, Chapter 5 provides concluding remarks, as well as presents some avenues for future research work that may be performed to follow-up on the results of this work.

# A HIGHLY EFFICIENT AND LINEAR POWER AMPLIFIER FOR 28-GHZ 5G PHASED ARRAY RADIOS IN 28-NM CMOS\*

#### 2.1 Introduction

The race to deploy fifth generation (5G) wireless services by 2020 is on-going, and mm-Wave technology will play a key role in meeting mounting demand for broadband data traffic [42, 43].

While the spectral band to be adopted is not yet determined, recent advances make the 28GHz band particularly interesting for 5G mobile standardization. Contrary to past perception of mm-Wave propagation as a *fundamental* limitation [44], non-line-of-sight (NLOS) 28GHz coverage was demonstrated in urban cells [45]. Also, to counter heavy propagation losses, directive antenna arrays were integrated into base station and user equipment (UE) form factors in commercial grade technologies [46, 47]. These advances motivate favorable spectrum regulation [48].

Besides wave propagation, battery power efficiency for low-cost UE devices is another critical 5G challenge, limited by integrated power amplifiers (PA). Maximizing data rate gains implies broadband, e.g.  $\geq 100$ MHz RF bandwidth, *and* spectrally efficient signaling, e.g. high-order quadrature amplitude modulation (QAM), with orthogonal frequency division multiplexing (OFDM). The large peak-to-average power ratios (PAPR) of these signals and their sensitivity to distortion force the PA to operate in 8–10dB power back-off and drastically lower its efficiency. Also, low cost and a high level of integration for UE phased arrays make CMOS the technology of choice. Limitations from substrate con-

<sup>\*</sup> Section 2 is reprinted with permission from S. Shakib, H. C. Park, J. Dunworth, V. Aparin and K. Entesari, "A Highly Efficient and Linear Power Amplifier for 28-GHz 5G Phased Array Radios in 28-nm CMOS," in IEEE Journal of Solid-State Circuits, vol. 51, no. 12, pp. 3020–3036, Dec. 2016. ©2016 IEEE.

ductivity and silicon breakdown field degrade power efficiency in CMOS relative to e.g. GaAs. CMOS devices also exhibit more gradual/softer gain compression, which further increases the needed back-off to meet linearity requirements. Furthermore, in a high-volume production setting, cost and complexity preclude the use of calibration, e.g. using digital pre-distortion (DPD), due to differing nonlinear behavior among PAs in an integrated array. Thus, 5G UE radios require efficient CMOS mm-Wave PAs having inherent circuit-level linearity.

Early effort to experimentally assess/reach the achievable limit on power-added efficiency (PAE) resulted in the first CMOS PA for 28GHz 5G that we reported in [7]. We proposed an optimization methodology for selecting output stage transistor size/biasing to maximize back-off PAE, given range and error vector magnitude (EVM) targets. We also demonstrated that inductive degeneration can be used to broaden inter-stage matching, and help to reduce distortion resulting from sub-threshold biasing. As a result of these techniques, the achieved performance matched or exceeded the state-of-the-art in linear silicon mm-Wave PAs, as represented by 60GHz works [15], [17], [20], and a SiGe PA for 28GHz 5G [49] (number of published 28GHz PAs is limited).

Expanding on the PAE trend observed in [7], this paper begins with a link budget analysis of phased array transmitter power consumption versus carrier frequency. The analysis considers more detailed/realistic circuit- and antenna-module-oriented limitations not considered by channel-propagation-oriented publications, e.g. [45]. From a 5G standardization viewpoint, this helps to make a more informed choice of carrier band while incorporating the impact of UE power consumption. Subsequently, PA circuit requirements based on this detailed analysis are derived. From this point, we turn to expanding on the optimization methodology using target specifications in [7]. Brief theoretical analysis of inductive degeneration in the output stage is used to further illustrate its benefits. The reported experimental results are augmented with new measurement data, and discussed in
terms of the derived 5G requirements.

This paper is organized as follows. Section 2.2 shows 28GHz is favorable through a detailed analysis of phased array transmitter power consumption across a wide carrier frequency range encompassing candidate 5G bands, and provides PA circuit specifications. The design optimization methodology, and output stage inductive degeneration as its circuit-level enabler reported in [7], are expanded upon in Section 2.3 and Section 2.4, respectively. Section 2.5 provides implementation details. In Section 2.6, experimental data is reported and compared to the state-of-the-art, as well as to the derived 5G requirements from Section 2.2. The paper is concluded in Section 2.7.

### 2.2 System Considerations for 5G Phased Array Radios

The link budget for the envisioned phased array 5G broadband communication system is analyzed to compare a wide range of potential carrier frequencies, with UE transmitter battery power consumption  $P_{Tx,dc}$  as the figure of merit. The 28GHz band is shown to be favorable, and PA output power requirements are derived for future 28GHz 5G phased array PA developments.

### 2.2.1 Choice of Carrier Frequency for 5G Systems

The chosen use scenario for this analysis, and signal loss mechanisms in the UE to access point (AP) direction are illustrated by Fig. 2.1. Assuming a line-of-sight (LOS) channel simplifies this analysis without affecting  $f_c$  comparison. NLOS channel details can be found elsewhere [50]. Low-cost CMOS technology is assumed for the UE phased array RFIC (AP RFIC may be e.g. SOI), and flip-chip bonding to printed circuit board (PCB) antenna arrays is assumed [51]. Patch antennas are arranged at each carrier frequency  $f_c$  into an  $N_x \times N_y$  uniform rectangular array (URA) to fit on a UE/AP PCB of *physical* dimensions  $d_x \times d_y$ , fixed across  $f_c$  values (26mm×15mm for UE, 47mm×47mm for AP), at a spacing of  $0.5\lambda_0$ , where  $\lambda_0$  is free space wavelength. The UE/AP URA is



Figure 2.1: Illustration of link budget analysis use scenario in comparison of potential carrier frequencies for 5G systems. Reprinted from [1].

illustrated at arbitrary  $f_c$  in Fig 2.2(a). Element counts vs.  $f_c$ , and an example UE array at  $f_c = 30$ GHz are shown in Fig 2.2(b). Fixing  $d_x \times d_y$  reflects practical size constraints, and helps compare  $f_c$  values fairly; as array gain  $G_{\text{array}}(f_c)$  significantly impacts the link budget. To find  $P_{Tx,dc}(f_c)$ , using Friis' equation [52] to first express  $P_{Tx,rf}(f_c)$ :

$$P_{Tx,rf}(f_c) = \underbrace{10 \log_{10} \left( k_B T \times 10^3 \times BW_{sig} \right) + NF_{R_x}(f_c) + SNR_{sig}}_{\text{RF Front-end Losses}} + L_{\text{FE,R_x}}(f_c)] - \underbrace{\left[ G_{\text{array},\text{T_x}}(f_c) + G_{\text{array},\text{R_x}}(f_c) \right]}_{\text{Tx/Rx Antenna Array Gains w.r.t Isotropic Element}} (2.1)$$

where  $k_B$  is the Boltzmann constant, T is absolute temperature, and  $P_{Tx,rf}(f_c)$  is the total RF output power of the UE transmit array in dBm, needed for reliable detection. Table 2.1 lists definitions of the remaining variables, and corresponding explanations for their chosen values used in (2.1), Fig. 2.1, and Fig 2.2. The general criterion is to represent the highest proven capabilities for system components from published literature across an f<sub>c</sub> of 2.4–83GHz.



Figure 2.2: Two-dimensional URA of UE or AP antennas for allowable physical size  $d_x \times d_y$  of 26mm × 15mm for UE array, 47mm × 47mm for AP array: (a) UE or AP at arbitrary  $f_c$ , and (b) number of antenna elements in UE (left axis) and AP (right axis) arrays; inset shows example for UE at  $f_c$ =30GHz. Reprinted from [1].

In [7], and using the best published CMOS back-off PAE data in [17, 18, 20, 49, 57–60], we previously approximated PAE at  $P_{out}$  that satisfies  $|\text{EVM}| = SNR_{sig} + 3\text{dB} = 25\text{dB}$ using 64-QAM OFDM by PAE at  $P_{sat} - 9.6\text{dB}$ , then fitted the data to the trendline:

PAE 
$$(f_c) \approx \frac{35\%}{\left[1 + 0.16\sqrt{(f_c/10^9)}\right]^2}$$
. (2.2)

Also in [7], and consistently throughout this paper, we define  $P_{sat}$  as the  $P_{out}$  at 3dB of gain compression for the presented 28nm CMOS PA (see Section 2.6.1). Combining (2.1), (2.2), Fig. 2.1, Fig 2.2, and Table 2.1, the data scatter and corresponding trendline for  $P_{Tx,dc}(f_c) = P_{Tx,rf}(f_c) / PAE(f_c)$  based on the best published performances across a wide  $f_c$  range is shown in Fig. 2.3(a). For fixed link range, modulation format, RF

| Variable                                                 | Definition or Value [unit]                                                                                                                                                                                                                                                                                                                                           | Comment                                                                                                                             |  |  |
|----------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|--|--|
| SNR <sub>sig</sub>                                       | SNR at digital Rx output for $10^{-3}$ BER = 22 [dB]                                                                                                                                                                                                                                                                                                                 | Anticipated most demanding format:<br>64-QAM OFDM                                                                                   |  |  |
| BW <sub>sig</sub>                                        | 64-QAM OFDM <i>RF</i> bandwidth=100 [MHz]                                                                                                                                                                                                                                                                                                                            | Low-GHz systems cannot<br>provide>100MHz                                                                                            |  |  |
| D                                                        | 30 [m]                                                                                                                                                                                                                                                                                                                                                               | Corresponds to relatively low d.c.<br>power levels in Fig. 3(a)                                                                     |  |  |
| $d_x \times d_y$                                         | UE: $26 \times 15$ , and AP: $47 \times 47$ [mm <sup>2</sup> ]                                                                                                                                                                                                                                                                                                       | Size constraint for 2 × 2 MIMO; 4<br>modules per UE to cover<br>elevation/azimuth                                                   |  |  |
| $\alpha_{\mathrm{atm}}\left(f_{c}\right)$                | Atmospheric absorption coefficient [dB/m]                                                                                                                                                                                                                                                                                                                            | See [42]                                                                                                                            |  |  |
| $\lambda_{eff}\left(f_{c} ight)$                         | $rac{\lambda_0}{\sqrt{arepsilon_e}}$ [m]                                                                                                                                                                                                                                                                                                                            | $\lambda_{eff} (\varepsilon_e)$ is effective wavelength<br>(permittivity) in antenna array<br>substrate [29]                        |  |  |
| $L_{path}\left(f_{c} ight)$                              | $10 \log_{10} \left  \left( \frac{4\pi D f_c}{c_0} \right)^2 \right  + \left[ \alpha_{\text{atm}} \left( f_c \right) \times D \right] [\text{dB}]$                                                                                                                                                                                                                   | Theoretical free space and<br>absorption losses in line-of-sight<br>channel                                                         |  |  |
| $N_{x,y}$                                                | floor { $\left[ d_{x,y} - 1.2\lambda_{eff} \left( f_c \right) \right] / \left[ 0.5\lambda_0 \left( f_c \right) \right]$ }                                                                                                                                                                                                                                            | 0 is replaced by 1, see Fig. 2                                                                                                      |  |  |
| $G_{ m array}\left(f_{c} ight)$                          | $10\log_{10}{(N_x 	imes N_y)}$ + $G_{ m elem}$ [dBi]                                                                                                                                                                                                                                                                                                                 | Patch $G_{\text{elem}} = 4 \text{dBi}$<br>$f_c$ dependence from $N_{x,y}$                                                           |  |  |
| $NF_{Rx}\left(f_{c} ight)$                               | $pprox 1+0.5\sqrt{(f_c/10^9)}~[	ext{dB}]$                                                                                                                                                                                                                                                                                                                            | Empirical fitting of best published data, e.g. see [43]                                                                             |  |  |
| $L_{sw}\left(f_{c}\right)$                               | $pprox 0.05 + 0.25 \sqrt{(f_c/10^9)} ~[	ext{dB}]$                                                                                                                                                                                                                                                                                                                    | Empirical fitting of published data,<br>e.g. see [44]                                                                               |  |  |
| $L_{feed}\left(f_{c} ight)$                              | $b \cdot \underbrace{\left(\frac{d_x + d_y}{2}\right)}_{lember {lemsth}} \cdot \left[\frac{\frac{\alpha_{\text{metal}}}{R_s\left(f_c\right)}}{\pi Z_0 W_f} + \underbrace{\left(\frac{\varepsilon_e - 1}{\varepsilon_r - 1} \cdot \frac{\varepsilon_r}{\varepsilon_e}\right) \frac{\tan \delta}{\lambda_{eff}\left(f_c\right)}}_{lember {lemsth}}\right] [\text{dB}]$ | (Worst-case feed<br>length)×(200 $\mu$ m-wide microstrip<br>line loss/ unit length),<br>$b \triangleq 20\pi \log_{10} (e^1)$ , [29] |  |  |
| $L_{bump}\left(f_{c}\right),\\L_{via}\left(f_{c}\right)$ | $pprox 0.01 	imes \left( f_c/10^9  ight)$ [dB]                                                                                                                                                                                                                                                                                                                       | Conservative EM simulation, see e.g. [45]                                                                                           |  |  |
| $L_{misc}$                                               | $L_{pol} + L_{DOA} + L_{body}$ + Implementation Margin<br>3dB + 3dB + 4dB + 5dB = 15 [dB]                                                                                                                                                                                                                                                                            | $L_{pol}$ : polarization mismatch<br>$L_{DOA}$ : $\Psi$ in Fig. 1<br>$L_{body}$ : loss due to proximity to<br>human body            |  |  |

Table 2.1: Expressions and chosen values for various variables used in (2.1), Fig. 2.1, and Fig. 2.2. Reprinted from [1].

bandwidth, and physical antenna area, Fig. 2.3(a) shows the 28GHz band provides *power* savings over 5–6GHz despite lower transmitter PAE and higher propagation loss. To a first order, this may be understood in the blue shaded region in Fig. 2.3(a) from the frequency dependence of PAE in (2.2), and of path loss  $L_{path,A}$  and antenna array gain  $G_{array,A}$  (see

Fig. 2.2) expressed as absolute ratios:



Figure 2.3: (a) Scatter of best published data and fitted trendlines for PAE  $(f_c)$  and  $P_{Tx,dc}(f_c)$ , (b) required average  $P_{out}$  per element vs.  $f_c$  for 64-QAM at different  $BW_{sig}$ , and (c) required average  $P_{out}$  per element vs. LOS range at 30GHz for QPSK and 64-QAM at different  $BW_{sig}$ . Reprinted from [1].

$$\frac{P_{Tx,rf}\left(f_{c}\right)}{\text{PAE}\left(f_{c}\right)} \propto \left[\frac{L_{path,A}\left(f_{c}\right)}{G_{\text{array},\text{Tx,A}}\left(f_{c}\right)G_{\text{array},\text{Rx,A}}\left(f_{c}\right)}\right] \cdot \frac{1}{\text{PAE}\left(f_{c}\right)} \sim \left[\frac{f_{c}^{2}}{f_{c}^{2} \times f_{c}^{2}}\right] \cdot \frac{1}{\left(1/f_{c}\right)} \approx \frac{1}{f_{c}}.$$
(2.3)

Relation (2.3) highlights the impact at mm-Wave frequencies of using directive arrays in both UE and AP to avoid path loss incurred at lower  $f_c$  values; where size constraints dictate fewer antenna elements. That is, the combined transmitter and receiver antenna array gains for 2.4–15GHz in Fig. 2.3(a) is insufficient to compensate for the negative effects of increasing  $L_{path}$  and decreasing PAE on  $P_{Tx,dc}$ . Also, the red shaded region in Fig. 2.3(a) shows that  $L_{FE}$  and  $NF_{R_x}$  in (2.1) may limit this benefit of arrays at higher mm-Wave frequencies. Therefore, the above semi-empirical analysis shows the 28GHz band is a viable choice for wideband, low-power 5G systems.

### 2.2.2 Output Power Requirements

The most demanding anticipated uplink scenario uses a 250MHz-wide 64-QAM OFDM signal. As part of on-going 28GHz 5G developments, the early effort in [7] aimed to understand practical transmission range and  $P_{Tx,DC}$  limits for this very challenging case. Considering realistic size limitations and signal losses as in the above analysis, Fig. 2.3(b) shows required average  $P_{out}$  at 30GHz per UE element in the URA of Fig. 2.2(b) to achieve 10–150m LOS range. A 20–30m LOS range requires an average  $P_{out}$  of 5–7.5dBm for a 250MHz-wide, 64-QAM signal. Alternatively, using a 16-element UE array as assumed in [45] translates in the above analysis to≈50m LOS range at average  $P_{out} \approx 7$ dBm. Lowpower 5G use cases requiring longer range can employ lower modulation orders down to e.g. QPSK as shown in Fig. 2.2(b).

A point worth noting is that EVM, i.e. in-channel signal quality, is the main linearity specification that needs to be met by PA design for 5G phased arrays. Adjacent channel

power ratio (ACPR) is less important due to higher spatial selectivity at mm-wave frequencies relative to the low-GHz range [21]. In a sub-6GHz system, adjacent channel leakage power of one user can block (degrade sensitivity to) the received signal from an adjacentchannel user because the AP antenna cannot spatially separate users. On the other hand, in the current context of mm-wave phased arrays, users with large enough spatial separation are served by separate beams from an AP. Typically, there is enough side-lobe attenuation in the AP antenna array (at least 10 dB) to prevent one user from blocking another. If the coverage sector of two users coincides so they are both served by the same beam from an AP, the interference can be mitigated by reducing bandwidth allocated to each user and/or separating their carrier frequencies. Statistically, such a coincidence may be a rare event, but this should be confirmed at the system level using, e.g. simulations.

The remainder of this paper expands on concepts and experimental results in [7] based on the PA specifications in Table 2.2.

| PA Circuit Specification                                                | Value [unit]                         | Comment                                                                                |  |  |
|-------------------------------------------------------------------------|--------------------------------------|----------------------------------------------------------------------------------------|--|--|
| Signal Format<br>RF Bandwidth $BW_{sig}$<br>PAPR                        | 64-QAM OFDM<br>250 [MHz]<br>9.6 [dB] | Most challenging 5G<br>mm-Wave uplink use<br>case                                      |  |  |
| Required Transmitter<br>EVM, i.e. EVM <sub>req</sub>                    | -25 [dBc]                            | $3 dB margin from SNR_{sig} = 22 dB for 64-QAM detection$                              |  |  |
| Required average $P_{out}$<br>@ EVM <sub>req</sub> , i.e. $P_{out,req}$ | 7.0 [dBm]                            | $\approx$ 100m estimate from<br>relatively optimistic link<br>budget based on e.g. [4] |  |  |
| Small-signal Gain $G_{ss}$                                              | ≥15.0 [dB]                           | To avoid degrading<br>transmit chain d.c.<br>power consumption [9]                     |  |  |

Table 2.2: Summary of targeted PA circuit specifications from [7]. Reprinted from [1].

### 2.3 Output Stage Optimization Methodology

The output stage should dominate overall linearity and efficiency by design, so it is carefully optimized. Using a parameterized output stage, the goal of this section is to determine transistor size and bias point to maximize PAE for the  $P_{\rm out,req}$  and  $\rm EVM_{req}$  requirements in Table 2.2.

### 2.3.1 Parameterized Output Stage



Figure 2.4: Parameterized output stage circuit for optimization: (a) power cell layout, (b) parameterized output stage circuit. Reprinted from [1].

The power cell layout in [15] was adapted to the 1P7M 28nm CMOS process; unit cell size is  $W_{\text{unit}}/L_{\text{unit}} = 32$  fingers  $\times 1\mu\text{m}/28\text{nm}$ , and its layout is illustrated in Fig. 2.4(a). Each scalable nMOS device in the neutralized push-pull stage of Fig. 2.4(b) is constructed as  $m \times$  the RC-extracted unit cell, so  $W_{\text{nMOS}} = m \times W_{\text{unit}}$ .  $C_n$  is chosen to maximize reverse isolation (i.e. minimize  $s_{12}$ ) of the core stage for greatest stability as in e.g. [15]. Ideal baluns and single L-section LC-tuners present variable differential mode terminations at the fundamental frequency. Higher odd-order harmonic terminations were unconstrained during optimization for consistency of simulation with the limited degrees of freedom in subsequent transformer-based implementation. That is, low-order networks were

acceptable in this mm-Wave design to minimize insertion loss at the fundamental. The common mode at input (output) is terminated in an ideal voltage source for gate-to-source (drain-to-source) d.c. bias  $V_{GS}$  ( $V_{DD}$ ), i.e. an a.c. short in simulation. For termination of the higher even-order harmonics in the implementation, large on- and off-chip bypass capacitors are used to approximate this simulated a.c. short at the supply node. Gate bias node termination is discussed further in Section 2.6.4. Matching network insertion losses are expected to vary with ( $W_{nMOS}, V_{GS}$ ) due to the varying impedance transformation ratios needed. Correctly modeling matching network losses across the design space is challenging; since electromagnetic (EM) simulations are needed, while simpler models can be inaccurate/misleading. Therefore, to avoid ambiguous selection criteria for ( $W_{nMOS}, V_{GS}$ ), *LC*-tuner loss is omitted. Finally, bulk and source terminals are grounded.

# 2.3.2 Optimization Procedure

With  $W_{nMOS}$  (i.e. m) and  $V_{GS}$  as the two independent variables, overlaid contours of average  $P_{out}$  and of PAE, *plotted at fixed EVM*, form the output stage transistor design chart. The chart is created using two steps at each ( $W_{nMOS}, V_{GS}$ ), corresponding to Fig. 2.5(a) and (b), respectively:

- 1. Load-pull: Optimal termination  $\Gamma_{L,opt}(W_{nMOS}, V_{GS})$  is defined to maximize PAE at  $P_{out,req}$ . A continuous wave (CW) signal at  $P_{in}(V_{GS}) = P_{out,req} - G_{max}(V_{GS})$ makes  $P_{out} \approx P_{out,req}$ , where  $G_{max}(V_{GS})$  is the maximum available gain of the stage (independent of m). This approximate enforcement of the desired output power during load-pull simulation results in some degradation of PAE in the next step, which is accepted to simplify the procedure.
- 2. *EVM Simulation:* A behavioral amplitude-to-amplitude (AM-AM) and amplitudeto-phase (AM-PM) modulation conversion model for the terminated stage is extracted from CW signal input power sweep simulation [61]. A sweep of 64-QAM



Figure 2.5: The two steps in the optimization procedure: (a) Step (1) - load-pull, (b) Step (2) - EVM simulation. Reprinted from [1].

OFDM  $P_{in}$  to this model generates PAE/EVM versus  $P_{out}$  characteristics. Note that this modeling approach ignores potentially relevant circuit-level memory effects as explained in Section 2.6.

By interpolating PAE/EVM characteristics, the design chart is plotted in Fig. 2.6 for EVM = -27 dBc (2dB margin from  $EVM_{req}$ ). EVM versus  $P_{out}$  slope is 2dB/1dB if third order intermodulation  $IM_3$  dominates; 2dB margin maintains average  $P_{out}$  at  $EVM_{req}$  after adding  $\approx 1 dB$  loss in realization of  $\Gamma_{L,opt}$ . Balun measurements confirmed 1dB loss is reasonable (Section 2.5.2).



Figure 2.6: Output stage transistor design chart for  $V_{DD} = 1$ V: PAE (shaded) and average 64-QAM OFDM  $P_{out}$  (line) contours plotted at an EVM of -27dBc, i.e. at a 2dB margin from EVM<sub>req</sub>; design choice indicated by a circle, and inset shows correspondence between  $V_{GS} - V_t$  on x-axis and bias current density  $J_{PA}$ . Reprinted from [1].

## 2.3.3 Optimization Results

From Fig. 2.6, PAE (shaded) decreases with increasing  $V_{GS} - V_t$  as would be expected. For  $P_{out}$  contours (lines), many effects interact to produce the behavior in Fig. 2.6. With this complexity in mind, two limiting scenarios are identified, and their intuitive interpretations are offered below:

• For fixed  $W_{nMOS}$ , as  $V_{GS} - V_t$  increases,  $P_{out}$  contours approach being horizontal. Horizontal contours indicate  $P_{out}$  is limited by current clipping at constant  $W_{nMOS}$ . Given a limited impedance transformation ratio in the output match realization, and without a guide like Fig. 2.6, one conventionally chooses minimum  $W_{nMOS}$  for  $P_{sat} \approx P_{out,req} + PAPR$ ; to avoid current clipping and *indirectly* satisfy  $EVM_{req}$ . PAE is suboptimal over  $V_{GS} - V_t = 100-200$ mV, where approximately horizontal  $P_{out}$  contours justify the conventional choice. For fixed V<sub>GS</sub> - V<sub>t</sub>, as W<sub>nMOS</sub> increases, P<sub>out</sub> contours become increasingly vertical, indicating P<sub>out</sub> no longer increases. Larger W<sub>nMOS</sub> corresponds to larger (nonlinear) intrinsic gate capacitance, and results in greater AM-PM conversion [13, 62], i.e. lower P<sub>out</sub>. Also, a small, class-C-like conduction angle limits P<sub>out</sub> through limiting P<sub>sat</sub> if V<sub>GS</sub> - V<sub>t</sub> is small.

A reasonable compromise between  $P_{out}$ /PAE of 6.5dBm/25% is reached by setting m = 12 and  $V_{GS} - V_t = -150$  mV. The region surrounding this design point is highlighted with a circle in Fig. 2.6 However, a device of  $W_{\rm nMOS}=384\mu{\rm m}$  in sub-threshold has a large ( $\approx 330$  fF) and strongly nonlinear gate-to-source capacitance  $C_{gs}$  [9]. Furthermore, neutralization increases unloaded quality factor of device input impedance  $Q_u \approx 40$  [63], reducing attainable inter-stage matching bandwidth  $BW_{int}$  as the Bode-Fano limit dictates [52]. Small  $BW_{int}$  is undesirable for two reasons. First, it increases sensitivity to PVT and to mm-Wave modeling accuracy in the cascaded amplifier (Fig. 2.7(a)), e.g. lower gain through relative detuning among stages. Second, excessive AM-PM conversion and driver stage load modulation both occur with increasing  $P_{out}$  as the center frequency of inter-stage matching shifts to lower values due to  $C_{qs}$  nonlinearity (Fig. 2.7(b)). Proposed in [7], inductive degeneration in the output stage mitigates the mentioned issues that result from sub-threshold bias at a current density on the order of  $10\mu A/\mu m$  using a relatively small source inductance. A seemingly similar use of the technique by [64] was different from this work, as the purpose there was to boost  $P_{1dB}$  by countering AM-AM conversion for class-A-like biasing at a much higher  $\sim 250 \mu A/\mu m$  current density.

### 2.4 Inter-stage Impedance Matching

In this section, inductive degeneration is shown to broaden  $BW_{int}$  and thus enable embedding of the optimized output stage of Section 2.3 in a cascaded mm-Wave transformercoupled PA.



(a)



Figure 2.7: Issues in using m = 12,  $V_{GS} - V_t = -150$ mV optimization result: (a) Cascaded amplifier frequency response overly sensitive to PVT and modeling accuracy, (b) AM-PM conversion and DA stage load modulation due to  $C_{gs}$  nonlinearity. Reprinted from [1].

## 2.4.1 Physical Circuit Operation

Single-tuned transformer-based inter-stage matching is analyzed in differential mode starting from the single-ended equivalent in Fig. 2.8(a), and simplified further in Fig. 2.8(b) and (c). The circuit in Fig. 2.8(c) is a *T*-model of two magnetically-coupled *LC*-resonators [65], i.e.  $L_{1,2}$  with coupling coefficient  $k_m$ , and capacitors  $C_{1,2}$ , having two unloaded





Figure 2.8: Single-ended model of differential-mode inter-stage matching: (a) circuit, (b)  $1^{st}$  simplification (c) simplified model. Reprinted from [1].

resonance frequencies [66]:

$$\omega_{1,2} = \frac{\sqrt{2}}{\sqrt{\left(L_1 C_1 + L_2 C_2\right) \pm \sqrt{\left(L_1 C_1 - L_2 C_2\right)^2 + 4C_1 C_2 L_m^2}}},$$
(2.4)

where  $k_m = L_m/\sqrt{L_1L_2} = k$ ,  $L_m = L_{mag}$ ,  $L_1 = L_{leak} + L_{mag}$ ,  $L_2 = L_{ser} + L_{mag}$ ,  $C_1 = C_D$ , and  $C_2 = C_{gs}$ . |k| is the ratio of coupled to stored magnetic energy in the resonators, and tighter coupling increases  $\Delta \omega \triangleq \omega_1 - \omega_2$ , [66]. For  $k \gtrsim 0.75$ , series  $L_{leak}$  and  $L_{ser}$  are

relatively small,  $\omega_2 \gg \omega_1$ , so in-band behavior resembles parallel resonance at  $\omega_0 \approx \omega_1$ . As will be shown mathematically in Section IV-B, in this range of  $k \gtrsim 0.75$ ,  $BW_{int}$  can be increased by reducing k, or by increasing series resistive loading. Reducing k is avoided because it only weakly increases  $BW_{int}$ , and because it also increases transformer insertion loss [67]. Emulating resistive loading by using  $L_{deg}$  is analyzed in the next section.

### 2.4.2 Driver Load Impedance

The input impedance to the right of a point X (X=A–E) in Fig. 2.8(c) is denoted  $Z_X$ , e.g. the input impedance at the gate of the transistor is:

$$Z_A \approx R_{ser} + (g_m L_{deg}/C_{gs}) + sL_{deg} + 1/sC_{gs}, \qquad (2.5)$$

where  $R_{ser}$  is the resistance in series with the gate. Inductive degeneration emulates a resistor  $(g_m L_{deg}/C_{gs})$  without dissipating power, where  $g_m$  and  $L_{deg}$  are transconductance and degeneration inductance, respectively [10]. The driver stage load  $Z_E$  can be expressed as a rational function, and its denominator as  $D(s) = [(s^2/\omega_1^2) + (s/\omega_1Q_1) + 1] \cdot$  $[(s^2/\omega_2^2) + (s/\omega_2Q_2) + 1]$  to reflect the dual-resonance nature of the circuit. The resonant frequencies  $\omega_{1,2}$  are defined in (2.4), and  $Q_{1,2}$  are their associated quality factors. For the relevant in-band resonance:

$$Q_{1} = \frac{(\omega_{1}^{2}/\omega_{2}^{2} - 1)}{[\omega_{1}^{2}C_{D}\left(L_{leak} + L_{mag}\right) - 1]} \cdot \left[\frac{1}{\omega_{1}\left(C_{gs}R_{ser} + g_{m}L_{deg}\right)}\right].$$
 (2.6)

With insight from Section 2.4.1, and since  $Q_1 \propto 1/L_{deg}$  in (2.6),  $L_{deg}$  broadens  $BW_{int}$ ; i.e. lowers sensitivity of  $Z_E$  to component tolerances and nonlinear  $C_{gs}$  variation with signal power. As an example, starting from a differential load  $C_{gs} = 165$  fF (i.e. 330/2) of  $Q_u \approx 40$ so differential  $R_{ser} \approx 0.9\Omega$ , and targeting  $Z_E = (100 + j0) \Omega$ , the Smith chart matching trajectory at resonance (28GHz) is shown in Fig. 2.9(a) for  $L_{deg} = 0$ , and in Fig. 2.9(b) for  $L_{deg} = 28$  pH (i.e.14×2). In Fig. 2.9(a), greater insertion loss is expected (smaller  $L_{mag}$ ), and larger silicon area (much larger  $L_{ser}$ ), compared to Fig. 2.9(b). Using a complete dual-resonance expression for  $Z_E$ , independent ±10% Gaussian variations in  $C_{gs}$  and  $C_D$ result in the  $Z_E$  scatter plots in Fig. 2.9(c) and (d), corresponding to Fig. 2.9(a) and (b), respectively.  $Z_E$  is less sensitive for  $L_{deg} = 28$  pH.



Figure 2.9: Smith chart trajectories for inter-stage matching to present  $(100 + j0) \Omega$  differentially to DA: (a)  $L_{deg} = 0$ , (b)  $L_{deg}=28$ pH (i.e. 14pH single-ended), and scatter of  $Z_E$  due to independent Gaussian variations ( $\pm 3\sigma \equiv \pm 10\%$ ) in  $C_{gs}$  and  $C_D$ ; -13dB return loss region w.r.t  $(100 + j0) \Omega$  target indicated with circle: (c)  $L_{deg} = 0$ , and (d)  $L_{deg}=28$ pH. Reprinted from [1].

## 2.4.3 Effect of $L_{deg}$ on Power Capability

To verify that inductive degeneration does not adversely affect transistor power capability, the two single-ended test transistors with and without  $L_{deg}$  in the die photos of Fig.2.10(a) and (b), respectively, were fabricated on the same test chip as the PA presented in this paper. The source node of the single-ended device in Fig.2.10(b) is connected



Figure 2.10: Micrographs of  $12 \times 32 \times 1 \mu m/28$ nm transistor test structures used in power capability verification: (a)  $L_{deg} = 0$ , (b)  $L_{deg} = 14$ pH single-ended. Reprinted from [1].

Table 2.3: Comparison of 29GHz CW load-pull measurement results for single-ended transistor test structures: (a)  $L_{deg} = 0$ , and (b)  $L_{deg} = 14$ pH single-ended. Reprinted from [1].

|                             | Fig. 10(a)                             | Fig. 10(b)     |
|-----------------------------|----------------------------------------|----------------|
| $W_{\rm nMOS}(\mu {\rm m})$ | $12 \times 32 \times 1 \mu \mathrm{m}$ | $12 \times 32$ |
| $L_{deg}(pH)$               | 0                                      | 14             |
| $I_{dc}$ (mA)               | 8.1                                    | 8.1            |
| $P_{out}$ (dBm)             | +12                                    | +12            |
| Gain @ Pout (dB)            | 10                                     | 8              |
| PAE @ P <sub>out</sub> (%)  | 48                                     | 44             |

to one terminal of the degeneration inductance ( $7\mu$ m-wide slab on the ultra-thick metal layer), and a wide, stacked metal mesh is connected to the other terminal of the inductor. The device in Fig.2.10(a) does not have any source inductor, so its source node is directly connected to the ground mesh. Load-pull measurements using a 29GHz CW signal were carried out, and the results are summarized in Table 2.3 for 1dB compression. A slightly more inductive load impedance is used for the  $L_{deg} = 0$  case. The chosen 14pH of inductive degeneration lowered device power gain at +12dBm  $P_{out}$  from 10dB to 8dB, and reduced its PAE at the same +12dBm  $P_{out}$  from 48% to 44%. The observed 2dB drop in power gain contributes to this PAE degradation. Extra care was exercised to minimize the ground path impedance for the test device of Fig. 2.10(b) by using a wide and stacked metal mesh surrounding the device for grounding. However, it is still possible that unwanted loss resistance in series with the 14pH  $L_{deg}$  also contributes to the PAE degradation. The chosen  $L_{deg}$  did not degrade power capability in any significant way.

# 2.4.4 Effect of $L_{deg}$ on Gain and Distortion

First, the AM-PM due to the inter-stage matching by itself is briefly studied. Transformer voltage gain  $A_v(\omega) \triangleq V_A/V_E = |A(\omega)| e^{j\varphi(\omega)}$  contributes AM-PM conversion due to shifting  $\omega_0$  with signal level (see Fig. 2.7(b), and [13, 62]). Mathematically, this contribution is  $\propto (d\varphi/dV_A) \approx (d\varphi/d\omega|_{\omega_0}) \cdot (d\omega_0/dC_{gs}) \cdot (dC_{gs}/dV_A)$ , where the chain rule of derivatives was used. Resonator quality factor is *defined* as  $\frac{1}{2}\omega_0 (d\varphi/d\omega|_{\omega_0})$  [68], and therefore  $L_{deg}$  reduces AM-PM by lowering the in-band quality factor of  $A_v$ , which is identical to  $Q_1$  in (2.6).

We now turn to discuss the effects of  $L_{deg}$  on gain, and on the overall linearity of the output stage, including the inter-stage matching. It is well-known that negative feedback has the beneficial effect of reducing distortion, and the adverse effect of reducing gain, by factors that increase with the associated loop gain [10].  $L_{deg}$  is therefore expected to contribute to linearizing output stage AM-AM response. Additionally, Volterra series analysis of a differential class-AB bipolar stage [69] suggests  $L_{deg}$  can reduce AM-PM conversion in the effective *transconductance* of the stage. On the other hand, the inverse relation between power gain and size of source degeneration inductance observed in Table 2.3 has been analyzed in e.g. [21, 70]. Thus, the loop gain of the negative feedback by  $L_{deg}$  in this work cannot be chosen large to minimize distortion, since it is very important to minimize gain degradation, and hence minimize back-off PAE degradation in the output stage  $(g_m \times \omega_0 L_{deg} \lesssim 0.2$  for quiescent bias point).

To illustrate the improvement in linearity due to  $L_{deg}$ , the AM-AM/AM-PM response

of the output stage is simulated for different  $L_{deg}$  values in the set  $\{0, 5, 15, 25, 50\}$ pH to cover zero, small, moderate, and heavy degeneration. The differential version of the inter-stage matching network topology of Fig. 2.8 is used, with ideal/lossless components. No losses were included in the matching components to avoid influencing effective  $R_{ser}$  and therefore  $Q_1$  in (2.6). The input source has  $120\Omega$  output impedance, and it models the driver stage signal at the same 28GHz frequency. Note that the AM-AM/AM-PM conversion characteristic is sensitive to any slight detuning between input CW frequency and the matching center frequency for lossless components, so the matching network is redesigned at each  $L_{deg}$  to maintain ~30dB return loss relative to  $120\Omega$  with a *fixed* 28GHz center frequency. Loss of the output match, and RC-extracted transistor layout are included in the simulation, and results are shown in Fig. 2.11.

It can be seen in Fig. 2.11 that even the smallest degeneration of 5pH contributes to improvement in AM-AM and AM-PM: the gain expansion is reduced by 0.7dB and the sharp increase in lagging AM-PM is slowed down by >10 degrees at the input power for 3dB of compression relative to small-signal gain. AM-AM/AM-PM conversion are also significantly reduced with larger  $L_{deg}$ , until the improvement saturates at the largest simulated value. At 50pH, the gain expansion is 0.2dB and the AM-PM is <1 degree up to 3dB compression. In this sweep of  $L_{deg}$ , the small-signal gain of the output stage drops from 17.5dB at  $L_{deg} = 0$ , to 5.8dB at  $L_{deg} = 50$ pH, while at 15pH, the small signal gain of the stage is 10.3dB. The design value is 14pH and the gain is  $\approx$ 11dB so that additional gain degradation cannot be tolerated to further improve linearity.

### 2.5 Circuit Implementation

Using the developed concepts, a 28GHz two-stage transformer-coupled PA is designed in a 1P7M 28nm CMOS technology, having ultra-thick metal (copper, UTM) and redistribution (aluminum, RDL) layers. The circuit is shown in Fig. 2.12(a): both stages use



Figure 2.11: Simulated AM-AM/AM-PM response of the output stage using re-designed lossless input matching for each  $L_{deg} \in \{0, 5, 15, 25, 50\}$ ]pH at  $W_{nMOS} = 12 \times 32 \times 1 \mu m$  and  $J_{PA} = 12 \mu A / \mu m$ : (a) AM-AM response, and (b) AM-PM response. Reprinted from [1].



Figure 2.12: Schematic of PA circuit: (a) two-stage transformer-coupled topology, (b) push-pull stage with capacitive neutralization capacitor  $C_n$  and single-ended source degen-

eration inductor  $L_{deg}$ . Reprinted from [1].

the same topology, but different criteria for parameter/element values. Matching network design is based on single-tuned transformers. Table 2.4 gives a summary of the key design values, while concepts unique to this implementation relative to published considerations

|                                                    | Output<br>Stage (PA) | Driver<br>Stage (DA) |
|----------------------------------------------------|----------------------|----------------------|
| $W_{\rm nMOS}/L_{\rm nMOS}[\mu {\rm m/nm}]$        | $12 \times 32/28$    | $6 \times 32/28$     |
| Single-ended $C_{input}$ [fF]                      | 330                  | 165                  |
| $C_n$ [fF]                                         | 67                   | 34                   |
| Single-ended $L_{deg}$ [pH]                        | 14                   | 31                   |
| Bias Current Density $J_{dc}$<br>[ $\mu A/\mu m$ ] | 12                   | 22                   |
| Differential $R_{\text{Load,diff}}$ [ $\Omega$ ]   | 49                   | 120                  |
| Differential Load Xfmr $L_{\text{self,diff}}$ [pH] | 135                  | 90                   |
| Gain [dB]                                          | 8                    | 9                    |

Table 2.4: Summary of design values for two-stage power amplifier. Reprinted from [1].

for high-efficiency mm-Wave PA layout [14, 15] are mentioned next.

## 2.5.1 Core Stages

Starting from the basic circuit of Fig. 2.4(b), with optimized loading, width, and bias from simulations in Section 2.3, introducing  $L_{deg}$  as in Section 2.4 completes the PA stage. Prioritizing PAE,  $L_{deg} = 14$ pH is *just* enough to effectively widen  $BW_{int}$  of inter-stage matching (Fig. 2.9) and to help reduce distortion. Adding  $L_{deg}$  results in some minor shift of the optimum  $W_{nMOS}$ ,  $V_{GS}$ , and  $\Gamma_{L,opt}$  relative to their chosen values in Section 2.3, but a re-design was not attempted. To realize this small  $L_{deg} = 14$ pH with minimal series resistance, and to comply with current density rules, the structure is drawn as two  $7\mu$ mwide slabs. The two UTM slabs form the yellow V-shape in the 3D model of Fig. 2.12(b). To account for added magnetic and capacitive coupling, electromagnetic (EM) simulations for  $L_{deg}$  design include both gate and source routing in close proximity (gray traces and red 'forks' in Fig. 2.12(b), respectively). The two lowest metal layers M1 and M2 form a stacked ground plane.  $L_{deg}$  conducts the differential mode currents, and has its center tap tied to M1–M2. Thus, the M1–M2 plane provides a predictable path/impedance for the common mode, minimizes parasitic d.c. voltage drop, and grounds transistor bulks.

Metal-oxide-metal (MOM) capacitors with a nominal value that maximizes core reverse isolation are used to implement  $C_n$  in Fig. 2.12(a). The layout uses 1µm-long fingers tied to a tapered manifold, and reduced substrate doping density below the capacitor (native layer), to help reduce the series and parallel resistive losses, respectively. The measured capacitance and quality factor of  $C_n$  in the PA stage are shown vs. frequency in Fig. 2.13(a). The nominal design value of  $C_n$  is 67fF, while the measured value is ≈64fF from Fig. 2.13(a). Also, Fig. 2.13(b) shows that measured Q translates to >1.5K $\Omega$  of shunt-equivalent resistance up to 40GHz. Therefore, the impact of the measured capacitor loss on  $P_{out}$ /PAE is not major.

Driver amplifier (DA) design targets are sufficient power gain and minimal influence on cascaded amplifier linearity. Accordingly, DA transistor width is half that in PA stage to avoid DA-limited saturation. Since the PA stage is biased in sub-threshold, class-A-like biasing must be avoided in the DA stage to maintain back-off PAE, degrading DA gain. Further degradation of gain results from a relatively large  $L_{deg}$  required for input matching to the 50 $\Omega$  driving impedance dictated by on-wafer probe testing. On the other hand, two measures help to partially recover DA gain. First, although still in sub-threshold, a bias current density of  $J_{DA} \approx 2 \times J_{PA}$  is chosen to increase gain, where  $J_{PA}$  ( $J_{DA}$ ) is the current density in the PA (DA) stage. Second, the targeted shunt load resistance at resonance is 120 $\Omega$  differentially for the DA, i.e. > 2 × 49 $\Omega$ , where 49 $\Omega$  is the corresponding value for the PA stage. Some additional gain improvement is possible if the amplifier is integrated into a front-end. In this case, an on-chip circuit drives the DA, so a driving impedance >50 $\Omega$  could be chosen. Finally, layout considerations described for the PA stage apply to the DA stage, but the larger  $L_{deg}$  is realized by a differential two-turn spiral.



Figure 2.13: Measured MOM neutralization capacitor characteristics: (a) capacitance and quality factor, (b) shunt-equivalent loss resistance calculated from capacitance and quality factor. Reprinted from [1].

## 2.5.2 Transformers

Transformers are implemented as vertical/broadside-coupled concentric spirals [65], and a few design considerations are briefly mentioned here. First, a ground plane is included in both transformer EM models and corresponding implementations to consistently define common mode signal path, i.e. improve predictability. Close proximity to a ground plane lowers inductor self-resonance frequency, so a  $\approx 30\mu$ m radial clearance is allowed. EM models are generated up to  $4-6\times$  the center frequency of the amplifier so they can be reasonably used in linearity simulations. To facilitate choice of transformer radius based on target inductance, it helps to de-embed the effect of parasitic capacitance by using an equivalent lumped circuit extraction. De-embedding the parasitic capacitance is accomplished by fitting the broadband EM simulation data to a lumped  $2\pi$ -prototype like that in [14] via numerical optimization. Maximum s-parameter fitting error is typically 1-5% (magnitude) and  $3-5^{\circ}$  (phase).

Minimizing losses, as well as deviation from  $\Gamma_{L,opt}$  in loading of the PA stage have a critical effect on  $P_{out}$ /PAE. The output matching balun XF in Fig. 2.12(a) is fabricated as a separate test structure (without center taps) as shown by its micrograph in Fig. 2.14(a). Measured and EM simulated self inductances and quality factors of the primary/secondary windings are shown in Fig. 2.14(b), with <10% error in inductances. The measured and simulated maximum available gain (MAG) of XF in differential mode are also overlaid in Fig. 2.14(c); showing good estimation of its loss – 0.58dB simulated and 0.72dB measured minimum insertion loss at 30GHz.

### 2.6 Experimental Results

The PA is fabricated in 1P7M 28nm CMOS LP. The die micrograph is shown in Fig. 2.15, and core dimensions are  $0.62 \times 0.25 \text{mm}^2$ . A d.c. probe provides bias/supply voltages to each of the two stages using separate pads for diagnostics. RF performance is characterized using on-wafer probing at bias current densities  $\{J_{PA}, J_{DA}\} = \{12, 22\} \mu \text{A}/\mu \text{m}$  (unless otherwise stated).

## 2.6.1 Measured Data

Measured small-signal s-parameters are shown in Fig. 2.16(a), with peak  $s_{21}$  of 15.7dB, and a -3dB bandwidth of 3.85GHz (27.35–31.2GHz), centered around  $f_0 = 29.25$ GHz. Input return loss at  $f_0$  is better than 20dB and remains better than 10dB over 28–31.35GHz. The PA is well-tuned to the target band overall.

CW signal  $P_{in}$  sweep results are measured across 27–31GHz with 1GHz step. Peak CW signal performance is at 30GHz, and is shown in Fig. 2.16(b) with  $G_{ss}$  =15.7dB,  $P_{sat}$  = 14dBm,  $P_{1dB}$  =13dBm, PAE<sub>max</sub>=35.5%, and PAE at  $P_{sat}$  – 9.6dB of 10%.  $P_{sat}$ is defined as  $P_{out}$  at 3dB compression. Key large-CW-signal power and PAE metrics at saturation and back-off are plotted vs. frequency in Fig. 2.17(a).  $P_{sat}$  is >13.5dBm over 29–31GHz, and PAE<sub>max</sub> is >32% over 28–31GHz. PAE at  $P_{sat}$  – 9.6dB is >8% over 27–31GHz. Saturated metrics at 30GHz are plotted vs.  $V_{DD}$  over 1.0–1.15V in Fig. 2.17(b); nominal  $V_{DD}$  for this 28nm process is 1.05V. A lower 1V supply is used for reliability concerns, mainly due to hot carrier injection (HCI). Other potential nMOS transistor degradation mechanisms such as time-dependent dielectric breakdown (TDDB) are less of a concern in the implemented nMOS-only circuit [15,71].

The setup in Fig. 2.18 is used to measure average  $P_{out}$ , EVM, and PAE for a 64-QAM OFDM signal of 9.6dB PAPR, across an  $f_c$  range of 28–30GHz, for  $BW_{sig} = \{150, 250\}$ MHz, i.e.  $\{0.9, 1.5\}$ Gbps data rate. Careful manual tuning of I-Q channels at each  $f_c$  corrects for baseband digital swing imbalance/symbol clock errors (tuned in M8190), and for RF I-Q amplitude/phase errors (tuned in E8267D). Using a thru element (from impedance standard substrate) in place of the PA device under test (DUT), the measured EVM floor of the setup is  $\{-38, -36\}$ dBc for  $BW_{sig} = \{150, 250\}$ MHz, i.e. has  $\geq 11$ dB of margin from EVM<sub>req</sub> across 28–30GHz.

Best measured output spectrum, adjacent channel power ratio (ACPR), and constellation for 1V supply at  $BW_{sig}$  of 250MHz and  $f_c = 30$ GHz are shown in Fig. 2.19. Average  $P_{out}$ /PAE of 4.2dBm/9% are achieved at EVM<sub>req</sub> = -25dBc. Fixing all parameters, and lowering  $BW_{sig}$  to 150MHz, average  $P_{out}$ /PAE increase to 5.2dBm/11%. Summaries of measured 64-QAM OFDM performance vs.  $f_c$  for 1V supply, and vs.  $V_{DD}$  at 30GHz, are shown in Fig. 2.20(a) and in Fig. 2.20(b), respectively, all at constant EVM (-25dBc). Average  $P_{out}$  is  $\approx 1$ dB higher for  $BW_{sig}$  of 150MHz than for 250MHz, independent of  $f_c$  (Fig. 2.20(a)), and of  $V_{DD}$  (Fig. 2.20(b)).

Measured AM-AM/AM-PM characteristics at 30GHz are shown in Fig. 2.21 for three example  $J_{PA}$  values. Two key features of AM-AM/AM-PM data are  $P_{1dB}$ , and maximum |AM - PM| relative to small-signal for  $P_{out} \leq P_{1dB}$ ; they are plotted vs.  $J_{PA}$  in Fig. 2.22(a) and (b), respectively. AM-AM is measured with -10dB/-20dB input/output directional couplers. Coupled ports feed two power sensors so  $P_{in}/P_{out}$  are concurrently measured by a two-channel power meter. Simultaneously, relative insertion phase is measured using a network analyzer connected to the PA via coupler thru ports. AM-PM is reported as network analyzer insertion phase reading vs. power meter reading. Instrument noise affects the data due to weak coupled port outputs at low signal power. Smoothing is used to reduce noise, and key data features are retained (somewhat conservatively for  $P_{1dB}$ ) as is evident visibly in Fig. 2.21, and quantitatively in Fig. 2.22.

#### 2.6.2 Comparison with State-of-the-art

Table 2.5 shows a comparison with state-of-the-art silicon mm-Wave PAs. We first compare to 60GHz bulk CMOS PAs [15, 17] in terms of well-reported CW benchmarks. Due to the lower 30GHz frequency, this work achieves better power gain per cascaded stage, despite a significantly lower d.c. bias current density. Also, comparable CW  $P_{sat}$  per combined PA path is achieved, which is attributed to similar supply voltages and 1:1 transformer output matching per path.

We now turn to compare measured 64-QAM signal performance with [17]. This work achieves significantly greater PAE for the same  $BW_{sig}/f_c$  and EVM, and at slightly higher average  $P_{out}$  per combined PA path. Note that [17] uses two-way combining and capacitance linearization.

CW PAE at  $P_{sat}$  – 9.6dB is used as previously in [7] to fairly compare back-off PAE at

-25dBc EVM for 64-QAM OFDM with publications that do not report this measurement. The SiGe BiCMOS PA of [49] incurs the d.c. current of only its single stage, and the SOI PA in [59] is a nonlinear class-E design with high PAE<sub>max</sub>. This work achieves comparable back-off PAE to them both. Also, despite lacking the additional bulk bias control per transistor segment inherent to the FD-SOI technology used in [20], this work achieves greater back-off PAE.

Overall, Table 2.5 shows that the implemented PA meets or exceeds the state-of-the-art.

## 2.6.3 Comparison with 5G Requirements

Average  $P_{out}$  at 150/250MHz  $BW_{sig}$  and  $EVM_{req}$  is 4.2/5.2dBm, supporting a range of 20–30m from Fig. 2.3(b). This short uplink range estimate is a result of the more practical link budget constraints in Section 2.2 than in [45]. Also, at wider 250MHz  $BW_{sig}$ , average  $P_{out}$  is  $\approx$  2dB less than the 6.5dBm predicted in Section 2.3.3, assuming the DA minimally impacts linearity.

Although computationally efficient, extracting AM-AM/AM-PM characteristics from CW signal power sweep simulations to model PA nonlinearity in EVM estimates as in Section 2.3 ignores relevant circuit memory effects, e.g. short term effects of limited bandwidth about  $f_0$  [72], and long term effects of low-frequency (bias network) impedance termination [73]. These memory effects can manifest experimentally as dependence of  $P_{out}$  at fixed EVM on  $BW_{sig}$ .

## 2.6.4 Discussion

To evaluate the AM-AM/AM-PM modeling method for EVM estimation, any mismatch between measured/simulated amplifier characteristics should be de-embedded. Accordingly, the same code used for 64-QAM OFDM signal power sweeps through simulated AM-AM/AM-PM models in Section 2.3 is applied to the *measured* AM-AM/AM-PM data from Fig. 2.21 (after smoothing). EVM is plotted vs. average  $P_{out}$  from this

|                                               | This<br>Work          | ISSCC'14<br>[10] | JSSC'13<br>[9] | ISSCC'15<br>[11]       | SiRF'14<br>[12]        | TMTT'14<br>[18] |
|-----------------------------------------------|-----------------------|------------------|----------------|------------------------|------------------------|-----------------|
| Technology                                    | 28nm<br>CMOS          | 40nm<br>CMOS     | 40nm<br>CMOS   | 28nm<br>UTBB<br>FD-SOI | 120nm<br>SiGe<br>BiMOS | 45nm SOI        |
| Frequency [GHz]                               | 30                    | 63               | 61             | 60                     | 28                     | 47              |
| Supply Voltage [V]                            | 1.0                   | $2 \times 0.9$   | 1.0            | 1.0                    | 3.6                    | 2.4             |
| Number of Stages                              | 2                     | 3                | 3              | 3                      | 1                      | 1               |
| Combined PA Paths                             | 1                     | 2                | 2              | 4                      | 1                      | 1               |
| Gain [dB]                                     | 15.7                  | 22.4             | 17             | 15.4                   | 15.3                   | 13              |
| $P_{sat}$ [dBm]                               | 14                    | 16.4             | 17             | 18.8                   | 18.6                   | 17.6            |
| $P_{1dB}$ [dBm]                               | 13.2                  | 13.9             | 13.8           | 18.2                   | 15.5                   | 11.3            |
| PAE <sub>max</sub> [%]                        | 35.5                  | 23               | 30.3           | 21                     | 35.3                   | 34.6            |
| PAE <sub>1dB</sub> [%]                        | 34.3                  | 18.9             | 21.6           | 21                     | 31.5                   | 17              |
| PAE @<br>P <sub>sat</sub> - 9.6d <b>B</b> [%] | 10                    | 5*               | 6.5*           | 7*                     | 10.5*                  | 9*              |
| $FOM^{\dagger}$                               | 74.7                  | 88.4             | 84.4           | 83.0                   | 78.3                   | 79.4            |
| Signal<br>PAPR [dB]                           | 64-QAM<br>OFDM<br>9.6 | 64-QAM<br>8.1    | _              | -                      | _                      | -               |
| EVM [dBc]                                     | -25                   | -25              | _              | -                      | -                      | _               |
| RF BW [MHz]                                   | 250                   | 500              | _              | _                      | -                      | _               |
| Pout @ EVM [dBm]                              | 4.2                   | 7                | _              | -                      | -                      | _               |
| PAE @ EVm [%]                                 | 9                     | 5                | -              | -                      | -                      | _               |
| D.C. Power [mW]                               | 17.5                  | 88               | 75             | 74                     | 61.2                   | _               |
| Active Area [mm <sup>2</sup> ]                | 0.16                  | 0.081            | 0.074          | 0.16                   | 0.45**                 | 0.12            |

Table 2.5: Comparison with state-of-the-art silicon mm-Wave PAs. Reprinted from [1].

\*Graphically estimated. \*\*With pads.  $^{\dagger}FOM \triangleq P_{sat} [dBm] + Gain [dB] + 10 \log_{10} (PAE_{max} [\%]) + 20 \log_{10} (Freq. [GHz])$ 

simulation in Fig. 2.23 for a  $J_{PA}$  range encompassing  $12\mu A/\mu m$ . Direct measurements of EVM vs. average  $P_{out}$  using the setup of Fig. 2.18 are overlaid on the same axes for  $BW_{sig} = \{150, 250\}$ MHz at  $J_{PA} = 12\mu A/\mu m$ . Intuitively, validity of an AM-AM/AM-PM model extracted from CW power sweeps improves as the  $BW_{sig}$  to amplifier bandwidth ratio decreases. Therefore, Fig. 2.23 may be interpreted as follows: as  $BW_{sig}$  falls from 250MHz, the measured EVM vs.  $P_{out}$  curve shifts to the right and approaches its most optimistic estimate from using measured CW AM-AM/AM-PM data in behavioral simulation, i.e.  $BW_{sig} \rightarrow 0$ .

Transient simulation of EVM at transistor-level can correctly capture memory effects [74], but it is prohibitively complex. Instead, simple two-tone simulations are used to investigate effect of signal bandwidth on third-order distortion in the PA.  $IM_3$  for tone spacing  $\Delta f = \{20, 75, 125, 150, 250\}$ MHz is shown in Fig. 2.24(a) and (b) for lower and upper sidebands, respectively.  $IM_3$  degrades for wider tone spacing, confirming the trend in Fig. 2.23. One exception is improving  $IM_3$  for 150–250MHz. In [75], it is shown that second-order distortion causes a low-frequency 'beat' that modulates the power supply, and [75] concludes that large off-chip bypass capacitors are needed to lower the supply node impedances over the beat frequency range. Similarly, [73] shows that second-order distortion causes a low-frequency beat that modulates input/gate bias. For  $V_{DD}$  nodes in our design, three bypass capacitors are used: one on-chip (20pF), one at the d.c. probing needle tip (120pF), and one on the PCB of the d.c. probe (10nF). In the gate bias networks, the same bypass capacitor values are used, but large on-chip blocking resistors were also included. Therefore, the EVM degradation with  $BW_{sig}$  in Fig. 2.23, and the similar initial  $IM_3$  degradation between 20–150MHz in Fig. 2.24 may be the result of excessively high gate bias network impedance. The unexpected improvement in IM3 between 150–250MHz in Fig. 2.24 is potentially due to limitations of using a two-tone test. For a two-tone input, the sub-harmonic beat is a *single-tone* that samples bias network impedance at only one point, i.e. spectrum is two impulses at  $\pm \Delta f$ . On the other hand, for 64-QAM OFDM input as in Fig. 2.23, the signal modulating the bias nodes has a continuous spectrum extending over  $\pm BW_{sig}/2$ . Therefore, a more complicated multi-tone signal simulation that modulates the bias nodes with tones spread across  $\pm BW_{sig}/2$  may

provide greater resolving power, and therefore better consistency with 64-QAM OFDM measurements at different  $BW_{sig}$  than a two-tone test.

Another notable point from Fig. 2.23, is that simulations predict non-monotonic EVM behavior over a  $P_{out}$  range that depends on bias point as seen for  $J_{PA} = 10\mu\text{A}/\mu\text{m}$ . This behavior is associated with inter-modulation nulls [8], but in the EVM context. Simulations for the higher  $J_{PA}$  values predict similar 'cancellation' at lower EVM levels than the shown range in Fig. 2.23. On the other hand, only a minor reduction in the slope  $\Delta \text{EVM}/\Delta P_{out}$  is reliably observed in the direct EVM vs.  $P_{out}$  measurements down to  $\approx$ 3dB above the EVM floor of the setup in Fig. 2.18. Less constrained by measurement floor, narrowband two-tone measurements at a tone spacing  $\Delta f = 20$ MHz, and the relatively high  $J_{PA} = 23.8\mu\text{A}/\mu\text{m}$  where AM-PM is minimal in Fig. 2.22(b), are shown in Fig. 2.25 across a 27–31GHz center frequency range. The two-tone results exhibit reduction in slope vs. average  $P_{out}$  like direct EVM measurements in Fig. 2.23, but no  $IM_3$  nulls down to  $\approx$ 10–12dB lower distortion levels over the same average  $P_{out}$  range. Therefore, the measured linearity performance of the implemented PA is not likely to be a result of sensitive inter-modulation distortion null effects.

#### 2.7 Conclusion

This paper showed that spectrum around 28GHz is a viable choice of carrier band for low-power, broadband 5G wireless UEs. Output power requirements that consider realistic size and RFIC- and integrated-antenna-module-related losses were derived for the most challenging anticipated 5G uplink use scenario. Subsequently, a PA output stage optimization methodology that tackles those demanding requirements was proposed. Introduced as a perturbation, a small source degeneration inductor enabled embedding of the optimized output stage into a two-stage PA, by broadening inter-stage bandwidth and helping to reduce distortion. Accordingly, a 28GHz band PA was fabricated in 28nm bulk CMOS and validated the presented concepts at state-of-the-art performance. For on-going 5G phased array PA developments, more broadband amplifier techniques, and efficient modeling of circuit memory effects in EVM estimation, e.g. [74], may help increase uplink transmission range, while aiming to maintain the demonstrated high power and spectral efficiencies.





(c)

Figure 2.14: Output balun characterization: (a) test structure micrograph, (b) inductances and quality factors (c) differential mode maximum available gain (i.e.  $\equiv$ transformer efficiency). Reprinted from [1].



Figure 2.15: Die micrograph of fabricated two-stage PA. Reprinted from [1].



Figure 2.16: Small- and large-signal CW signal measurement results for  $J_{PA} = 12\mu \text{A}$ / $\mu \text{m}$ ,  $J_{DA} = 22\mu \text{A}/\mu \text{m}$  and 1V supply: (a) s-parameter results (b) best measured CW signal input power sweep at  $f_c = 30$ GHz. Reprinted from [1].



Figure 2.17: Swept large-CW-signal measurement results summary for  $J_{PA} = 12\mu A/\mu m$ ,  $J_{DA} = 22\mu A/\mu m$ : (a) key performance metrics over 27–31GHz for 1V supply, and (b) saturated performance metrics vs. supply voltage at 30GHz. Reprinted from [1].


Figure 2.18: EVM measurement setup for 64-QAM OFDM signal. Reprinted from [1].



Figure 2.19: Peak 64-QAM OFDM measured performance: output spectrum, ACPR, and constellation for  $J_{PA} = 12\mu A/\mu m$ ,  $J_{DA} = 22\mu A/\mu m$  and 1V supply at  $P_{out} = +4.2 dBm$ ,  $BW_{sig} = 250 MHz$  (1.5Gbps), achieving 9% PAE at EVM= -25dBc. Reprinted from [1].



Figure 2.20: Swept 64-QAM OFDM signal measurement results summary for  $J_{PA} = 12\mu A/\mu m$ ,  $J_{DA} = 22\mu A/\mu m$ : (a) average  $P_{out}$  and corresponding PAE vs.  $f_c$  for 1V supply, (b) average  $P_{out}$  and corresponding PAE supply voltage at  $f_c = 30$ GHz. Reprinted from [1].



Figure 2.21: Measured AM-AM/AM-PM characteristics of two-stage PA at  $J_{DA} = 22\mu A$ / $\mu$ m and corresponding Savitzky-Golay smoothed characteristics for three example  $J_{PA}$ values: (a)  $J_{PA} = 10.0\mu A/\mu m$ , (b)  $J_{PA} = 12.9\mu A/\mu m$ , and (c)  $J_{PA} = 16.5\mu A/\mu m$ . Reprinted from [1].



Figure 2.22: Key metrics of measured AM-AM/AM-PM characteristics at 30GHz for 1V supply and  $J_{DA} = 22\mu A/\mu m$  before and after Savitzky-Golay smoothing: (a)  $P_{1dB}$ , and (b) maximum AM-PM deviation w.r.t. small-signal for  $P_{out} \leq P_{1dB}$ . Reprinted from [1].



Figure 2.23: EVM vs. average 64-QAM OFDM  $P_{out}$  at 30GHz obtained using direct measurement (setup in Fig. 2.18) for  $BW_{sig} = \{150, 250\}$ MHz, and using behavioral simulation with measured AM-AM/AM-PM characteristics of the two-stage PA (i.e.  $BW_{sig} \rightarrow 0$ ) for different  $J_{PA}$  at  $J_{DA} = 22\mu$ A/ $\mu$ m; some example AM-AM/AM-PM characteristics are shown in Fig. 2.21. Reprinted from [1].



Figure 2.24: Simulated two-tone inter-modulation distortion at  $J_{DA} = 22\mu A/\mu m$ ,  $J_{PA} = 12\mu A/\mu m$  for  $\Delta f = \{20, 75, 125, 150250\}$ MHz at the amplifier center frequency: (a) lower  $IM_3$ , (b) upper  $IM_3$ . Reprinted from [1].



Figure 2.25: Measured two-tone inter-modulation distortion at  $J_{DA} = 21\mu A/\mu m$ ,  $J_{PA} = 23.8\mu A/\mu m$  for  $\Delta f = 20$ MHz across 27–31GHz center frequency: (a) lower  $IM_3$ , (b) upper  $IM_3$ , (c) lower  $IM_5$ , and (d) upper  $IM_5$ . Reprinted from [1].

# 3. A WIDEBAND LINEAR 28-GHZ POWER AMPLIFIER FOR POWER-EFFICIENT 5G PHASED ARRAYS IN 40-NM CMOS<sup>\*</sup>

## 3.1 Introduction

To meet rising demand, broadband cellular data providers are racing to deploy fifth generation (5G) mm-Wave, e.g. rollout of some 28GHz-band services is intended in 2017 in the USA, with 5/1Gbps downlink/uplink targets. Even with 64QAM signaling, this translates to RF bandwidth (RFBW) as large as 800MHz. With 100m cells and a dense network of 5G access points (AP), potential manufacturing volumes make low-cost CMOS technology attractive for both user equipment (UE) and AP devices. However, poor Pout and linearity of CMOS power amplifiers (PA) are a bottleneck, as 10dB back-off is typical to meet error vector magnitude (EVM) specifications. This limits range and power added efficiency (PAE), and wider RFBW accentuates these issues. On the other hand, sufficient element counts in the envisaged 5G phased array modules can overcome path loss despite low Pout per PA, e.g, by combining RFICs in AP. CMOS PAs with wideband linearity/PAE can therefore enable economical UE/AP devices to deliver 5G data rates.

Silicon 28GHz-band PAs with state-of-the-art PAE were recently reported [1,49,76]. Despite these advances, linearity is not sufficiently broadband for 5G speeds i.e. maximum RFBW of 250MHz at 28GHz [1]. Relevant state-of-the-art CMOS PAs for 802.11ad [15, 17, 20] are similar to their 28GHz counterparts in a normalized RFBW sense, i.e. 500MHz RFBW at 60GHz [17]. This paper reports a 28GHz CMOS PA supporting

<sup>\*</sup> Section 3 is reprinted with permission from S. Shakib, M. Elkholy, J. Dunworth, V. Aparin and K. Entesari, "2.7 A wideband 28GHz power amplifier supporting 8×100MHz carrier aggregation for 5G in 40nm CMOS," 2017 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, 2017, pp. 44-45. ©2017 IEEE.

RFBW 3Œthe state-of-the-art without degrading Pout, PAE, or EVM. The three-stage PA uses dual-resonance transformer matching networks with bandwidths optimized for wideband linearity. Digital gain control (9dB range) is integrated for phased array operation; a needed functionality absent from existing high-performance mm-Wave PAs.

#### 3.2 Circuit Design

Fig. 3.1 shows the PA schematic. The 1x power transistor layout is similar to [15] (W/L=32Œ1m/40nm). Capacitive neutralization is used in stages 2 and 3 for reverse isolation. Stage 1 is a current steering VGA with 4Œ2dB gain steps determined by width ratios of a switched array of low-Vt cascode transistors. This topology has robust gain step accuracy, and small input/output impedance and insertion phase variations across digital states. An additional 1dB step is implemented in the biasing of stage 2. The stage scaling indicated in Fig. 3.1 helps to avoid compression in stages 1 and 2.

Back-off PAE of stage 3 is first optimized using a similar approach to [1]. Wideband matching is additionally desired to improve linearity by avoiding memory effects e.g. sharp RF gain slope. Broadband transformer matching networks are realized using loose magnetic coupling k to attain two in-band resonances separated by f. Using ideal transformer models, and by simulating amplitude-to-amplitude/amplitude-to-phase modulation conversion (AM-AM /AM-PM) of the PA at 28GHz, **Fig. 3.2** illustrates the effect of fin on linearity/PAE in stage 3 for constant bias and terminations. Explicit input shunt resistance is used, ranging from 660 to 100 to increase fin from 1 to 7GHz at the cost of power gain, which drops from 13 to 6dB. Fig. 3.2 shows AM-AM is insensitive, while AM-PM decreases with increasing fin. Pout at constant EVM (for 64QAM OFDM) increases with fin, and approaches the artificial case of setting AM-PM to zero. fin=3GHz is chosen as a compromise between Pout and PAE/gain. To realize a desired fin, transformer windings are offset to control k. The low Ropt,diff=45 target from load-pull simulation enables broadband output matching (fout 7GHz). Shunt input resistance and transformer self-inductances scale inversely to Cin of each stage such that gain is 7-8dB/stage with overall bandwidth limited by Cin of stage 3.

## **3.3** Experimental Results

The PA die micrograph is shown in Fig. 3.3; fabricated in 1P6M 40nm CMOS LP with core dimensions of 0.90Œ0.25mm2, and using 1.1V nominal supply. S-parameter measurements across 20-40GHz and gain settings are shown in Fig. 3.4. Input return loss is >10dB over 24.3-36.6GHz and varies negligibly across settings. Peak gain is 22.4dB/13.3dB for maximum/minimum setting at 28GHz. Expected skin effect in transformers and transistor MAG roll-off, and unexpectedly small capacitance (w.r.t. simulation) cause the observed gain slope. Also, Fig. 3.4 shows peak nonlinearity error <0.5dB in gain step over 26-34.3GHz. Phase error is small (peak<9.30), which mitigates complexity of phased array calibration.

Measured continuous wave (CW) Pin sweeps up to Pin,max=-3.5dBm are reported in Fig. 3.5 for highest gain setting and over 26-33GHz with 1GHz step (Pmax in Fig. 3.5 is Pout at Pin,max). The PA is driven to at least 1dB compression across 26-33GHz, and to 2-3dB compression only over 27-30GHz. Peak performance is at 27GHz, with Psat/PAEmax of 15.1dBm/33.7%, where Psat is Pout at 3dB compression. Also, P1dB/PAE1dB remain >13.4dBm/25%, while PAE at P1dB-5dB remains >13.2% across 26-33GHz.

Fig. 3.7 shows measurements using a 64QAM OFDM signal (2048-point FFT, 75kHz tone spacing, 9.7dB PAPR at 0.01% CCDF). To test with 5G data rates, 1, 4, and 8, component carrier (CC) aggregation scenarios are measured, for 90MHz-wide CCs and 10MHz guard bands. The test setup and its characterization are shown in Fig. 3.6. CCs are amplified concurrently with composite Pin divided evenly among them. PAE/EVM are plotted vs. Pout at 27GHz for 1,4,8CC. Pout/PAE for -25dBc or better EVM on each

CC are also summarized vs. center frequency. For 8CC, peak performance is at 27GHz: Pout= 6.7dBm at 11% PAE; a snapshot of corresponding measured output spectrum shows lower/upper adjacent channel leakage ratios (ACLR) are -34.4/-29.4dBc. Pout/PAE remain > 6.5dBm/9.6% across 27-32GHz for 8CC.

Table 3.1 shows a comparison with the state-of-the-art. This work extends RFBW by 3CE over that in [1] while achieving higher Pout/PAE at equal EVM for the same signal PAPR. Narrower RFBW and lower signal PAPR tested in [76] make comparison of linearity difficult. Relative to [17], this PA produces almost the same Pout at the same EVM for wider RFBW relative to center frequency and at  $2 \times$  higher PAE from a lower supply voltage. Normalizing to supply voltage and number of combined PA cores shows CW Psat of this work is on-par with the state-of-the-art. Back-off CW PAE of this work only seems lower than the single-/two-stage designs of [76]/ [1], but this is a natural result of the 12dB/6dB higher gain achieved. For example, CW drain efficiency of stage 3 in this work at P1dB-5dB is 25.6%, i.e. very close to 26.3% [76] for 1.1V supply. This work simultaneously achieves higher back-off PAE and 7dB higher gain than [49]. In summary, the implemented wideband CMOS PA can handle challenging 5G data rates at low-cost without sacrificing range or efficiency.



Figure 3.1: Three-stage PA: cascode VGA 1st stage ( $4 \times 2dB$  digital gain steps), and capacitively-neutralized common-source 2nd and 3rd stages; power transistor size scaling indicated in units. Reprinted from [2].



Figure 3.2: Optimization of linearity and PAE in stage 3 using spacing  $\Delta f_{in}$  of the two resonance frequencies of inter-stage matching network (input of stage 3). Reprinted from [2].



Figure 3.3: Die microgrph of 40 nm CMOS PA. Reprinted from [2].



Figure 3.4: Measured s-parameters across digital gain states as well as the associated gain/phase errors vs. frequency. Reprinted from [2].



Figure 3.5: Measured large CW signal power sweep results over 27–30GHz ( $P_{in,max} = -3.5$ dBm at all frequencies); CW Pout/PAE summaries at key power levels over 26–33GHz. Reprinted from [2].



Figure 3.6: EVM measurement setup. (a) Block diagram. (b) Characterization of EVM floor over center frequency for the tested carrier aggregation waveforms. Worst-case EVM floor data measured by connecting SMW200A directly to FSW43 using only a cable ( $\approx$ 2-2.5dB loss at 30GHz); i.e. without CMOS DUT. At each center frequency,  $P_{out}$  of SMW200A is increased until EVM is no longer noise-limited; then highest/worst EVM floor across component carriers is recorded. Reprinted from [2].



Figure 3.7: EVM/PAE vs. Pout at 27GHz for 64-QAM OFDM with 1,4,8CC and Pout/PAE summaries vs. center frequency for -25dBc EVM; measured spectrum/ACLR for peak 8CC performance at 27GHz: 4.32Gbps, +6.7dBm, 11% PAE. Reprinted from [2].



Figure 3.8: Summary of QPSK OFDM carrier aggregation measurements versus carrier frequency for -16 dBc EVM on each CC. (a) Average  $P_{out}$ . (b) Average PAE. Reprinted from [2].



Figure 3.9: Summary of QPSK OFDM carrier aggregation measurements versus carrier frequency for -25 dBc EVM on each CC. (a) Average  $P_{out}$ . (b) Average PAE. Reprinted from [2].



Figure 3.10: Summary of 64-QAM OFDM measurements versus carrier frequency for a single CC having different contiguous RFBW values at -25 dBc EVM. (a) Summary of average  $P_{out}$ . (b) Summary of PAE. Reprinted from [2].

| (                              | This         | ISSC      | C'16    | SMI       | 16        | ISSCC'14     | ISSCC'15       | JSSC'13      | SiRF'14              |
|--------------------------------|--------------|-----------|---------|-----------|-----------|--------------|----------------|--------------|----------------------|
|                                | Work         | []        | ]       | []        | .]<br>[   | [4]          | [2]            | [9]          | [3]                  |
| Technology                     | 40nm<br>CMOS | 28r<br>CM | m<br>SO | 281<br>CM | m<br>SO   | 40nm<br>CMOS | 28nm<br>FD-SOI | 40nm<br>CMOS | 120nm SiGe<br>BiCMOS |
| Frequency [GHz]                | 27           | 3(        |         | 2         | 8         | 63           | 60             | 61           | 28                   |
| Pwr.Combining                  | 1            | 1         |         |           |           | 2-way        | 4-way          | 2-way        | 1                    |
| Supply [V]                     | 1.1          | 1.0       | 1.15    | 1.1       | 2.2       | 1.8          | 1.0            | 1.0          | 3.6                  |
| P <sub>sat [</sub> dBm]        | 15.1         | 14        | 15.3    | 14.8      | 19.8      | 16.4         | 18.8           | 17.0         | 18.6                 |
| P <sub>1dB</sub> [dBm]         | 13.7         | 13.2      | 14.3    | 14.0      | 18.6      | 13.9         | 18.2           | 13.8         | 15.5                 |
| PAE <sub>max</sub> [%]         | 33.7         | 35.5      | 36.6    | 36.5      | 43.3      | 23           | 21             | 30.3         | 35.3                 |
| PAE <sub>1dB</sub> [%]         | 31.1         | 34.3      | 35.8    | 35.2      | 41.4      | 18.9         | 21             | 21.6         | 31.5                 |
| PAE P1dB-5dB [%]               | 15.1         | 18*       | NA      | 23.7*     | 24*       | 8            | 11.9*          | 8.4          | 14*                  |
| Gain [dB]                      | 22.4         | 15.7      | 16.3    | 10        | 13.6      | 22.4         | 15.4           | 17           | 15.3                 |
| Number of Stages               | 3            | 2         |         |           |           | 3            | 3              | 3            | 1                    |
| D.C. Power [mW]                | 30.3         | 17.5      | 20.1    | N         | A         | 88           | 74             | 75           | 61.2                 |
| Gain Control                   | 9 	imes 1dB  | No        | ne      | No        | ne        | None         | None           | None         | None                 |
| Active area [mm <sup>2</sup> ] | 0.23         | 0.1       | 6       | 0.2       | 8**       | 0.081        | 0.16           | 0.074        | 0.45**               |
| Signal Format                  | 64QAM,OFDM   | 64QAM     | ,OFDM   | 040       | <b>AM</b> | 64QAM        |                |              |                      |
| Comp. Carriers                 | 200<br>8-00  | ÷.        | S .     | ÷         | ပ္ပ       | 1-CC         |                |              |                      |
| PAPR [dB]                      | 9.7          | <u>6</u>  | 6       | 7         | 5         | 8.1          |                |              |                      |
| RFBW [MHz]                     | 800          | 25        | 0       | 2         | 0         | 500          |                |              |                      |
| EVM ]dBc]                      | — 25         | Τ         | 25      | - 27      | - 25      | - 25         |                |              |                      |
| Pout @EVM [dBm]                | 6.7          | 4.2       | 5.3     | 9.5       | 14.2      | 7            |                |              |                      |
| PAE@EVM [%]                    | 11           | 6         | 9.6     | 27        | 25        | 5*           |                |              |                      |

Table 3.1: Comparison with state-of-the-art linear mm-wave silicon PAs for data communication. Reprinted from [2].

\*Graphically estimated. \*\*With pads.

# 4. A 28-GHZ TRANSMIT-RECEIVE FRONT-END MODULE FOR 5G HANDSET PHASED ARRAYS IN 40-NM CMOS

## 4.1 Introduction

Accelerated development of millimeter wave (mm-Wave) systems for fifth-generation (5G) mobile is overtaking formal standardization and increasingly focusing on the 28 GHz band. Integrating phased arrays in the hand-held user equipment (UE) and the access points (AP) is widely regarded as the solution for path losses at mm-Wave frequencies [43]. However, supporting the requirements of even initial 5G pilot services; such as spectrally efficient waveforms like 64-QAM OFDM, and radio frequency bandwidths (RFBW) as broad as  $\approx$ 800 MHz [77], means UE phased array architecture and implementation technology should chosen carefully.

The RF phase shifting (RFPS) architecture is most suited to battery-powered UE devices due to its low power consumption compared to local oscillator phase shifting (LOPS) and baseband phase shifting (BBPS) [78]. RFPS also relaxes receiver linearity requirements due to the spatial rejection of interferers it offers; since beamforming occurs before the RF mixer [38, 79]. However, phase-shifting resolution needs to be carefully chosen based on application requirements, as both insertion loss and physical size of RF phase shifter (PS) circuits typically grow with resolution [80, 81].

Starting almost a decade ago, *Ka*-band RFPS array developments in silicon focused on relatively narrowband satellite communication or radar *receiver* (Rx) applications [82– 84], while typically assuming III-V HEMT LNAs drive the silicon Rx array chip inputs. The majority of those works used SiGe BiCMOS, and only a limited number tackled the challenges of integrating the transmitter (Tx) and the transmit-receive (TR) antenna switch [85–87]. More recently, while the paradigm of using silicon along with III-V compounds is still necessary for very high performance, e.g. see [79], impressive advances have been achieved by all-silicon arrays. For example, a 400 MHz 16-QAM link has been demonstrated over a 300 m range using 32-element Tx/Rx arrays, each built from eight, un-calibrated  $2 \times 2$  SiGe chips [88]. Similarly, base station radio developments in SiGe have recently demonstrated multiple-beam capabilities with excellent precision in gain/phase control [89], and new highly-compact RFPS-based architectures [90].

Despite the inherent advantage of SiGe as a material, integration with high-performance digital circuits and low cost in mass production make CMOS more attractive than SiGe for UE devices. Recent CMOS RFPS array developments targeted 60 GHz 802.11ad, and demonstrated a high level of integration, e.g. [4, 38]. However, 5G cellular has an inherently more demanding link budget than 802.11ad, e.g. due to the required range. Also, physical size of hand-held UE devices is more constraining at 28 GHz than at 60 GHz and limits the number of antenna elements to 4–8 [1]. On the other hand, an access point (AP) form factor permits up to ~100's of elements, and AP manufacturing volume for the anticipated high-density 5G cell network may be larger than its counterpart for sub-6 GHz. A scalable design may enable an AP module to be built by bonding multiple RFICs to a single larger AP antenna array, e.g. like in [88, 89]. The UE CMOS RFIC design must however satisfy some of the even more stringent requirements of the AP, e.g. low-noise amplifier (LNA) noise figure (NF), and power amplifier (PA) average  $P_{out}$  and power added efficiency (PAE) at a desired error vector magnitude (EVM). Therefore, proving adequate NF and  $P_{out}/PAE$  in CMOS may enable it to compete for both UE and AP RFICs.

This Chapter presents the fist high-performance TR FEM in bulk CMOS targeting 5G UE phased arrays, and is organized as follows. Section 4.2 discusses TR front-end module (FEM) considerations, including derivation of the circuit-level requirements for the PA [Chapter 3], LNA, and PS. Then, Section 4.4 provides a comparison of the strategies considered for integrating the circuit blocks at antenna and PS interfaces, followed by a

| Block-level<br>Specification                                     | Circuit<br>Block | RequiredBlock-level<br>Performance        | AchievedBlock-level<br>Performance          | Comment                                                                                                                                     |
|------------------------------------------------------------------|------------------|-------------------------------------------|---------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------|
| Gain [dB]                                                        | PA,<br>LNA       | PA: 25<br>LNA: 25                         | PA: 22.4<br>LNA: 27.1                       | Required linearity more limiting for PA gain<br>(load-line impedances lower in PA stages)                                                   |
| Gain Control                                                     | PA,<br>LNA       | 8×1dB<br>(pk.error ≤0.5dB)                | PA: 9×1dB<br>LNA: 8×1dB                     | For e.g. array tapering or channel-to-channel gain mismatch correction.                                                                     |
| Waveform<br>RFBW<br>EVM [dBc]                                    | PA               | 64-QAM OFDM<br>up to 8CC× 100MHz<br>- 25  | 64-QAM OFDM<br>up to 8CC× 100MHz<br>— 25    | Most challenging/highest throughput 5G scenario in e.g. [2]; $\approx \frac{1}{2}$ of total EVM budget given to PA                          |
| Avg. Pout[dBm]                                                   | PA               | 7                                         | Peak performance: 6.7 $\geq$ 6 @ 27—31GHz   | Spec. based on 8-element UE array and link<br>budget analysis in [18]                                                                       |
| Avg. PAE @ Tx<br>Pout[%]                                         | PA               | Maximum achievable<br>in technology       | Peak performance: 11 $\geq$ 8.8@ 27—31GHz   | For UE battery life and thermal dissipation @ cell-<br>edge scenario                                                                        |
| NF [dB]                                                          | LNA              | 3.7                                       | 3.3 @ max gain                              | From link budget analysis in [18]; assuming UE Rx<br>achieves similar LNA NF as AP Rx                                                       |
| IIP₃[dBm]                                                        | LNA              | — 6.4 @ min gain                          | Across 26—33GHz:<br>$\geq$ - 12.6@ max gain | $S_{\mbox{in,max}}{\sim}-$ 25dBmfor 150—300mtypical link range, see e.g. [14]                                                               |
| PS Insertion<br>Loss [dB]                                        | PS               | $\leq$ 6 for Rx NF<br>$\leq$ 8 for Tx EVM | 5.9                                         | Max tolerable loss limited by Rx NF<br>and Tx EVM [Figs. 2 and 3]                                                                           |
| PS Resolution<br>[bits]                                          | PS               | ≥3                                        | 3                                           | Spec from [Fig. 1(a)] but limited to 3-bit toal to meet<br>~6dB IL;IL @Ka-band is 2.5dB/bit in[6], reduced to<br>2.0dB/bit in this work [7] |
| Per-element<br>Random<br>Phase Error<br>[deg <sub>r.m.s.</sub> ] | PS               | ≤ <b>10</b>                               | 5                                           | 10° < LSB/4 for 3-bit PS and for beam-pointing<br>accuracy (conservative, [Fig. 1(b)])                                                      |

Table 4.1: Key specifications for circuit components of UE FEM and the corresponding measured performances achieved by stand-alone test circuits in this work.

detailed step-by-step explanation of the trade-offs in the presented design. The Tx and Rxmode experimental results of the complete FEM are provided in Section 4.5, and finally the paper is concluded in Section 2.7.

# 4.2 Transmit-receive Module Considerations

This section explains the key specifications in Table 4.1 for FEM component circuits, while leveraging the uplink RF system budget investigation in [1]. Values of parameters from the analysis of [1] to be used in this paper are given, and the analysis is augmented where needed.

#### 4.2.1 Link Budget

#### 4.2.1.1 5G Uplink

The UE FEM is in Tx-mode. Anticipating a 5G-standardized single-channel RFBW of 100 MHz, the required  $P_{out}$  is 7 dBm for 64-quadrature amplitude modulation (64-QAM) at -25 dBc EVM and 40 m link range (higher throughput scenario), or 6.5 dBm for quadrature phase shift keying (QPSK) at -14 dBc EVM and 150 m link range (~cell-edge scenario) [1]. These  $P_{out}$  requirements include margin for UE-side front-end losses of  $L_{FE,Tx} = 3.9$  dB; 1.4 dB for TR switch, 0.6 dB for chip-to-package transition combined with via to antenna layer of the circuit board, and 1.9 dB for antenna feed-line. For longer UE battery life, the Tx back-off PAE should be the maximum permitted by the technology.

## 4.2.1.2 5G Downlink

The UE FEM is in Rx-mode. In the uplink system budget analysis of [1], a *AP-side* Rxelement noise figure  $NF_{Rx}$  of  $1 + 0.5\sqrt{f_{GHz}} = 3.7$  dB at 30 GHz was assumed. Also, high LNA gain was assumed so the noise figure of the stand-alone LNA  $NF_{LNA}$  dominated  $NF_{Rx}$ . Here, the same 3.7 dB value is targeted for the *UE-side* LNA. Considering a 1.4 dB TR switch loss in Rx-mode, this translates to a FEM noise figure  $NF_{FEM} = 5.1$  dB. Additionally, for a maximum received signal power of  $S_{in,max} = -25$  dBm, the effective input-referred 1 dB compression point of the FEM  $IP_{-1 \text{ dB},FEM}$  needs to be  $\approx -15$  dBm. That is, for Rx nonlinearity to result in a negligible EVM degradation,  $IP_{-1 \text{ dB},FEM}$  should be at least  $S_{in,max}$ + signal PAPR,where this PAPR~10 dB. Therefore,  $IIP_3 = -6.4$  dBm for the LNA if the 1.4 dB loss of the switch mentioned above is accounted for, and  $IIP_3 \approx$  $IP_{-1 \text{ dB}} + 10$  dB holds [10].

#### 4.2.2 Beam Steering

Accommodating up to  $8 \times$  carrier aggregation, i.e. supporting even the downlink data rate targets as in [2], the *fractional* signal bandwidth anticipated for 5G standardization in the 28 GHz band remains <3%. Thus, a PS with a linear phase profile over only a limited bandwidth suffices to avoid significant distortion from array-induced intersymbol interference (ISI) [78]. That is, a broadband true-time delay (TTD) element is not necessary.

For the PS, digital control is desirable for robustness of the beam steering, but the associated quantization error results in beam misalignment that degrades the array gain. The theoretical peak EVM degradation from PS quantization was reported for a uniform linear array (ULA) [78]:

$$\max_{\theta_{inc}} \left\{ \frac{\text{EVM}}{\text{EVM}_0} \right\} = N \cdot \left[ \frac{\sin\left(\frac{2\pi}{4 \times 2^{N_{PS}}}\right)}{\sin\left(N \cdot \frac{2\pi}{4 \times 2^{N_{PS}}}\right)} \right],\tag{4.1}$$

where EVM<sub>0</sub> is the EVM for continuous phase tuning (i.e.  $N_{PS} \rightarrow \infty$ ), and the maximization is performed over the signal's spatial angle of incidence  $\theta_{inc}$  relative to the Rx ULA's broadside direction. Figure 4.1(a) is a plot of  $\max_{\theta_{inc}} \{\text{EVM/EVM}_0\}$  from (4.1) versus PS resolution  $N_{PS}$  in bits for different N. The EVM degradation increases with N because the beamwidth progressively gets narrower; thereby increasing the impact of a given misalignment error. Beam misalignment is allowed to degrade EVM by a margin of up to 3 dB in [1]. Also, from Fig. 4.1(a),  $N_{PS} \ge 3$  yields a peak EVM degradation of <4 dB in a ULA with N = 8. The adopted  $N = 4 \times 2$  URA case (not plotted; (4.1) only applies to ULAs) is expected to lie between the shown N = 4 and the N = 8 ULA cases. Therefore, for  $N_{PS} \ge 3$ , EVM degradation is within the budget in [1].

Additionally, random phase errors/mismatches with a standard deviation  $\sigma_{PS}$  in each array element super-impose *nonlinearity errors* on the quantized phase steps. These sub-

sequently map to nonlinearity errors in the quantized steps of the beam pointing angle, with a statistical variance  $\sigma_{beam}^2$  which degrades with wider scan angles  $\theta_{inc}$ , but improves with number of elements N [78]:

$$\sigma_{beam}^2 = 12 \cdot \frac{\sigma_{PS}^2}{\pi^2 \cos^2(\theta_{inc}) \cdot N \cdot (N^2 - 1)}.$$
(4.2)

Figure 4.1(b) is a plot of  $\sigma_{beam}$  versus  $\sigma_{PS}$  for broadside incidence using (4.2) (i.e. 'normalized' by taking  $\theta_{inc} = 0^{\circ}$ ). Random phase error  $\sigma_{PS}$  should be controlled so that its associated beam-pointing angle nonlinearity error remains  $\leq \frac{1}{2} \times$  the smallest possible beam-pointing angle step size dictated by PS resolution:  $\approx \sin^{-1} \left[ \frac{(2\pi/2^{N_{PS}})}{\pi} \right] \approx 14.5^{\circ}$ , even at the maximum scan angle  $\theta_{inc,max}$ . Figure 4.1(b) shows that even for N = 4, and assuming a very large  $\theta_{inc,max} = 75^{\circ}$ ,  $\sigma_{PS} \leq 10^{\circ}$  results in  $\sigma_{beam} \leq 5.5^{\circ}$ , which is  $\leq \frac{1}{2} \times 14.5^{\circ} = 7.25^{\circ}$ . Considering that other limitations, such as nulls of element factor (e.g. patch antenna element), will certainly dominate the radiation pattern well before this extremely wide  $\theta_{inc,max} = 75^{\circ}$  is achieved,  $\sigma_{PS}$  of  $10^{\circ}$  r.m.s. seems highly conservative as far as accuracy of UE array beam pointing angle is concerned.

Due to battery life limitations in UE devices, a *passive* PS is desirable. A passive PS is also leveraged here as it enables a *bidirectional* implementation, which helps to reduce silicon area. The key obstacle is insertion loss (*IL*), i.e. in terms of overall front-end power gain, linearity, and noise. The impact of PS *IL* is explained for the two modes of operation next.

## 4.2.2.1 Transmit Mode

Besides lowering total gain, increasing PS *IL* forces the RF pre-driver of the Tx distribution network to generate an equally higher  $P_{out}$ , given the per-element PA power gain is limited by achievable gain per stage and available silicon area. Figure 4.2(a) shows this scenario for N = 8 and 25 dB total power gain for the three-stage Tx-element PA. In addi-



Figure 4.1: Impact of quantization/random phase step errors on array performance. (a) Worst-case EVM degradation as a result of phase quantization error in digital PS. (b) Impact of random per-element phase errors on beam pointing angle accuracy.

tion to its inherent 3 dB splitting 'loss', 1 dB IL is budgeted per Wilkinson splitter stage. For simplicity, the pre-driver re-uses the same circuit design as the output stage of the PA, with 8 dB gain. Therefore, the two stages highlighted in yellow in Fig. 4.2(a) are modeled to have the same amplitude-to-amplitude/ amplitude-to-phase conversion (AM-AM/ AM-PM) behavior. All other blocks are modeled as perfectly linear. Figure 4.2(b) shows the simulated Tx EVM versus IL of the passive PS. Targeting -28 dBc, and to limit EVM degradation to <0.5-1 dB, this IL cannot exceed 8 dB.

## 4.2.2.2 Receive Mode

A higher PS IL here reduces the total front-end gain; hence the noise of the chain beyond the PS is increasingly larger when referred to the Rx input and the overall  $NF_{Rx}$ degrades. Figure 4.2(a) illustrates this scenario, with an LNA having a total gain of 25 dB, an  $IIP_3$  of -6 dBm, and a noise figure NF = 5 dB. The back-end (after Wilkinson combining network) is assumed to have a noise figure of 16 dB and an  $IIP_3$  of 10 dBm. Figure 4.2(b) plots both  $NF_{Rx}$  and  $IIP_3$  versus PS IL. Overall Rx  $IIP_3$  is seen to im-



Figure 4.2: Effect of passive phase shifter insertion loss on overall transmitter EVM. (a) Simulation scenario illustration. (b) 8-element phased array transmitter EVM versus phase shifter insertion loss.

prove due to the reduction in total gain of the LNA-PS composite. To limit overall  $NF_{Rx}$  degradation to 0.5 dB above the 5 dB of the LNA, the *IL* cannot exceed 6 dB.



Figure 4.3: Effect of passive phase shifter insertion loss on overall receiver noise figure and linearity. (a) Simulation scenario illustration. (b) 8-element phased array receiver NF and  $IIP_3$  versus phase shifter insertion loss.

## 4.3 Circuit Blocks

To experimentally validate the presented strategy for integrating FEM components, the PA, LNA, and PS are fabricated as stand-alone test circuits; their die photos are shown in Figs. 4.4(a)–(c), respectively. Stand-alone measured performances are summarized in Table 4.1 as a reference for before-versus-after comparison with measurements reported in Section 4.5 after integration. For detailed design considerations/characterizations of the stand-alone blocks, see [2,81,91].



Figure 4.4: Die micrographs of stand-alone front-end module component test circuits. (a) Power amplifier. (b) Low-noise amplifier. (c) Phase shifter 3.

#### 4.3.1 Power Amplifier

Figure 4.5 shows the schematic, and Fig. 4.4(a) shows the die micrograph of the threestage PA of [2]. It consists of a current-steering  $4 \times 2$  dB variable gain amplifier (VGA) input stage [stage 1, Fig. 4.5(b)], followed by two capacitively neutralized differential stages [stages 2 and 3, Fig. 4.5(c)]. A 1 dB least significant bit (LSB) is implemented in the bias mirror of stage 2.

The current steering VGA topology of stage 1 is chosen for the relative insensitivity to digital gain setting of its input impedance, output impedance, and insertion phase. The linear-in-dB steps of the stage therefore can have correspondingly small gain step nonlinearity errors over a wide frequency range. This is achieved by drawing each transistor in the switched array of low-Vt cascode devices as an integer multiple of 1  $\mu$ m/40 nm fingers. Layout details are in [2].

Size and bias point of stage 3 are chosen based on an optimization methodology similar to [1], beginning from a  $W/L=32\times1 \mu m/40$  nm unit power cell similar to the layout in [1, 15]. Overcoming the linearity limitation of the design in [1] for wide signal bandwidths  $\geq$ 250 MHz is a main driver for the wideband PA in [2]; i.e. for the integrated frontend to subsequently maintain the wideband transmit linearity achieved. Thus, bandwidth of interstage matching between stages 2 and 3 was optimally selected by controlling the spacing between the network's two in-band resonances as described in [2]. Finally, driving transistors in each stage are scaled as indicated in Fig. 4.5(a), and remaining interstage and input matching transformer inductances are inversely scaled with their respective driven stage sizing as given in detail in [2].

## 4.3.2 Low-noise Amplifier

Figure 4.6 shows the schematic, and Fig. 4.4(b) shows the die micrograph of the threestage LNA of [33]. It consists of a single-ended cascode input stage [stage 1, Fig. 4.6(a)],



Figure 4.5: Schematic of three-stage power amplifier. (a) Top level block diagram and relative stage scaling. (b) Stage 1:  $4 \times 2$  dB cascode VGA. (c) Stages 2 and 3: common source stages with capacitive neutralization.

followed by two variable-gain, capacitively neutralized differential stages [stages 2 and 3, Figs. 4.6(b) and (c)].

Stage 1 uses a single-ended rather than a differential design to minimize NF given the limited d.c. power budget. Inductive source degeneration is used for 50  $\Omega$  input match-

ing [10], and the source and gate inductors of the input network are magnetically coupled to increase the effective inductance without adding series resistance and thereby reduce their *direct* contribution to NF. Cascode device  $M_2$  improves reverse isolation, and a series inductor between the drain of  $M_1$  and source of  $M_2$  boosts gain and reduces the noise contribution of  $M_2$  by countering capacitive parasitics at the 'cascode node' [92]. A transformer balun converts the output of stage 1 to a differential signal at the gates of  $M_3$ and  $M_4$ , and is designed with loose magnetic coupling  $k \approx 0.27$  for wideband impedance matching. Similarly, loose coupling is also chosen for transformers in the other interstage/output matching networks.

Doubling available voltage swing using differential designs for stages 2 and 3 improves output third order intercept point ( $OIP_3$ ). Capacitive neutralization enhances stability and gain. Differential operation also improves immunity to errors in modeling ground-path impedance and thereby makes stability more predictable than in single-ended mm-Wave amplifier stages. For gain control, a bank of resistors with series MOS switches sets the resistive load at the output of stage 2 as well as at both the input and output of stage 3. A configurable current mirror provides biasing for stage 3, and its digital switches are set concurrently with the setting of the resistor bank at the output of stage 2. This arrangement progressively saves d.c. power for lower gain settings without degrading linearity. Overall,  $8 \times 1$  dB gain control using this distribution of switched resistors/biasing aims to reduce the dependences of NF and  $OIP_3$  on gain setting.

## 4.3.3 Phase Shifter

Figure 4.7 shows the schematic, and Fig. 4.4(c) shows the die micrograph of the passive, bidirectional, 3-bit, differential PS of [33]. The design is fundamentally based on lumped-element *approximation* of transmission lines (i.e. true time delay) [37, 93]. As shown in Fig. 4.7(a), the PS consists of a cascade of passive, switched delay cells. Each of



Figure 4.6: Schematic of three-stage LNA. (a) Top level block diagram. (b) Stage 1. (c) Stage 2. (d) Stage 3.

the 45° and 90° cells introduces *relative* steady-state insertion phase delay between its two possible switch states, while the the 180° cell inverts the polarity of the differential signal. The differential 80  $\Omega$  input and output are each matched to 50  $\Omega$  single-ended ports using on-chip baluns only to simplify on-die probing.

The 45° cell [Fig. 4.7(b)] can add significant delay if configured as a lowapass  $\pi$ network (LP-state:  $S_1$  off,  $S_2$  on), or instead a MOS switch can bypass the  $\pi$ -network so
the cell adds only minimal delay (BP-state:  $S_1$  on,  $S_2$  off). Unlike the design in [37],
implementing  $S_1$  as a triple-well device and tying its gate to the LP-state common-mode
node reduces insertion loss [33].

The  $90^{\circ}$  cell [Fig. 4.7(c)] can be configured to either add phase delay as a lowpass

 $\pi$ -network (LP-state), or add phase *advance* as a highpass  $\pi$ -network (HP-state). The insertion phase difference between LP and HP states varies less with frequency than its counterpart between LP and BP design [38]. This helps to reduce phase step nonlinearity errors across a broad frequency range in comparison to an identical PS having a LP-BP instead of a LP-HP 90° cell [33].



Figure 4.7: Schematic of three-bit phase shifter. (a) Block diagram. (b)  $45^{\circ}$  Cell. (c)  $90^{\circ}$  Cell (d)  $180^{\circ}$  Cell.

## 4.4 Module Integration

This section explains topology choices and considerations for integrating the three major components of the FEM (PA, LNA, and PS of Section 4.3) with TR switches at antenna and PS interfaces.

#### 4.4.1 Antenna Interface



Figure 4.8: Candidate topologies for antenna matching and transmit-receive switch. (a) Concept of  $\lambda/4$  transformer topology used in [3, 4]. (b) Transformer-based multiplexer topology of [5]. (c) Transformer-based topology in [6]. (d) Proposed topology.

The candidate topologies of Figs. 4.8(a)–(d) are compared in Table 4.2 based on the following:

The topology in Fig. 4.8(a) is reported for 60 GHz arrays, e.g. [3, 4]. Its distributed nature offers wideband matching but its size is accordingly large at 28 GHz; λ/4 ≈ 1.3 mm in SiO<sub>2</sub>. Electromagnetic (EM) simulations of on-chip 50 Ω shielded coplanar waveguide (CPW) show insertion loss (*IL*) ≈0.7 dB/mm, while the shunt switches must be *very* wide for their on-resistance to be ≤2.5–5 Ω, i.e. to limit *IL* contribution. Also, losses are too severe in high- and low-impedance λ/4 lines so
| Topology | Tx<br>Insertion<br>Loss | Rx<br>Insertion<br>Loss | Matching<br>Bandwidth | Silicon<br>Area |
|----------|-------------------------|-------------------------|-----------------------|-----------------|
| Fig.5(a) | High                    | High                    | Wideband              | Bulky           |
| Fig.5(b) | High                    | High                    | Narrowband            | Moderate        |
| Fig.5(c) | Low                     | Low                     | Narrowband            | Compact         |
| Fig.5(d) | Low                     | Moderate                | Wideband              | Compact         |

Table 4.2: Comparison of candidate topologies in Fig. 4.8 for antenna matching and transmit-receive switch.

additional matching networks are needed for unequal PA/LNA terminations; further increasing *IL*. Finally, the Tx-side switch is exposed to the full PA output swing and must be designed for reliability instead of Rx-mode *IL*.

- The topology in Fig. 4.8(b) showed wideband performance in [5], where broadsidecoupled transformers enabled it to be relatively compact. However it is similar to Fig. 4.8(a) on overall *IL* due to *cascading* of matching networks, and also on Txside shunt switch reliability.
- The circuit in Fig. 4.8(c) was reported for an 802.11ac transceiver [6]. It can achieve lower *IL* than Figs. 4.8(a)–(b) by avoiding cascaded networks. It also embeds the TR switch into *co-designed* Tx/Rx matching, making it very compact, while also avoiding a shunt switch at the PA output. However, the co-design requires the Tx balun to present optimal loading (high impedance) to the PA (antenna) in Tx-mode (Rx-mode). These two requirements are difficult to satisfy across a wide bandwidth centered on 28 GHz like [2] because *explicit* C<sub>ANT</sub> ≈ 50–80 fF must be used for dual-resonance matching. C<sub>ANT</sub> increases by C<sub>1</sub>C<sub>2</sub>/(C<sub>1</sub> + C<sub>2</sub>) ≈ C<sub>1</sub> in Tx-mode, which detunes the balun if not compensated. C<sub>1</sub> ≪ C<sub>2</sub> by design to reduce Tx-mode PA-to-LNA coupling ∝ C<sub>1</sub>/(C<sub>2</sub> + C<sub>1</sub>), and hence reduce Tx *IL* and protect the LNA. However, C<sub>1</sub> cannot be arbitrarily small to avoid degrading Rx-mode NF

due to weak antenna-to-LNA a.c. coupling. One solution is to control  $C_{ANT}$  and  $C_1$  with more switches; degrading both Tx and Rx *IL*. Another is to design the Tx balun to have a single in-band resonance so explicit  $C_{ANT} = 0$ ; sacrificing Tx bandwidth.

• In this paper, the topology in Fig. 4.8(d) is proposed to benefit from the advantages of Fig. 4.8(c) while relaxing its bandwidth limitation mentioned above. Also, the PA gain devices *replace* the Tx-side shunt switch, eliminating its parasitics *and* reliability concerns. Connecting the PA transistor's gates to  $V_{DD}$  shorts the Tx port, while grounding the balun center tap avoids 'crowbar' current and biases the PA devices in deep triode to minimize  $R_{on,PA}$  [Fig. 4.9(a)].

Note that the topology of Fig. 4.8(d) trades wider Tx-mode bandwidth for slightly higher Rx-mode *IL*. This may be understood using the Rx-mode  $\pi$ -equivalent model shown in Fig. 4.9(b):

- $C_{P1}$ : explicit  $C_{ANT}$  plus parasitic capacitance of Tx balun at antenna node A.
- $L_{P1}$ : models *leakage*, i.e. un-coupled inductance of balun on antenna side.
- $R_{P1}$ : models balun losses plus 'reflected'  $R_{on,PA}$  in series with antenna.
- $C_{P2}$ : balun parasitic capacitance + off-capacitance of switch  $C_{off,SW}$  at node B.

Wider Tx bandwidth implies greater separation between the Tx balun's two in-band resonances, in turn requiring tighter magnetic coupling k [1, 66], i.e. smaller  $L_{P1} \propto (1 - k^2)$  [65]. On the other hand, low Rx-mode NF requires larger  $L_{P1}$  to separate  $C_{P1}$ and  $C_{P2}$  and counter their step-down effect on  $\Re\{Z_{out,B}\}$ . Larger  $L_{P1}$  (i.e. lower k) therefore reduces the impedance transformation required in the LNA matching network  $r_{LNA} = \Re\{Z_{in,LNA}\}/\Re\{Z_{out,B}\}$ , thereby reducing Rx-mode IL [10, 67]. To analyze  $L_{P1}$ 's effect on  $r_{LNA}$  in Fig. 4.9(b) (neglecting  $R_{P1}$ ):



Figure 4.9: Illustration of trade-off between Tx-mode bandwidth and Rx-mode NF in design of PA balun for circuit of Fig. 4.8(d). (a) Configuring PA gain devices to replace shunt switch in Rx-mode;  $V_{DD}$  center-tap is pulled down to ground to minimize gain device's on-resistance  $R_{on,PA}$ . (b) Simplified  $\pi$ -equivalent circuit in Rx-mode. (c) Smith chart trajectory of output impedance 'looking back' at antenna from point B in Rx-mode; red arrow indicates effect of tighter magnetic coupling in Tx balun.

$$Z_{out,B}(s) = \frac{R_{ANT} + sL_{P1}\left(1 + sC_{P1}R_{ANT}\right)}{sC_{P2}R_{ANT} + \left(1 + sC_{P1}R_{ANT}\right)\left(1 + s^2L_{P1}C_{P2}\right)},$$
(4.3)

where  $R_{ANT}$  is the 50  $\Omega$  antenna resistance. Accordingly,  $R_B \triangleq \Re \{Z_{out,B}\}$  is approximately:

$$R_B \approx \frac{R_{ANT}}{\left[1 - 2\omega^2 L_{P1}C_{P2} - 2\omega^4 R_{ANT}^2 C_{P1}C_{P2}L_{P1} \left(C_{P1} + C_{P2}\right)\right] + \omega^2 \left(C_{P1} + C_{P2}\right)^2 R_{ANT}^2} \tag{4.4}$$

Equation (4.4) reduces to  $R_{ANT}/[1 + \omega^2 (C_{P1} + C_{P2}) R_{ANT}^2]$  at  $L_{P1} = 0$ , corresponding to k = 1; i.e. perfect Tx balun coupling with  $L_{P1}$  being a short circuit, and therefore  $C_{P1}$  and  $C_{P2}$  sum and appear directly in parallel with  $R_{ANT}$  such that (4.3) becomes  $Z_{out,B}(s) \sim [1/s (C_{P1} + C_{P2})] \parallel R_{ANT}$ . Increasing either  $C_{P1}$  or  $C_{P2}$  reduces  $R_B$  in this limiting case, and this logic may be extended to cases where  $L_{P1} \neq 0$  but remains small, i.e. wide Tx bandwidth ( $k \gtrsim 0.65$ –0.8). The Smith chart trajectory of  $Z_{out,B}$  at 30 GHz in Fig. 4.9(c) shows that  $R_B$  drops as k increases.

With With the above insight, the circuit is designed for the  $Z_{PA}$  and  $Z_{LNA}$  termination

targets annotated on the more detailed schematic in Fig. 4.10(a), with its physical layout shown in Fig. 4.10(b).

- Tx-mode signal path through the circuit of Figs. 4.10(a) and (b) is highlighted in Figs. 4.11(a) and (b), respectively. TR switch  $M_1$  and head-switch  $M_2$  are on, while pull-down switch  $M_3$  is off. Coupled inductors  $L_1$  and  $L_2$  form the wideband (dual-resonance) Tx balun.
- Rx-mode signal path is highlighted in Figs. 4.11(c) and (d). The PA gain devices (deep triode) short the Tx balun, with  $M_3$  grounding the center tap of  $L_1$ , while  $M_1$  and  $M_2$  are off. A.c. coupling capacitor  $C_c$  and inductors  $L_p$  and  $L_s$  together form the LNA matching network.



Figure 4.10: Antenna port matching and transmit-receive switch. (a) Schematic. (b) 3D illustration of physical layout.

The Tx balun and LNA matching network are re-designed relative to their stand-alone circuit counterparts in [2] and [91], respectively. The LNA matching network is brought



Figure 4.11: Illustration of signal path in the circuit of Fig. 4.10 for the Tx/Rx modes. (a) Schematic in Tx-mode. (b) 3D structure in Tx-mode. (c) Schematic in Rx-mode. (d) 3D structure in Rx-mode.

close to the antenna port to help reduce Rx-mode IL. The Tx balun comprises a tightlycoupled core transformer (45°-rotated with  $k_{core} \approx 0.75$ ) in addition to a *controlled* series leakage contribution to  $L_2$ ; i.e. routing between rotated core transformer and antenna port. Including routing, an effective  $k \approx 0.65$  is chosen based on the trade-off explained in Fig. 4.9 and (4.4).

 $M_1$  is sized  $(W/L)_1 = 4 \times 28 \times 2.4 \ \mu \text{m/40}$  nm as a compromise between Tx-mode IL/Rx-mode NF (via its off-capacitance  $C_{off,SW}$  [Fig. 4.9 and (4.4)]). A deep n-well (DNW) device is used, with a 5 k $\Omega$  resistor biasing its *local* bulk to float it at RF and hence reduce coupling through its junction capacitances to the otherwise low-impedance shared bulk [55]. Simulations show  $M_1$  contributes  $IL \leq 0.73$  dB in Tx-mode for 23– 32 GHz, and has  $C_{off,SW} = 140$  fF after RC-extraction.  $M_2$  conducts PA supply current in Tx-mode, so it must be very wide to reduce its d.c. on-resistance, i.e. the voltage drop across it. Simulations show that  $(W/L)_2 = 64 \times 16 \times 3.42 \ \mu\text{m}/40 \ \text{nm}$  ( $\approx 300 \ \text{m}\Omega$ ) limits  $P_{sat}$  degradation to  $\leq 0.1$  dB.  $M_3$  is drawn with  $(W/L)_3 = 4 \times 16 \times 3.42 \ \mu\text{m}/40 \ \text{nm}$ , and uses a thick oxide device to minimize drain-to-source leakage in Tx-mode. Finally, two bypass capacitors  $C_B = 20 \ \text{pF}$  connect symmetrically to  $L_1$ 's  $V_{DD}$  center tap.

Using EM simulation, the *entire* structure in Fig. 4.10(b) is optimized, i.e. core Tx balun radius with added output routing length, as well as capacitance  $C_c$  and dimensions of  $L_p$  and  $L_s$ . EM modeling of the complete structure is necessary to capture various current return paths that significantly impact parameters of interest, e.g. balance in impedances loading each of the two PA output stage transistors, effective k of Tx balun, and effective  $L_p$ . Correct ground-referencing of internal ports for  $C_B$ ,  $M_1$ ,  $C_c$ , and  $V_{b,LNA}$  is similarly important for tuning accuracy.

The EM simulation is experimentally verified using Tx- and Rx-mode equivalent passive test structures, whose die photos are shown in Figs. 4.12(a) and (b), respectively. Thick metal connections (open circuits) replace on-state (off-state) switches and bypass capacitors in the implemented test structures. Correspondingly, ideal shorts/opens are applied across the respective ports of the *same* EM model to simulate each mode. Figures 4.13(a) and (b) show the measured and simulated insertion and return losses in both modes, with good agreement between measurement and simulation except for an extra 0.5 dB vertical offset and visible ripples in the Rx-mode *IL* response. This discrepancy was expected; as due to area limitations, only *differential* (GSGSG-pattern) open/short impedance standards could be included on the same CMOS chip with the passive antenna interface test circuits. Hence, the open-short de-embedding applied [94], which re-used the measured impedances of the differential standards, is only approximate for the *singleended* LNA port. Note that the single-ended antenna port did not required de-embedding; since its I/O pad capacitance of  $\approx$ 20 fF is included as part of the design (contributes to total  $C_{ANT}$ ).



Figure 4.12: Die micrographs of passive test structures for antenna interface. (a) Tx-mode: LNA port and TR switch shorted. (b) Rx-mode: PA port shorted, TR switch gate tied to ground.

### 4.4.2 Phase Shifter Interface

The schematic and 3D layout illustration of the PS interface are shown in Figs. 4.14(a) and (b), respectively. The PS-side TR switch uses a shunt-series topology [5, 55]. A DNW is used for series switch  $M_s$  to eliminate bulk losses [Section 4.4.1]. The shunt switch is split into bulk devices  $M_d$  and  $M_c$ , to short differential-mode and common-mode (including d.c.), respectively.

The PA scales up the impedance level moving backwards from output to input for enhanced back-off PAE [2]. Similarly, the LNA uses large load impedances at the transistor



Figure 4.13: Simulated and measured IL and RL of antenna interface passive test structures. (a) Tx-mode. (b) Rx-mode



Figure 4.14: Phase shifter port matching and transmit-receive switch. (a) Schematic. (b) 3D illustration of physical layout.

drains in each stage, limited by  $IP_3$  [91]. Thus, the PS interface may match the 80  $\Omega$  input/output impedance of the PS to a relatively high impedance in *both* Tx and Rx paths; i.e. the design can be symmetrical. Transformers are preferred for compactness, but their parasitic capacitances for  $\leq 1:2.5$  turn ratios limit the driving (load) parallel-equivalent resistance for the PA (LNA) path. A 1:2 ratio is therefore used (smaller winding on PS side), with equal Tx and Rx termination resistance  $\approx 250 \Omega$ . The concept of the Tx balun

in Section 4.4.1 is re-used, i.e. a tightly-coupled core XF with intentional series leakage  $L_{kt}$  to set effective coupling  $\approx 0.55$ ; limited by the relatively high termination impedances. The switches terminate the routing as indicated in Fig. 4.14(b) to absorb their parasitics as tuning capacitances, and EM simulation of the complete structure is used to optimize dimensions as in Section 4.4.1.

## 4.5 Experimental Results

The FEM is fabricated in 1P6M 40 nm CMOS LP technology; its die micrograph is shown in Fig. 4.15, measuring 1.55 mm×0.7 mm. All mm-Wave characterization is performed using on-die probing, and the matching balun indicated in Fig. 4.15 enables simpler GSG probing at the RFIC I/O port as for the PS in Fig. 4.4(c). This balun contributes 1.2 dB of *IL*, which is not de-embedded in the results shown in this section (unless stated). Two multi-contact wedges land on the two rows of d.c. pads on either side of the FEM to supply bias currents, separate power connections for individual amplifier stages (e.g. 3  $V_{DD}$  pads for PA), and digital lines for gain/phase control.



Figure 4.15: Die micrograph of fully-integrated front-end module.

#### 4.5.1 Transmit Mode

#### 4.5.1.1 Small-signal S-parameters

Figure 4.16(a) shows the measured small-signal S-parameters of the front-end versus frequency across the 10 available PA gain states, with the data for two PS phase states  $\{0, 4\}$  being overlaid on the same axes. The plot shows the return loss RL on the PS input port ( $s_{11}$ ) is better than 10 dB across 26.9–35.5 GHz, with a peak gain  $s_{21}$  of 11.2 dB, and reverse isolation  $s_{12} \approx 40$  dB up to 37 GHz. Recall that another  $\approx 1.2$  dB should be added to all reported gain values to compensate for the IL of the matching balun at the RFIC I/O port. Also, the gain roll-off with frequency is due to skin effect in transformers (e.g. see Fig. 4.13(a)), and to transistor maximum available gain (MAG) roll-off with frequency as explained in [2]. The data for the two overlaid PS states in Fig. 4.16(a) are practically identical at each PA gain state. This is a result of the input/output matching to the PS and antenna interfaces being insensitive to the digital gain setting, and to the fact that states 0 and 4 differ only by a signal inversion in the 180° cell of the PS [81]. Similarly, Figs. 4.16(b)–(d) plot the corresponding data for the remaining PS states  $\{1, 5; 2, 6; 3, 7\}$ , showing similar behavior.

Figures 4.17(a)–(d) show the measured errors in the 9×1 dB PA gain steps that correspond to the data in Fig. 4.16 for the PS states  $\{0, 1, 2, 3\}$ – the remaining 4 PS states exhibit similar broadband gain step accuracy (not shown due to space limitations). Across all 8 PS states, the *peak* gain step nonlinearity error remains  $< \frac{1}{2} \times 1$  LSB (i.e. <0.5 dB) across 23.2–37.4 GHz.

Figures 4.18(a)–(c) show the measured Tx-mode insertion phase versus frequency across the  $7 \times 45^{\circ}$  PS phase steps, and Figs. 4.19(a)–(c) show the corresponding phase step nonlinearity errors. An r.m.s. phase error <  $10^{\circ}$  is achieved across 21.3–33.8 GHz.



Figure 4.16: Measured s-parameters in Tx-mode across  $9 \times 1$ dB PA gain steps for different PS phase state pairs. (a) States  $\{0, 4\}$ . (b) States  $\{1, 5\}$ . (c) States  $\{2, 6\}$ . (d) States  $\{3, 7\}$ .



Figure 4.17: Measured gain step nonlinearity errors in Tx-mode across  $9 \times 1$ dB PA gain steps for different PS phase states; r.m.s. error indicated with thick black line in each case. (a) State 0. (b) State 1. (c) State 2. (d) State 3.



Figure 4.18: Measured s-parameters in Tx-mode across  $7 \times 45^{\circ}$  PS phase steps for different PA gain settings: (a) PA gain state 0 (min. gain). (b) PA gain state 4. (c) PA gain state 9 (max. gain).



Figure 4.19: Measured errors in  $7 \times 45^{\circ}$  PS phase steps in Tx-mode for different PA gain settings; r.m.s. error indicated with thick black line in each case. (a) PA gain state 0 (min. gain). (b) PA gain state 4. (c) PA gain state 9 (max. gain).

#### 4.5.1.2 Large CW Signal Performance

Figures 4.20(a)–(d) show the measured Tx-mode large continuous wave (CW) signal power sweep results at {27, 28, 29, 30} GHz. The sweeps are performed up to  $P_{in,max}$ =+8 dBm, at the highest PA gain setting. The FEM is driven to at least 1 dB compression across 26–33 GHz, and to 2–3 dB compression only over 27–30 GHz. Figure 4.21(a) shows a summary of measured CW signal  $P_{out}$  at key back-off levels across 26–33 GHz, while Fig. 4.21(b) shows the corresponding measured PAE. Note that  $P_{max}$  (PAE<sub>max</sub>) in Fig. 4.21(a) (Fig. 4.21(b)) is  $P_{out}$  (PAE) at  $P_{in,max}$ . The input balun's 1.2 dB *IL* is not de-embedded.



Figure 4.20: Measured CW power sweep results at maximum PA gain setting and PS phase state 0 at different CW frequencies. (a) 27GHz. (b) 28GHz. (c) 29GHz. and (d) 30GHz.

#### 4.5.1.3 Modulated Signal Performance

This subsection demonstrates that the implemented UE front-end amplifies even the extremely broadband, high-PAPR signals anticipated for 5G downlinks with high fidelity. Measurements are performed for carrier aggregation scenarios across center frequencies 27–32 GHz with 1 GHz step in center frequency. Each component carrier (CC) is 90 MHz-wide, and 10 MHz guard bands are reported. Each CC carries OFDM having 2048 fast Fourier transform (FFT) points, a 75 kHz tone spacing, and 64-QAM modulation on each tone. The  $P_{out}$  and PAE for EVM  $\leq -25$  dBc are reported for a single CC in Figs. 4.22(a) and (c), respectively. Similarly,  $P_{out}$  and PAE for EVM  $\leq -25$  dBc on each and every CC for eight CCs are shown in Figs. 4.22(b) and (d). Peak perfor-



Figure 4.21: Summary of measured CW power sweep results at maximum PA gain setting and PS phase state 0 versus CW frequency at key power back-off levels. (a) $P_{out}$ . (b) PAE.

mance of  $P_{out}$ /PAE=6.5dBm/8.8% is demonstrated at 27 GHz for the extremely broadband 8×100 MHz waveform. Note tha The different traces in each of Figs. 4.22(a)–(d) correspond to 4 different digital PS phase states (the other 4 states have identical linearity as they correspond to an inversion in the symmetric 180°-cell [81]). These overlaid curves highlight that excellent Tx linearity performance is practically independent of PS phase state (as desired).

#### 4.5.2 Receive Mode

## 4.5.2.1 Small-signal S-parameters and Noise Figure

The measured small-signal S-parameters of the Rx-mode versus frequency for the 9 gain states of the LNA are shown in Fig. 4.23(a). The input *RL* at the antenna port  $(s_{11})$  is better than 9 dB across 23.5–37.9 GHz, while the peak gain  $(s_{21})$  is 16.8 dB, and the reverse isolation is  $\approx$ 40 dB as in the Tx-mode. Mirroring the plots for Tx-mode, the data for the two PS states {0, 4} are overlaid on the same axes in Fig. 4.23(a), while Figs. 4.23(b)–(d) show the measured S-parameters for the remaining PS states. Figures 4.24(a)–(d) show



Figure 4.22: Summary of measured  $P_{out}$  and PAE for carrier aggregation scenarios versus center frequency for EVM < -25dBc on each CC for different PS digital states. (a)  $P_{out}$  for 1CC. (b) PAE for 1CC. (c)  $P_{out}$  for 8CC. (d) PAE for 8CC.

the nonlinearity errors in the  $8 \times 1$  dB LNA gain steps for the four PS states  $\{0, 1, 2, 3\}$ . The r.m.s. gain step error is  $< 0.53 \text{ dB}_{r.m.s.}$  across 22–38 GHz.

Figures 4.25(a)–(c) show the measured insertion phase through the front-end module in Rx-mode versus frequency, and across the  $7 \times 45^{\circ}$  PS phase steps. The corresponding phase step nonlinearity errors are shown in Figs. 4.26(a)–(c); demonstrating that the r.m.s. phase error remains <  $10^{\circ}$  is achieved across 21.3–33.2 GHz.

The Rx-mode noise figure of the front-end is measured across the 9 LNA gain states at PS state 0, and reported in Fig. 4.27 over the 20–40 GHz frequency range. Figure 4.27 shows that the minimum  $NF_{Rx}$  achieved is 5.5 dB, and it remains below 6.5 dB across 26.4–32.0 GHz. Also,  $NF_{Rx}$  is relatively insensitive to LNA gain setting, it increases by a maximum amount of  $\approx$ 0.5 dB at minimum gain setting relative to its value at maximum gain over all measured frequencies.

#### 4.5.2.2 Receive-mode Linearity Performance

CW signal input power sweeps are performed to extract the input-referred 1 dB gain compression point  $IP_{1 \text{ dB}}$  for the minimum and maximum LNA gain states {0,8}. Similarly, two-tone input power sweeps for a tone spacing of  $\Delta f = 100$  MHz are also performed at LNA gain states {0,1,7,8} to extract the  $IIP_3$  of the Rx-mode front-end. Both sets of large-signal linearity measurements are made for PS phase state 0. Figure 4.28(a) plots a summary of Rx-mode  $IP_{1 \text{ dB}}$  for CW frequencies 26–33 GHz, showing a worst-case  $IP_{1 \text{ dB}}$  of -15.9 dBm (-22.7 dBm) at minimum (maximum) LNA gain. Figure 4.28(b) shows summary of the  $IIP_3$  across center frequencies of the two-tone signal over the same 26–33 GHz range, with a worst-case value of -8.5 dBm (-12.9 dBm) at maximum gain.



Figure 4.23: Measured s-parameters in Rx-mode across  $8 \times 1$ dB LNA gain steps for different PS phase state pairs. (a) States  $\{0, 4\}$ . (b) States  $\{1, 5\}$ . (c) States  $\{2, 6\}$ . (d) States  $\{3, 7\}$ .



Figure 4.24: Measured gain step nonlinearity errors in Rx-mode across  $8 \times 1$ dB LNA gain steps for different PS phase states; r.m.s. error indicated with thick black line in each case. (a) State 0. (b) State 1. (c) State 2. (d) State 3.



Figure 4.25: Measured s-parameters in Rx-mode across  $7 \times 45^{\circ}$  PS phase steps for different LNA gain settings. (a) LNA gain state 0 (min. gain). (b) LNA gain state 3. (c) LNA gain state 8 (max. gain).

## 4.5.3 Performance Comparison

Table 2.5 shows a comparison with state-of-the-art front-ends for 5G in the 28 GHzband. We note that references [88,89,95] report a mixture of per-channel performance in a conducted environment in one hand, and over-the-air performance of their complete packaged antenna arrays in the other hand. All other values in the table used on-die probing, including this work. References [88, 89, 95] are included for their strong relevance, but for fair comparison, their self-reported per-channel conducted-test performances are used



Figure 4.26: Measured errors in  $7 \times 45^{\circ}$  PS phase steps in Rx-mode for different LNA gain settings; r.m.s. error indicated with thick black line in each case. (a) LNA gain state 0 (min. gain). (b) LNA gain state 3. (c) LNA gain state 8 (max. gain).



Figure 4.27: Measured Rx-mode noise figure versus frequency across all LNA gain settings at PS phase state 0.

except if otherwise stated.

Except for the very high precision needed for multiple-beam-capable base station arrays as demonstrated in [96], the r.m.s. gain and phase step nonlinearity errors achieved in this work are comparable to those in all references in Table 2.5. However, this work deviates significantly from the overall trend for base station developments in that a lower PS resolution of 3 bits is implemented, which is tailored closely to UE requirements and limited by PS insertion loss [Section 4.2.2]. Other metrics specific to Tx-/Rx-mode are separately compared.



Figure 4.28: Summary of measured Rx-mode linearity performance versus frequency. (a) CW input  $P_{1dB}$  results at minimum and maximum LNA gain settings and PS phase state 0 versus CW frequency. (b) Two-tone  $IIP_3$  results at LNA gain settings  $\{0, 1, 7, 8\}$  and PS phase state 0 versus center frequency of two-tone signal.

|                                                                 | This<br>Work                                            | UCSD<br>RFIC'17—(1)                 | ADI/NCSU<br>RFIC'17—(2)                          | IBM/Ericsson<br>ISSCC'17           | Anokiwave<br>Product'16           | ADI/NCSU<br>JSSC'17  | UCSD<br>RFIC'16    | Samsung<br>IMS'16         |
|-----------------------------------------------------------------|---------------------------------------------------------|-------------------------------------|--------------------------------------------------|------------------------------------|-----------------------------------|----------------------|--------------------|---------------------------|
| Technology                                                      | 40nm Bulk<br>CMOS LP                                    | Jazz SBC18H<br>SiGe BiCMOS          | 130nm SiGe<br>BiCMOS                             | 130nm SiGe<br>BiCMOS               | SiGe<br>BiCMOS                    | 120nm SiGe<br>BiCMOS | 40nm<br>SOICMOS    | 0.15 µm GaAs<br>pHEMT     |
| Supply Voltage [V]                                              | 1.1                                                     | 2.2                                 | 3.3                                              | Not stated                         | 1.8                               | 2.5                  | 1.5                | 5.0                       |
| Integration                                                     | 1-Ch (LNA + PA<br>+ PS + TRSW)                          | 4-Ch/Chip<br>(LNA + PA + PS + TRSW) | 4-Ch (LNA + PA +<br>PS; separate<br>Tx/Rx ports) | 32-Ch (LNA +<br>PA + PS +<br>TRSW) | 4-Ch (LNA +<br>PA + PS +<br>TRSW) | 4-Ch (LNA +<br>PS)   | 1-Ch (LNA<br>+ PS) | 1-Ch (LNA +<br>PA + TRSW) |
| 5G Application                                                  | UE TRx                                                  | UE / BS TRx                         | BS TRx                                           | BS TRx                             | BS TRx                            | BS Rx                | UE / BS Rx         | UE TRx                    |
| Operating Feq.[GHz]                                             | 27—30                                                   | 28—32                               | 24—28                                            | 27—29                              | 27.5—30                           | 28—32                | 26—28              | 27.5-28.35                |
| Tx Peak Gain [dB]                                               | 12.4†<br>12.6†                                          | 13 (PA-Only)                        | 14.3                                             | 32 @ 25°C                          | 24                                | 3                    | 4                  | 22                        |
| KXPeak Gain [dB]                                                | 10.81                                                   | 20                                  | C.11                                             | 34 @ 20°C                          | 19                                | 11                   | 12.2               | 18                        |
| Tx Gain Control<br>Rx Gain Control                              | 9× 1dB<br>8× 1dB                                        | 14× 1dB<br>14× 1dB                  | None                                             | 8×1dB<br>8×1dB                     | 31× 1dB<br>31× 1dB                | None                 | -16×0.4dB          | None                      |
| Tx OP <sub>max</sub> /Ch [dBm]                                  | 15.8 @ 27GHz                                            | 13 (PA-Only @ 28GHz)                | 12.5 to 17.5                                     | 16.4                               | Ι                                 |                      |                    | १                         |
| IX PAE max/Cn [%]                                               | ZIU @ ZIGHZ                                             | 18 (PA-Uniy @ 28GHZ)                | I                                                | 1.22                               |                                   |                      |                    | 32                        |
| Tx OP <sub>1dB</sub> /Ch [dBm]<br>Tx PAE <sub>1dB</sub> /Ch [%] | ≥ 14.6@ 27—30GHz<br>≥ 21.9@ 27—30GHz                    | 10.5<br>13 (PA-Only @ 28GHz)        | 5.5 to 10.6<br>—                                 | <b>4</b>                           | 6                                 |                      |                    | - 24                      |
| Tx Waveform                                                     | 8CC-64QAM-OFDM                                          | 1CC-16QAM                           |                                                  |                                    |                                   |                      |                    |                           |
| TX EVM [MHz]                                                    | 800<br>— 25 @ 27GHz                                     | 400<br>22.5 @ 29GHz <sup>β</sup>    |                                                  |                                    |                                   |                      |                    |                           |
| Tx Pout/Ch [dBm]                                                | 6.5<br>6.5                                              | 3.5 (@ OP <sub>1dB</sub> - 7dB)     |                                                  |                                    |                                   |                      |                    |                           |
| PAE @ IX Pout %]                                                | 8.0<br>F                                                | 1 4                                 | 1 E to C D                                       | 2                                  | 4                                 |                      | 4.0 to 4.7         | 0 0                       |
| KX NF (GD)                                                      | c.c                                                     | 4.0                                 | 4.0 10 0.9                                       | 0.0                                | 0.C                               | WCALL OUL LO         | (no TRSW)          | 3.0                       |
| Rx IP <sub>1dB</sub> [dBm]                                      | - 16<br>2 5                                             |                                     | − 25 to − 18.4                                   | - 22.5                             |                                   | — 16.4 to<br>— 12.9  | - 8<br>0 5         |                           |
|                                                                 | - 8.3                                                   |                                     | I                                                | I                                  |                                   | - 10.4 to $-$ 6.8    | C.U –              |                           |
| PS IL [db]                                                      | 5.9†                                                    |                                     | n/a                                              | 9.3 <sup>a</sup>                   |                                   | n/a                  | Ι                  | No PS                     |
| PS Resolution [bits]                                            | 3                                                       | 9                                   | 5                                                | 9                                  | 5                                 | 4                    | 5                  | No PS                     |
| TRX Phase<br>Error [deg. <sub>m.s.</sub> ]                      | ≤ 10 @ 21.3—33.2GHz                                     | 9                                   | 4.2                                              | 0.6"                               | 5                                 | 5.4                  | 4                  | No PS                     |
| TRX Gain Error<br>[dB <sub>rms.</sub> ]                         | ≤0.50dBpk @ 23—37GHz (Tx)<br>≤0.53dBrms @ 22—38GHz (Rx) | 0.8                                 | 0.5                                              | 0.7                                | 0.5                               | 9.0                  | 9.0                | n/a                       |
| Area / Ch [mm <sup>2</sup> ]                                    | 0.99††                                                  | 2.16*                               | 0.56*                                            | 3.06*                              | Not stated                        | 0.45                 | 1.24*              | 12.6**                    |
|                                                                 |                                                         |                                     |                                                  |                                    |                                   |                      |                    |                           |

Table 4.3: Comparison with state-of-the-art front-ends for 28 GHz 5G communications.

<sup>†</sup>After de-embedding 1.2dB I/O balun loss, <sup>††</sup> without I/O balun, \*Estimated from die photo, \*\*With Pads,  $^{\alpha}$  from [100],  $^{\beta}$  includes EVM contribution of driving up-/down-converter chip. References: RFIC'17:-(1) [88], RFIC'17-(2): [90], ISSCC'17: [89], RFIC'16: [97], JSSC'17: [98], Anokiwave'16: [95], IMS'16: [99].

### 4.5.3.1 Transmit Mode

Note that all of the silicon-based works that integrate the Tx in Table 2.5 use SiGe, and the majority target base stations. Despite the significantly lower 1.1 V supply voltage used by this work, the achieved per-channel  $P_{1 dB} = 14.6$  dBm is comparable to [89] and higher than [88,90,95]. We note that the Tx in [90] benefits from separation of Tx output and Rx input to separate ports/pins (no TR switch), but also suffers significantly due to mis-tuning of on-chip matching. Also, [99] benefits from higher GaAs breakdown field to produce  $P_{1 dB} > 20$  dBm, which may be needed only for high performance base stations [79]. Finally, the peak  $8 \times 100$  MHz-wide 64-QAM OFDM signal  $P_{out}$  demonstrated at 27 GHz in this work is 3 dB greater than the peak per-channel  $1 \times 400$  MHz-wide 16-QAM  $P_{out}$ reported in [88], which can be attributed to the significantly higher per-channel  $P_{1 dB}$ , and the wideband PA [Section 4.3] and antenna interface [Section 4.4.1] designs in this work.

## 4.5.3.2 Receive Mode

We start with noise figure comparison. References [90, 97, 98] do not integrate the TR switch, so their noise figure performances should be compared to the 3.3 dB measured LNA-only noise figure in this work [Table 4.1]. Also, this work achieves comparable noise figure to [89, 95], i.e. 5–6 dB *including insertion loss of integrated TR switch*. With this 5–6 dB mean value in mind, reference [88] achieves an outstanding 4.6 dB. Since Rx  $IP_{1 \text{ dB}}/IIP_{3}$  are not reported in [88], it is difficult to conclude if linearity is simultaneously upheld by the use of a 2.2 V supply for the back-end VGA, since the active vector modulator typically exhibits poor linearity. Finally, the GaAs LNA and TR antenna switch in [99] achieve a lower *NF* due to the fundamental advantages of higher substrate resistivity and lower switch losses in compound semiconductors. However, note that no gain or phase control are integrated into that solution, which implies that the d.c. power consumption in [99] is likely to increase if the gain and noise figure are maintained after

adding gain/phase control functionality in the final solution.

As for linearity (i.e.  $IP_{1 dB}$ ) comparison, performance of this work is similar to [98], and better than [89]. The significantly higher supply voltage in [98] enables the higher  $IP_{1 dB}$  therein. Reference [97] significantly boosts  $IP_{1 dB}$  by using a combination of slightly higher 1.5 V supply voltage as well as alternating amplifier stages with phaseshifter cells. Note that alternating the phase shift cells with LNA stages means that, *unlike* e.g. [79], the PS is no longer bidirectional although its cells are passive. Therefore, larger area would be needed to integrate a second passive PS for the Tx-mode in a TRx built from the concept of [97].

Considering the above comparison with the most recent advances in SiGe RFIC design for base stations, one concludes that this work achieves state-of-the-art performance.

#### 4.6 Conclusion

This Chapter presented the fist fully-integrated high-performance TR front-end in bulk CMOS targeting 5G handset phased arrays. block-level specifications were first derived based on system design considerations. The PA, LNA, and PS circuits are integrated with TR switches in a compact area by using a wideband, all-lumped, and co-designed TR antenna interface topology that was verified experimentally using dedicated test structures. The presented TR front-end achieves state-of-the-art Tx  $P_{out}$ , linearity, and efficiency, as well as Rx noise/linearity performances in comparison to recent base station radio developments by academia/industry in SiGe BiCMOS.

# 5. CONCLUSION

In this dissertation, design methodologies and circuit techniques were demonstrated to address the integration of key phased array front-end circuits in scaled CMOS. For proof-of-concept, two PA prototypes were implemented in 28-nm and 40-nm nodes, and achieved state-of-the-art performance. A low-power fully-integrated TR front-end module was also implemented and maintained the excellent broadband performance of its constituent circuits through careful signal integrity analysis and design.

The 28 nm PA prototype in this dissertation is the first reported linear, bulk CMOS PA targeting low-power 5G mobile UE integrated phased array transceivers. The proposed optimization methodology, and its circuit-level enabler using inductive source degeneration were demonstrated to be effective through theoretical as well as very detailed experimental verification using continuous wave, two-tone, and complex modulated signals. The prototype was designed and fabricated in 1P7M 28 nm bulk CMOS and achieved achieves +4.2 dBm/9% measured  $P_{out}$ /PAE at -25 dBc EVM for a 250 MHz-wide, 64-QAM OFDM signal with 9.6 dB PAPR. At the time of its publication, the 28 nm PA set the state-of-the-art for high efficiency, linear, broadband 28 GHz-band PAs.

To drastically extend RFBW over that achieved in the first design, and to explore the use of CMOS technology for the even more challenging downlink data rates, the second PA design was designed for wideband linearity and implemented in a *slower* 40 nm process. The 40 nm design extended the supportable RFBW by a factor of three over the state-of-the-art without degrading output power range, battery life, or amplifier fidelity. The implemented PA used double-tuned transformers that were optimized for linearity, and integrated  $9 \times 1$  dB digital gain control for the first time in any comparable (reported) high-performance CMOS PA. The prototype was fabricated in a 1P6M 40 nm CMOS LP

technology and achieved  $P_{out}$ /PAE of +6.7 dBm/11% for an 8×100 MHz carrier aggregation 64-QAM OFDM signal with 9.7 dB PAPR; demonstrating the viability of CMOS technology to address even the very difficult 5G AP bandwidth requirements.

Finally, leveraging the developed PA design methodologies and circuits, a low power transmit-receive phased array front-end module is fully integrated in 40 nm technology. In transmit-mode, the front-end maintains the excellent performance of the 40 nm PA: achieving +5.5 dBm/9% for the same  $8 \times 100$  MHz carrier aggregation signal above. In receive-mode, a 5.5 dB noise figure (*NF*) and a minimum third-order input intercept point (*IIP*<sub>3</sub>) of -13 dBm are achieved. The performance of the implemented CMOS front-end is comparable to state-of-the-art publications and commercial products that were very recently developed in silicon germanium (SiGe) technologies for 5G communication.

#### 5.1 Future Work

Future efforts to follow up on this research are recommended in two main areas:

• Investigating on-chip versus array-based transmit power combining techniques in the 28 GHz band. The purpose is to enhance the achievable range for high-throughput scenarios like using 64-QAM OFDM, but without needing to migrate to a more expensive technology such as SiGe or GaAs. Similar studies of power combining techniques have been performed in the 60 GHz band, but none have been reported in the 28 GHz band as of yet, to the best of the author's knowledge. The study may benefit greatly fom the detailed link-budget analysis in Chapter 2 as a starting point; note however that some of the works cited in Chapter 2already included two- or four-way on-chip power combining in the higher frequency part of the considered range. A key point in such a study is that overall cost of implementation must be taken into consideration, since the cost of the silicon RFIC is expected to grow in size. A cost and performance comparison with architectures that integrate multiple

phased array RFICs with a single antenna module is also advised.

• Leveraging the developed PA design methodologies to implement more complex PA circuit topologies for further back-off PAE enhancement above the limits set by the simple common-source topology employed so far. For example, Doherty PAs typically suffer from the problems of AM-PM conversion, and narrowband performance. Since the work in Chapter 3 actually tackles AM-PM conversion by using wideband matching techniques, a Doherty PA using carrier and peaking amplifier cells based on the techniques in Chapter 3 could result in higher PAE performance without the AM-PM and narrowband issues of existing Doherty PAs.

## REFERENCES

- S. Shakib, H. C. Park, J. Dunworth, V. Aparin, and K. Entesari, "A highly efficient and linear power amplifier for 28-ghz 5g phased array radios in 28-nm cmos," *IEEE Journal of Solid-State Circuits*, vol. 51, pp. 1–17, Dec 2016.
- [2] S. Shakib, M. Elkholy, J. Dunworth, V. Aparin, and K. Entesari, "2.7 a wide-band 28ghz power amplifier supporting 8×100mhz carrier aggregation for 5g in 40nm cmos," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 44–45, Feb 2017.
- [3] M. Uzunkol and G. Rebeiz, "A low-loss 50–70 ghz spdt switch in 90 nm cmos," *IEEE Journal of Solid-State Circuits*, vol. 45, pp. 2003–2007, Oct 2010.
- [4] M. Boers, B. Afshar, I. Vassiliou, S. Sarkar, S. T. Nicolson, E. Adabi, B. G. Perumana, T. Chalvatzis, S. Kavvadias, P. Sen, W. L. Chan, A. H. T. Yu, A. Parsa, M. Nariman, S. Yoon, A. G. Besoli, C. A. Kyriazidou, G. Zochios, J. A. Castaneda, T. Sowlati, M. Rofougaran, and A. Rofougaran, "A 16tx/16rx 60 ghz 802.11ad chipset with single coaxial interface and polarization diversity," *IEEE Journal of Solid-State Circuits*, vol. 49, pp. 3031–3045, Dec 2014.
- [5] Y. Wang, H. Wang, C. Hull, and S. Ravid, "A transformer-based broadband frontend combo in standard cmos," *IEEE Journal of Solid-State Circuits*, vol. 47, pp. 1810–1819, Aug 2012.
- [6] T. M. Chen, W. C. Chan, C. C. Lin, J. L. Hsu, W. K. Li, P. A. Wu, Y. L. Huang, Y. C. Huang, T. Tsai, P. Y. Chang, C. L. Chen, C. H. Tsai, T. Y. Chang, I. C. Huang, W. H. Chiu, C. H. Liao, C. H. Wu, and G. Chien, "A 2×2 mimo 802.11 abgn/ac wlan soc with integrated t/r switch and on-chip pa delivering vht80 256qam 17.5dbm

in 55nm cmos," in 2014 IEEE Radio Frequency Integrated Circuits Symposium, pp. 225–228, June 2014.

- [7] S. Shakib, H. C. Park, J. Dunworth, V. Aparin, and K. Entesari, "20.6 a 28ghz efficient linear power amplifier for 5g phased arrays in 28nm bulk cmos," in 2016 IEEE International Solid-State Circuits Conference (ISSCC), pp. 352–353, Jan 2016.
- [8] S. Cripps, *RF Power Amplifiers for Wireless Communications*. Artech House microwave library, Artech House, 2006.
- [9] Y. Tsividis and C. McAndrew, Operation and Modeling of the MOS Transistor. Oxford Series in Electrical and Computer Engineering, Oxford University Press, 2011.
- [10] B. Razavi, *RF Microelectronics*. Prentice Hall Communications Engineering and Emerging Technologies, Prentice Hall, 2012.
- [11] E. Cohen, S. Ravid, and D. Ritter, "60ghz 45nm pa for linear ofdm signal with predistortion correction achieving 6.1% pae and –28db evm," in 2009 IEEE Radio Frequency Integrated Circuits Symposium, pp. 35–38, June 2009.
- [12] B. François and P. Reynaert, "Highly linear fully integrated wideband rf pa for lteadvanced in 180-nm soi," *IEEE Transactions on Microwave Theory and Techniques*, vol. 63, pp. 649–658, Feb 2015.
- [13] M. Elmala, J. Paramesh, and K. Soumyanath, "A 90-nm cmos doherty power amplifier with minimum am-pm distortion," *IEEE Journal of Solid-State Circuits*, vol. 41, pp. 1323–1332, June 2006.
- [14] D. Chowdhury, P. Reynaert, and A. M. Niknejad, "Design considerations for 60 ghz transformer-coupled cmos power amplifiers," *IEEE Journal of Solid-State Circuits*, vol. 44, pp. 2733–2744, Oct 2009.

- [15] D. Zhao and P. Reynaert, "A 60-ghz dual-mode class ab power amplifier in 40-nm cmos," *IEEE Journal of Solid-State Circuits*, vol. 48, pp. 2323–2337, Oct 2013.
- [16] S. V. Thyagarajan, A. M. Niknejad, and C. D. Hull, "A 60 ghz drain-source neutralized wideband linear power amplifier in 28 nm cmos," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 61, pp. 2253–2262, Aug 2014.
- [17] S. Kulkarni and P. Reynaert, "14.3 a push-pull mm-wave power amplifier with < 0.8°; am-pm distortion in 40nm cmos," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE International*, pp. 252–253, Feb 2014.
- [18] D. Zhao and P. Reynaert, "An e-band power amplifier with broadband parallel-series power combiner in 40-nm cmos," *IEEE Transactions on Microwave Theory and Techniques*, vol. 63, pp. 683–690, Feb 2015.
- [19] W. Ye, K. Ma, and K. S. Yeo, "A 2-to-6ghz class-ab power amplifier with 28.4pct pae in 65nm cmos supporting 256qam," in 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers, pp. 1–3, Feb 2015.
- [20] A. Larie, E. Kerhervé, B. Martineau, L. Vogt, and D. Belot, "2.10 a 60ghz 28nm utbb fd-soi cmos reconfigurable power amplifier with 21% pae, 18.2dbm p1db and 74mw pdc," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2015 IEEE International*, pp. 1–3, Feb 2015.
- [21] H. Hashemi and S. Raman, *mm-Wave Silicon Power Amplifiers and Transmitters*. The Cambridge RF and Microwave Engineering Series, Cambridge University Press, 2016.
- [22] B.-W. Min and G. M. Rebeiz, "5–6 ghz spdt switchable balun using cmos transistors," in 2008 IEEE Radio Frequency Integrated Circuits Symposium, pp. 321–324, June 2008.

- [23] H. S. Lee, K. Kim, and B. W. Min, "On-chip t/r switchable balun for 5- to 6-ghz wlan applications," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 62, pp. 6–10, Jan 2015.
- [24] P. Park, D. H. Shin, and C. P. Yue, "High-linearity cmos t/r switch design above 20 ghz using asymmetrical topology and ac-floating bias," *IEEE Transactions on Microwave Theory and Techniques*, vol. 57, pp. 948–956, April 2009.
- [25] Z. Li, H. Yoon, F.-J. Huang, and K. K. O, "5.8-ghz cmos t/r switches with high and low substrate resistances in a 0.18-μm cmos process," *IEEE Microwave and Wireless Components Letters*, vol. 13, pp. 1–3, Jan 2003.
- [26] B. W. Min and G. M. Rebeiz, "Ka -band low-loss and high-isolation switch design in 0.13-μm cmos," *IEEE Transactions on Microwave Theory and Techniques*, vol. 56, pp. 1364–1371, June 2008.
- [27] Y. P. Zhang, J. J. Wang, Q. Li, and X. J. Li, "Antenna-in-package and transmitreceive switch for single-chip radio transceivers of differential architecture," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 55, pp. 3564–3570, Dec 2008.
- [28] N. A. Talwalkar, C. P. Yue, H. Gan, and S. S. Wong, "Integrated cmos transmitreceive switch using lc-tuned substrate bias for 2.4-ghz and 5.2-ghz applications," *IEEE Journal of Solid-State Circuits*, vol. 39, pp. 863–870, June 2004.
- [29] T. Ohnakado, S. Yamakawa, T. Murakami, A. Furukawa, E. Taniguchi, H. Ueda, N. Suematsu, and T. Oomori, "21.5-dbm power-handling 5-ghz transmit/receive cmos switch realized by voltage division effect of stacked transistor configuration with depletion-layer-extended transistors (dets)," *IEEE Journal of Solid-State Circuits*, vol. 39, pp. 577–584, April 2004.

- [30] Y. Jin and C. Nguyen, "Ultra-compact high-linearity high-power fully integrated dc–20-ghz 0.18-μm cmos t/r switch," *IEEE Transactions on Microwave Theory and Techniques*, vol. 55, pp. 30–36, Jan 2007.
- [31] Q. Li, Y. P. Zhang, K. S. Yeo, and W. M. Lim, "16.6- and 28-ghz fully integrated cmos rf switches with improved body floating," *IEEE Transactions on Microwave Theory and Techniques*, vol. 56, pp. 339–345, Feb 2008.
- [32] Z. Li and K. K. O, "15-ghz fully integrated nmos switches in a 0.13- mu;m cmos process," *IEEE Journal of Solid-State Circuits*, vol. 40, pp. 2323–2328, Nov 2005.
- [33] M. Elkholy, S. Shakib, J. Dunworth, V. Aparin, and K. Entesari, "A 26–33ghz, 3.3db nf variable gain lna and 6db insertion loss passive 3-bit phase shifter for 5g in 40nm cmos," in *to be submitted to 2017 IEEE Radio Frequency Integrated Circuits Symposium*, pp. 1–4, June 2017.
- [34] C. F. Campbell and S. A. Brown, "A compact 5-bit phase-shifter mmic for k-band satellite communication systems," *IEEE Transactions on Microwave Theory and Techniques*, vol. 48, pp. 2652–2656, Dec 2000.
- [35] T. M. Hancock and G. M. Rebeiz, "A 12-ghz sige phase shifter with integrated lna," *IEEE Transactions on Microwave Theory and Techniques*, vol. 53, pp. 977–983, March 2005.
- [36] D.-W. Kang, H. D. Lee, C.-H. Kim, and S. Hong, "Ku-band mmic phase shifter using a parallel resonator with 0.18-μm cmos technology," *IEEE Transactions on Microwave Theory and Techniques*, vol. 54, pp. 294–301, Jan 2006.
- [37] B. W. Min and G. M. Rebeiz, "Single-ended and differential ka-band bicmos phased array front-ends," *IEEE Journal of Solid-State Circuits*, vol. 43, pp. 2239–2250, Oct 2008.

- [38] E. Cohen, C. Jakobson, S. Ravid, and D. Ritter, "A bidirectional tx/rx four-element phased array at 60 ghz with rf-if conversion block in 90-nm cmos process," *IEEE Transactions on Microwave Theory and Techniques*, vol. 58, pp. 1438–1446, May 2010.
- [39] K. Gharibdoust, N. Mousavi, M. Kalantari, M. Moezzi, and A. Medi, "A fully integrated 0.18-μm cmos transceiver chip for x-band phased-array systems," *IEEE Transactions on Microwave Theory and Techniques*, vol. 60, pp. 2192–2202, July 2012.
- [40] W. T. Li, Y. C. Chiang, J. H. Tsai, H. Y. Yang, J. H. Cheng, and T. W. Huang, "60-ghz 5-bit phase shifter with integrated vga phase-error compensation," *IEEE Transactions on Microwave Theory and Techniques*, vol. 61, pp. 1224–1235, March 2013.
- [41] S. Shakib, M. Elkholy, J. Dunworth, V. Aparin, and K. Entesari, "A wideband linear 28-ghz power amplifier for power-efficient 5g phased arrays in 40-nm cmos," *to be submitted to IEEE Transactions on Microwave Theory and Techniques*, pp. 1–12.
- [42] S. Onoe, "Evolution of 5g mobile technology toward 2020 and beyond," in 2016 IEEE International Solid-State Circuits Conference (ISSCC), pp. 23–28, Jan 2016.
- [43] Z. Pi and F. Khan, "An introduction to millimeter-wave mobile broadband systems," *IEEE Communications Magazine*, vol. 49, pp. 101–107, June 2011.
- [44] T. S. Rappaport, S. Sun, R. Mayzus, H. Zhao, Y. Azar, K. Wang, G. N. Wong, J. K. Schulz, M. Samimi, and F. Gutierrez, "Millimeter wave mobile communications for 5g cellular: It will work!," *IEEE Access*, vol. 1, pp. 335–349, 2013.
- [45] S. Rangan, T. S. Rappaport, and E. Erkip, "Millimeter-wave cellular wireless networks: Potentials and challenges," *Proceedings of the IEEE*, vol. 102, pp. 366–385,

March 2014.

- [46] W. Roh, J. Y. Seol, J. Park, B. Lee, J. Lee, Y. Kim, J. Cho, K. Cheun, and F. Aryanfar, "Millimeter-wave beamforming as an enabling technology for 5g cellular communications: theoretical feasibility and prototype results," *IEEE Communications Magazine*, vol. 52, pp. 106–113, February 2014.
- [47] F. Aryanfar, J. Pi, H. Zhou, T. Henige, G. Xu, S. Abu-Surra, D. Psychoudakis, and
  F. Khan, "Millimeter-wave base station for mobile broadband communication," in
  2015 IEEE MTT-S International Microwave Symposium, pp. 1–3, May 2015.
- [48] F. C. C. N. of Proposed Rulemaking, "Use of spectrum bands above 24 ghz for mobile radio services," in GN Docket No. 14-177, October 2015.
- [49] A. Sarkar and B. Floyd, "A 28-ghz class-j power amplifier with 18-dbm output power and 35% peak pae in 120-nm sige bicmos," in *Silicon Monolithic Integrated Circuits in Rf Systems (SiRF), 2014 IEEE 14th Topical Meeting on*, pp. 71–73, Jan 2014.
- [50] M. R. Akdeniz, Y. Liu, M. K. Samimi, S. Sun, S. Rangan, T. S. Rappaport, and E. Erkip, "Millimeter wave channel modeling and cellular capacity evaluation," *IEEE Journal on Selected Areas in Communications*, vol. 32, pp. 1164–1179, June 2014.
- [51] W. Heinrich, "The flip-chip approach for millimeter wave packaging," *IEEE Microwave Magazine*, vol. 6, pp. 36–45, Sept 2005.
- [52] D. Pozar, Microwave Engineering, 4th Edition. Wiley, 2011.
- [53] T. Rappaport, R. Heath, R. Daniels, and J. Murdock, *Millimeter Wave Wireless Communications*. Prentice Hall Communications Engineering and Emerging Technologies Series from Ted Rappaport, Pearson Education, 2014.

- [54] O. Inac, M. Uzunkol, and G. M. Rebeiz, "45-nm cmos soi technology characterization for millimeter-wave applications," *IEEE Transactions on Microwave Theory and Techniques*, vol. 62, pp. 1301–1311, June 2014.
- [55] X. J. Li and Y. P. Zhang, "Flipping the cmos switch," *IEEE Microwave Magazine*, vol. 11, pp. 86–96, Feb 2010.
- [56] S. Monayakul, S. Sinha, C. T. Wang, N. Weimann, F. J. Schmückle, M. Hrobak, V. Krozer, W. John, L. Weixelbaum, P. Wolter, O. Krüger, and W. Heinrich, "Flipchip interconnects for 250 ghz modules," *IEEE Microwave and Wireless Components Letters*, vol. 25, pp. 358–360, June 2015.
- [57] J. Kang, D. Yu, K. Min, and B. Kim, "A ultra-high pae doherty amplifier based on 0.13-μm cmos process," *IEEE Microwave and Wireless Components Letters*, vol. 16, pp. 505–507, Sept 2006.
- [58] S. H. L. Tu and S. C. H. Chen, "A 5.25-ghz cmos cascode class-ab power amplifier for wireless communication," in *Electron Devices and Solid-State Circuits*, 2007. *EDSSC 2007. IEEE Conference on*, pp. 421–424, Dec 2007.
- [59] A. Chakrabarti and H. Krishnaswamy, "High-power high-efficiency class-e-like stacked mmwave pas in soi and bulk cmos: Theory and implementation," *IEEE Transactions on Microwave Theory and Techniques*, vol. 62, pp. 1686–1704, Aug 2014.
- [60] E. Kaymaksut, D. Zhao, and P. Reynaert, "Transformer-based doherty power amplifiers for mm-wave applications in 40-nm cmos," *IEEE Transactions on Microwave Theory and Techniques*, vol. 63, pp. 1186–1192, April 2015.
- [61] S. W. Chen, W. Panton, and R. Gilmore, "Effects of nonlinear distortion on cdma communication systems," *IEEE Transactions on Microwave Theory and Tech*-

niques, vol. 44, pp. 2743–2750, Dec 1996.

- [62] Y. Palaskas, S. S. Taylor, S. Pellerano, I. Rippke, R. Bishop, A. Ravi, H. Lakdawala, and K. Soumyanath, "A 5-ghz 20-dbm power amplifier with digitally assisted ampm correction in a 90-nm cmos process," *IEEE Journal of Solid-State Circuits*, vol. 41, pp. 1757–1763, Aug 2006.
- [63] T. Heller, E. Cohen, and E. Socher, "Analysis of cross-coupled common-source cores for w-band lna design at 28nm cmos," in *Microwaves, Communications, Antennas and Electronics Systems (COMCAS), 2013 IEEE International Conference on*, pp. 1–5, Oct 2013.
- [64] Y. He, L. Li, and P. Reynaert, "60ghz power amplifier with distributed active transformer and local feedback," in *ESSCIRC*, 2010 Proceedings of the, pp. 314–317, Sept 2010.
- [65] J. R. Long, "Monolithic transformers for silicon rf ic design," IEEE Journal of Solid-State Circuits, vol. 35, pp. 1368–1382, Sept 2000.
- [66] J. Hong and M. Lancaster, *Microstrip Filters for RF / Microwave Applications*.Wiley Series in Microwave and Optical Engineering, Wiley, 2004.
- [67] I. Aoki, S. D. Kee, D. B. Rutledge, and A. Hajimiri, "Distributed active transformera new power-combining and impedance-transformation technique," *IEEE Transactions on Microwave Theory and Techniques*, vol. 50, pp. 316–331, Jan 2002.
- [68] B. Razavi, "A study of phase noise in cmos oscillators," *IEEE Journal of Solid-State Circuits*, vol. 31, pp. 331–343, Mar 1996.
- [69] K. L. Fong and R. G. Meyer, "High-frequency nonlinearity analysis of commonemitter and differential-pair transconductance stages," *IEEE Journal of Solid-State Circuits*, vol. 33, pp. 548–555, Apr 1998.

- [70] J. Deng, P. S. Gudem, L. E. Larson, and P. M. Asbeck, "A high average-efficiency sige hbt power amplifier for wcdma handset applications," *IEEE Transactions on Microwave Theory and Techniques*, vol. 53, pp. 529–537, Feb 2005.
- [71] D. Stephens, T. Vanhoucke, and J. J. T. M. Donkers, "Rf reliability of short channel nmos devices," in 2009 IEEE Radio Frequency Integrated Circuits Symposium, pp. 343–346, June 2009.
- [72] D. Schreurs, *RF Power Amplifier Behavioral Modeling*. The Cambridge RF and Microwave Engineering Series, Cambridge University Press, 2008.
- [73] V. Aparin and C. Persico, "Effect of out-of-band terminations on intermodulation distortion in common-emitter circuits," in *Microwave Symposium Digest, 1999 IEEE MTT-S International*, vol. 3, pp. 977–980 vol.3, June 1999.
- [74] V. Aparin, "Analysis of cdma signal spectral regrowth and waveform quality," *IEEE Transactions on Microwave Theory and Techniques*, vol. 49, pp. 2306–2314, Dec 2001.
- [75] P. Haldi, D. Chowdhury, P. Reynaert, G. Liu, and A. M. Niknejad, "A 5.8 ghz 1 v linear power amplifier using a novel on-chip transformer power combiner in standard 90 nm cmos," *IEEE Journal of Solid-State Circuits*, vol. 43, pp. 1054– 1063, May 2008.
- [76] B. Park, D. Jeong, J. Kim, Y. Cho, K. Moon, and B. Kim, "Highly linear cmos power amplifier for mm-wave applications," in 2016 IEEE MTT-S International Microwave Symposium (IMS), pp. 1–3, May 2016.
- [77] V. G. T. F. A. I. W. Group, "Verizon 5th generation radio access; test plan for air interface (release 1)." http://www.5gtf.org/5GTF\_Test\_Plan\_AI\_v1p1.pdf, 2016.
- [78] A. Niknejad and H. Hashemi, *mm-Wave Silicon Technology: 60 GHz and Beyond*. Integrated Circuits and Systems, Springer US, 2008.
- [79] U. Kodak and G. M. Rebeiz, "Bi-directional flip-chip 28 ghz phased-array corechip in 45nm cmos soi for high-efficiency high-linearity 5g systems," in 2017 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), pp. 61–64, 2017.
- [80] B. W. Min and G. M. Rebeiz, "Single-ended and differential ka-band bicmos phased array front-ends," *IEEE Journal of Solid-State Circuits*, vol. 43, pp. 2239–2250, Oct 2008.
- [81] M. Elkholy, S. Shakib, J. Dunworth, V. Aparin, and K. Entesari, "Low loss highly linear integrated passive phase shifters for 5g on bulk cmos," *submitted to IEEE Transactions on Microwave Theory and Techniques*, pp. 1–9.
- [82] K. J. Koh and G. M. Rebeiz, "A q-band four-element phased-array front-end receiver with integrated wilkinson power combiners in 0.18-μm sige bicmos technology," *IEEE Transactions on Microwave Theory and Techniques*, vol. 56, pp. 2046– 2053, Sept 2008.
- [83] T. Yu and G. M. Rebeiz, "A 22–24 ghz 4-element cmos phased array with on-chip coupling characterization," *IEEE Journal of Solid-State Circuits*, vol. 43, pp. 2134– 2143, Sept 2008.
- [84] S. Y. Kim and G. M. Rebeiz, "A low-power bicmos 4-element phased array receiver for 76–84 ghz radars and communication systems," *IEEE Journal of Solid-State Circuits*, vol. 47, pp. 359–367, Feb 2012.
- [85] B. W. Min, M. Chang, and G. M. Rebeiz, "Sige t/r modules for ka-band phased arrays," in 2007 IEEE Compound Semiconductor Integrated Circuits Symposium, pp. 1–4, Oct 2007.

- [86] D. W. Kang, J. G. Kim, B. W. Min, and G. M. Rebeiz, "Single and four-element kaband transmit/receive phased-array silicon rfics with 5-bit amplitude and phase control," *IEEE Transactions on Microwave Theory and Techniques*, vol. 57, pp. 3534– 3543, Dec 2009.
- [87] C. Y. Kim, D. W. Kang, and G. M. Rebeiz, "A 44–46-ghz 16-element sige bicmos high-linearity transmit/receive phased array," *IEEE Transactions on Microwave Theory and Techniques*, vol. 60, pp. 730–742, March 2012.
- [88] K. Kibaroglu, M. Sayginer, and G. M. Rebeiz, "An ultra low-cost 32-element 28 ghz phased-array transceiver with 41 dbm eirp and 1.0-1.6 gbps 16-qam link at 300 meters," in 2017 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), pp. 73–77, 2017.
- [89] B. Sadhu, Y. Tousi, J. Hallin, S. Sahl, S. Reynolds, . Renström, K. Sjögren, O. Haapalahti, N. Mazor, B. Bokinge, G. Weibull, H. Bengtsson, A. Carlinger, E. Westesson, J. E. Thillberg, L. Rexberg, M. Yeck, X. Gu, D. Friedman, and A. Valdes-Garcia, "7.2 a 28ghz 32-element phased-array transceiver ic with concurrent dual polarized beams and 1.4 degree beam-steering resolution for 5g communication," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 128–129, Feb 2017.
- [90] Y. S. Yeh, B. Walker, E. Balboni, and B. A. Floyd, "A 28-ghz phased-array transceiver with series-fed dual-vector distributed beamforming," in 2017 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), pp. 65–68, 2017.
- [91] M. Elkholy, S. Shakib, J. Dunworth, V. Aparin, and K. Entesari, "A highly linear wideband variable gain lna with 1db gain steps for 5g using 40nm cmos," *submitted to IEEE Microwave and Wireless Components Letters*, pp. 1–3, 2017.

- [92] T. Yao, M. Q. Gordon, K. K. W. Tang, K. H. K. Yau, M. T. Yang, P. Schvan, and S. P. Voinigescu, "Algorithmic design of cmos lnas and pas for 60-ghz radio," *IEEE Journal of Solid-State Circuits*, vol. 42, pp. 1044–1057, May 2007.
- [93] C. F. Campbell and S. A. Brown, "A compact 5-bit phase-shifter mmic for k-band satellite communication systems," *IEEE Transactions on Microwave Theory and Techniques*, vol. 48, pp. 2652–2656, Dec 2000.
- [94] H. Cho and D. E. Burk, "A three-step method for the de-embedding of highfrequency s-parameter measurements," *IEEE Transactions on Electron Devices*, vol. 38, pp. 1371–1375, Jun 1991.
- [95] Anokiwave, "Awmf-0108." http://www.anokiwave.com/specifications/ AWMF-0108.pdf, 2016.
- [96] B. Sadhu, Y. Tousi, J. Hallin, S. Sahl, S. Reynolds, . Renström, K. Sjögren, O. Haapalahti, N. Mazor, B. Bokinge, G. Weibull, H. Bengtsson, A. Carlinger, E. Westesson, J. E. Thillberg, L. Rexberg, M. Yeck, X. Gu, D. Friedman, and A. Valdes-Garcia, "7.2 a 28ghz 32-element phased-array transceiver ic with concurrent dual polarized beams and 1.4 degree beam-steering resolution for 5g communication," in 2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 128–129, Feb 2017.
- [97] U. Kodak and G. M. Rebeiz, "A 42mw 26–28 ghz phased-array receive channel with 12 db gain, 4 db nf and 0 dbm iip3 in 45nm cmos soi," in 2016 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), pp. 348–351, May 2016.
- [98] Y. S. Yeh, B. Walker, E. Balboni, and B. Floyd, "A 28-ghz phased-array receiver front end with dual-vector distributed beamforming," *IEEE Journal of Solid-State Circuits*, vol. PP, no. 99, pp. 1–15, 2017.

- [99] J. Curtis, H. Zhou, and F. Aryanfar, "A fully integrated ka-band front end for 5g transceiver," in 2016 IEEE MTT-S International Microwave Symposium (IMS), pp. 1–3, May 2016.
- [100] Y. Tousi and A. Valdes-Garcia, "A ka-band digitally-controlled phase shifter with sub-degree phase precision," in 2016 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), pp. 356–359, May 2016.