# A FULLY DIFFERENTIAL, WIDE BAND, 4-STACKED RF DRIVER 

 FOR PHOTONIC MODULATORSA Thesis<br>by<br>POUYA ESFAHANI

Submitted to the Graduate and Professional School of Texas A\&M University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE

| Chair of Committee, | Kamran Entesari |
| :--- | :--- |
| Committee Members, | Jose Silva-Martinez <br> Prasad Enjeti |
|  | Behbood Zoghi <br> Head of Department, |
| Miroslav Begovic |  |

May 2023
Major Subject: Electrical Engineering


#### Abstract

Advanced sub-micrometer technologies are suffering from the low breakdown voltage of the devices, which makes it hard to achieve high-power and high-frequency RF drivers. To overcome these limitations, power combining techniques are developed to combine the power over multiple stages, such as the transistor stacking technique. This work presents an ultra-wide-band, fully differential stacked RF driver. In comparison to other conventional RF drivers, the stacked driver allows higher output voltage swing and larger output power. The proposed design, prototyped in Global Foundry 22nm FDSOI CMOS technology, shows a $0.4-17.4 \mathrm{GHz}$ bandwidth, with a 20 dB gain. More than 16 dBm output power is achieved at 10 GHz , with 17.6 dBm saturation power. Overall, the driver consumes 288 mW , differential, from a 3.2 V supply and occupies $0.167 \mathrm{~mm}^{2}$.


## DEDICATION

To my Mother and Sister,
And in memory of my Father.

## ACKNOWLEDGEMENTS

First, I would like to express my gratitude to Dr. Kamran Entesari, my research advisor, for giving me the opportunity to work in the Analog and RF domain and guiding me throughout this time. I am grateful for his valuable guidance and encouragement.

Also, I would like to thank Dr. Jose Silva-Martinez, Dr. Prasad Enjeti, and Dr. Behbood Zoghi for serving on my committee and dedicating their valuable time to review my thesis.

I am thankful to have worked with Jeirui Fu, Mohammad Ghaedi Bardeh, and Ramy Rady who have guided me in various aspects of the layout and design tools.

And last, I would like to thank my family and friends for their support throughout this time, without whom, I wouldn't have been able to complete this work.

## CONTRIBUTORS AND FUNDING SOURCES

## Contributors

This work was supported by a thesis committee consisting of Dr. Kamran Entesari, Dr. Jose-Martinez and Dr. Prasad Enjeti and Dr. Behbood Zoghi.

All work conducted for the thesis (or) dissertation as completed by the student, under supervision of Dr. Kamran Entesari of the Electrical Engineering.

## Funding Sources

This project, as a part of OEO project, is funded by National Science Foundation (NSF).

TABLE OF CONTENTS

## Page

ABSTRACT ..... ii
DEDICATION ..... iii
ACKNOWLEDGEMENTS ..... iv
CONTRIBUTORS AND FUNDING SOURCES ..... v
LIST OF FIGURES ..... viii
LIST OF TABLES ..... xii
1._INTRODUCTION. ..... 1
1.1 System Overview ..... 1
1.2 Research Objective ..... 8
1.3 Thesis organization ..... 9
2. RF DRIVER ARCHITECTURES REVIEW ..... 10
2.1 Distributed Driver. ..... 10
2.2 Stacked Driver ..... 20
3. PROPOSED RF DRIVER ..... 34
3.1 22nm-FDSOI ..... 34
3.2 Stacked RF Driver ..... 37
3.3 Intermediate matching network ..... 43
3.4 Fully differential Stacked RF Driver design FDOSI CMOS ..... 45
3.5 Transistor sizing and biasing ..... 48
3.6 Feedback and wideband matching ..... 51
3.7 Choke Inductor ..... 53
3.8 Layout ..... 55
3.9 Wire bonding effect ..... 60
4. SIMULATION RESULTS ..... 63
4.1 Simulation Results ..... 63
4.1.1 S-parameter analysis ..... 63
4.1.2 Design stability ..... 64
4.1.3 Harmonic balance analysis (1-dB compression point and Voltage gain) ..... 65
4.1.4 Transient Analysis ..... 67
4.2 Chip microphotograph ..... 68
4.3 Measurement plan ..... 69
4.3.1 S-parameter measurement ..... 70
4.3.2 Transient measurement ..... 72
4.3.3 Large Signal measurement ..... 72
5. CONCLUSION AND FUTURE WORKS ..... 74
REFERENCES ..... 76

## LIST OF FIGURES

## Page

Figure 1 OEO Loop block diagram................................................................................... 1
Figure 2 A simplified block diagram of a CMOS chip with PD......................................... 3
Figure 3 Driving schemes of MZM transmitters. (a) Single-ended drive, dual-arm push-pull. (b) Differential drive, dual-arm push-pull. (c) Differential drive, dual-arm push-pull with shared bias. (d) Dual-differential drive, dual-arm push-pull
Figure 4 Driver linearity plot ..... 7
Figure 5 Simplified Distributed power amplifier using TL ..... 11
Figure 6 (a) Gate TL, (b) Drain TL ..... 11
Figure 7 A two-port network terminated by the image impedance ..... 13
Figure 8 8-stage Distributed Driver using 2-stacked gain unit cell ..... 15
Figure 9 Low-pass filter T-section: (a) constant-k section and (b) m-derived section ..... 15
Figure 10 folded pseudo-differential distributed driver schematic ..... 16
Figure 11 distributed driver gain stage ..... 17
Figure 12 multi-drive stacked topology ..... 18
Figure 13 (a) Conventional Stacked and (b) multi-drive stacked topologies ..... 18
Figure 14 Cascode configuration ..... 20
Figure 15 Two-Stacked Transistor with Choke inductor ..... 22
Figure 16 Effect of the number of stacked transistors on Psat ..... 23
Figure 17 3-stacked driver with resistive biasing network ..... 24Figure 18 Source input impedance of kth transistor.24
Figure 19 different intermediate matching network solutions: (a) shunt inductive tuning, (b) shunt feedback Cds tuning, (c) series inductive tuning ..... 25
Figure 20 (a) simplified 4-stacked Transformer-coupled input-feed technique, (b) 2-stage Transformer-coupled input-feed technique ..... 27
Figure 21 high-frequency transistor model with output admittance ..... 28
Figure 22 (a) 3-stack n-MOSFET, (b) 4-stack n-MOSFET, and (c) 3-stack CMOS ..... 28
Figure 23 3-stacked nMOS, 4-stacked nMOS, and 3-stacked CMOS architecture ..... 29
Figure 24 Stacked topology with feedback resistor Rf . ..... 30
Figure 25 Type II: Conventional biasing; Type IV: proposed intrinsic parasitic feedback biasing network ..... 31
Figure 26 Bulk CMOS Vs. FDSOI CMOS technology ..... 34
Figure 27 Comparison between flip-well (right) and conventional well (left) ..... 35
Figure 28 BEOL visualization of 22 nm FDSOI. ..... 36
Figure 29 Proposed 4-stack NMOS basic structure ..... 38
Figure 30 Class-A biasing point ..... 39
Figure 31 high-frequency stacked-transistor model ..... 40
Figure 32 Small signal gain and stability factors Vs. the feedback resistor (Rf) ..... 43
Figure 33 simplified the small-signal model of stacked transistors ..... 44
Figure 34 The comparison of different tuning methods on Psat ..... 45
Figure 35 Proposed 4-stacked driver (single-ended) with input and output matching network ..... 47
Figure 36 Intermediate drain voltages ..... 47
Figure 37 (a) Input/output of first stack transistor, (b) back-gate voltage effect on the first stage drain voltage ..... 50
Figure 38 (a) Threshold Voltage Vs. Back-gate voltage; (b) Transconductance (gm) Vs. Vgs for different Back-gate voltages ..... 50
Figure 39 Effect of feedback resistor on (a) gain, bandwidth, and (b) stability ..... 51
Figure 40 Input/Output matching Vs. Frequency plot from 1 GHz to 17 GHz ..... 52
Figure 41 Choke inductor effect on the low cutoff frequency ..... 54
Figure 42 Designed spiral inductor using Sonnet EM simulator ..... 55
Figure 432 pF coupling capacitor, $\mathrm{C}_{6}$ on Figure 35 , consists of four 500fF unit capacitors in parallel with $8 \mu \mathrm{~m} \times 10 \mu \mathrm{~m}$ dimension each unit and $15 \mu m \times 50 \mu m$ overall size for the $\mathrm{C}_{6}$. ..... 56
Figure 44 2-pF MOM coupling capacitor Quality Factor ..... 56
Figure 45 First stage transistor layout, consists of 8 arrays of SLVTnFET ..... 57
Figure 46 Layout differential stacked PA, Active area layout ..... 58
Figure 47 Overall Differential 4-stacked RF driver ..... 59
Figure 48 Passive VDD connection and its equivalent circuit model ..... 60
Figure 49 Effect of resonance frequency as the bond-wire inductance varies from 200 pH to 600 pH with fixed $80-\mathrm{fF}$ pad2 capacitance ..... 61
Figure 50 driver transfer function with LBW of 400 pH and 550 pH ..... 62
Figure 51 proposes a differential driver S-parameter analysis ..... 63
Figure 52 (a) K-factor and B1f stability factors of the PA, (b) Mu and Mu-prime stability factors ..... 64
Figure 53 (a) OP1db at 10 GHz is about 16 dBm , IP1db is about -2.9 dBm (single-ended, $50 \Omega$ termination) ..... 65
Figure 54 Voltage gain Vs input power at (a) 10 GHz , (b) 16 GHz . ..... 66
Figure 55 (a) differential and (b) single-ended input/output signal at 10 GHz ..... 67
Figure 56 Chip microphotograph ..... 68
Figure 57 The location of the test point at the CMOS chip ..... 69
Figure 58 Measurement PCB ..... 70
Figure 59 S-Parameter measurement setup ..... 71
Figure 60 Differential output to single-ended, using a Balun ..... 71
Figure 61 Large-signal measurement setup ..... 73
Figure 62 IP3 measurement setup ..... 73

## LIST OF TABLES

Page
Table 1 RF driver Class comparison ..... 5
Table 2 Table of comparison: Distributed RF Driver ..... 19
Table 3 Achieved outcomes in ..... 29
Table 4 Table of comparison: Stacked RF Driver. ..... 33
Table 5 performance comparison table ..... 75

## 1. INTRODUCTION

### 1.1 System Overview

Nowadays, ultra-wideband, ultra-low-phase-noise, and high-resolution mm-wave signal generators can significantly assist applications like current instrumentation, wireless communications, radars, and space systems. To enable simultaneous ultrawideband, ultra-low phase noise, high-resolution, and continuously tunable signal generation for these applications, transformative mm-wave signal generation architectures are required. The difficulty of simultaneously meeting all of the mentioned specifications for mm-wave signal production is a substantial obstacle to the implementation of these signal generators. Unfortunately, conventional electronic signal generators at the mmwave range have considerable limitations to overcome this challenge for systems with small form factors. RF photonics technology has the potential to enable mm-wave signal generators to overcome these challenges. On the other hand, nanometers CMOS technologies have the potential to operate in the mm-wave range with ultra-wideband bandwidth, due to their high unity-gain-frequency $\left(f_{t}\right)$ and low power consumption.


Figure 1 OEO Loop block diagram

Figure 1, shows the overall block diagram of the Optical/Electronic Oscillator (OEO). In this system, the optical power passes through the Mach-Zehnder modulator (MZM) and multiple passband/notch filters. Next, the optical power will be absorbed by the photodetector (PD). The PD transforms the optical power into electrical current and passes it to the CMOS chip. Inside the CMOS chip, the input current transforms and amplifies to the desired voltage swing at the output of the RF driver. Lastly, the voltage swing at the output of the driver will drive the MZM and modulate the light which will be discussed in this chapter.

For Phase 1 of this project, a 17 GHz bandwidth goal was set. From the system analysis, the condition for start-of-oscillation is 45 dB gain for the CMOS chip to compensate for overall loop loss, and $2 \mathrm{~V}_{\mathrm{p}}$ (Single-ended) voltage swing at the output of the RF driver, which the gain is distributed between the CMOS devices with 20 dB gain for RF driver and 25 dB for TIA and VGA.

Figure 2, shows a simplified block diagram of the TIA and main amplifier. As mentioned, the optical signal is absorbed by a photodetector (PD), which then transforms the optical signal into an electrical current. Next, the transimpedance amplifier (TIA) will change the electrical current to electrical voltage with a certain gain. However, due to the small voltage swing at the output of the TIA, another amplifier is needed, to provide sufficient voltage swing for the rest of the circuit, which is the main amplifier in Figure 2. The main amplifier can be implemented as a limiting amplifier (LA) or driver amplifier (DA) depending on the application. In this project, because the ultimate goal is to generate low distortion CW signals with low phase noise, and the receiver is driving a Mach-

Zehnder modulator (MZM) in a feedback loop as shown in Figure 1, a DA is chosen as the driver to produce high voltage swing amplitude at the output.


Figure 2 A simplified block diagram of a CMOS chip with PD (Reprinted from [1])

Mach-Zehnder is an optical modulator that changes the light intensity or phase by applying an electric field across the optical path, as is shown in Figure 3 (c). In this figure, the electric field was produced, due to the voltage difference between the S-/S+ and VDD. With the electric field between S-/S+ and VDD, a phase shift will be induced, which depending on that, the recombination of light from the two paths will be constructive or destructive.

A commonly way to implement the MZM is the traveling-wave (TW) electrode architecture. TW modulators are designed as a transmission line (TL), like CPW. Some of the advantages of this modulator are as follows. First, the capacitive loading effect of the electrodes absorbs by the TL, therefore the effective bandwidth of the system expands. Second, unlike the discrete devices, the TL in the TW electrode technique can absorb any possible reflections, by choosing matched-impedance TL. However, there are some disadvantages of TW topology. First, due to the high resistivity of the on-chip metal wires, the TLs are lossy, so the signal attenuates which degrades modulation efficiency. For
instance, in 180 nm CMOS technology, a signal attenuates from $7 \mathrm{~V}_{\mathrm{pp}}$ at 20 GHz to $4.6 \mathrm{~V}_{\mathrm{pp}}$ over the 4 mm -length and 6um-width wire when it arrives at a $50 \Omega$ termination resistor [2]. Therefore to compensate for this signal attenuation, a larger modulation swing is required which makes the driver circuit design more challenging. Moreover, since the characteristic impedance of the MZM is usually chosen to be $50 \Omega$, in order the generate large voltage modulation swing, huge currents are needed, which handling this amount of current on advanced CMOS technologies is also challenging.


Figure 3 Driving schemes of MZM transmitters. (a) Single-ended drive, dual-arm push-pull. (b) Differential drive, dual-arm push-pull. (c) Differential drive, dual-arm push-pull with shared bias. (d) Dual-differential drive, dual-arm push-pull (Reprinted from [2])

Figure 3 compares different MZM structures. Figure 3 (a) shows a single-ended push-pull driver driving one anode and one cathode arm at the same time. Figure 3 (b) and (c) show a differential driver which drives two anode arms and at the same time, the cathodes are tied up to the highest DC voltage, to make sure the diodes are always reverse-
biased. Lastly, Figure 3 (d) present a dual-differential-driver that drives both the anode and cathode pair for more efficient operation by doubling the swing across each anode/cathode pairs [2].

In this project, a differential-drive, dual-arm, push-pull, Traveling-Wave (TW) MZM was fabricated in SOI with $2 \times 1.5 \mathrm{~mm}^{2}$ area, $50 \mathrm{GHz} 3-\mathrm{dB}$ bandwidth, and $50 \Omega$ input impedance which is the load of the differential RF driver.

As mentioned, the RF driver has to be able to drive an MZM, to modulate the optical signal. RF drivers have been traditionally categorized into different classes such as Class A, B , C, and AB, which each have their advantages and disadvantages. For instance, a Class A driver has decent linearity and low distortion due to its biasing operating point, however, the power efficiency of this class is low. On the other hand, the class B driver has more distortion but better power efficiency. Therefore, depending on the application, the proper DA topology needs to be selected. The following table shows a brief comparison between different classes of power amplifiers.

Table 1 RF driver Class comparison (Reprinted from [3])

| Classes | $\mathbf{A}$ | $\mathbf{A B}$ | $\mathbf{B}$ | $\mathbf{C}$ | $\mathbf{D}$ | $\mathbf{E}$ | $\mathbf{F}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Conduction angle (\%) | 100 | $50-100$ | 50 | $<50$ | 50 | 50 | 50 |
| Max. efficiency (\%) | 50 | $50-78.5$ | 78.5 | 100 | 100 | 100 | 100 |
| Linearity | Excellent | Good | Good | Poor | Poor | Poor | Poor |

There are some key parameters of RF drivers that need to be considered in a design and analysis. The power efficiency of the driver is one of the parameters that need to be looked at. Power efficiency defines as the ratio of average output power and average DC consumption over a specified period of time [4].

$$
\eta_{\mathrm{AV}}=\frac{\mathrm{P}_{\mathrm{O}(\mathrm{AV})}}{\mathrm{P}_{\mathrm{I}(\mathrm{AV})}}
$$

Where $\mathrm{P}_{\mathrm{O}(\mathrm{AV})}$ and $\mathrm{P}_{\mathrm{I}(\mathrm{AV})}$ are the average output power and average DC consumption, respectively. In theory, the class A RF driver, with RF Choke, can achieve at most 50\% efficiency, whereas, the Class B RF driver can go as high as 78\% [4]. Moreover, the maximum output power of the driver can be calculated using the following equation.

$$
\mathrm{P}_{\mathrm{O}(\max )}=\mathrm{c}_{\mathrm{p}} \mathrm{I}_{\mathrm{DM}} \mathrm{~V}_{\mathrm{DSM}}
$$

(1.2)

Where $\mathrm{V}_{\mathrm{DSM}}$ is the maximum value of instantaneous drain-source voltage, $\mathrm{I}_{\mathrm{DM}}$ is the maximum value of instantaneous drain current, and $c_{p}$ is output power capability [4]. Also, power-added efficiency (PAE) is another way to measure and analyze efficiency.

$$
\mathrm{PAE}=\frac{\mathrm{P}_{\mathrm{out}}-\mathrm{P}_{\mathrm{in}}}{\mathrm{P}_{\mathrm{dc}}}
$$

Linearity is another parameter that needs to be considered. The definition of linearity is, as the input power of the device varies, the gain of the device remains constant. In other words, in Figure 4, ideally, the first harmonic of the output power should follow
the linear dashed line, however, as the input strength $\left(\mathrm{P}_{\mathrm{in}}\right)$ is growing, the output power is acting nonlinear, which is unwanted. There are some other parameters based on linearities that are well-known, such as $\mathrm{P}_{1 \mathrm{~dB}}$ and IIP3. The $\mathrm{P}_{1 \mathrm{~dB}}$ is where the actual output power is 1dB below the expected linear value, and IIP3 ( input third harmonic intercept point) is where the first and third harmonic outputs power are equal.


Figure 4 Driver linearity plot (Reprinted from [5])
The next critical parameter for designing an RF driver is its stability. The system is called stable when the system behaves as expected and amplifies the input signal with a certain gain. However, the system is unstable when its output oscillates, which this oscillation is not constant oscillation with fixed amplitude. To analyze the stability of a system, the stability factor, called the "k-factor", of a system needs to be looked at.

$$
\mathrm{k}=\frac{1-\left|\mathrm{S}_{11}\right|^{2}-\left|\mathrm{S}_{22}\right|^{2}+|\mathrm{D}|^{2}}{2 .\left|\mathrm{S}_{21}\right|\left|\mathrm{S}_{12}\right|}
$$

And

$$
D=S_{11} \cdot S_{22}-S_{12} \cdot S_{21}
$$

So in practical terms, the system is called "unconditionally stable" when $\mathrm{k}>1$. However, $\mathrm{k}<1$ the system can be either "conditionally stable" depending on the value of parameter "D" or "unstable" [6].

### 1.2 Research Objective

As mentioned, driving an MZM with a high-frequency CW signal, need a large voltage amplitude to produce the required electric field to modulate the light. However, due to the low nominal voltage and low supply voltage in the 22 nm FDSOI CMOS process as the technology of the choice to realize this driver, achieving the required amplitude is challenging. Moreover, due to the high operating frequency of the overall system, an ultrawide bandwidth driver is required, which is practical because of the low transistor's parasitic and high $\mathrm{f}_{\mathrm{t}}$ in this technology. Before we delve into solving these issues, it is critical to intuitively analyze different topologies and methods and understand the pros and cons of each topology.

With all being said, an ultra-wide-band, fully-differential stacked RF driver has been proposed using 22 nm FDSOI which should be capable of driving the MZM with the required voltage amplitude over ultra-wide frequency bandwidth. The primary objective of this thesis would be to design, implement, and test the functionality of the proposed wide-band Stacked RF driver.

### 1.3 Thesis organization

Following this introduction, chapter 2 discusses different RF drive and MZM driver topologies and their advantages and disadvantages.

In chapter 3, the proposed stacked RF driver is discussed. This section will illustrate the working of this topology followed by the design, layout, and implementation procedure of the system.

Chapter 4, provides measurement results of the fabricated wide-band driver, followed by a comparison between different state-of-art RF driver and MZM driver topologies.

Lastly, chapter 5 presents the conclusion of this thesis.

## 2. RF DRIVER ARCHITECTURES REVIEW

### 2.1 Distributed Driver

Distributed amplifiers are attractive circuit blocks for ultrawide-band and wireline optical links, due to the ability to process ultra-narrow pulses and multigigabit-per-second signals. Although the bandwidth of these devices can go over 90 GHz , their output power and efficiency are relatively low. Moreover, the distributed driver topologies are not suitable for some applications, due to their large chip area [7].

The distributed driver consists of multiple gain stages and two transmission lines, one at the input and the other at the output of the gain stages, as shown in Figure 5[8]. The concept of the distributed driver is based on combining input and output parasitic capacitance of the active devices with inductors in a way that the two input/output Transmission lines (TL) are obtained. Figure 6, shows the sections of distributed TL of the input and output line which is consisting of inductors and active devices parasitic [8]. This distributed lumped circuit forms a lumped element low pass filter (T-section filter) [8]. To boost the gain, multiple of these filter sections are cascaded and formed into a distributed amplifier.


Figure 5 Simplified Distributed power amplifier using TL (Reprinted from [8])
As shown in Figure 5, input and output TL are terminated by $\mathrm{Z}_{\mathrm{g}}$ and $\mathrm{Z}_{\mathrm{d}}$, which usually are chosen to be $50 \Omega$, to avoid reflection from each end. However, since a portion of the output absorbs by these terminations, the efficiency of a distributed driver degrades [9]. Moreover, with the large DC current from the supply, the voltage drop across the termination resistor raises which causes efficiency degradation. One way to fix this problem is to use a Bias-T network. Though, the Bias-T networks are expensive and bulky, due to large and high-quality inductors.


Figure 6 (a) Gate TL, (b) Drain TL (Reprinted from [8])

Assuming the transistors are unilateral, which means the effect of Cgd is neglected, the distributed driver can be simplified into two parts, as shown in Figure 6, in which, Figure 6(a) and (b) correspond to the gate and drain TL, respectively. From Figure 6(b), the current delivered to the load is given by:[8]

$$
\mathrm{I}_{0}=\frac{1}{2} \mathrm{~g}_{\mathrm{m}} \mathrm{e}^{-\frac{\gamma_{\mathrm{d}}}{2}}\left[\sum_{\mathrm{k}=1}^{\mathrm{n}} \mathrm{~V}_{\mathrm{k}} \mathrm{e}^{-(\mathrm{n}-\mathrm{k}) \gamma_{\mathrm{d}}}\right]
$$

(2.1)

Where $\mathrm{V}_{\mathrm{k}}$ is the voltage across $\mathrm{C}_{\mathrm{gs}}, \gamma_{\mathrm{d}}$ is the propagation factor of the drain line, and " n " is the number of transistors in the amplifier. The following equation also expresses $\mathrm{V}_{\mathrm{k}}$ in terms of the gate voltage of the $\mathrm{k}^{\text {th }}$ transistor [8]:

$$
\mathrm{V}_{\mathrm{k}}=\frac{\mathrm{V}_{\mathrm{i}} \mathrm{e}^{-\frac{(2 \mathrm{k}-1) \gamma_{\mathrm{g}}}{2}-\mathrm{j} \tan ^{-1}\left(\frac{\omega}{\omega_{\mathrm{g}}}\right)}}{1+\left(\frac{\omega}{\omega_{\mathrm{g}}}\right)^{1 / 2}\left[1-\left(\frac{\omega}{\omega_{\mathrm{c}}}\right)^{2}\right]}
$$

Where $V_{i}$ is the input voltage of the amplifier, $\gamma_{\mathrm{g}}$ is the propagation factor of the gate line, $\omega_{\mathrm{g}}$, and $\omega_{\mathrm{c}}$ are as follows:

$$
\omega_{\mathrm{g}}=\frac{1}{\mathrm{R}_{\mathrm{gs}} \cdot \mathrm{C}_{\mathrm{gs}}} \quad \omega_{\mathrm{c}}=\frac{2 \pi}{\mathrm{~L}_{\mathrm{g}} \cdot \mathrm{C}_{\mathrm{gs}}}
$$

Additionally, the input power and the power delivered to the load are given by

$$
\begin{gather*}
\mathrm{P}_{0}=\left|\mathrm{I}_{0}\right|^{2} \Re\left[\mathrm{Z}_{\mathrm{ID}}\right] \\
\mathrm{P}_{\mathrm{i}}=\frac{\left|\mathrm{V}_{\mathrm{i}}\right|^{2}}{2\left|\mathrm{Z}_{\mathrm{IG}}\right|^{2}} \Re\left[\mathrm{Z}_{\mathrm{IG}}\right] \tag{2.4}
\end{gather*}
$$

Where $\mathrm{Z}_{\mathrm{IG}}$ and $\mathrm{Z}_{\mathrm{ID}}$ are the image impedance of the gate and drain line, respectively.

It is also important to understand the concept of image impedance. Let's consider a two-port network forming an artificial transmission line. Each two-port in a cascaded two-port network should function with the necessary impedance terminations to ensure that the maximum power transfer occurs over the specified bandwidth when considering signal transmission and impedance matching. The two-port can be terminated with a pair of impedances known as image impedances to satisfy this need, Figure 7. If the system is symmetrical, $\mathrm{Z}_{\mathrm{i} 1}$ and $\mathrm{Z}_{\mathrm{i} 2}$ become identical and the characteristic impedance is denoted as $\mathrm{Z}_{0}$. As discussed, the image impedance concept can be applied in distributed driver topology, where each gain stage can be considered as a 2-port network.


Figure 7 A two-port network terminated by the image impedance (Reprinted from [8])

Where the $\mathrm{Z}_{\mathrm{i} 1}$ and $\mathrm{Z}_{\mathrm{i} 2}$ are the image impedances and can be expressed as

$$
\begin{aligned}
& Z_{i 1}=\sqrt{Z_{s c 1} Z_{o c 1}} \\
& Z_{i 2}=\sqrt{Z_{s c 2} Z_{o c 2}}
\end{aligned}
$$

An 8-stage distributed driver is proposed in [10] using the image impedance interstage matching, as shown in Figure 8. This design consists of a 2-stacked transistor for each unit gain stage and a low-pass artificial T-line transmission line for the input and the output TLs. Moreover, an off-chip bias-T is used for biasing purposes.

In this design, the m-derived TL is used instead of the conventional constant-k section TL. As is shown in Figure 9, using the m-derived structure introduces a coefficient, " $m$ ", which gives an additional degree of freedom to adjust the frequency response of the filter section. Equation (2.7) and (2.8) shows the reciprocal propagation factor of the ksection and $m$-derived section TL, respectively, where the effect of " $m$ " can be observed.

$$
\begin{gather*}
e^{\gamma}=\left|1-\frac{2 \omega^{2}}{\omega_{c}^{2}}\right|+\frac{2 \omega}{\omega_{c}} \sqrt{\frac{\omega^{2}}{\omega_{c}^{2}}-1} \\
e^{\gamma}=\left|\frac{1-\left(1+m^{2}\right)\left(\frac{\omega}{\omega_{c}}\right)^{2}}{1-\left(1-m^{2}\right)\left(\frac{\omega}{\omega_{c}}\right)^{2}}\right|+\frac{\sqrt{\left(\frac{2 m \omega}{\omega_{c}}\right)^{2}\left[\left(\frac{\omega}{\omega_{c}}\right)^{2}-1\right]}}{\left|1-\left(1-m^{2}\right)\left(\frac{\omega}{\omega_{c}}\right)^{2}\right|} \tag{2.7}
\end{gather*}
$$



Figure 88 -stage Distributed Driver using 2-stacked gain unit cell. (Reprinted from [10])


Figure 9 Low-pass filter T-section: (a) constant-k section and (b) m-derived section (Reprinted from [10])

As a result of this architecture, [10], a gain of about 12 dB was achieved with a frequency range of $\mathrm{DC}-12 \mathrm{GHz}$. the $O P_{1 d B}$ is reported as 12.44 dBm at 200 MHz with \%12.15 PAE. The overall power consumption is 666 mW with a 6 V supply voltage. lastly, the overall size of the design is $2.23 \mathrm{~mm}^{2}$ on 250 nm CMOS technology.

One way to solve the biasing problem of DPA, as mentioned earlier, is presented in [9]
. A folded pseudo-differential distributed driver is proposed in Figure 10. In this architecture, two NMOS and two PMOS transistors are stacked which solves the biasing problem mentioned earlier. The DC biasing of the gain stage is solved due to PMOS sourcing the DC current from the DC supply without the need for bias-T, as shown in Figure 11. Moreover, stacking the transistors improves the stability of the system, by increasing the isolation between the input and output.


Figure 10 folded pseudo-differential distributed driver schematic (Reprinted from [9]).


Figure 11 distributed driver gain stage (Reprinted from [9])
With this architecture, the gain of 11.6 dB is achieved with the bandwidth from $0.4-31.6 \mathrm{GHz}$. The PAE in this work is reported between \% 4.8-8.3, while the $\mathrm{OP}_{1 \mathrm{~dB}}$ is about 11.5 dBm . However, the core area of this design was reported as $0.5 \mathrm{~mm}^{2}$, in 22 nm FDSOI technology, which is relatively large [9].

Another architecture is proposed in [7] and is called multi-driver stacked topology as shown in Figure 12, where it consists of 8 stages using a 4-stacked gain unit with input and output CPW TL. In this design, elevated CPW TL is used, where the signal line is placed higher than the ground strips. Doing so allows for higher $Z_{0}$ without need to decrease signal linewidth. As the operating frequency increases, the importance and the effect of parasitic are more significant. As shown in Figure 13, as the frequency increase, the effect of $\mathrm{C}_{\mathrm{gs}}$ of the stacked transistor is more dominant, therefore, part o the "idi" goes through that $\mathrm{C}_{\mathrm{gs}}$, which is called " ircc ". Therefore, less current is delivered to the load, which means less output power. In this topology, by driving the second stacked transistor $\left(\mathrm{M}_{2 \mathrm{CG}}\right)$ the " $\mathrm{irc2}^{2}$ " current is compensated by " $\mathrm{i}_{\mathrm{x}}$ " which is produced by the $\mathrm{M}_{2 \mathrm{CG}}$, which in this case, the same current "im" can be delivered to the load. Since this effect is more on
higher frequencies, a small transformer (T1) is used for driving the stacked transistors, to ensure that its effect is applied only at high frequencies. Moreover, by applying these intrastacked coupled inductors, the input TL loss will be compensated, due to an increase of the effective transconductance $(\mathrm{Gm})$ at higher frequencies, which results in flat gain over the bandwidth [7].


Figure 12 multi-drive stacked topology (Reprinted from [7])


Figure 13 (a) Conventional Stacked and (b) multi-drive stacked topologies (Reprinted from [7])

As a result of this topology, the gain of 16 dB is achieved over 120 GHz bandwidth (DC-120GHz). The $\mathrm{OP}_{1 \mathrm{~dB}}$ and $\mathrm{P}_{\text {sat }}$ are reported as 21.3 dBm and 22.6 dBm at 20 GHz , respectively. Also, The PAE PldB in this topology is about $\% 18$ with a core area of 0.51 $\mathrm{mm}^{2}$ on 45 nm RFSOI CMOS technology [7].

Table 2 shows a brief overview of the two distributed drivers which were discussed above. This topology allows for ultra-wide bandwidth operation with sufficient output power, however, the distributed drivers suffer from large chip areas and sometimes, low efficiency. Therefore, due to the limited chip area for this project, distributed RF drivers cannot be the desired architecture and topology.

Table 2 Table of comparison: Distributed RF Driver

| Distributed Driver | [9] | $[7]$ | $[\mathbf{1 0 ]}$ |
| :---: | :---: | :---: | :---: |
| Technology | 22 nm FDSOI | 45 nm RFSOI | 250 nm CMOS |
| Gain | 11.6 dB | 16 dB | $11.54 \pm 1.36$ |
| Bandwidth | $0.4-31.6 \mathrm{GHz}$ | $120 \mathrm{GHz}<$ | DC-12 GHz |
| Psat | 14.2 dBm | 22.6 dBm | -- |
| OP1db | 11.5 dBm | 21.3 dBm | 16.44 dBm |
| @upply Voltage | 2.5 V | 4.2 V | $@ 0.2 \mathrm{GHz}$ |
| Area | $0.5 \mathrm{~mm}{ }^{2}$ | $0.51 \mathrm{~mm}^{2}$ | $2.2327 \mathrm{~mm}{ }^{2}$ |
| PAE | $\% 4.8-8.3$ | $\% 18$ | $\% 12.15$ |
| DC consumption | 238 mW | -- | 666 mW |
|  |  |  |  |
|  |  |  |  |

### 2.2 Stacked Driver

The more advanced sub-micron CMOS technologies have many advantages such as lower parasitic, and higher unity-gain frequency. However, as the devices are getting smaller, their breakdown voltage reduces, in which, a large voltage swing across the transistor's terminals may cause too much stress on them. Also, the maximum supply voltage that can be used is limited to the device breakdown voltage. Several approaches are proposed in the literature, which combine the power of multiple devices. Among them, cascode configurations have been commonly used to improve the gain performance. As shown in Figure 14, common-source and common-gate transistors are placed in series to allow a larger voltage swing amplitude at the drain of the common-gate device [3]. This configuration can solve the limitation of the supply voltage. For instance in Figure 14, the VDD can be increased up to two times the device breakdown voltage. Yet, the commongate transistor is still experiencing more stress, compared to the common-source device, especially across its drain and gate terminal, since the gate of this device is AC ground.


Figure 14 Cascode configuration (Reprinted from [3])

To reduce the stress on the common-gate transistor, a small capacitor is added at the gate of the cascoded device and then it is biased through a large resistor, Figure 15. This technique allows an RF swing at the gate of the cascode device, which is in phase with its drain RF swing. Doing this reduces the gate-drain voltage difference, therefore, the device experiences less stress. Additionally, the gate capacitor can control the real part of the source input impedance of the stage, which will be discussed more in detail later in this chapter. To increase the overall gain of the system, multiple stacked transistors can be placed and biased so all of the transistors experience the same stress, both in DC and AC signal, over all three terminals. For instance, if there are K-stacked transistors, the overall system can tolerate $\mathrm{K} . \mathrm{V}_{\text {max }}$, where $\mathrm{V}_{\text {max }}$ is the voltage value close to the device's breakdown voltage. Moreover, as is shown in Figure 15, an RF choke is placed after the VDD. The RF choke blocks the high-frequency RF signals and allows DC signal to pass. Also, RF choke has an inductive-peaking effect at higher frequencies which can increase the bandwidth of the system. Using the RF choke increases the overall efficiency of the system since there is no loss across the choke compared to resistive loading.


Figure 15 Two-Stacked Transistor with Choke inductor (Reprinted from [11])
To increase the output power of stacked PA, multiple transistors can be used on top of each other. However, after a certain number of stacked devices, the growth of output power becomes insignificant, though, the power consumption increases. Therefore, there is an optimum number of stacked devices that can be used. As shown in Figure 16, up to about 4 stacked devices, with a constant current, the increase in saturation power is more than 1X, however, beyond that, there is not much of a change on $\mathrm{P}_{\text {sat }}$ [12]. Therefore, the optimum number of stages is usually between 3 and 4 stages. For the fixed loading impedance $\mathrm{R}_{\mathrm{L}}$, by stacking K transistors, the current in each transistor increases, therefore, the overall power increases by $\mathrm{K}^{2}$. However, due to the increase of parasitic as the devices become larger, the gain of transistors decreases severely, as is shown in Figure 16.


Figure 16 Effect of the number of stacked transistors on Psat (Reprinted from [12])
As previously mentioned, the stacked driver consists of one common source and multiple common-gate amplifiers. Figure 17 shows $n$-stacked transistors $(n=3)$. Since the transistors are connected in series, the voltage of the drain and source of each stage are in phase, and the total voltage swing of the $\mathrm{n}^{\text {th }}$ transistor is " n " times larger than the drainsource voltage swing of the first stage $\left(\mathrm{V}_{\mathrm{m}}\right)$. In the meantime, the overall current swing $\left(\mathrm{I}_{\mathrm{m}}\right)$ does not change. Therefore, the output impedance of the first stage can be calculated as $\mathrm{Z}_{\mathrm{L} 1}=\frac{\mathrm{V}_{\mathrm{m}}}{\mathrm{I}_{\mathrm{m}}}=\mathrm{Z}_{\text {opt }}$. As stated before, to make sure each transistor experiences the same stress, the $\mathrm{n}^{\text {th }}$ stage should experience $\mathrm{n} . \mathrm{V}_{\mathrm{m}}$, thus

$$
\mathrm{Z}_{\mathrm{Ln}}=\mathrm{n} \cdot \frac{\mathrm{~V}_{\mathrm{m}}}{\mathrm{I}_{\mathrm{m}}}=\mathrm{n} \cdot \mathrm{Z}_{\mathrm{opt}}
$$

Additionally, the optimum load impedance of the $n$-stacked transistor $\mathrm{Z}_{\mathrm{Ln}}$ can be synthesized close to the $50 \Omega$ load, which allows a simpler matching network and better matching over a wide range of frequencies [13]. By doing so, each transistor experiences
at most, $\mathrm{V}_{\mathrm{m}}$ voltage swings across their drain-source terminals, in which, the $\mathrm{V}_{\mathrm{m}}$ is less than the device breakdown voltage.


Figure 17 3-stacked driver with resistive biasing network (Reprinted from [13])

To control the $\mathrm{Z}_{\mathrm{Ln}}$, there are two-degree-of-freedom. Figure 18 shows the $\mathrm{k}^{\text {th }}$ transistor model, in which, $\mathrm{Z}_{\mathrm{sk}}$ is the source input impedance of that stage.


Figure 18 Source input impedance of kth transistor (Reprinted from [13])

As reported in [13], the source input impedance of the $K^{\text {th }}$ stage can be calculated as

$$
Z_{s k}=\frac{C_{g s}+C_{k}}{\left(g_{m}+j \omega C_{g s}\right) C_{k}} \approx \frac{\left(C_{g s}+C_{k}\right)}{g_{m} C_{k}} \quad f_{0} \ll F_{t}
$$

Where $\mathrm{Cgs}_{\mathrm{gs}}$ is gate-source capacitance, $\mathrm{g}_{\mathrm{m}}$ is transconductance, and $\mathrm{C}_{\mathrm{k}}$ is the shunt capacitance of the gate of the transistor. It is observed in (2.10) the transconductance and the gate capacitance are inversely proportional to the source input impedance which means, by reducing $\mathrm{g}_{\mathrm{m}}$ and $\mathrm{C}_{\mathrm{k}}$, the $\mathrm{Z}_{\mathrm{sk}}$ increases, therefore these two parameters can be tuned to achieve the desired optimum impedance $\left(\mathrm{Z}_{\mathrm{opt}}\right)$.

The other concern in stacked driver design is the intermediate node matching. As the operating frequency increases, the effect of device parasitic is significant and causes a mismatch between the stages. Due to the large voltage swing in PA, the impedance matching between each stage is critical. Three intermediate matchings are proposed in [12], as shown in Figure 19.


Figure 19 different intermediate matching network solutions: (a) shunt inductive tuning, (b) shunt feedback Cds tuning, (c) series inductive tuning (Reprinted from [12])

In Figure 19(a), the shunt inductive technique is used. Although, the gate capacitor, Ck , set the $\operatorname{Re}\left\{\mathrm{Z}_{\mathrm{opt}}\right\}$ to the desired value, adding a shunt inductor, can ensure the optimal phase alignment [12]. Figures (b) and (c), have the same purpose as the shunt inductive and adjust the phase of each stage. As concluded from [12], the shunt inductive technique has the best performance among the three, which is less sensitive to inductor value error and it has higher PAE and saturation power, followed by shunt feedback capacitor and series inductive tuning. To appropriately align the phase between two stages and match the imaginary part of the impedance with the "shunt inductance" method, the following equation should be solved for $L_{K}$

$$
\operatorname{Im}\left\{\mathrm{Y}_{\mathrm{s}, \mathrm{k}+1}\right\}+\frac{1}{\mathrm{sL}}=\operatorname{Im}\left\{\mathrm{Y}_{\mathrm{s}, \mathrm{k}}\right\} \quad \text { where } \mathrm{k}=1,2, \ldots, \mathrm{~K}-1
$$

Where

$$
\frac{1}{\mathrm{~L}_{\mathrm{K}}}=\frac{\omega^{2}\left(\mathrm{C}_{\mathrm{ds}, \mathrm{k}}-\mathrm{C}_{\mathrm{ds}, \mathrm{k}-1}\right)}{\mathrm{k}}+\frac{\omega^{2} \mathrm{C}_{\mathrm{gs}, \mathrm{k}+1}}{\mathrm{~kg}_{\mathrm{m}, \mathrm{k}+1} \mathrm{R}_{\mathrm{opt}}}+\frac{\omega^{2}\left(\mathrm{C}_{\mathrm{gs}, \mathrm{k}}+\mathrm{kC}_{\mathrm{dsub}, \mathrm{k}}\right)}{\mathrm{k}}
$$

Many device-stacking technique modifications have been studied in the literature in recent years. For instance, in [14], a transformer-coupled stacked-FET RF driver was reported with $900-\mathrm{MHz}$ bandwidth, in which, multiple on-chip transformers were used to couple the input signals into all stacked transistors, as is shown in Figure 20(a) and (b). Figure 20 (a) shows the simplified 4-stacked design, where Figure 20 (b) shows the
detailed 2-stacked architecture with the value of the components. However, using transformers in this design makes it relatively bulky and also limited in bandwidth.


Figure 20(a) simplified 4-stacked Transformer-coupled input-feed technique, (b) 2-stage Transformer-coupled input-feed technique (Reprinted from [14])

Three different stacked-transistor designs are compared in [15]. As shown in Figure 22, 3-stack n-MOSFET, 4-stack n-MOSFET, and 3-stack CMOS structures are used. These stacked transistors are biased through the resistive biasing system, which, biases each transistor without the need for an external bias point. Moreover, each stacked transistor has a relatively small gate capacitance connected in parallel as discussed before. However, on (a) and (b), a varactor is connected to the gate of the top transistor in the stack. Doing so allows controlling the output impedance of the system, to eliminate the need for an extra matching network, as shown in Figure 21.

$$
Y_{\text {OUT }} \approx \frac{g_{0} C_{\mathrm{var}}}{C_{\mathrm{var}}+C_{\mathrm{gd}}}+j \omega\left(C_{\mathrm{gd}}+C_{\mathrm{db}}\right)
$$



Figure 21 high-frequency transistor model with output admittance (Reprinted from [15])


Figure 22 (a) 3-stack n-MOSFET, (b) 4-stack n-MOSFET, and (c) 3-stack CMOS (Reprinted from [15]) Where $\mathrm{g}_{0}=\frac{1}{\mathrm{r}_{0}}$ which is the intrinsic output conductance of the MOSFET, and CVAR is the overall gate-to-ground capacitance of the last stack. As a result of investigating these three configurations, the following outcomes were achieved. Table 3 and Figure 23 show that the "3-stack CMOS" design suffers from high power consumption, which makes this design
inefficient, and relatively limited bandwidth. However, 3 and 4 -stack n-MOS designs have less power consumption, higher $\mathrm{OP}_{1 \mathrm{~dB}}$, and wider bandwidth.

Table 3 Achieved outcomes in (Reprinted from [15])

| Output Stage | Psat/P1db | Pdc @ 28GHz |
| :---: | :---: | :---: |
| 3-stack CMOS | $17.5 / 10.4 \mathrm{dBm}$ | 230 mW |
| 3-stack n-MOSFET | $15.5 / 11.2 \mathrm{dBm}$ | 107 mW |
| 4-stack n-MOSFET | $16.4 / 11.6 \mathrm{dBm}$ | 140 mW |



Figure 23 3-stacked nMOS, 4-stacked nMOS, and 3-stacked CMOS architecture (Reprinted from [15])
A similar architecture was proposed in [13]. In this design, a feedback resistor is placed from the output of the driver to its input as shown in Figure 24. By changing the value of the $\mathrm{R}_{\mathrm{f}}$, gain, bandwidth and the stability of the system can be controlled. For instance, in [13], as the $R_{f}$ value changed from $300 \Omega$ to $100 \Omega$, the gain decreased by half, but the bandwidth increased. Also, the stability improved as the $\mathrm{R}_{\mathrm{f}}$ increased. To achieve
the desired gain in [13], two 3-stacked PAs are connected in series, in which the output for the first stage will be amplified by the second stage.


Figure 24 Stacked topology with feedback resistor Rf (Reprinted from [13])
The overall gain of 20 dB is achieved with 3 dB bandwidth, over $0.1-6.5 \mathrm{GHz}$ frequency. The maximum P 1 db and PAE are reported as 19 dBm and $\% 13-20$, respectively. [13] is implemented in 0.18 -um technology with an overall area of $0.64 \mathrm{~mm}^{2}$.

With the same stacked topology in [16], the 8 dB gain is achieved over 89 GHz bandwidth. The saturation power, P 1 dB , and PAE are reported as $17 \mathrm{dBm}, 11.5 \mathrm{dBm}$, and $\% 9$, respectively. Moreover, the driver core area is $0.06 \mathrm{~mm}^{2}$ in 44 nm SOI CMOS technology [16].

Another stacked driver is proposed in [17] with an intrinsic parasitic feedback network biasing method. In this topology, instead of the conventional cascode configuration with a single capacitor at the gate of each transistor, an $\mathrm{Rgn}_{\mathrm{gn}}$, Lgn , and Cgn
passive circuit is used to bias the transistors, as shown in Figure 25. Adding the Rgn, $\mathrm{Lgn}^{\text {, }}$ and $\mathrm{Cgn}_{\mathrm{gn}}$ changes the source input impedance which can be calculated by

$$
\mathrm{Z}_{\mathrm{s}, \mathrm{i}} \approx \frac{\frac{1}{\mathrm{~g}_{\mathrm{m}, \mathrm{i}}}\left(1+\mathrm{sR} \mathrm{gn}\left(\mathrm{C}_{\mathrm{gs}, \mathrm{i}}+\mathrm{C}_{\mathrm{gn}}\right)+\mathrm{s}^{2} \mathrm{~L}_{\mathrm{gn}}\left(\mathrm{C}_{\mathrm{gs,i}}+\mathrm{C}_{\mathrm{gn}}\right)\right)}{1+\mathrm{sC}_{\mathrm{gn}} \mathrm{~L}_{\mathrm{gn}}+\mathrm{s}^{2} \mathrm{C}_{\mathrm{gn}} \mathrm{R}_{\mathrm{gn}}} \cdot\left(\frac{1}{1+\frac{\mathrm{s}}{\omega_{\mathrm{T}}}}\right)
$$

Where additional high-frequency zeros were introduced due to the existence of $\mathrm{Rgn}_{\mathrm{gn}}$ and $\mathrm{Lgn}_{\mathrm{g}}$ which cause a bandwidth extension. Additionally, an interstage inductor, $\mathrm{L}_{\mathrm{m}}$, was inserted to neutralize the effect of $\mathrm{C}_{\mathrm{gd}}$ and $\mathrm{Cgs}_{\mathrm{gs}}$ parasitic capacitors. In comparison with the conventional biasing network, this method provides 1.5 X bandwidth extension, at a cost of \%33 larger chip area [17]. Moreover, this technique improves the jitter and amplitude of the driver by about $\% 10-12$.


Figure 25 Type II: Conventional biasing; Type IV: proposed intrinsic parasitic feedback biasing network (Reprinted from [17])

Table 4 shows brief a comparison between the discussed stacked drivers. As is discussed, this topology is more compact, compares to distributed drivers. The gain and the bandwidth of this topology are sufficient for the purpose of this project. However, there are some improvements needed for the output power.

Table 4 Table of comparison: Stacked RF Driver

| stacked Drivers | [15] | [15] | [15] | [16] | [18] |
| :---: | :---: | :---: | :---: | :---: | :---: |
| Technology | $\begin{aligned} & \hline 22 \mathrm{~nm} \\ & \text { CMOS } \end{aligned}$ | $\begin{aligned} & \hline 22 \mathrm{~nm} \\ & \text { CMOS } \end{aligned}$ | $\begin{aligned} & \hline \hline 22 \mathrm{~nm} \\ & \text { FDSOI } \end{aligned}$ | $\begin{gathered} \hline 45 \mathrm{~nm} \\ \text { SOI } \end{gathered}$ | 45nm SOI |
| Gain | $\sim 30 \mathrm{~dB}$ | $\sim 24 \mathrm{~dB}$ | $\sim 24 \mathrm{~dB}$ | 8 dB | 12.4 dB |
| Bandwidth | $\begin{gathered} \hline 0.5-39 \\ \mathrm{GHz}^{*} \end{gathered}$ | $\begin{gathered} \hline \mathrm{DC}-39 \\ \mathrm{GHz}^{*} \end{gathered}$ | $\begin{gathered} \hline \mathrm{DC}-39 \\ \mathrm{GHz}^{*} \end{gathered}$ | $\begin{gathered} 88-90 \\ \mathrm{GHz} \end{gathered}$ | $81-96 \mathrm{GHz}$ |
| Psat | 7.5 dBm <br> @ 28GHz | 15.5 dBm <br> @ 28GHz | $16.4 \mathrm{dBm}$ <br> @ 28GHz | 17 dBm | 19.2 dBm |
| OP1db | 10.4 dBm <br> @ 28GHz | 11.2 dBm <br> @ 28 GHz | 11.6 dBm <br> @ 28 GHz | $\begin{gathered} 11.5 \\ \mathrm{dBm} \end{gathered}$ | -- |
| Supply Voltage | 4.8 V | 2.4 V | 3.2 V | 4.2 V | 3.4 V |
| Area | -- | -- | -- | $\begin{aligned} & 0.06 \\ & \mathrm{~mm}^{2} \end{aligned}$ | $0.228 \mathrm{~mm}^{2}$ |
| PAE | \% 14.2 | \% 33.4 | \% 28.3 | \% 9 | \% 14 |
| DC consumption | 230 mW | 107 mW | 140 mW | 382 mW | -- |

[^0]
## 3. PROPOSED RF DRIVER

### 3.1 22nm-FDSOI

The proposed architecture is designed using the 22nm FDSOI Global Foundries process. The FDSOI stands for fully-depleted Silicon on Insulator. The main difference between this technology and the Bulk CMOS is that an ultra-thin layer of buried oxide (BOX) is placed as an insulator on the base silicon, in which, the channel will be formed on top of the BOX. In contrast to the bulk CMOS where dopants must be added to the channel, FDSOI eliminated this necessity due to the BOX. Hence the process is called fully depleted. Figure 26 compares the cross-section of Bulk and FDSOI CMOS [19].


Figure 26 Bulk CMOS Vs. FDSOI CMOS technology (Reprinted from [19])
The FDSOI technology has a shorter channel length as compared to a bulk CMOS, which allows for higher operating speed since the electron must travel a shorter distance. Moreover, lower gate leakage and smaller junction capacitance are some of the other benefits of this technology. Also, since there is no doping needed, the number of steps in the fabrication process is reduced.

The other feature of this technology is to have an access to bias the device from the bulk, or called "back-gate". As shown in Figure 26, the drain/source of the device is separated from the bulk by a thin layer of insulator, which prevents the device from latchup at lower voltages. Moreover, the channel is separated from the bulk. Therefore, the bulk voltage can go to a much higher voltage which can act as a secondary gate. Because of this feature, the process allows for a "flip well" technology, where the PMOS transistors are placed in P-Well and NMOS transistors in N-Well, which is the opposite of a conventional well. Some of the benefits of the flip well technology are lower threshold voltage ( $\mathrm{V}_{\text {th }}$ ) which leads to faster switching and higher drive capability [19]. In this design, a super-low- $\mathrm{V}_{\mathrm{t}}$ (SLVT) device is used, with about $70 \mathrm{mV} / \mathrm{V}$ threshold voltage varying depending on the back gate voltage. As reported in [9] and [20], the maximum $\mathrm{f}_{\mathrm{t}} / \mathrm{f}_{\text {max }}$ in this technology is $357 / 290 \mathrm{GHz}$ for NMOS and $260 / 250 \mathrm{GHz}$ for PMOS devices, which allows PMOS devices to be suitable for RF application and higher operating frequencies too.


Figure 27 Comparison between flip-well (right) and conventional well (left) (Reprinted from [19])

This technology also offers a wide range of MOM capacitors, as well as alternative polarity MOM (APMOM) with a good quality factor. APMOM5V and APMOM3.3V are
used in this design between layer 1 and layer 7 to achieve relatively large capacitance with a good quality factor.

This process offers a range of back-end-of-line (BEOL) metals. Figure 28 shows the cross-section of stack-up metals.


Figure 28 BEOL visualization of 22nm FDSOI (Reprinted from [19])

### 3.2 Stacked RF Driver

As mentioned earlier, the advanced sub-micrometer technologies are suffering from the low breakdown voltage of the devices, which makes it hard to achieve highpower and high-frequency RF drivers. To overcome these limitations, power combining techniques are developed to combine the power over multiple stages, such as the transistor stacking technique. This circuit is composed of a common-source input stage and three common-gate stages connected in series so that the output swings are added in phase. This technique can solve the two major problems in these technologies. First, this technique can support higher supply voltage, therefore, a higher RF voltage swing is achievable. For instance, assuming the breakdown voltage of a device is 0.8 V , in 4 -stack PA, the supply voltage can go as high as 3.2 V , which is four times higher than the breakdown voltage of the device, as shown in Figure 29. Second, due to the structure of this technique, a simpler and smaller matching network is required, which results in lower power loss due to the matching network, and smaller area. However, this technique has some limitations at millimeter-wave bands such as limited efficiency due to coupling losses and parasitic of devices and wiring [18].


Figure 29 Proposed 4-stack NMOS basic structure (Reprinted from [4]).
The main concern in stacked driver design is an equal division of the voltages, RF and DC. Since the supply voltage in the $n$-stacked driver technique is " $n$ " times higher than the breakdown voltage of each transistor, it is critical that each transistor experience the same drain-source DC voltage, equal to its breakdown voltage, otherwise, there will be a reliability issue for the device. In this design, a class-A biasing is chosen due to its high linearity, at a cost of more power consumption. In the class-A biasing technique, the biasing point should be well above the threshold voltage of the device so that there is a full voltage swing across the $\mathrm{V}_{\mathrm{GS}}$, as shown in Figure 30 [4]. The drain current in class A driver is

$$
\mathrm{i}_{\mathrm{D}}=\mathrm{K}_{\mathrm{sat}}\left(\mathrm{~V}_{\mathrm{GS}}-\mathrm{V}_{\mathrm{t}}\right)+\mathrm{K}_{\mathrm{sat}} \cdot V_{\mathrm{gsm}} \cos (\mathrm{wt})=\mathrm{I}_{\mathrm{D}}+\mathrm{I}_{\mathrm{m}} \cos (\mathrm{wt})
$$

Where $I_{D}$ is DC component of drain current, $I_{m}$ is the maximum amplitude of the AC component, and $\mathrm{K}_{\text {sat }}$ is the device parameter of the transistor. Furthermore, the maximum
amplitude of the AC component of the drain current is specified by $I_{m, \max }=I_{D}$ to prevent distortion on the signal [4].


Figure 30 Class-A biasing point (Reprinted from [4])
Figure 29 shows an $n$-stacked driver ( $n=4$ ), in which, one common source and three common-gate gain stages are connected in series. Assuming the voltage swing across the common-source transistors is $V_{m}$ and the drain-source AC current is $I_{m}$, the $\mathrm{Z}_{\mathrm{s} 2}$ can be calculated as

$$
\begin{equation*}
\mathrm{Z}_{\mathrm{s} 2}=\frac{\mathrm{V}_{\mathrm{m}}}{\mathrm{I}_{\mathrm{m}}}=\mathrm{Z}_{\mathrm{opt}} \tag{3.2}
\end{equation*}
$$

To make sure each transistor has an equal drain-source voltage swing, the next stage also needs to have $V_{m}$ swing across its drain and source terminals, and since the drain voltages of each stage are in phase, therefore the overall swing at the drain of the second stage will be $\mathrm{Z}_{\mathrm{s} 3}=\frac{2 . \mathrm{V}_{\mathrm{m}}}{\mathrm{I}_{\mathrm{m}}}=2 . \mathrm{Z}_{\mathrm{opt}}$, which can be generalized for the $\mathrm{n}^{\text {th }}$ stage as

$$
\mathrm{Z}_{\mathrm{s},(\mathrm{n}+1)}=\frac{\mathrm{n} \cdot \mathrm{~V}_{\mathrm{m}}}{\mathrm{I}_{\mathrm{m}}}=\mathrm{n} \cdot \mathrm{Z}_{\mathrm{opt}}
$$

As discussed before, adding $\mathrm{C}_{\mathrm{i}}$, the gate voltage swing can be controlled which, in contrast with the cascode configuration, this structure reduces the drain-gate and drainsource swing which cause more reliable transistor operation. However, there is a tradeoff between the gain of the driver and its saturation power and drain efficiency. By adding the gate capacitance, the overall gain of the system reduces, however, the saturation power and drain efficiency increase. To compensate for the small gain degradation, more stacked devices can be used.


Figure 31 high-frequency stacked-transistor model (Reprinted from [3])
Figure 31 shows the high-frequency stacked-transistor model, where $\mathrm{Z}_{\mathrm{L}}$ is the loading impedance seen by the drain of that device. Assuming $r_{0}$ is large, the test voltage $\mathrm{V}_{\mathrm{t}}$ can be applied to the source of the device. By applying the KCL and KVL

$$
\mathrm{v}_{\mathrm{t}}=\frac{\mathrm{i}_{\mathrm{f}}-\mathrm{i}_{\mathrm{g}}}{\mathrm{sC} \mathrm{C}_{\mathrm{i}}}-\frac{\mathrm{i}_{\mathrm{g}}}{\mathrm{sC} \mathrm{C}_{\mathrm{gs}}}
$$

$$
\begin{gather*}
-i_{t}=i_{g}+\frac{g_{\mathrm{m}} i_{g}}{s C_{g s}} \\
-i_{f}=\frac{g_{\mathrm{m}} i_{g}}{s C_{g s}}+\frac{v_{d s}+v_{t}}{Z_{L}}  \tag{3.5}\\
v_{d s}=\frac{i_{f}}{s C_{g d}}+\frac{i_{g}}{s C_{g s}}  \tag{3.6}\\
i_{t}+i_{g}-i_{f}=\frac{v_{t}+v_{d s}}{Z_{L}} \tag{3.7}
\end{gather*}
$$

Therefore, assuming $f_{0} \ll F_{t}$ and neglect the feedback current through $C_{g d}$, the source input impedance, $\mathrm{Z}_{\mathrm{s}, \mathrm{i}}$, can be derived and simplified as [3]

$$
\mathrm{Z}_{\mathrm{s}, \mathrm{i}}=\frac{\mathrm{C}_{\mathrm{gs}}+\mathrm{C}_{\mathrm{i}}+\mathrm{C}_{\mathrm{gd}, \mathrm{i}}\left(1+\mathrm{g}_{\mathrm{m}, \mathrm{i}} \mathrm{Z}_{\mathrm{d},(\mathrm{i}+1)}\right)}{\left(\mathrm{g}_{\mathrm{m}}+\mathrm{s} \mathrm{C}_{\mathrm{gs}}\right)\left(\mathrm{C}_{\mathrm{gd}}+\mathrm{C}_{\mathrm{i}}\right)} \approx \frac{\mathrm{C}_{\mathrm{gs}}+\mathrm{C}_{\mathrm{i}}}{\mathrm{~g}_{\mathrm{m}} \mathrm{C}_{\mathrm{i}}}
$$

To achieve the optimum load line impedance and make sure each transistor experiences the same stress, the impedance $Z_{s, i}$ should be $(i-1) . R_{o p t}$. Therefore, the value of $C_{2}$ to $C_{4}$ should be set such that $Z_{s, 2}, Z_{s, 3}$, and $Z_{s, 4}$ are $R_{o p t}, 2 R_{o p t}$, and $3 R_{o p t}$, respectively, and the optimum load impedance is $4 \mathrm{R}_{\text {opt }}$. This guarantees the growth of absolute voltage swings with respect to ground, but the drain-source, drain-gate, and gatesource voltage swings are the same for each transistor, which causes more reliable
operation. However, as the operating frequency increases and becomes closer to $F_{t}$, the reactance of $\mathrm{Z}_{\mathrm{s}, \mathrm{i}}$ won't be negligible anymore and needs to be considered.

From the equation above, and assuming $\mathrm{Z}_{\mathrm{s}, \mathrm{i}}$ is primarily real and $\mathrm{C}_{\mathrm{gd}}$ is negligible, $\mathrm{C}_{\mathrm{i}}$ can be calculated and simplified as: [12]

$$
\begin{gathered}
\mathrm{C}_{\mathrm{i}}=\frac{\mathrm{C}_{\mathrm{gs}}+\mathrm{C}_{\mathrm{gd}}\left(1+\mathrm{g}_{\mathrm{m}} \mathrm{R}_{\mathrm{opt}}\right)}{(\mathrm{i}-1) \mathrm{g}_{\mathrm{m}} \mathrm{R}_{\mathrm{opt}}-1} \approx \frac{\mathrm{C}_{\mathrm{gs}}}{(\mathrm{i}-1) \mathrm{g}_{\mathrm{m}} \mathrm{R}_{\mathrm{opt}}-1} \\
\text { where } \mathrm{i}=2,3, \ldots, \mathrm{I}
\end{gathered}
$$

(3.10)

The open-loop small-signal gain of the driver can be calculated as

$$
A_{v}=\frac{g_{m 1} R_{L}}{\left(1+\frac{s C_{g s 2}}{g_{m 2}}\right) \cdot\left(1+\frac{s C_{g s 3}}{g_{m} 3}\right) \cdot\left(1+\frac{s C_{g s 4}}{g_{m 4}}\right)} \approx g_{m 1} R_{L} \text { for } f_{0} \ll F_{t}
$$

One drawback of stacked topology is the stability of the system. With higher voltage swings, stability can be an issue. Figure 32 shows the small signal gain and stability factors of the driver, which is shown in Figure 24, as the feedback resistor changes. As expected, by lowering the feedback resistor value, the gain of the system drops, and the bandwidth increases. However, the stability of the system increases due to the effect of a negative feedback loop.


Figure 32 Small signal gain and stability factors Vs. the feedback resistor (Rf) (Reprinted from [13])

### 3.3 Intermediate matching network

As previously mentioned, at low frequencies, the drain impedance of the $\mathrm{k}^{\text {th }}$ transistor, $\mathrm{Z}_{\mathrm{d}, \mathrm{k}}$, can be approximated as a resistance and can be tuned by the gate impedance of the next stage $\mathrm{C}_{\mathrm{k}+1}$. However, as the frequency increases and reaches the millimeter-wave range, the intermediate node impedance has a significant reactance part, which causes the voltage swing not to be aligned. From Figure 33 the high-frequency admittance can be re-written as

$$
\mathrm{Y}_{\mathrm{opt}} \approx \frac{1}{\mathrm{k} \cdot \mathrm{R}_{\mathrm{opt}}}-\frac{\mathrm{s}}{\mathrm{k}} \mathrm{C}_{\mathrm{eqv}, \mathrm{k}} \quad \mathrm{k}=1,2, \ldots, \mathrm{k}
$$

Where

$$
\mathrm{C}_{\mathrm{eqv}, \mathrm{k}}=\mathrm{C}_{\mathrm{ds}, \mathrm{k}}+\mathrm{kC}_{\mathrm{dsub}, \mathrm{k}}+\mathrm{C}_{\mathrm{gd}, \mathrm{k}}
$$

Additionally, the source input impedance of the next stage can be written as

$$
\mathrm{Z}_{\mathrm{s}, \mathrm{k}+1} \approx \mathrm{k} \cdot \mathrm{R}_{\mathrm{opt}}-\frac{\mathrm{kR}}{\mathrm{opt}} \mathrm{~g}_{\mathrm{m}, \mathrm{k}+1} \mathrm{~s}\left(\mathrm{C}_{\mathrm{gs}, \mathrm{k}+1}-\mathrm{g}_{\mathrm{m}, \mathrm{k}+1} \mathrm{R}_{\mathrm{opt}} \mathrm{C}_{\mathrm{ds}, \mathrm{k}+1}\right) \quad \mathrm{k}=1,2, \ldots, \mathrm{~K}-1
$$



Figure 33 simplified the small-signal model of stacked transistors (Reprinted from [12])
As discussed before, to minimize the intermediate mismatch due to the reactance portion of impedance, three methods are available as previously shown in Figure 19. As shown in Figure 34, the shunt inductance method is the most effective with less sensitivity to the inductance value.

In this design, however, the intermediate matching is not used, since the maximum operating frequency is below the mm-wave range and the effect of the parasitic at the desired bandwidth is not significant.


Figure 34 The comparison of different tuning methods on Psat (Reprinted from [12])

### 3.4 Fully differential Stacked RF Driver design FDOSI CMOS

A fully-differential stacked linear RF driver was designed with 17 GHz bandwidth and 20 dB voltage gain, as discussed in chapter 1, using the 22nm FDOSI CMOS process for which the drain-to-source breakdown voltage is 0.8 V [19]. The 4 -stacked structure used is shown in Figure 35 together with the feedback network and on-chip input and output matching network. Three gate capacitors $\mathrm{C}_{2}-\mathrm{C}_{4}$ determined to enable voltage swing on the gate of each transistor and minimize the stress on each stage, as discussed in the earlier section. With $50 \Omega$ single-ended output termination (input impedance of the MZM) and 200 mV input signal, the output voltage swing of about 2 V was achieved, which is necessary to properly drive the MZM.

Four SLVT-nFET transistors were used because of lower threshold voltage and faster operation. Due to higher gain for the first stage, a larger transistor was used with a
$192-\mu \mathrm{m}$ overall gate width. A smaller transistor was used for the other three stages, to minimize the parasitic of devices, with $160-\mu \mathrm{m}$ overall width for each stage. The supply voltage was chosen to be 3.2 V and the gates were biased with $0.3 \mathrm{~V}, 1.23 \mathrm{~V}, 2.1 \mathrm{~V}$, and 3 V , respectively from the bottom to top transistor, so that each stage experiences the same drain-to-source DC voltage difference. Additionally, Class-A biasing was used to achieve highly linear operation, as is shown in Figure 36, where the signal is a full cycle sinusoidal signal throughout the circuit. The input and output matching networks are implemented with a high-quality factor on-chip inductor, which was designed using the Sonnet EM simulator, with 650 pH to achieve input/output matching better than -10 dB over the bandwidth. A 24-nH off-chip choke inductor was chosen to push the lower-cutoff frequency to lower frequencies. Additionally, 2-pF coupling capacitors are placed at the input and the output to decouple the DC voltage from other parts of the system.

With $50 \Omega$ termination and having 4-stacked transistors, the $\mathrm{R}_{\text {opt }}$ was chosen to be about $12.5 \Omega$. Also, the overall output swing needed for the MZM was $2 \mathrm{~V}_{\mathrm{pk}}$ single-ended. Assuming, the input signal is 200 mV , the voltage swing at the drain of each stage should be $0.5 \mathrm{~V}, 1 \mathrm{~V}, 1.5 \mathrm{~V}$, and 2 V , respectively so that each transistor experiences a $500-\mathrm{mV}$ source-to-drain voltage swing which is below the device breakdown voltage. By considering these drain voltages, each stage has to have a gain of $2.5,2,1.5$, and $1.33 \frac{\mathrm{~V}}{\mathrm{~V}}$ respectively from the bottom to the top stage.


Figure 35 Proposed 4-stacked driver (single-ended) with input and output matching network


Figure 36 Intermediate drain voltages

### 3.5 Transistor sizing and biasing

In this design, sizing, and biasing of the transistors is the main aspect of the design's challenge. As discussed earlier in (3.4), the stages must have $2.5 \frac{\mathrm{~V}}{\mathrm{~V}}, 2 \frac{\mathrm{v}}{\mathrm{V}}, 1.5 \frac{\mathrm{v}}{\mathrm{V}}$ and $1.33 \frac{\mathrm{~V}}{\mathrm{~V}}$ gain, respectively from the first to the last stage, to achieve the overall gain of 20 dB . Moreover, the DC voltages at the drain of each stage should be $0.8 \mathrm{~V}, 1.6 \mathrm{~V}, 2.4 \mathrm{~V}$, and 3.2 V , to make sure the $V_{D S, n}$ does not exceed the nominal voltage of the device. Furthermore, by having a $50 \Omega$ load at the output of the driver, the $R_{o p t}$ for 4-stack, will become about $12.5 \Omega$. By knowing this information, and also, knowing the input signal swing which is $200 m V_{p}$, the transistors can be sized.

Since the first stack transistor has the highest gain, the design of this stage is slightly more challenging compared to other stages. Having the input voltage and the gain of the first stage, the drain voltage of this stage can be calculated as 500 mV with $800 \mathrm{mV} V_{D C}$ level, as shown in Figure 37 (a). To ensure the signal is not clipping at the bottom, the over-driver voltage, $V_{o v}$, should be less than the minmium of the output swing, which is 300 mV , with some margin. However, because of using a choke inductor for VDD connection, the peak voltage can go as high as $2 \times V D D$, ideally, which in this case there is more than 300 mV margine. Thus, the transistor must be biased to have a small overdrive voltage while maintaining the required gain. On the other hand, due to the large input signal, which in this case is 200 mV , the threshold voltage also need to be minimized to make sure the first stack transistor does not go to the triode region. Having access to the back-gate on this technology, as discussed before, allows us to reduce the threshold voltage to make sure the drain voltage does not clip at its peak value as is shown in Figure

37 (b). However, from the equation (3.15), by reducing the threshold voltage, the overdrive voltage increases, therefore, the $V_{g s}$ also need to be minimized to reduce the overdrive voltage.

$$
V_{o v}=V_{g s}-V_{t h}
$$

Figure 38(a) shows the effect of back-gate voltage on the threshold voltage $\left(V_{t h}\right)$ of a device. As is shown, the threshold voltage of the transistor reduces by about $80 \frac{\mathrm{mV}}{\mathrm{V}}$ with maximum of 250 mV and minmium of 90 mV .

Additionally, Figure 38 (b) shows the effect of back-gate voltage on the $g_{m}$ of the transistor. As is shown, the peak transconductance value moves to lower $V_{G S}$ as the backgate voltage increases. In this work, the 2 V back-gate voltage is chosen with about 0.3 V $V_{g s}$ to reduce the over-drive voltage while achieving the required gain. Also, 192-um overall width is divided between 20 fingers and 8 multipliers for the first stack transistor, for power handling purposes. The same design procedure is used for other stages. However, due to the lower required gain, the smaller transistors are used to minimize the parasitic of the device.


Figure 37 (a) Input/output of first stack transistor, (b) back-gate voltage effect on the first stage drain voltage


Figure 38 (a) Threshold Voltage Vs. Back-gate voltage; (b) Transconductance (gm) Vs. Vgs for different Backgate voltages.

### 3.6 Feedback and wideband matching

In this study, the effect of resistive feedback on both gain and stability factors is observed. Figure 39 (a) shows, as the $R_{f}$ reduces from $900 \Omega$ to $100 \Omega$, the gain of the system reduces from 23 dB to 10 dB , but the bandwidth of the device increases, as expected. Figure 39 (b), shows the stability factor "Mu" as the $R_{f}$ changes from $100 \Omega$ to $900 \Omega$. Although, at higher frequencies, the $R_{f}=300 \Omega$ performs better, but at low frequencies it does not. However, with $R_{f}=500 \Omega$ better stability observed at low frequencies and high frequencies. At the end, a $550 \Omega$ feedback resistor was chosen to achieve desired gain and stability.


Figure 39 Effect of feedback resistor on (a) gain, bandwidth, and (b) stability.

Another advantage of the feedback resistor is to provide broadband matching, especially at the input of the driver. However, due to the properties of this topology, which have been discussed, the output matching by itself has acceptable matching, if the right value is chosen for $Z_{\text {opt }}$, which in this design is about $Z_{\text {opt }}=12.5 \Omega$. Nevertheless, the L-
type matching network is used to achieve even better input/output matching. Figure 40 shows the input and the output matching versus frequency on the smith chart. As is shown, both the input and the output plot are close to the center of the smith chart.


Figure 40 Input/Output matching Vs. Frequency plot from $1 \mathbf{G H z}$ to $17 \mathbf{G H z}$

### 3.7 Choke Inductor

In this design, the choke inductor is placed between the VDD and the circuit. Unlike the conventional resistive load, having a choke inductor has some benefits. Since the impedance looking into the choke inductor is $\mathrm{jwL}_{\mathrm{f}}$, the high-frequency input impedance to the choke inductor is very large, therefore, the choke inductor can filter out the highfrequency RF signal, and also, can minimize the parasitic effect of the device at the higher frequency. This results, in an inductive peaking which causes the extended bandwidth. Moreover, at DC, the choke inductor act as a short circuit, which means the DC signal can pass through with no voltage drop across the choke inductor. Therefore, the efficiency of the device also improves with a maximum of $\% 50$ drain efficiency, as mentioned earlier. However, there are some challenges in designing the choke inductor. due to the limitation of the sub-micron CMOS technology, achieving a large inductance value with a decent quality factor is not possible. Also, to include the low frequency on the bandwidth, relatively, a very large inductor is needed, therefore, an off-chip inductor is a solution. But, as the operating frequency goes to a higher frequency, the self-resonant frequency (SRF) of the inductor also needs to be considered and it must be well above the maximum operating frequency. In this work, a 24-nH off-chip choke inductor is used with more than 40 GHz upper-frequency limit and a quality factor better than 20 at 10 MHz . Additionally, this inductor can handle up to 500 mA current.

Figure 41 shows the effect of the choke inductor value on low cutoff frequency. As shown, the choke inductance value swept from $1-\mathrm{nH}$ and $30-\mathrm{nH}$. As the inductance value increased, the cutoff frequency was pushed to a lower frequency by about 1 GHz .

Moreover, as the inductance increases, the input/output matching of the driver improves at 1 GHz by about 4 dB . Additionally, a slightly higher gain was achieved with a higher inductance value, due to less loading effect at the output.


Figure 41 Choke inductor effect on the low cutoff frequency

### 3.8 Layout

The importance of applying good layout techniques at RF and mm-wave frequencies is significant, since all the wiring, pads, and passive devices may introduce parasitic that change the design parameters. Moreover, achieving large-value passive devices is not possible, due to the quality factor degradation as the reactance increases. In this project, a 650pH spiral inductor was designed using the Sonnet EM simulator with a peak quality factor of 19 for input/output matching networks, as shown in Figure 42. Additionally, four MOM capacitors are connected in parallel to form a larger capacitor for input/output coupling purposes, as shown in Figure 43. However, due to its limited quality factor, severe voltage gain drop is observed at a higher frequency, with a coupling capacitor larger than 2 pF . Figure 44 shows the $2-\mathrm{pF}$ MOM capacitor quality factor.



Figure 42 Designed spiral inductor using Sonnet EM simulator


Figure 432 pF coupling capacitor, $\mathrm{C}_{6}$ on Figure 35, consists of four 500 fF unit capacitors in parallel with $8 \mu m \times 10 \mu m$ dimension each unit and $15 \mu m \times 50 \mu m$ overall size for the $\mathbf{C}_{6}$.


Figure 44 2-pF MOM coupling capacitor Quality Factor


Figure 45 First stage transistor layout, consists of 8 arrays of SLVTnFET
Figure 45 shows the first stage transistor. As is shown, it consists of 8 multipliers where each array of a transistor consists of 20 fingers. Multiple arrays are chosen to increase the matching and power handling of the transistor. Moreover, to ensure that the transistor can handle the required current, all the widths of the traces are chosen carefully and all the Vias are counted and checked based on [19]. In this arrangement, the transistor's arrays are placed in two rows so that the sources are facing each other and can be connected easily. Then, all the drains and the sources are connected to the top three metal layers to handle high current flow without burning out the metal. The same structure is used for the rest of the stages with different unit array sizing.

Figure 46 is a close look at the active part of the layout. Each stage corresponds to each stacked transistor, where the $1^{\text {st }}$ stage is the common-source transistor and the other three stages are the common-gate transistors cascoded at the top of each other. In this layout, the top three metals are used, due to their thickness and lower loss, for the RF signal path to ensure that the signal path can handle the passing current. Overall, the main
signal path can handle up to 74 mA RMS current, based on the PDK files [19], which is much higher than the overall RMS current on this design.


Figure 46 Layout differential stacked PA, Active area layout
Figure 47 shows the overall layout of the fully differential 4-stacked RF driver using 22 nm FDSOI with a core area of $0.167 \mathrm{~mm}^{2}$. As is shown, 2 pairs of inductors are used for input and output matching. Also, bypass capacitors are placed at the bottom, close to the DC pads, to filter out the AC signal at those nodes and reduce the power supply noise before entering the circuit.


Figure 47 Overall Differential 4-stacked RF driver

### 3.9 Wire bonding effect

In this design, a wire bonding connection was used for measurement purposes. Due to high operating frequency, the parasitic associated with bond pads and the inductance of the wire bonds must be considered in the design. Typically, the effect of the bond wire and bond pads is more critical on the RF path, and for the DC path, the bypass capacitor will bypass those effects. However, in this design, the bypass capacitors cannot be placed on the VDD pads since otherwise, the effect of off-chip choke inductors will also be bypassed. Therefore, the effect of the bond wire and pads needs to be considered, as shown in Figure 48.


Figure 48 Passive VDD connection and its equivalent circuit model
By having a significantly large choke inductor, the choke inductor can be considered an open circuit for high frequencies. In this case, the pad-2 capacitance ( $\mathrm{C}_{\mathrm{p} 2}$ ) and the bond wire (LBW) resonate at their resonance frequency which can be calculated by

$$
\mathrm{f}=\frac{1}{2 \pi \sqrt{(\mathrm{LBW})\left(\mathrm{C}_{\mathrm{p} 2}\right)}}
$$

Where $\mathrm{C}_{\mathrm{p} 2}$ is the pad capacitance for the PCB which is considered to be 80 -fF. Figure 49 shows the effect of variation of bond-wire inductance (LBW) as it increases from 200pH to 600 pH . As is shown, there will be a notch at the resonance frequency for each LBW value and this frequency moves to a higher frequency as the LBW decreases. The same behavior also is observed for the driver transfer function. As is shown in Figure 50, the same notches can be seen at the same frequency. Therefore, it is critical to choose minimum LBW and pad capacitance values to push the notch out of the frequency range of interest.


Figure 49 Effect of resonance frequency as the bond-wire inductance varies from 200 pH to 600 pH with fixed 80-fF pad2 capacitance


Figure 50 driver transfer function with LBW of 400 pH and 550 pH

## 4. SIMULATION RESULTS

### 4.1 Post-Layout Simulation Results

### 4.1.1 S-parameter analysis

Figure 51, shows the "S-parameter" analysis of the proposed design. As shown, the gain of 20 dB is achieved with about 17 GHz 3 dB bandwidth. The input and output matching of better than -10 dB is achieved from 1 GHz to 20 GHz . However, below 1 GHz , the matching start to degrade due to the coupling capacitor effect at the input and the output. The effect of wire bonding is noticeable at about 28 GHz , with 400 pH bond wire inductance and 80fF PCB pad capacitance.


Figure 51 proposes a differential driver S-parameter analysis

### 4.1.2 Design stability

Figure 52 (a) shows the primary stability factor of the design. as shown, the K-factor is greater than 1 for the entire bandwidth, which means the system is unconditionally stable. Moreover, Figure 52 (b) shows the Mu and Mu-prime stability factors, which are greater than 1 which confirms the stability of the system.


Figure 52 (a) K-factor and B1f stability factors of the PA, (b) Mu and Mu-prime stability factors

### 4.1.3 Harmonic balance analysis (1-dB compression point and Voltage gain)

This section shows the large-signal effect with harmonic balance analysis. Figure 53 (a) and (b), show the output vs input power of the driver at 10 and 16 GHz . as is shown, the IP1dB at 10 and 16 GHz are -2.9 dBm and -3.2 dBm and OP 1 db are 16 dBm and 14.3 dBm , respectively, which means, the system behaves non-linear with signal around IP1dB and beyond. Moreover, $P_{\text {Sat }}$ is about 17.5 dBm for 10 and 16 GHz .


Figure 53 (a) OP1db at 10 GHz is about 16 dBm , IP1db is about $\mathbf{- 2 . 9 ~ d B m}$ (single-ended, $50 \Omega$ termination);
(b) OP1db at $16 \mathbf{G H z}$ is about $14.3 \mathbf{d B m}$, IP1db is about $\mathbf{- 3 . 2} \mathbf{~ d B m}$ (single-ended, $50 \Omega$ termination)

Figure 54 (a) and (b), show the voltage gain versus the input power, which up to about -4 dBm , the gain is relatively constant and linear, and it starts to degrade for input power beyond that point for 10 and 16 GHz frequencies.


Figure 54 Voltage gain Vs input power at (a) $10 \mathbf{G H z}$, (b) $\mathbf{1 6 G H z}$.

### 4.1.4 Transient Analysis

Figure 55 (a) and (b) show the transient analysis of the driver with a 400 mV differential input signal at 10 GHz . As shown in Figure 55 (a), a 4 V output swing is achieved. Figure 55 (b) shows the input/output single-ended swings. As is shown, there is a small mismatch between the $+/$ - output swing due to the layout mismatch effect.


Figure 55 (a) differential and (b) single-ended input/output signal at $10 \mathbf{~ G H z}$

### 4.2 Chip microphotograph

Figure 56 shows the microphotograph of the TIA, VGA, and RF driver in a single chip, where the highlighted part is the RF driver. In this chip, the RF ports are all placed on the left side and the bias pads are placed at the top and the bottom of the chip


Figure 56 Chip microphotograph

### 4.3 Measurement plan

In this work, to measure the CMOS chip, wire bonding is used instead of using probing for two reasons. First, since there are off-chip components connected to a PCB, which is shown in Figure 58, wire bonding is necessary. Moreover, because the goal of this project is to design the OEO loop, wire bonding is the only way to connect the CMOS chip to the SiP chip. For these reasons, wire bonding is used to connect the CMOS chip to a PCB for measurement purposes. To minimize the length of the bond wires and their inductance, a cavity was designed on the PCB, so that the chip can be placed inside that cavity and the surface of the PCB and the chip are at the same height, therefore the bond wires can be shorter in length. Another challenge of this measurement is to characterize and measure the driver, alone, without the TIA and the VGA. Hence, to overcome this challenge, a test point is designed at the input of the driver, as shown in Figure 57, so that the driver can be measured by subtracting the test point result from the output result.


Figure 57 The location of the test point at the CMOS chip

The first step in the measurement is to calibrate the equipment and cables to be able to de-embed the effect of them later from the measurement results.


Figure 58 Measurement PCB

### 4.3.1 S-parameter measurement

Figure 59 shows the S -parameter measurement setup using a VNA. A VNA injects a precise sine wave and sweeps the frequency while the receiver tracks the swept input response. From equation (4.1), the S-Parameter can be calculated, where $a_{1}$ and $a_{2}$ are the incident signal of input and output, and $b_{1}$ and $b_{2}$ are the reflected signal of input and output, respectively, as shown in Figure 59. However, in this design, since the CMOS chip includes the TIA, VGA, and Driver, the $S_{11}$ of the driver, alone, cannot be measured. Moreover, since the output of the driver is differential, a balun is needed to measure the $S_{22}$ and $S_{21}$, as shown in Figure 60. Also, since the gain at the output, is the cumulative gain of the TIA, VGA, and the driver, to measure the gain of the driver alone, the gain at the test point can be substituted from the overall output gain.

$$
\begin{array}{ll}
S_{11}=\frac{b_{1}}{a_{1}} & S_{12}=\frac{b_{1}}{a_{2}} \\
S_{21}=\frac{b_{2}}{a_{1}} & S_{22}=\frac{b_{2}}{a_{2}}
\end{array}
$$



Figure 59 S-Parameter measurement setup (Reprinted from [23])


Figure 60 Differential output to single-ended, using a Balun

### 4.3.2 Transient measurement

To see the transient response of the system, a signal generator and an oscilloscope is needed. The signal generator will inject a signal with a certain amplitude and frequency, and the oscilloscope will track the single-ended output swing using a balun or differential driver's output swing.

### 4.3.3 Large Signal measurement

For the large-signal measurements, a spectrum analyzer and sweep-signal generator are going to be used, as shown in Figure 61. The signal generator will inject the signal into the device and the output power of the signal generator sweeps while the spectrum analyzer measures the single-ended output power of the DUT using a balun, as shown in Figure 60. The cable and probes are de-embedded using the calibration. The P1db and IP3 are the two main parameters to measure on large signal measurement. To measure the P1db, the input power sweep while the gain is being monitored. The region where the gain drops off 1 dB compared to the small-signal linear region, is the 1 dB compression point. To measure the IP3, a two-tone signal will be injected, as shown in Figure 62, while their power is swept at the same time. The point where the output power of the first and third harmonics are equal is the IP3 point.


Figure 61 Large-signal measurement setup


Figure 62 IP3 measurement setup (Reprinted from [24])

## 5. CONCLUSION AND FUTURE WORKS

In this study, different RF driver topologies were analyzed and their advantages and disadvantages were identified, such as distributed and stacked RF drivers. A fully differential, wide-band stacked RF driver was finally presented. The proposed design, prototyped in 22 nm Global Foundries FDSOI technology, shows a $0.4-17.4 \mathrm{GHz} 3-\mathrm{dB}$ bandwidth with 20 dB voltage gain and about 16 dBm output power at 10 GHz . This design consumes 288 mW overall DC power (differential) with 3.2 supply voltage and occupies a core area of $0.167 \mathrm{~mm}^{2}$. Table 5 shows a brief performance comparison between the post-layout simulation result of this work and other works.

For the future, the chip will be measured as soon as it is ready and wire bonded to the PCB. This chip will be measured as a stand-alone CMOS chip and with SOI photonic chip. Moreover, the ultimate goal of this project is to extend the bandwidth of the RF driver and the whole OEO system to 40 GHz .

Table 5 performance comparison table

|  | This Work | $[\mathbf{1 5 ]}$ | $[\mathbf{1 6 ]}$ | $[\mathbf{1 8}]$ |
| :---: | :---: | :---: | :---: | :---: |
| Technology | $\underline{22-\mathrm{nm}}$ | $22-\mathrm{nm}$ | 45 nm SOI | $45-\mathrm{nm}$ |
| Gain | $\underline{20 \mathrm{~dB}}$ | $\sim 24 \mathrm{~dB}$ | 8 dB | 12.4 dB |
| Frequency | $\underline{0.4-17.4 \mathrm{GHz}}$ | $\mathrm{DC}-39 \mathrm{GHz}^{*}$ | $88-90 \mathrm{GHz}$ | $81-96 \mathrm{GHz}$ |
| OP1dB | $\underline{16 \mathrm{dBm} @ 10 \mathrm{GHz}}$ | 11.6 dBm | 11.5 dBm | --- |
| $\mathbf{P}_{\text {sat }}$ | $\underline{17.6 \mathrm{dBm}}$ | 16.4 dBm | 17 dBm | 19.2 dBm |
| PAE | $\underline{27.4 \%}$ | $28.3 \%$ | $\% 9$ | $\% 14$ |
| Supply | $\underline{3.2 \mathrm{~V}}$ | 3.2 V | 4.2 V | 3.4 V |
| Consumption | $\underline{144 \mathrm{~mW}}$ | 140 mW | 382 mW | 379 mW |

* The reported value is not the 3dB-Bandwidth


## REFERENCES

[ 1] Abdelrahman, Diaaeldin, et al. "A Novel Inductorless Design Technique for Linear Equalization in Optical Receivers." Journal of Low Power Electronics and Applications, vol. 12, no. 2, Apr. 2022, p. 19. Crossref, https://doi.org/10.3390/jlpea12020019.
[2] Qi, Nan, et al. "A $25 \mathrm{gb} / \mathrm{s}$, 520mw, 6.4 Vpp silicon-photonic Mach-Zehnder modulator with distributed driver in CMOS." Optical fiber communication conference. Optical Society of America, 2015.
[3] Pornpromlikit, S. (2010) CMOS RF power amplifier design approaches for Wireless Communications. dissertation. University of California, San Diego.
[ 4] Kazimierczuk, M.K. (2008) RF Power Amplifiers. Chichester, West Sussex, UK: John Wiley \& Sons.
[5] Borel, A.; Barzdėnas, V.; Vasjanov, A. Linearization as a Solution for Power Amplifier Imperfections: A Review of Methods. Electronics 2021, 10, 1073. https://doi.org/10.3390/electronics10091073
[6] Cripps, S.C. (2006) RF power amplifiers for Wireless Communications. Boston: Artech House.
[7] El-Aassar, Omar, and Gabriel M. Rebeiz. "A 120-GHz bandwidth CMOS distributed power amplifier with multi-drive intra-stack coupling." IEEE Microwave and Wireless Components Letters 30.8 (2020): 782-785.
[ 8] Kumar, N. and Grebennikov, A. (2015) Distributed power amplifiers for RF and Microwave Communications. Artech House.
[ 9] Çelik, Umut, and Patrick Reynaert. "Robust, efficient distributed power amplifier achieving $96 \mathrm{Gbit} / \mathrm{s}$ with 10 dBm average output power and $3.7 \%$ PAE in 22-nm FDSOI." IEEE Journal of Solid-State Circuits 56.2 (2020): 382-391.
[10] X. Zhu, Y. Qian, Z. Peng, Y. Liang and S. Diao, "Analysis and Design of a DC-12-GHz Distribution Power Amplifier for Quantum Key Distribution Application," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 30, no. 9, pp. 1306-1318, Sept. 2022, doi: 10.1109/TVLSI.2022.3180503.
[11] Alsuraisry, Hamed et al. "A 24-GHz transformer-based stacked-FET power amplifier in $90-\mathrm{nm}$ CMOS technology." 2015 Asia-Pacific Microwave Conference (APMC) 3 (2015): 1-3.
[ 12] Dabag, Hayg-Taniel, et al. "Analysis and design of stacked-FET millimeter-wave power amplifiers." IEEE Transactions on Microwave Theory and Techniques 61.4 (2013): 1543-1556.
[13] Wu, Hai-Feng, et al. "Analysis and design of an ultrabroadband stacked power amplifier in CMOS technology." IEEE Transactions on Circuits and Systems II: Express Briefs 63.1 (2015): 49-53.
[ 14] McRory, John G., Gordon G. Rabjohn, and Ronald H. Johnston. "Transformer coupled stacked FET power amplifiers." IEEE Journal of Solid-State Circuits 34.2 (1999): 157-161.
[ 15] Dadash, M. Sadegh, David Harame, and Sorin P. Voinigescu. "Large-swing 22nm si/SiGe fdsoi stacked cascodes for 56GBaud drivers and 5g pas." 2018 IEEE BiCMOS and Compound Semiconductor Integrated Circuits and Technology Symposium (BCICTS). IEEE, 2018.
[ 16] Jayamon, Jefy, et al. "A W-band stacked FET power amplifier with 17 dBm P sat in 45-nm SOI MOS." 2013 IEEE Radio and Wireless Symposium. IEEE, 2013.
[ 17] Kao, Min-Sheng, et al. "20-Gb/s CMOS EA/MZ modulator driver with intrinsic parasitic feedback network." IEEE Transactions on Very Large Scale Integration (VLSI) Systems 22.3 (2013): 475-483.
[ 18] Agah, Amir, et al. "Multi-drive stacked-FET power amplifiers at 90 GHz in 45 nm SOI CMOS." IEEE Journal of Solid-State Circuits 49.5 (2014): 1148-1157.
[19] 22FDX Design Kit and Technology Training Book
[20] Carter, R., et al. "22nm FDSOI technology for emerging mobile, Internet-ofThings, and RF applications." 2016 IEEE International Electron Devices Meeting (IEDM). IEEE, 2016.
[ 21] Pornpromlikit, Sataporn, et al. "A watt-level stacked-FET linear power amplifier in silicon-on-insulator CMOS." IEEE Transactions on Microwave Theory and Techniques 58.1 (2009): 57-64.
[22] H. Portela, V. Subramanian and G. Boeck, "Fully integrated high efficiency Kband PA in $0.18 \mu \mathrm{~m}$ CMOS technology," 2009 SBMO/IEEE MTT-S International Microwave and Optoelectronics Conference (IMOC), Belem, Brazil, 2009, pp. 393-396, doi: 10.1109/IMOC.2009.5427557.
[23] Team, The Sierra. "S-Parameters Measurement via VNA." Sierra Circuits, 6 Feb. 2023, https://www.protoexpress.com/blog/s-parameters-measurement-vector-networkanalyzer/.
[24] Anritsu Co., Morgan Hill. "VNA Addresses IMD Measurements." Microwave Journal, Microwave Journal, 27 Apr. 2018, https://www.microwavejournal.com/articles/24172-vna-addresses-imd-measurements.


[^0]:    *The reported value is not the 3 dB -Bandwidth

