# FAST ALGORITHMS FOR HIGH FREQUENCY INTERCONNECT MODELING IN VLSI CIRCUITS AND PACKAGES

A Dissertation

by

YANG YI

Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

December 2009

Major Subject: Electrical Engineering

## FAST ALGORITHMS FOR HIGH FREQUENCY INTERCONNECT MODELING IN VLSI CIRCUITS AND PACKAGES

A Dissertation

by

YANG YI

Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of

#### DOCTOR OF PHILOSOPHY

### Approved by:

Chair of Committee,

Weiping Shi Costas Georghiades Committee Members,

Peng Li Vivek Sarin

Head of Department, Costas Georghiades

December 2009

Major Subject: Electrical Engineering

#### ABSTRACT

Fast Algorithms for High Frequency Interconnect Modeling in VLSI Circuits and Packages. (December 2009)

Yang Yi, B.S., Shanghai Jiaotong University, China; M.En., Shanghai Jiaotong University, China Chair of Advisory Committee: Dr. Weiping Shi

Interconnect modeling plays an important role in design and verification of VLSI circuits and packages. For low frequency circuits, great advances for parasitic resistance and capacitance extraction have been achieved and wide varieties of techniques are available. However, for high frequency circuits and packages, parasitic inductance and impedance extraction still poses a tremendous challenge. Existing algorithms, such as FastImp and FastHenry developed by MIT, are slow and inherently unable to handle multiple dielectrics and magnetic materials.

In this research, we solve three problems in interconnect modeling for high frequency circuits and packages.

- 1) Multiple dielectrics are common in integrated circuits and packages. We propose the first Boundary Element Method (BEM) algorithm for impedance extraction of interconnects with multiple dielectrics. The algorithm uses a novel equivalent-charge formulation to model the extraction problem with significantly fewer unknowns. Then fast matrix-vector multiplication and effective preconditioning techniques are applied to speed up the solution of linear systems. Experimental results show that the algorithm is significantly faster than existing methods with sufficient accuracy.
  - 2) Magnetic materials are widely used in MEMS, RFID and MRAM. We present

the first BEM algorithm to extract interconnect inductance with magnetic materials. The algorithm models magnetic characteristics by the Landau Lifshitz Gilbert equation and fictitious magnetic charges. The algorithm is accelerated by approximating magnetic charge effects and by modeling currents with solenoidal basis. The relative error of the algorithm with respect to the commercial tool is below 3%, while the speed is up to one magnitude faster.

3) Since traditional interconnect model includes mutual inductances between pairs of segments, the resulting circuit matrix is very dense. This has been the main bottleneck in the use of the interconnect model. Recently,  $K = L^{-1}$  is used. The RKC model is sparse and stable. We study the practical issues of the RKC model. We validate the RKC model and propose an efficient way to achieve high accuracy extraction by circuit simulations of practical examples.

To My Family

#### ACKNOWLEDGMENTS

The past few years have made up one of the most enjoyable periods in my life. I owe my gratitude to all those people who have made this possible. Foremost, I would like to thank my advisor Prof. Weiping Shi. I have been amazingly fortunate to have an advisor who gave me the freedom to explore on my own, and at the same time, the guidance to recover when my steps faltered. Prof. Shi has taught me a lot on how to question thoughts and express ideas. His brilliant insight, patient guidance, countless encouragements, and continuous support has helped me overcome many crisis situations and finish this dissertation. Working with him has been truly a pleasant, stimulating and rewarding experience. I would like to thank my committee member Prof. Vivek Sarin. He has always been there to listen and give advice. I am deeply grateful to him for the long discussions that inspired my research work. I have also benefited greatly from Prof. Sarin's courses and his expertise in the field of numerical algorithms. I would like to thank Prof. Peng Li. He served on my dissertation committee, and I have learned a lot from him on the subject of statistical timing analysis, sensitivity analysis, and model order reduction. I would also like to thank Prof. Costas Georghiades for serving on my committee and being supportive in different stages of my doctoral studies.

I am thankful to Prof. Jiang Hu, Prof. Gwan Choi, Prof. Sunil Khatri, and Prof. Donald K. Friesen. It has been an extremely rewarding experience taking their courses and discussing various research topics with them. I am grateful to my friends and colleagues, Dr. Bryon Krauter, Dr. Tingdong Zhou, Dr. Chenggang Xu, David Widiger, Alain Caron, Alice Lu, and Vivian Shen, during my internship at IBM; Dr. Rajendran Panda, Dr. Yuhong Fu, Dr. Yun Zhang, Dr. Robert Wenzel, William Downey, and Brian Xia, during my stay at Freescale Semiconductor, Inc. I

am also indebted to the members of the computer engineering group with whom I have interacted during the course of my graduate studies. Particularly, I would like to acknowledge CT Chang, Zhuo Feng, Shiyan Hu, Jerry Jiang, Yifang Liu, Zhuo Li, Guo Yu, Shu Yan, Xiaoji Ye, Wei Zhuang and Ying Zhou for their various contributions.

Finally, this acknowledgement will not be complete without expressing my profound gratitude to my family. I sincerely thank my mother Lifang, my father Zhengming and my husband Lingjia for their unconditional love. I am what they made me to be. This dissertation is dedicated to them.

## TABLE OF CONTENTS

| CHAPTER |                                               | Page                             |
|---------|-----------------------------------------------|----------------------------------|
| I       | INTRODUCTION                                  | 1<br>1<br>3<br>5                 |
| II      | CIRCUIT FORMULATION FOR IMPEDANCE EXTRACTION  | 7                                |
|         | A. Introduction                               | 7<br>8<br>12<br>12<br>14         |
| III     | PHIIMP ALGORITHM                              | 17                               |
|         | A. Introduction                               | 17<br>18<br>22<br>28<br>28<br>32 |
| IV      | MAGNETIC MATERIAL MODELING                    | 37                               |
|         | <ul> <li>A. Introduction</li></ul>            | 37<br>39<br>40<br>42             |
| V       | INDUCTANCE EXTRACTION WITH MAGNETIC MATERIALS | 49                               |
|         | A. Introduction                               | 49<br>50<br>52<br>56<br>61       |

| CHAPTER  |                                                   | Page |
|----------|---------------------------------------------------|------|
|          | <ol> <li>Accuracy and Efficiency</li></ol>        |      |
| VI       | COMPACT AND ACCURATE INTERCONNECT MODEL $$ . $$ . | 70   |
|          | A. Introduction                                   | 70   |
|          | B. Sparse $K$ Model                               | 71   |
|          | C. Delay Sensitivity Analysis                     | 75   |
|          | D. Experimental Results                           | 78   |
| VII      | CONCLUSIONS                                       | 85   |
| REFERENC | CES                                               | 87   |
| VITA     |                                                   | 9.4  |

## LIST OF TABLES

| TABLE |                                                                                                                               | Page |
|-------|-------------------------------------------------------------------------------------------------------------------------------|------|
| Ι     | Comparison of PHiImp and other existing algorithms                                                                            | 17   |
| II    | Comparison of Impedance (Ohm) for spiral inductor with uniform dielectric, error is with respect to FastImp (higher accuracy) | 35   |
| III   | Comparison of Impedance (Ohm) for a transmission line with multiple dielectrics, error is with respect to HFSS                | 36   |
| IV    | Comparison between MRAM and other memory technologies                                                                         | 37   |
| V     | Inductance (nH) computed by Maxwell 3D and the new algorithm, error is with respect to Maxwell 3D                             | 68   |
| VI    | Delay comparison between the full $L$ method and the sparse $K$ method, error is respect to the full $L$ method               | 79   |
| VII   | Relative error of delay due to extraction error of $R_i$                                                                      | 81   |
| VIII  | Relative error of delay due to extraction error of $C_i$                                                                      | 82   |
| IX    | Relative error of delay due to extraction error of $L_i/K_i$                                                                  | 83   |

## LIST OF FIGURES

| FIGURI | ${f E}$                                                                         | Page |
|--------|---------------------------------------------------------------------------------|------|
| 1      | Design flow                                                                     | 2    |
| 2      | Distributed interconnect $RLC$ model                                            | 9    |
| 3      | Partial element equivalent circuit model                                        | 11   |
| 4      | Panel charge approximation                                                      | 13   |
| 5      | Matrix transformation                                                           | 14   |
| 6      | Recursive refinement process                                                    | 21   |
| 7      | J and $J'$ matrices for a tree of height $3$                                    | 24   |
| 8      | Sparse transformation                                                           | 25   |
| 9      | Preconditioners construction                                                    | 26   |
| 10     | Number of GMRES iterations                                                      | 26   |
| 11     | Rate of convergence                                                             | 27   |
| 12     | Number of GMRES iterations                                                      | 27   |
| 13     | CPU time                                                                        | 28   |
| 14     | Geometry of a transmission line                                                 | 29   |
| 15     | Cross sectional view of a transmission line                                     | 29   |
| 16     | Impedance (Ohm) calculated by FastImp and PHiImp $\ \ldots \ \ldots \ \ldots$   | 31   |
| 17     | Rate of convergence                                                             | 31   |
| 18     | Spiral inductor's top view (a) and cross sectional view (b) $\ \ldots \ \ldots$ | 33   |
| 19     | Impedance (Ohm) calculated by HFSS, FastImp and PHiImp                          | 34   |

| FIGUR | E                                                                         | Page |
|-------|---------------------------------------------------------------------------|------|
| 20    | MRAM structure                                                            | 39   |
| 21    | Schematic view of a integrated MRAM cell                                  | 39   |
| 22    | Micrograph of a integrated MRAM cell                                      | 40   |
| 23    | Magnetic hysteresis loop                                                  | 41   |
| 24    | Gyromagnetic switching processes with damping                             | 42   |
| 25    | Effect of the magnetic field on the real part of $\mu_r$                  | 45   |
| 26    | Effect of the magnetic field on the imaginary part of $\mu_r$             | 46   |
| 27    | Effect of the magnetization on the real part of $\mu_r$                   | 47   |
| 28    | Effect of the magnetization on the imaginary part of $\mu_r$              | 48   |
| 29    | Magnetostatic field modeling                                              | 53   |
| 30    | Conductor array with magnetic blocks                                      | 61   |
| 31    | CPU time ratio of Maxwell 3D over our algorithm                           | 62   |
| 32    | Cross sectional view of $ \mathbf{B} $                                    | 63   |
| 33    | Effect of $M_0$ ( $M_0 = M_S cos(theta)$ ) on the inductance              | 64   |
| 34    | Effect of conductor current on the inductance                             | 65   |
| 35    | Magnitude of S-parameters for interconnect with non-magnetic blocks       | 66   |
| 36    | Magnitude of S-parameters for interconnect with magnetic blocks $$ . $$ . | 67   |
| 37    | Phase of S-parameters for interconnect with non-magnetic blocks           | 69   |
| 38    | Phase of S-parameters for interconnect with magnetic blocks               | 69   |
| 39    | One wire with ten segments                                                | 72   |
| 40    | Circuit model                                                             | 75   |
| 41    | Gate model                                                                | 76   |

| FIGURI | ${f E}$                                                  | Page |
|--------|----------------------------------------------------------|------|
| 42     | Voltage waveform at port 2 in transient analysis         | . 80 |
| 43     | Voltage waveform at port 2 in AC analysis                | . 81 |
| 44     | Relative error of delay due to extraction error of $R_i$ | . 82 |
| 45     | Relative error of delay due to extraction error of $C_i$ | . 83 |
| 46     | Relative error of delay due to extraction error of $L_i$ | . 84 |

#### CHAPTER I

#### INTRODUCTION

#### A. Background

Semiconductor process technology has been continually scaling down for the past four decades and the trend continues. In the early days of very large scale integration (VLSI) circuits, the speed bottleneck was at the circuit level, whereas interconnects were treated as ideal connections with the parasitic effects ignored. With shrinking process technologies, increasing die size and clock frequency, interconnect parasitic effects have begun to manifest themselves in signal delay and noise [1].

Nowadays, the circuit design becomes interconnect limited and the design flow is interconnect driven. Consequently, accurate interconnect modeling is critical to the analysis and design of VLSI circuits, electronic packages and micro-electromechanical devices (MEMS). Fig.1 shows the design flow. As layouts are completed, parasitic extraction tools are used to provide highly accurate interconnect models with passive components including resistance, capacitance, inductance and impedance. Interconnect modeling plays an important role in the circuit simulation and layout optimization.

Most existing interconnect modeling methods fall into two categories. One approach is to solve the electromagnetic field for the volume, such as Finite Difference Method (FDM) or Finite Element Method (FEM). It generates a mesh for the analyzed structures as well as the surrounding space, resulting a large but sparse linear system. Sparse linear solution methods, such as sparse factorization, conjugate-gradient, or multi-grid methods can be used to solve the system.

The journal model is *IEEE Transactions on Automatic Control*.



Fig. 1. Design flow

The second class of methods is Boundary Element Method (BEM) which requires a discretization of only the analyzed structures in the electromagnetic field. It results a small yet dense linear system. Our work is mainly based on BEM.

The dense linear system in BEM is often solved by iterative techniques such as Generalized Minimal Residual (GMRES) [2] and Conjugate Gradient (CG) [3] methods. Each iteration requires the product of the dense coefficient matrix of order n with a vector, which takes  $O(n^2)$  time and  $O(n^2)$  memory. Great efforts have been made to accelerate the computation of matrix-vector products. By exploiting the fast decaying nature of the Green's function, matrix-vector products can be computed efficiently by the fast multipole approximation [4] [5], hierarchical data structure [6] [7], precorrected Fast Fourier Transform (pFFT) method [8] [9] [10], or the singular value decomposition method [11]. These methods reduce the time for each matrix-vector product to O(n) or  $O(n\log n)$ .

For low frequency circuits, great advances for BEM parasitic resistance and capacitance extraction have been achieved and wide varieties of techniques are available, such as FastCap [4], PHiCap [7], HiCap [6] and ICCap [12].

However, for high frequency circuits and packages, BEM parasitic inductance and impedance extraction still poses a tremendous challenge. Existing algorithms, such as FastImp [9]and FastHenry [5] developed by MIT, are slow and inherently unable to handle multiple dielectrics and magnetic materials.

#### B. Overview of Our Work

In this dissertation, we solve three problems in interconnect modeling for high frequency circuits and packages.

We first tackle the problem of impedance extraction for interconnects with multiple dielectrics. Previous BEM impedance extraction algorithms include FastPep [13] and FastImp [9]. FastPep uses a conductor volume discretized integral formulation combined with model order reduction. FastImp applies a conductor surface integral formulation and precorrected-FFT [8] accelerated iterative method. However, both of these algorithms are kernel dependent assuming uniform dielectric.

For practical problems, dielectrics with different permittivity should be considered. Integrated circuit interconnects are separated by dielectrics with different permittivity varying from 3.0 to 8.0. In packages, conductors typically pass through plastic or ceramic materials with large relative permittivity. The dielectric should be accurately modeled, or it can easily cause up to 20% error [14]. No previous algorithm based on BEM can handle multiple dielectrics. One of the challenges for impedance extraction based on BEM is to find a method applicable to multiple dielectric problems.

We propose the first BEM impedance extraction algorithm for interconnects with multiple dielectrics. We call the algorithm PHiImp (a Preconditioned hierarchical algorithm for Impedance extraction). PHiImp algorithm introduces a circuit formulation which makes it possible to utilize either multilayer Green's function or equivalent charge method to extract impedance in multiple dielectrics. The novelty of the formulation is the reduction of unknowns and the application of hierarchical data structure. The hierarchical data structure permits efficient sparsification transformation and preconditioners to accelerate the linear equation solver.

Our second project focus on the problem of inductance extraction for structures in the presence of magnetic materials. The magnetic materials become common in circuits of MEMS, Radio Frequency Identification (RFID) and magnetoresistive random access memory (MRAM). The existence of magnetic materials could significantly increase the inductance for nearby interconnects and pose a challenge to inductance extraction. In our examples, the inductance increases up to 2X in the presence of nearby magnetic materials.

Most previous work, such as FastHenry [5], cannot deal with magnetic materials. Recently, FastMag [15] was proposed for inductance extraction in the presence of only linear magnetic materials. Ansoft's Maxwell 3D [16] can extract interconnect inductance in the presence of nonlinear magnetic materials. Maxwell 3D uses finite elements and automatic adaptive meshing techniques to compute the electrical and electromagnetic behavior. However, the speed of Maxwell 3D is slow. The increasing adaption of magnetic materials in large complex IC requires fast inductance extraction.

We propose a fast algorithm to extract inductance in the presence of magnetic materials. The new algorithm models the magnetic characteristics by the Landau-Lifshitz-Gilbert (LLG) equation and fictitious magnetic charges. To speed up the algorithm, we apply a number of innovative techniques including the approximation of magnetic charge effects and the modeling of currents with solenoidal basis.

The third problem we investigate is generating and simulating a compact and accurate interconnect circuit model. Since the interconnect circuit model includes mutual inductances between every pair of conductors, the resulting circuit matrix is very dense. As an example, large clock net topologies along with power grid can lead to the number of self inductances of the order of look and mutual inductance of the order of 10G. Hence the SPICE simulation is infeasible due to impractical time and memory requirements. This has been the main bottleneck in the use of the interconnect circuit model.

Recently, a new circuit element K, which is defined as the inverse of inductance, is introduced and is incorporated in a circuit simulation tool KSim [17]. We study the practical issues of the RKC model. We validate the RKC model by circuit simulations of practical examples. The RKC model is very sparse and stable, and accurately captures the inductance effect. Furthermore, we propose an efficient way to achieve high accuracy extraction based on the delay sensitivity analysis. Interconnect R and L/K close to driver should be extracted with high accuracy, while interconnect C close to receiver should be extracted with high accuracy.

#### C. Outline

The major contribution of the dissertation is presenting several novel algorithms that improve the existing parasitic parameter extraction methods in terms of accuracy and running time. The rest of the dissertation is organized as follows.

Chapter II and III give the detailed description of the PHiImp algorithm. In Chapter II, we introduce a novel circuit formulation for both uniform and multiple data structure and efficient preconditioners. Chapter IV and V demonstrate our fast algorithm to extract inductance in the presence of magnetic materials. In chapter IV, we present the magnetic material modeling by the LLG equation and the small signal approximation. Chapter IV introduces the magnetostatic modeling by the fictitious charge method. A compact system is established and accelerated by the solenoidal basis method. Chapter VI presents our study on the practical issues of the *RKC* model. Finally, we summarize our work in Chapter VII.

#### CHAPTER II

#### CIRCUIT FORMULATION FOR IMPEDANCE EXTRACTION

#### A. Introduction

In the area of interconnect structure modeling and analysis, the electromagnetic simulation is increasingly significant, but associated with many challenges. For instance, due to the skin and proximity effect at high frequencies, the impedance becomes frequency-dependent, which leads to problems in the simulation and design. Hence, the electromagnetic simulation tools require accurate and efficient impedance extraction for interconnects across the entire frequency range.

The impedance extraction for a set of conductors is to determine the relationship between potentials and currents at the terminals of conductors. The unknowns in a multi-conductor system are charges on the surfaces and currents in the interiors of conductors. Previous impedance extraction algorithms for 3D structures include FastPep [13] and FastImp [9]. FastPep uses a conductor volume discretized integral formulation combined with model order reduction. FastImp applies a conductor surface integral formulation and precorrected-FFT [8] accelerated iterative method. However, both of these algorithms are kernel dependent assuming uniform dielectric.

For practical problems, dielectrics with different permittivity should be considered. Integrated circuit interconnects are separated by dielectrics with different permittivity varying from 3.0 to 8.0. In packages, conductors typically pass through plastic or ceramic materials with large relative permittivity. The dielectric should be accurately modeled, or it can easily cause up to 20% error [14]. No previous algorithm based on BEM can handle multiple dielectrics. One of the challenges for impedance extraction based on BEM is to find a method applicable to multiple dielectric prob-

lems.

In this chapter, we propose novel circuit formulations for interconnect impedance extraction with both uniform dielectric and multiple dielectrics. Compared with the circuit formulations in existing BEM impedance extraction algorithms [13] [9], our circuit formulation greatly reduces the number of unknowns by eliminating panel currents and node potentials. Besides, since the only unknowns are filament currents, the new formulation makes it possible to use the hierarchical data structure [6].

#### B. Partial Element Equivalent Circuit Model

One well-known approach to generate accurate circuit model for 3-D structures is the partial element equivalent circuit (PEEC) method. PEEC is an efficient modeling methodology for handling complex challenges posed by VLSI circuits and packages through the use of resistors, inductors, and capacitors. In this section, we will briefly review the PEEC formulation [18] [19] which is derived from Maxwell's equations.

The sum of all sources of electric field at any point in a conductor is

$$\mathbf{E}(\mathbf{r}) = \frac{\mathbf{J}_c(\mathbf{r})}{\sigma} + \frac{\partial \mathbf{A}(\mathbf{r})}{\partial t} + \nabla \phi(\mathbf{r}), \tag{2.1}$$

where **E** is the applied electric field,  $\mathbf{J}_c$  is the conductor current density,  $\sigma$  is the conductivity,  $\mathbf{A}$  and  $\phi$  are the vector and scalar potential, respectively. The integral formulation for Maxwell's equation under Laplace transform can be used to compute the conductor current and charge distribution [18]

$$\frac{\mathbf{J}_c(\mathbf{r})}{\sigma} + s \int_V G(\mathbf{r}, \mathbf{r}') \mathbf{J}(\mathbf{r}') dV = -\nabla \phi(\mathbf{r}), \qquad (2.2)$$

$$\int_{S} G(\mathbf{r}, \mathbf{r}') q(\mathbf{r}') dS = \phi(\mathbf{r}), \qquad (2.3)$$

where G is the Green's functions, V and S are the union of conductor volumes and



Fig. 2. Distributed interconnect RLC model

surfaces, respectively, s is the Laplace frequency, q is the total surface charge density including conductor charges and dielectric-dielectric interface charges,  $\mathbf{J}$  is the total current density.

In addition, the current distribution  $\mathbf{J}(\mathbf{r})$  and surface charge distribution  $q(\mathbf{r})$  should obey the conservation equation

$$\nabla \cdot \mathbf{J}(\mathbf{r}) = 0, \tag{2.4}$$

$$\mathbf{n}(\mathbf{r}) \cdot \mathbf{J}(\mathbf{r}) = s \, q(\mathbf{r}), \tag{2.5}$$

where  $\mathbf{n}(\cdot)$  is the outward normal vector on the conductor surface.

The integral equation can be solved by PEEC discretization. To model the charge, the conductor surfaces are covered with panels, each of which holds a constant charge density. To model the current flow, the conductor volumes are divided into filaments, each of which carries a constant current density along the length. It will be assumed that the current density is uniform and constant within a filament but varying from filament to filament. An example for the discretization of a section of

layout is shown in Fig. 2. Based on the discretization, equation (2.2) becomes

$$(R+sL)I_c^f = \phi_n^f, (2.6)$$

where  $I_c^f$  is the vector of filament currents,  $\phi_n^f$  is the vector of potentials over filaments, R is the diagonal matrix of filament resistances and given by

$$R_{ii} = \frac{l_i}{\sigma a_i},\tag{2.7}$$

where  $a_i$ ,  $l_i$  are the cross section area and length of filament i, respectively.

Matrix L is the dense, symmetric and positive definite matrix of partial inductances. There are four popular inductance formulae for computing the diagonal entry of L matrix including Ruehli [20], Grover [21], Hoer [22], and FastHenry [5]. Along with PEEC, Ruehli [20] provides inductance formulae to facilitate PEEC development. Grover [21] uses geometric mean distance (GMS) method and provides greatly simplified formulae for inductance calculation. Hoer [22] provides exact formulae for calculating self-inductance. Inside the FastHenry [5] package, an inductance formula was used for multipole-accerated inductance extraction application. In our work, we use the FastHenry self-inductance formula [5]. The off diagonal entry of L matrix is defined as

$$L_{ij} = \frac{1}{a_i a_j} \int_{V_i} \int_{V_j} \mathbf{I_i} \cdot \mathbf{I_j} G(\mathbf{r}, \mathbf{r}') dV_j dV_i, \qquad (2.8)$$

where  $a_i$ ,  $a_j$  are the cross section area of filament i and j, respectively,  $I_i$  and  $I_j$  are unit currents along the length of filament i and j, respectively.

For the charge, the approximated charge density  $\rho_s(r)$  can be written as

$$\rho_s(\mathbf{r}) = \Sigma v_i(\mathbf{r}) q_i, \quad \mathbf{r} \in S, \tag{2.9}$$

where  $q_i$  is the charge density of panel i and  $v_i(\mathbf{r}) = 1$  if  $\mathbf{r}$  is on panel i and zero



Fig. 3. Partial element equivalent circuit model

otherwise, S is the surface. The filaments are branches in a network circuit graph and the junctions between filaments are the nodes of the network circuits. To enforce equation (2.3), the panels are added by the nodes of the network circuits. Therefore, the relationship between potential and charge can be deducted from equation (2.3). Approximating the average value over the face, by its value at the appropriate node point, the potential becomes

$$Pq_p = \phi_n^p, \tag{2.10}$$

where  $q_p$  is the vector of panel charges,  $\phi_n^p$  is the vector of potentials over panels, P is the potential coefficient matrix and given by

$$P_{ij} = \frac{1}{a_i a_j} \int_{S_i} \int_{S_j} G(\mathbf{r}, \mathbf{r}') dS_j dS_i.$$
 (2.11)

Fig. 3 shows an example of a circuit model describing one conductor divided into three filaments per segment and one panel per node.

#### C. Circuit Formulation

#### 1. Uniform Dielectric Problem

To enforce current conservation in the interior, we first replace panel charges with currents into panels,  $\frac{d}{dt}q_p = I_c^p$ , where  $I_c^p$  is the vector of panel currents, and then state that the sum of currents entering any node equals the sum of the currents leaving that node, equations (2.4-2.6, 2.10) can be represented as a matrix form

$$\begin{bmatrix} R + sL & 0 & -A_e^T \\ 0 & P/s & -A_q^T \\ A_e & A_q & 0 \end{bmatrix} \begin{bmatrix} I_c^f \\ I_c^p \\ \phi_n^e \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ I_t \end{bmatrix}, \tag{2.12}$$

where  $I_t$  denotes the terminal current,  $\phi_n^e$  is the vector of node voltage,  $A_e^T$  is the nodal incidence matrix providing the difference of node voltage and  $A_e$  enforce the boundary condition,  $A_q^T$  replicates the potential at a node to all its corresponding panels and  $A_q$  sums the charges at each node.

In order to apply the hierarchical data structure to the PEEC-based linear system, the original formula needs to be modified as follows

$$\begin{bmatrix} R + sL + A_e^T Y^{-1} A_e & 0 & 0 \\ 0 & P/s & -A_q^T \\ A_e & 0 & Y \end{bmatrix} \begin{bmatrix} I_c^f \\ I_c^p \\ \phi_n^e \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ I_t \end{bmatrix}.$$
 (2.13)

where  $Y = A_q s P^{-1} A_q^T$ .

The linear system (2.13) can be further transformed by eliminating  $I_c^p$ 

$$\begin{bmatrix} R + sL + A_e^T Y^{-1} A_e & 0 \\ A_e & Y \end{bmatrix} \begin{bmatrix} I_c^f \\ \phi_n^e \end{bmatrix} = \begin{bmatrix} A_e^T Y^{-1} I_t \\ I_t \end{bmatrix}.$$
 (2.14)

The part  $Y^{-1}$  in equation (2.14) introduces the inversion of P matrix. Since



Fig. 4. Panel charge approximation

matrix P is dense, calculating the inverse of matrix P will consume lots of CPU time and memory. To avoid calculating the inverse of matrix P, we use matrix  $\widetilde{Y}$  to approximate matrix Y

$$\widetilde{Y} = s\widetilde{P}^{-1},\tag{2.15}$$

$$\widetilde{P} = \frac{1}{n_p^2} A_q P A_q^T, \tag{2.16}$$

where  $n_p$  is the number of panels per node. From the definition of matrix  $A_q$ , we know that the transformation  $A_q P A_q^T / n_p^2$  approximates the effect of all the panels in a node by only one panel per node. Fig. 4 and 5 give an illustration of the transformation. Based on the transformation, equation (2.14) then becomes

$$\left(R + sL + \frac{1}{s}A_e^T \widetilde{P} A_e\right) I_c^f = \frac{1}{s}A_e^T \widetilde{P} I_t,$$
(2.17)

$$(R+sL)I_c^f = A_e^T \phi_n^e. (2.18)$$

The part  $A_e^T \widetilde{P} A_e$  in equation (2.17) maps the capacitive effect to the filaments. Finally, we get the linear system to extract impedance for uniform dielectric problem

$$\left(R + sL + \frac{1}{sn_p^2} A_e^T A_q P A_q^T A_e\right) I_c^f 
= \frac{1}{sn_p^2} A_e^T A_q P A_q^T I_t,$$
(2.19)



Fig. 5. Matrix transformation

$$(R+sL)I_c^f = A_e^T \phi_n^e. (2.20)$$

For a given interconnect structure, after specifying a set of terminal current  $I_t$  as inputs, we get the value of  $I_c^f$  from equation (2.19). The terminal potential can be obtained by calculating the potential drop in each filament from equation (2.20). Consequently, the overall system impedance for the structure can be extracted. From equation (2.19-2.20), it is obvious that when computing the unknown vector  $I_c^f$ , we only need to know the hierarchical data structure of filaments.

Compared to the existing circuit formulations [13] [9], the new formulation greatly reduces both the type and the number of unknowns by eliminating the vector of panel currents and node potentials. Also, it can take advantage of the hierarchical data structure because the only unknowns are filament currents.

## 2. Multiple Dielectric Problem

For multiple dielectric problems, we combine the equivalent charge method [23] with PEEC modeling to deal with equations (2.2-2.4) and the boundary condition of dielectric-dielectric interfaces. Equivalent charge method considers total charges including conductor charges and dielectric interfaces charges. This method has a num-

ber of advantages: there is no limit on the number of dielectric layers that can be dealt with, and it can solve the problems with arbitrarily shaped conductors.

Therefore, the circuit formulation is

$$\begin{bmatrix} R+sL & \mathbf{0} & \mathbf{0} & -A_e^T \\ \mathbf{0} & P_{cc}/s & P_{cd} & -A_q^T \\ \mathbf{0} & E_{dc}/s & E_{dd} & \mathbf{0} \\ A_e & A_q & \mathbf{0} & \mathbf{0} \end{bmatrix} \begin{bmatrix} I_c^f \\ I_c^p \\ q_d \\ \phi_n^e \end{bmatrix} = \begin{bmatrix} \mathbf{0} \\ \mathbf{0} \\ \mathbf{0} \\ I_t \end{bmatrix}.$$
 (2.21)

where  $q_d$  is the vector of dielectric-dielectric interface charge densities,  $P_{cc}$  and  $P_{cd}$  are the dense, symmetric, positive definite matrices of potential coefficient and are defined in equation (2.11),  $E_{dd}$  and  $E_{dc}$  are the dense matrices of electrical field coefficient. The diagonal entry of  $E_{dd}$  is

$$E_{ii} = \frac{(\epsilon_{r1} + \epsilon_{r2})}{2a_i \epsilon_0},\tag{2.22}$$

where  $\epsilon_0$ ,  $\epsilon_{r1}$  and  $\epsilon_{r2}$  are the permittivity of free space, dielectric regions, respectively. The off-diagonal entries of  $E_{dd}$  and the entry of  $E_{dc}$  is

$$E_{ij} = (\epsilon_{r1} - \epsilon_{r2}) \frac{\partial}{\partial n} \frac{1}{a_i a_j} \int_{S_i} \int_{S_j} G(\mathbf{r}, \mathbf{r}') dS_j dS_i.$$
 (2.23)

Note that the problem with uniform dielectric is a special case with the dielectricdielectric interfaces removed.

Let

$$Y = \begin{bmatrix} A_q & \mathbf{0} \end{bmatrix} H^{-1} \begin{bmatrix} A_q^T \\ \mathbf{0} \end{bmatrix}, \tag{2.24}$$

$$H = \begin{bmatrix} P_{cc}/s & P_{cd} \\ E_{dc}/s & E_{dd} \end{bmatrix}, \tag{2.25}$$

the linear system (2.21) can then be transformed by eliminating  $I_c^p$  and  $q_d$ 

$$\begin{bmatrix} R + sL + A_e^T Y^{-1} A_e & \mathbf{0} \\ A_e & Y \end{bmatrix} \begin{bmatrix} I_c^f \\ \phi_n^e \end{bmatrix} = \begin{bmatrix} A_e^T Y^{-1} I_t \\ I_t \end{bmatrix}. \tag{2.26}$$

Using block matrix decompositions [24],  $H^{-1}$  can be expressed as

$$H^{-1} = \begin{bmatrix} s(P_{cc} - P_{cd}E_{dd}^{-1}E_{dc})^{-1} & sP_{cc}^{-1}P_{cd}(P_t - E_{dd})^{-1} \\ (P_t - E_{dd})^{-1}E_{dc}P_{cc}^{-1} & (E_{dd} - P_t)^{-1} \end{bmatrix},$$
(2.27)

where  $P_t = E_{dc} P_{cc}^{-1} P_{cd}$ . Accordingly, matrix Y becomes

$$Y = sA_q(P_{cc} - P_{cd}E_{dd}^{-1}E_{dc})^{-1}A_q^T. (2.28)$$

Because  $E_{dd}$  is strongly diagonally dominant, we use the diagonal sparse approximation to avoid computing the inversion of matrix  $E_{dd}$ . Only the diagonal entries of matrix  $E_{dd}$  are kept and the approximation of  $E_{dd}^{-1}$  is denoted by  $E_t$ .

Hence, a new circuit formulation applied for impedance extraction with multiple dielectrics is proposed as

$$\left[R + sL + \frac{1}{sn_p^2} A_e^T A_q (P_{cc} - P_{cd} E_t E_{dc}) A_q^T A_e\right] I_c^f 
= \frac{1}{sn_p^2} A_e^T A_q (P_{cc} - P_{cd} E_t E_{dc}) A_q^T I_t,$$
(2.29)

$$(R+sL)I_c^f = A_e^T \phi_n^e. (2.30)$$

The proposed circuit formulation (2.29-2.30) is kernel independent because it treats Green's function as a black box, which is reflected in matrices  $P_{cc}$ ,  $P_{cd}$ ,  $E_{dc}$  and  $E_{dd}$ . For impedance extraction with multiple dielectrics, we can either use equations (2.19-2.20) and multilayer Green's function, or equations (2.29-2.30) and equivalent charge method. Either way is effective.

#### CHAPTER III

#### PHIIMP ALGORITHM

#### A. Introduction

Recently, we made an advance in BEM by proposing a kernel independent hierarchical data structure [6] and an orthogonal transformation [7] to convert the dense linear system for the capacitance extraction problem into a sparse linear system. The sparsity offers several benefits including fast matrix-vector multiplication and efficient preconditioning.

In this chapter, we successfully extend the techniques to impedance extraction and develop PHiImp, which can either use equations (2.19-2.20) and multilayer Green's function, or equations (2.29-2.30) and equivalent charge method. Either way is effective. The comparison of the existing BEM algorithms for parasitic extraction and PHiImp algorithm is shown in Table I.

Table I. Comparison of PHiImp and other existing algorithms

|                      | FastCap | FastHenry | FastPep | FastImp | PHiImp    |
|----------------------|---------|-----------|---------|---------|-----------|
|                      | PHiCap  |           |         |         |           |
| Capacitance          | Yes     | No        | Yes     | Yes     | Yes       |
| Inductance           | No      | Yes       | Yes     | Yes     | Yes       |
| Impedance            | No      | No        | Yes     | Yes     | Yes       |
| Multiple Dielectrics | Yes     | N/A       | No      | No      | Yes       |
| Speed                | -       | -         | Average | Fast    | Very Fast |

There are several key features of the new algorithm. First, the algorithm uses

a novel equivalent-charge formulation to model the extraction problem with significantly fewer unknowns. Second, we utilize the fast hierarchical method. It has been shown that the fast hierarchical algorithm outperforms the multipole accelerated algorithm FastCap [4] in capacitance extraction. Third, we successfully use the sparsification transformation and efficient preconditioners which are originally introduced in capacitance extraction [7] [12]. The proposed preconditioners greatly reduce the number of iterations in solving the linear system. All these contribute to the significant improvement of PHiImp algorithm.

Experimental results demonstrate that PHiImp algorithm is accurate and efficient. For uniform dielectric problems, our algorithm is more accurate than FastImp while its number of unknowns is ten times less than that of FastImp. For multiple dielectric problems, its relative error with respect to HFSS is below 3%.

#### B. Hierarchical Data Structure

Chapter II introduces the circuit formulation (2.19-2.20) and (2.29-2.30) for impedance extraction with both uniform dielectric and multiple dielectrics. To efficiently solve equations (2.19-2.20) and (2.29-2.30), we apply the fast hierarchical data structure [6] invented for capacitance extraction. However, extending it to impedance extraction is not an easy task. The key issue lies in how to successfully implement the hierarchical partition scheme of conductor surfaces and interiors separately and apply it to the multiple tree data structure. The hierarchical partition scheme of conductor surfaces into panels should be consistent with the partition scheme of the conductor interior into filaments which make it possible to record the interaction link.

As shown in equations (2.19-2.20) and (2.29-2.30), the filament current vector  $I_c^f$  is sufficient to determine the impedance. To speed up, we can first apply the hi-

erarchical partition scheme to divide the conductors into filaments. It constructs the hierarchical date structure during the discretization. The hierarchical data structure consists of two parts: the partition hierarchy and the interaction links between entities of increments. Most of direct calculations of interactions are avoided since the interactions can be obtained recursively from small-scale interactions according to the hierarchical record. The solved linear equations in the fast hierarchical algorithm can be iteratively obtained by using GMRES, in which the most computational intensive procedure, the matrix-vector product, is reduced from  $O(n^2)$  to O(n) by using the hierarchical data structure, where n is the total number of filaments.

We know that the mutual inductance between two filaments decreases when the distance between the two filaments increases. For a prescribed precision criterion, the mutual inductance can be neglected without any meaningful impact on the accuracy when two filaments are sufficiently far apart. Therefore, the hierarchical partition scheme can be described in pseudo code as follows

```
Refine(filament i, filament j) {
    R = max(Ri, Rj);
    If (R < base-metric)
        RecordIteractionLink(i, j);
    else if (R/r <= P_{eps})
        RecordIteractionLink(i, j);
        else if (Ri > Rj) {
            Subdivide(i);
            Refine(i.left, j);
            Refine(i.right, j);
        }
}
```

In the above pseudo code, Ri and Rj are the radius of the smallest sphere containing the cross section of filament i and j, respectively. r is the distance between filament i and j. Parameter base-metric is employed to terminate the refinement if both the cross sections of filaments i and j are small enough.  $P_{eps}$  is user defined error bound. The procedure Subdivide() divides a filament along the center axis into two sub-filaments, which are denoted by left and right, respectively. Procedure left RecordIteractionLink() estimates the mutual inductance and effective potential drop across filaments due to the panel capacitance. The hierarchical partition scheme of conductor surfaces into panels should be consistent with the partition scheme of the conductor interior into filaments.

The parameter  $P_{eps}$  is utilized to specify the termination criterion for refinement. The interaction link between filaments i and j will be recorded in the hierarchical data structure and the filaments will not be refined further when the ratio R/r equals to or less than the parameter  $P_{eps}$ . The termination criterion places an upper bound on the error [6]. With the decreasing of  $P_{eps}$ , the extracted impedance values will converge. By recursively invoking this refine subroutine, the original conductors will be divided into filaments in a hierarchical manner.

Fig. 6 shows the recursive refinement that subdivides two conductors into a hierarchical data structure of filaments. Starting with two conductors A and B,



Fig. 6. Recursive refinement process

suppose the estimated ratio R/r between A and B is larger than  $P_{eps}$ . Then A can be divided into  $C \cup D$  and B into  $E \cup F$ . Suppose the estimated ratio between CF, DE and DF are less than  $P_{eps}$ , but estimated ratio between CE is still greater than  $P_{eps}$ . We record the interaction CF, DE and DF at this level while further subdividing filaments C and E. This recursive refinement procedure that subdivides the filaments will continue until the estimated ratio between them is less than the user-setting error bound  $P_{eps}$ . PHiImp provides continuous tradeoff of time with precision by changing  $P_{eps}$ . In this hierarchy, the original conductors A and B are assigned a level of zero. At each subdivision, the level of a newly-born sub-filament increases by one from its parent filament. Hence, C, D, E and F are at level 1, and G, H, I and J are at level 2. The mergence procedure proceeds as follows:

- 1) Begin with the highest level, i.e., level 2: The entries corresponding to G, H, I and J are denoted by  $L_G$ ,  $L_H$ ,  $L_I$  and  $L_J$ .
- 2) At level 1, the entry corresponding C, D, E and F is denoted as  $L_C$ ,  $L_D$ ,  $L_E$  and  $L_F$ ; Since C is the parent node of G and H, E is the parent node of I and J,  $L_C$  and  $L_E$  can be calculated:  $L_C = L_G + L_H$  and  $L_E = L_I + L_J$ .
- 3) At level 0:  $L_A = L_C + L_D = (L_G + L_H) + L_D$ ,  $L_B = L_E + L_F = (L_I + L_J) + L_F$ .

In the above mergence procedure, L denotes the estimation the mutual induc-

tance and effective potential drop across filaments due to the panel capacitance. Hence, we combine the partition scheme of conductor surfaces and interiors together and store the hierarchical discretization in a multiple tree structure. The filaments are stored as nodes in the tree, and the block coefficient entries are stored as links between the nodes. Each tree belongs to a conductor. It is important to note that the leaf nodes are mutually disjoint and the union of them covers each conductor completely. The benefit of using multiple tree structure is that the number of interaction in the multiple tree structure is O(n).

#### C. Efficient Preconditioners

In this subsection, we use an orthogonal transformation to convert the dense linear system for impedance extraction to a sparse linear system.

The transformation matrix is based on the characteristic of multiple tree structure which represents dividing the original conductors into filaments in a hierarchical manner. The sparsity offers several benefits including very fast matrix-vector multiplication, and efficient preconditioning through incomplete decomposition. The proposed preconditioners help reducing the number of iterations in the GMRES method up to ten times less than that of the original method without preconditioners. The residual norm decreases rapidly for preconditioned algorithm. In contrast, the decrease of the un-preconditioned GMRES method is very slow.

Let

$$M = R + sL + \frac{1}{sn_p^2} A_e^T A_q (P_{cc} - P_{cd} E_t E_{dc}) A_q^T A_e,$$
$$v = \frac{1}{sn_p^2} A_e^T A_q (P_{cc} - P_{cd} E_t E_{dc}) A_q^T I_t,$$

the linear system arising from equation (2.29-2.30) has the form

$$MI_c^f = v. (3.1)$$

Construct the J matrix and J' matrix to represent the hierarchical refinement from all filaments to the leaf filaments. Each row of the structure matrix J corresponds to a filament, either leaf or non-leaf, and each column corresponds to a leaf filament. The entry (i,j) of J matrix is 1 if filament i contains the leaf filament j, and 0 otherwise [7]. In the column of the structure matrix J' that corresponds to a basis filament i, each entry (i,j) of J' matrix is 1 if filament j contains the right hand leaf filament i, -1 if the parent of filament j contains the right-hand side filament i, and 0 otherwise [12]. Fig. 7 shows an example of constructing the J and J' matrix for a tree of height 3. E is an elementary transformation matrix expressed by J' = JE. Based on the characteristics of multiple tree structure, matrix E can be easily constructed from the hierarchical refinement.

In the example shown in Fig. 6, considering the rows and columns representing the leaf nodes, such as "D, F, G, H, I, J" in Fig. 6, matrix E is

$$E = \begin{bmatrix} 1 & 0 & 0 & -1 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & -1 \\ 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & -1 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & -1 & 1 \end{bmatrix}.$$

$$(3.2)$$

Based on matrix E, we can construct a new system matrix  $\tilde{M}$  by applying orthogonal transformation  $\tilde{M} = E^T M E$ . Here matrix M is constructed by  $J^T H J$  where H is a sparse matrix with each nonzero entry represents a link between the corre-



Fig. 7. J and J' matrices for a tree of height 3

sponding filaments in the hierarchical data structure. The details of how to construct matrix H can be found in [7].

Substituting  $\tilde{M}=E^TME$  into  $MI_c^f=v$ , the dense linear system is transformed to a sparse system

$$\tilde{M}\tilde{I}_c^f = \tilde{v},\tag{3.3}$$

where  $\tilde{v} = E^T v$ . The value of  $I_c^f$  can be obtained through

$$I_c^f = E\tilde{I}_c^f. (3.4)$$

After orthogonal transformation of the linear system, we apply matrix reordering and incomplete LU factorization [24] to compute the preconditioners. Incomplete LU factorization requires no fill-ins hence no extra memory and CPU time in computing the LU decomposed preconditioners, and its preconditioning decreases the number of



Fig. 8. Sparse transformation

iterations dramatically. Finally, the preconditioned GMRES method is used to solve the system. The outline of the new algorithm is shown in Fig. 8 and Fig. 9.

Experimental results show that the proposed preconditioners work very efficient. Fig. 10 shows the number of GMRES iterations needed to extract the impedance of a transmission line at different frequency points. The preconditioners help reducing the number of iterations up to an order of magnitude less than that of the original method without preconditioners, and keep the number of its iterations almost constant at different frequency sample points. Fig. 11 demonstrates the effect of preconditioners on the rate of convergence. The residual norm decreases rapidly for the preconditioned algorithm. In contrast, the decrease of the un-preconditioned GMRES method is very slow. Fig. 12 shows the effect of preconditioner on the number of GMRES iterations with different number of unknowns. As illustrated in Fig. 12, the preconditioned algorithm is more than an order of magnitude less than the standard GMRES method in the number of iterations. Fig. 13 shows CPU time comparison. As illustrated in



Fig. 9. Preconditioners construction



Fig. 10. Number of GMRES iterations



Fig. 11. Rate of convergence



Fig. 12. Number of GMRES iterations



Fig. 13. CPU time

Fig. 13, the preconditioned algorithm is more than ten times faster than the standard GMRES method without preconditioners.

# D. Experimental Results

To demonstrate the accuracy and efficiency of PHiImp, several classes of test are implemented. All the experiments were performed on the same computer with 3.2 GHz CPU and 1 GB memory.

# 1. Uniform Dielectric Cases

In the first test, consider a transmission line with two parallel conductors, shorted at the far end which is shown in Fig. 14 and Fig. 15. The length of each conductor is 1 cm. The cross section of each conductor is  $30 \ \mu m \times 30 \ \mu m$  and the space between two conductors is  $50 \ \mu m$ . Fig. 16 shows its impedance at different frequency points. The theoretical resonance frequencies should be 7.5 GHz and 22.5 GHz. PHiImp matches



Fig. 14. Geometry of a transmission line



Fig. 15. Cross sectional view of a transmission line

better than FastImp while its number of unknowns is ten times less than that of FastImp. This is mainly because PHiImp eliminates the vector of panel currents and node potentials. The vector of filament currents is sufficient to determine the impedance in PHiImp. FastPep uses more variables, and thus is slower than FastImp. The number of unknowns in PHiImp is significantly less than FastImp and FastPep.

The most important advantage of PHiImp is its super fast running speed and much less memory usage. The average running time of our algorithm is only 2.98 seconds for each sampling frequency point, whereas FastImp takes 132.02 seconds to get the same accurate results. The memory usage of PHiImp is 1.9 while that of FastImp is 47.3. Its speed is more than 50 times faster and its memory usage is 25 times less than FastImp.

In the second test, we consider a rectangular spiral inductor with different number of full turns. The width, spacing and thickness of the rectangular spiral are 1  $\mu m$ , 1  $\mu m$ , respectively. The inner radius of the rectangular spiral is 5  $\mu m$ .

The frequency point is 1 GHz. Table II gives the comparison between FastImp and PHiImp. Let the impedance computed by FastImp (higher accuracy) be Z which is a complex number and the impedance computed by another program be Z'. The relative error is computed in the Frobenius norm ||Z - Z'|| / ||Z||. With respect to FastImp (higher accuracy) in which the number of unknowns is ten times more than that of PHiImp (Peps=0.01), the relative error of PHiImp is below 2%, while that of FastImp with the same number of unknowns is about 5%. The running time of PHiImp is about 30-100 times less than FastImp to get the same accuracy. Hence, PHiImp is computationally more efficient.

Table II also shows the performance comparison of our algorithm with different setting of Peps. The number of unknown and running time will increase as Peps decreases. However, by doing this, the results will be more accurate. Since Peps gives an asymptotic error bound, similar to the expansion order of FastCap and running time/variance of QuickCap in capacitance extraction tools, Peps does not translate directly to the accuracy of the computed impedance. The relationship between Peps and the error of impedance can only be measured for an actual implementation using a reference. For the current implementation and with FastImp as the reference, 0.01 is the default value.

Fig. 17 shows the number of GMRES iterations with different number of unknowns in FastImp and PHiImp. Even with the same number of unknowns, its number of iterations is about half of that in FastImp. This figure demonstrates that the preconditioners in PHiImp greatly reduce the number of iterations in solving the linear system. All these contribute to the significant improvement of PHiImp.



Fig. 16. Impedance (Ohm) calculated by FastImp and PHiImp



Fig. 17. Rate of convergence

# 2. Multiple Dielectric Cases

Since FastImp can not handle multiple dielectrics, we use High Frequency Structure Simulator [25] (HFSS) from AnSoft to verify the accuracy of PHiImp. HFSS uses FEM.

In the first test, consider a transmission line with the length of each conductor being 1 cm. The cross section of each conductor is 1  $\mu m \times 1 \mu m$  and the space between two conductor is 2  $\mu m$ . The dielectric surrounding the conductors has relative permittivity  $\epsilon_r$ . The size of the dielectric is scaled to 10 cm  $\times$  10 cm  $\times$  10 cm. The frequency is 10 GHz. Table III shows the comparison between HFSS and our algorithm. The maximum error of PHiImp compared to HFSS is 3%, which is acceptable in practice. The average number of its iterations is 23 and its running time is 3.62 seconds. The impedance for this case computed by FastImp is 51.1476 ohm since FastImp assumes uniform dielectric. Hence, there is huge difference between impedance of conductors embedded in uniform dielectric and that embedded in multiple dielectrics. This demonstrates the importance of considering the effect of multiple dielectrics for impedance extraction.

In the second test, we extract the impedance of on-chip spiral inductor at different frequency points. Fig. 18 shows the layout of on chip spiral inductor. The width, spacing and thickness of the rectangular spiral are 1  $\mu m$ , 1  $\mu m$ , 1  $\mu m$ , respectively. The medium surrounding the upper layer has relative permittivity 4.2 and the medium surrounding the lower layer has relative permittivity 3.6. The spiral inductor is in the interface between two layers. The impedance at different frequency points computed by HFSS and PHiImp are shown in Fig. 19. The average number of iterations and running time of PHiImp are 24 and 4.87 seconds, respectively. The relative error of PHiImp with respect to HFSS is below 3%. Note that the impedance in FastImp



Fig. 18. Spiral inductor's top view (a) and cross sectional view (b)

is computed in uniform dielectric for emphasizing the effect of multiple dielectrics. Fig. 19 demonstrates that PHiImp perform accurate impedance extraction of 3-D general structures across wide frequency range.



Fig. 19. Impedance (Ohm) calculated by HFSS, FastImp and PHiImp

Table II. Comparison of Impedance (Ohm) for spiral inductor with uniform dielectric, error is with respect to FastImp (higher accuracy)

|            | FastImp        | FastImp           | PHiImp                      | PHiImp                      |  |  |  |
|------------|----------------|-------------------|-----------------------------|-----------------------------|--|--|--|
|            |                | (higher accuracy) | $(P_{\texttt{eps}} = 0.02)$ | $(P_{\texttt{eps}} = 0.01)$ |  |  |  |
| two-turn   |                |                   |                             |                             |  |  |  |
| unknowns   | 1010           | 15554             | 280                         | 720                         |  |  |  |
| iterations | 40             | 92                | 9                           | 17                          |  |  |  |
| impedance  | 1.730+0.570j   | 1.755 + 0.448j    | 1.732+0.501j                | 1.732+0.469j                |  |  |  |
| error      | 6.9~%          | -                 | 3.2%                        | 1.7~%                       |  |  |  |
| CPU (sec)  | 5.92           | 157.07            | 0.96                        | 1.48                        |  |  |  |
| three-turn |                |                   |                             |                             |  |  |  |
| unknowns   | 1778           | 25154             | 396                         | 1080                        |  |  |  |
| iterations | 44             | 111               | 11                          | 21                          |  |  |  |
| impedance  | 3.075+1.195j   | 3.088 + 1.109j    | 3.079+1.181j                | 3.080+1.155j                |  |  |  |
| error      | 2.7~%          | -                 | 2.2~%                       | 1.1 %                       |  |  |  |
| CPU (sec)  | 8.41           | 184.28            | 1.02                        | 2.79                        |  |  |  |
| four-turn  |                |                   |                             |                             |  |  |  |
| unknowns   | 2706           | 36290             | 860                         | 2560                        |  |  |  |
| iterations | 46             | 118               | 18                          | 24                          |  |  |  |
| impedance  | 4.618+2.108j   | 4.790 + 1.799j    | 4.672+1.972j                | 4.695+1.814j                |  |  |  |
| error      | 6.9~%          | -                 | 4.1~%                       | 1.9 %                       |  |  |  |
| CPU (sec)  | 13.07          | 202.65            | 1.96                        | 4.40                        |  |  |  |
| five-turn  |                |                   |                             |                             |  |  |  |
| unknowns   | 5462           | 47426             | 1284                        | 5120                        |  |  |  |
| iterations | 50             | 123               | 21                          | 28                          |  |  |  |
| impedance  | 6.347 + 3.352j | 6.482 + 3.085 j   | 6.390+3.301j                | 6.421+3.175j                |  |  |  |
| error      | 4.2~%          | -                 | 3.2~%                       | 1.2~%                       |  |  |  |
| CPU (sec)  | 20.35          | 239.78            | 3.08                        | 7.28                        |  |  |  |
|            |                |                   |                             |                             |  |  |  |

Table III. Comparison of Impedance (Ohm) for a transmission line with multiple dielectrics, error is with respect to HFSS

| $\epsilon_r$ | HFSS    | PHiImp  | Relative Error |
|--------------|---------|---------|----------------|
| 3            | 28.1031 | 27.8476 | 0.9~%          |
| 5            | 22.5185 | 22.1815 | 1.2~%          |
| 7            | 19.2886 | 18.8431 | 2.3~%          |
|              |         |         |                |

# CHAPTER IV

# MAGNETIC MATERIAL MODELING

#### A. Introduction

Nowadays, magnetic materials become common in circuits of MEMS, RFID and MRAM. MRAM is a new technology that provides fast, non-volatile, low power and high density memory [26]. IBM recently published a 16Mb MRAM [27] that has fast reading-access/writing-cycle time and small cell area as static random access memory (SRAM), and significantly faster than flash RAM. Table IV compares expected MRAM features with other memory technologies. It is possible that MRAM will replace flash RAM as the dominating technology for non-volatile memory in the near future.

Table IV. Comparison between MRAM and other memory technologies

|                    | MRAM     | SRAM     | DRAM     | Flash   |
|--------------------|----------|----------|----------|---------|
| Read Speed         | Fast     | Fastest  | Medium   | Fast    |
| Write Speed        | Fast     | Fastest  | Medium   | Low     |
| Array Efficiency   | Med/High | High     | High     | Low/Med |
| Future Scalability | Good     | Good     | Limited  | Limited |
| Cell Density       | Med/High | Low      | High     | Medium  |
| Non-Volatility     | Yes      | No       | No       | Yes     |
| Endurance          | Infinite | Infinite | Infinite | Limited |
| Cell Leakage       | Low      | Low/High | High     | Low     |
| Low Voltage        | Yes      | Yes      | Limited  | Limited |
| Complexity         | Medium   | Low      | Medium   | Medium  |

The basic structure of MRAM is illustrated in Fig. 20, where an array of magnetic junction transistor (MJT) are used as storage devices. An MJT consists of a free layer of "soft" nonlinear magnetic materials that can be programmed to change its magnetic orientation, and a fixed layer of "hard" magnetic materials that can not change its magnetic orientation. The long axis of the free layer is oriented parallel to the uniaxial anisotropy magnetic orientation of the fixed layer, resulting in the magnetic orientation of the free layer in two stable states: in the same direction as the fixed layer (parallel) or in the opposite direction (anti-parallel). MJT is placed near the top metal layers of CMOS circuits.

A schematic cross-sectional view of a integrated MRAM cell for a 1T-1MTJ cell [28] is shown in Fig. 21. The MRAM process module is integrated between the last two layers of metal in an otherwise standard semiconductor process flow. The MRAM is termed a "back-end" module because it is inserted after all of the associated CMOS circuitry has been fabricated [28]. This integration scheme requires no alteration to the front-end CMOS process flow. The back-end approach separates the specialized magnetic materials processing from the standard CMOS process. Fig. 22 shows the micrograph of the MRAM cell [28]. MRAM module is inserted between metal layer 4 and metal layer 5. The latest racetrack technology developed by IBM [29] is different from the MJT technology, but the racetrack technology also uses nonlinear magnetic wires. Therefore, modeling the magnetic materials is equally important.

In this chapter, we present the magnetic material modeling. Starting from analyzing the hysteresis loop of magnetic field and magnetization, we model the magnetic characteristics by the Landau-Lifshitz-Gilbert (LLG) equation. The LLG equation describes the magnetic behavior including gyromagnetic switching processes with damping and is widely used in the field of micromagnetics. It can be solved by small signal approximation to obtain the effective relative permeability which presents the



Fig. 20. MRAM structure



Fig. 21. Schematic view of a integrated MRAM cell

relationship between magnetization and magnetic field.

# B. Relationship between Magnetic Field and Magnetization

In electromagnetism, magnetic characteristics of a material can be represented by the relative permeability  $\mu_r$ , which is defined as

$$\mathbf{B} = \mu_0 \mu_r \mathbf{H},\tag{4.1}$$



Fig. 22. Micrograph of a integrated MRAM cell

$$\mathbf{M} = (\mu_r - 1)\mathbf{H},\tag{4.2}$$

where  $\mu_0$  is the permeability of free space, **B** is the magnetic flux, **H** is the magnetic field and **M** is the magnetization of the material.

Fig. 23 shows the relationship between **H** and **M**. For linear magnetic materials, **H** and **M** relationship is a straight line with constant fixed slope implying that  $\mu_r$  is constant.

For nonlinear magnetic materials, **H** and **M** relationship is a hysteresis loop of varying slope implying that  $\mu_r$  is a function of magnetic field and dependent on the history of magnetic field.

# C. Landau Lifshitz Gilbert Equation

The loop is different if  $\mathbf{H}$  is at a different frequency. The whole picture between  $\mathbf{H}$  and  $\mathbf{M}$  is given by LLG equation. Therefore, we use the LLG equation to compute the relative permeability  $\mu_r$  of nonlinear magnetic materials. LLG equation describes



Fig. 23. Magnetic hysteresis loop

the magnetic behavior of a grain in magnetic field including gyromagnetic switching processes with damping

$$\frac{d\mathbf{M}}{dt} = \gamma \mathbf{H} \times \mathbf{M} - \frac{\alpha}{M_s} \mathbf{M} \times \frac{d\mathbf{M}}{dt}, \tag{4.3}$$

where  $\gamma$  is the gyromagnetic ratio,  $M_s$  is the saturation magnetization and  $\alpha$  is a dimensionless damping factor. Fig. 24 gives a better understanding of LLG equation. The first term on the right hand side of LLG equation corresponds to the torque produced by a field  $\mathbf{H} \times \mathbf{M}$  as "rotational force". The second term corresponds to the torque produced by a field  $\mathbf{M} \times \frac{d\mathbf{M}}{dt}$  as "centripetal force".

LLG equation is widely used to model the switching process of magnetization in both industry and academia [30] [31] [32]. Given a magnetic field in the desired



Fig. 24. Gyromagnetic switching processes with damping

final direction, the grain loses energy and the magnetization will align to the applied magnetic field after a number of precession field [33].

#### D. Frequency Dependent Relative Permeability

The magnetic field in equation (4.3) can be expressed as

$$\mathbf{H} = \mathbf{H_0} + \mathbf{h},\tag{4.4}$$

where  $\mathbf{H_0}$  is the static magnetic field,  $\mathbf{h}$  is the magnetic field perturbation. Similarly, the magnetization is

$$\mathbf{M} = \mathbf{M_0} + \mathbf{m},\tag{4.5}$$

where  $\mathbf{M_0}$  is the static magnetization,  $\mathbf{m}$  is the magnetization perturbation. As shown in Fig. 23, the first derivative of the  $\mathbf{H}\text{-}\mathbf{M}$  function is continuous. Assume  $\mathbf{H_0}$  and  $\mathbf{M_0}$  is along the z-axis, in component form the above equations are

$$\mathbf{H} = h_x \mathbf{x} + h_y \mathbf{y} + (H_0 + h_z) \mathbf{z},$$

$$\mathbf{M} = m_x \mathbf{x} + m_y \mathbf{y} + (M_0 + m_z) \mathbf{z},$$
(4.6)

where  $\mathbf{x}, \mathbf{y}, \mathbf{z}$  are unit vectors along x, y and z axes. Equation (4.3) can now be expanded to give

$$\frac{dm_x}{dt} = m_y \gamma H_0 + \alpha \frac{dm_y}{dt} + m_y \gamma h_z - h_y \gamma (M_0 + m_z),$$

$$\frac{dm_y}{dt} = -m_x \gamma H_0 - \alpha \frac{dm_x}{dt} - m_x \gamma h_z + h_x \gamma (M_0 + m_z),$$

$$\frac{dm_z}{dt} = m_x \gamma h_y - m_y \gamma h_x.$$
(4.7)

In the small signal approximation higher order terms of m and h are set equal to zero. The small signal approximation is therefore

$$\frac{dm_x}{dt} = m_y \gamma H_0 + \alpha \frac{dm_y}{dt} - h_y \gamma M_0, 
\frac{dm_y}{dt} = -m_x \gamma H_0 - \alpha \frac{dm_x}{dt} + h_x \gamma M_0, 
\frac{dm_z}{dt} \approx 0.$$
(4.8)

The second order differential of equation (4.8) is

$$\ddot{m}_x = \dot{m}_y \gamma H_0 + \alpha \ddot{m}_y - \dot{h}_y r M_0,$$

$$\ddot{m}_y = -\dot{m}_x \gamma H_0 - \alpha \ddot{m}_x + \dot{h}_x r M_0,$$

$$m_z \approx 0.$$
(4.9)

Rewriting equation (4.9) we have

$$(1 + \alpha^{2})\ddot{m}_{x} + 2\gamma H_{0}\alpha \dot{m}_{x} + (\gamma H_{0})^{2} m_{x}$$

$$= \gamma^{2} H_{0} M_{0} h_{x} + r M_{0}\alpha \dot{h}_{x} - \gamma M_{0} \dot{h}_{y},$$

$$(1 + \alpha^{2})\ddot{m}_{y} + 2\gamma H_{0}\alpha \dot{m}_{y} + (\gamma H_{0})^{2} m_{y}$$

$$= \gamma^{2} H_{0} M_{0} h_{y} + r M_{0}\alpha \dot{h}_{y} + \gamma M_{0} \dot{h}_{x},$$

$$m_{z} \approx 0.$$

$$(4.10)$$

If the time dependence of the **m** and **h** quantities is of the form  $exp(j\omega t)$ , the relative permeability  $\mu_r$  can be defined which relates the magnetization **m** to the magnetic

field **h**:  $\mathbf{m} = \mu_0(\mu_r - 1)\mathbf{h}$ . Therefore, the relative permeability can be obtained by solving the LLG equation with small signal approximation [34] [35]

$$\mu_r = 1 + \frac{\gamma M_0 (\gamma H_0 + j\alpha\omega)}{\mu_0 [(\gamma H_0 + j\alpha\omega)^2 - \omega^2]},$$
(4.11)

where  $H_0$  is the magnitude of  $\mathbf{H_0}$ ,  $M_0$  is the project magnitude of  $\mathbf{M_0}$  in the direction of  $\mathbf{H_0}$ ,  $\mu_0$  is permeability of vacuum,  $\omega = 2\pi f$  and f is the frequency. The detailed derivation can be found in [34].

Equation (4.11) shows the frequency dependent  $\mu_r$  is a function of magnetic field and magnetization. The real and imaginary part of relative permeability across wide frequency range for different values of  $H_0$  are shown in Fig. 25 and 26, respectively. The resonance frequency drifts towards higher frequency as magnetic field increases. Fig. 27 and 28 demonstrate the tendency of relative permeability according to the change of magnetization. Similar to the results shown in Fig. 25 and 26, relative permeability varies significantly at different frequency point. The relative permeability changes rapidly when the frequency approaches the resonance frequency. For this particular example, the ratio could be as large as 2. It is now easy to see that the traditional method [15] [36] modeling linear magnetic material with assuming a constant  $\mu_r$  does not work for nonlinear magnetic material analysis.



Fig. 25. Effect of the magnetic field on the real part of  $\mu_r$ 



Fig. 26. Effect of the magnetic field on the imaginary part of  $\mu_r$ 



Fig. 27. Effect of the magnetization on the real part of  $\mu_r$ 



Fig. 28. Effect of the magnetization on the imaginary part of  $\mu_r$ 

# CHAPTER V

# INDUCTANCE EXTRACTION WITH MAGNETIC MATERIALS

#### A. Introduction

Inductance extraction is the process of computing the complex frequency dependent impedance matrix of multi-conductors, under the magnetoquasistatic approximation assuming that there is no charge accumulation on the surface. Fast and accurate inductance extraction is important for timing verification and signal integrity analysis of VLSI circuits, packages, multi-chip modules, and printed circuit boards. As the need for modeling large and complex structures increases, developing efficient inductance extraction algorithms is of great practical importance for the emerging marketplace.

As described in Chapter IV, the magnetic materials become common in circuits of MEMS, RFID and MRAM. We found that the existence of magnetic materials could significantly increase the inductance for nearby interconnects and pose a challenge to interconnect inductance extraction. In our examples, the inductance increases up to 2X in the presence of nearby magnetic materials. Most previous work, such as FastHenry [5], cannot deal with magnetic materials. Recently, FastMag [15] was proposed for inductance extraction in the presence of only linear magnetic materials. Ansoft's Maxwell 3D [16] can extract interconnect inductance in the presence of nonlinear magnetic materials. Maxwell 3D uses finite elements and automatic adaptive meshing techniques to compute the electrical and electromagnetic behavior. However, the speed of Maxwell 3D is slow. The increasing adaption of magnetic materials in large complex IC requires fast inductance extraction.

This chapter presents a fast algorithm to extract inductance in the presence of magnetic materials. Base on the effective relative permeability introduced in Chapter IV, we model the nonhomogeneous magnetic characteristics by the fictitious magnetic charge method and the equivalent charge method.

The system is solved iteratively. We start with an initial permeability. In each iteration, we update the relative permeability based on the changes of conductor currents and magnetic charges. The iterative procedure will stop until the relative permeability difference between consecutive iterations becomes less then a user defined bound. To speed up the algorithm, we apply a number of innovative techniques. A reduced system is established based on the magnetic charge effect approximation, and then get accelerated by solenoidal basis method. All these contribute to the high efficiency of our algorithm.

Experimental results give the comparison between Maxwell 3D and the new algorithm for several test cases. The relative error of the new algorithm with respect to Maxwell 3D is below 3%, while its running time is 1.8X to 12.9X less than that of Maxwell 3D.

### B. Inductance Extraction Formulation

For a system with n terminal pairs, let  $Z(\omega) \in C^{n \times n}$  denote the impedance matrix at frequency  $\omega$ . Then,

$$Z(\omega)I_t(\omega) = V_t(\omega), \tag{5.1}$$

where  $I_t, V_t \in C^n$  are the vectors of terminal currents and voltages, respectively. Several integral equation based approaches have been used to derive the  $Z(\omega)$  associated with a given package or interconnect structures [37]. The integral formulations are derived by assuming sinusoidal steady state and then applying the magnetoquasistatic

assumption that the displacement current  $\epsilon \omega \mathbf{E}$  (where  $\epsilon$  is the permittivity and  $\mathbf{E}$  is the electric field) is negligible. Given this, the vector potential  $\mathbf{A}$ , can be related to the resistive current  $\mathbf{J}$ , by

$$\mathbf{A}(\mathbf{r}) = \frac{\mu}{4\pi} \int_{V'} \frac{\mathbf{J}(\mathbf{r}')}{|\mathbf{r} - \mathbf{r}'|} dV', \tag{5.2}$$

where  $\mu$  is the permeability and V' is the volume of all conductors. The vector potential **A** also satisfies the constraint

$$\nabla \times \mathbf{A} = \mu \mathbf{H},\tag{5.3}$$

$$\nabla \cdot \mathbf{A} = 0. \tag{5.4}$$

From Faraday's law and the definition of A, the electric field E and vector potential A follow that

$$\mathbf{E} = -j\omega\mathbf{A} - \nabla\phi,\tag{5.5}$$

where  $\phi$  is referred to as the scalar potential. Assuming the ideal conductor constitutive relation  $\mathbf{J} = \sigma \mathbf{E}$  where  $\sigma$  is the electrical conductivity, and combining this relation with equation (5.2) and (5.5) results in

$$\frac{\mathbf{J}(\mathbf{r})}{\sigma} + \frac{j\omega\mu}{4\pi} \int_{V'} \frac{\mathbf{J}(\mathbf{r}')}{|\mathbf{r} - \mathbf{r}'|} dV' = \nabla\phi(\mathbf{r}). \tag{5.6}$$

Then, by simultaneously solving (5.4) with the current conservation equation

$$\nabla \cdot \mathbf{J}(\mathbf{r}) = 0, \tag{5.7}$$

conductor current densities and the scalar potential can be computed.

Given the magnetoquasistatic assumption, the current within a long thin conductor can be assumed to flow parallel to its surface, as there is no charge accumulation on the surface. For a long thin structures such as pins of a package or connector, the conductor can be divided into filaments of rectangular cross section inside which the current is assumed to flow along the length of the filament. Equation (5.6) can be rewritten as

$$\left(\frac{l_i}{\sigma s_i}\right) I_i + j\omega \sum_{j=1}^b \left(\frac{\mu}{4\pi s_i s_j} \int_{V_i} \int_{V_j'} \frac{\mathbf{l}_i \cdot \mathbf{l}_j}{|\mathbf{r} - \mathbf{r}'|} dV_j' dV_i\right) I_j 
= \frac{1}{s_i} \int_{s_i} (\phi_a - \phi_b) dS,$$
(5.8)

where  $l_i$  is the length of filament i,  $s_i$  and  $s_j$  is the cross section area of filament i and j, respectively, b is the total number of filaments except filament i,  $I_i$  is the current inside filament i,  $\mathbf{l}_i$ ,  $\mathbf{l}_j$  is a unit vector along the length of the filament, respectively,  $V_i$ ,  $V'_j$  are the volumes of filaments i and j, respectively,  $\phi_a$  and  $\phi_b$  are the potentials on the filament end surfaces.

Note that the right hand side of equation (5.8) results from integrating  $\nabla \phi$  along the length of the filament, and is the average potential on face a minus the average potential on face b. Equation (5.7) and (5.8) are the basic formulations for inductance extraction.

#### C. Magnetostatic Field Modeling

For 3-D structures with conductors and magnetic materials, we can assume that the currents are distributed in conductor volumes and magnetic charges are distributed on magnetic material surfaces. See Fig. 29 for an illustration. The magnetic material surfaces is divided into panels, each of which has constant magnetic charge density. The conductor volumes is divided into filaments, each of which carries current with constant density along the length. The extraction problem can be represented as an equivalent free space problem with fictitious magnetic charges distributed on the magnetic material surfaces [38].



Fig. 29. Magnetostatic field modeling

The normal component of  ${\bf B}$  is continuous across the magnetic material interface. The boundary condition at a point  ${\bf r}$  on the magnetic material surfaces must satisfy

$$\mathbf{B_a(r)} \cdot \mathbf{n(r)} = \mathbf{B_b(r)} \cdot \mathbf{n(r)},\tag{5.9}$$

where  $\mathbf{B_a}$  and  $\mathbf{B_b}$  are the magnetic flux of the two adjacent regions a and b, respectively,  $\mathbf{n}$  is the unit vector normal to the magnetic material surface. From equation (4.1) and (4.2),  $\mathbf{B} = \mu_0(\mathbf{H} + \mathbf{M})$ . Therefore, equation (5.9) becomes

$$\mathbf{H_a}(\mathbf{r}) \cdot \mathbf{n}(\mathbf{r}) - \mathbf{H_b}(\mathbf{r}) \cdot \mathbf{n}(\mathbf{r}) = \mathbf{M_b}(\mathbf{r}) \cdot \mathbf{n}(\mathbf{r}) - \mathbf{M_a}(\mathbf{r}) \cdot \mathbf{n}(\mathbf{r}).$$
 (5.10)

Since  $\mu_a \neq \mu_b$  on the magnetic material interfaces,  $\mathbf{H_a}(\mathbf{r}) \cdot \mathbf{n}(\mathbf{r}) - \mathbf{H_b}(\mathbf{r}) \cdot \mathbf{n}(\mathbf{r})$  is not equal to 0. In other words,  $\mathbf{H}$  is discontinuous at the magnetic material inter-

faces. To avoid imposing such complex requirements, we introduce fictitious magnetic charge [38], which is similar to the way we introduce electric charge in capacitance extraction with multiple dielectrics. In order to correctly impose the conditions in equation (5.9) and (5.10), the fictitious magnetic charges must satisfy the boundary condition on the magnetic material interfaces

$$\frac{\rho_m(\mathbf{r})}{\lambda(\mathbf{r})} = \frac{1}{4\pi} \int_V \frac{\nabla \times \mathbf{J}(\mathbf{r}') \cdot \mathbf{n}(\mathbf{r})}{|\mathbf{r} - \mathbf{r}'|} dV - \frac{1}{4\pi} \int_{S_m} \frac{\nabla \rho_m(\mathbf{r}') \cdot \mathbf{n}(\mathbf{r})}{|\mathbf{r} - \mathbf{r}'|} dS_m, \tag{5.11}$$

$$\lambda(\mathbf{r}) = \frac{2(\mu_r - 1)}{\mu_r + 1},\tag{5.12}$$

where  $\rho_m$  is the fictitious surface charge density,  $\lambda(\mathbf{r})$  is the factor scaling real magnetic charges to fictitious magnetic charges, V is the union of conductor volumes,  $S_m$  is the surface of the magnetic materials, and  $\mathbf{J}$  is the conductor current density. Note that the first term on the right hand side of equation (5.11) represents the normal magnetic field due to conductor currents, and the second term on the right hand side represents the normal magnetic field due to fictitious magnetic charges.

The conductor current must satisfy the integral equation derived based on a modified vector potential including the effect of magnetic charge and conductor current distribution [15]

$$\frac{\mathbf{J}(\mathbf{r})}{\sigma} + \frac{j\omega\mu_0}{4\pi} \int_V \frac{\mathbf{J}(\mathbf{r}')}{|\mathbf{r} - \mathbf{r}'|} dV - \frac{j\omega\mu_0}{4\pi} \int_{S_m} \frac{\nabla\rho_m(\mathbf{r}')}{|\mathbf{r} - \mathbf{r}'|} dS_m = -\nabla\phi(\mathbf{r}), \quad (5.13)$$

where  $\sigma$  is the conductivity and  $\phi(\mathbf{r})$  is the scalar potential. Note that the second term as well as the third term on the left hand side of equation (5.13), divided by  $j\omega\mu_0$ , represent the vector potential due to conductor currents and fictitious magnetic charges, respectively.

Furthermore, the Kirchhoff's current law must be satisfied

$$\nabla \cdot \mathbf{J}(\mathbf{r}) = 0. \tag{5.14}$$

Based on equation (5.11-5.14), the circuit formulation can be expressed in the following matrix form

$$\begin{bmatrix} R + j\omega L & -j\omega L_p & -A_e^T \\ -H_J & H_p + I & 0 \\ A_e & 0 & 0 \end{bmatrix} \begin{bmatrix} I_c^f \\ q_m \\ \phi_n^e \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ I_t \end{bmatrix}.$$
 (5.15)

In the unknown vector of equation (5.15),  $I_c^f$  is the vector of conductor filament currents,  $q_m$  is the vector of magnetic charge density, I is the identity matrix and  $\phi_n^e$  is the vector of node voltages. The right hand side component  $I_t$  denotes vector of terminal currents. Matrix R is the diagonal matrix of resistance and given by

$$R[i,i] = \frac{l_i}{\sigma S_i},\tag{5.16}$$

where  $l_i$  and  $S_i$  are the length and cross section area of filament i, respectively. Matrix L is the dense, symmetric positive definite matrix of partial inductances and given by

$$L[i,j] = \frac{\mu_0}{4\pi S_i S_j} \int_{V_i} \int_{V_i} \frac{\mathbf{I}_i \cdot \mathbf{I}_j}{|\mathbf{r} - \mathbf{r}'|} dV_j dV_i,$$
 (5.17)

where  $\mathbf{I}_i$  and  $\mathbf{I}_j$  is the unit current vector along the length of filament i and j, respectively,  $V_i$  and  $V_j$  are the volume of filament i and j, respectively,  $S_i$  and  $S_j$  are the cross section area of filament i and j, respectively. Matrix  $A_e^T$  is the nodal incidence matrix providing the difference of node voltage and  $A_e$  enforces the boundary condition.

Matrix element  $L_p[i,j]$  corresponds to the impact of magnetic charge on magnetic

material panel j to the conductor filament i and is given by

$$L_p[i,j] = \frac{\mu_0}{4\pi S_i S_j} \int_{V_i} \int_{S_j} \nabla \frac{1}{|\mathbf{r} - \mathbf{r}'|} \cdot \mathbf{n}(\mathbf{r}) dS_j dV_i.$$
 (5.18)

Matrix element  $H_J[i,j]$  is the magnetic field from conductor filament current j to the magnetic panel i and given by

$$H_J[i,j] = \frac{\lambda(\mathbf{r})}{4\pi S_i S_j} \int_{S_i} \int_{V_j} \nabla \times \frac{\mathbf{I}_j}{|\mathbf{r} - \mathbf{r}'|} \cdot \mathbf{n}(\mathbf{r}) dV_j dS_i.$$
 (5.19)

Matrix element  $H_p[i,j]$  is the magnetic field from magnetic panel j to the magnetic panel i and given by

$$H_p[i,j] = \frac{\lambda(\mathbf{r})}{4\pi S_i S_j} \int_{S_i} \int_{S_i} \nabla \frac{1}{|\mathbf{r} - \mathbf{r}'|} \cdot \mathbf{n}(\mathbf{r}) dS_j dS_i.$$
 (5.20)

In equations (5.19-5.20), the elements of sub-matrices  $H_J$  and  $H_p$  are related to the nonlinear frequency dependent relative permeability  $\mu_r$ . The strongly nonlinear feature of  $\mu_r$  lies in its relationship with magnetization, magnetic field and frequency as shown in equation (4.11). The magnetization depends on the nature and status of the magnetic materials, while the magnetic field depends on the distribution of  $I_c^f$ and  $q_m$ .

#### D. Speed Up

To speed up the algorithm, we apply a number of innovative techniques. A reduced system is established based on the magnetic charge effect approximation, and then gets accelerated by the solenoidal basis method. First of all, the linear system (5.15)

is modified as

$$\begin{bmatrix} R + j\omega L + H_c & 0 & -A_e^T \\ -H_J & H_p + I & 0 \\ A_e & 0 & 0 \end{bmatrix} \begin{bmatrix} I_c^f \\ q_m \\ \phi_n^e \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ I_t \end{bmatrix},$$
 (5.21)

where  $H_c = -j\omega L_p(H_p + I)^{-1}H_J$ . Note the individual elements in the sub-matrix  $H_p$  decay at the speed of  $1/r^2$ . The interaction between magnetic panels in the same node is much stronger than that between magnetic panels of different nodes. Based on the observation, a magnetic charge effect approximation is used in our algorithm. The approximation transforms the effect of all magnetic panels in one node by only one magnetic panel per node. This transformation has shown to be accurate and efficient in impedance extraction with multiple dielectrics [39] [40].

We construct an incidence matrix  $A_q^T \in \mathbb{R}^{p*m}$  (p, m) is the total number of panels and nodes in magnetic material surfaces, respectively) to replicate node potential to its corresponding panels while  $A_q$  sums the charges at each node. Let  $q_n$  be the vector of magnetic charge density. Each entry of  $q_n$  is for one node which approximates the effects of all panels in that node. Vector  $q_n$  satisfies

$$A_q H_J I_c^f = \frac{1}{n_p} A_q (H_p + I) A_q^T q_n, \tag{5.22}$$

where  $n_p$  is the number of panels per node. We calculate the diagonal entries of the matrix  $\frac{1}{n_p}A_q(H_p+I)A_q^T$  and set the inverse of the resulting diagonal matrix as P. The effect of magnetic surface charges on the potential drop can be approximated as  $\frac{1}{n_p}j\omega L_pA_q^TPA_qH_J$ , whose physical meaning is to map the magnetic panels' effect to the conductor filaments. Therefore, equation (5.15) can be transformed by eliminating

 $q_m$ 

$$\begin{bmatrix} Z & -A_e^T \\ A_e & 0 \end{bmatrix} \begin{bmatrix} I_c^f \\ \phi_n^e \end{bmatrix} = \begin{bmatrix} 0 \\ I_t \end{bmatrix}, \tag{5.23}$$

where  $Z = R + j\omega L - \frac{1}{n_p}j\omega L_p A_q^T P A_q H_J$ . The detailed illustration of the matrix transformation can be found in [41] [42].

To further accelerate the algorithm, the solenoidal basis method [43] is used to solve equation (5.23). Let

$$I_c^f = I' + I_p, \tag{5.24}$$

where  $I_p$  is a current vector that satisfies the constraints

$$A_e I_p = I_t. (5.25)$$

The following linear system can be derived from equation (5.23)

$$\begin{bmatrix} Z & -A_e^T \\ A_e & 0 \end{bmatrix} \begin{bmatrix} I' \\ \phi_n^e \end{bmatrix} = \begin{bmatrix} -ZI_p \\ 0 \end{bmatrix}, \tag{5.26}$$

and solved for the unknown current I'. Note that in the original problem, the input current is given, while in the solenoidal basis formulation, the input potential is given instead. The current vector  $I_p$  can easily be obtained by a number of techniques. For instance, when the known branch current has unit magnitude, one can assign a unit current to filaments on an arbitrary path from node with input source current to the node with output source current. We can get  $A_eI'=0$ , from equation (5.26) which means that the null space of  $A_e$  represents a basis for current that obeys Kirchhoff's law.

Given a full-rank matrix

$$F \in R^{f*(f-n)}, \tag{5.27}$$

where f and n are the total number of filaments and nodes in conductor volumes, respectively, such that  $A_eF = 0$ , a current vector computed as follows

$$I' = Fx, \quad x \in R^{f-n}, \tag{5.28}$$

which satisfies the constraint  $A_eI'=0$  for all x. A purely algebraic approach such as QR factorization of  $A_e$  cannot be used to compute F due to the prohibitive cost of computation and storage. We define a unit current flow in a close loop as a local solenoidal function. Each such mesh current is represented as a vector and the set of these vectors forms the column of F matrix. The local nature of these mesh currents leads to efficient computation and storage schemes for F. Since the current vector I' = Fx automatically satisfies the constraint

$$A_e I' = 0, (5.29)$$

we only need to solve

$$ZFx - A_e^T \phi_n^e = -ZI_p. (5.30)$$

After eliminating the branch potential unknowns  $\phi_n^e$  by multiplying this equation with  $F^T$ , the reduced system

$$F^T Z F x = -F^T Z I_p, (5.31)$$

can be solved via a suitable iterative scheme such as the GMRES method. Once x is obtained, current is computed as

$$I' = Fx. (5.32)$$

At last, we determine the conductor potential difference by computing  $ZI' + ZI_p$ . Consequently, the overall inductance for the analyzed structure is obtained by dividing terminal potential drops by terminal currents.

The complete flow of our algorithm is described as follows. We solve the equation (5.31) iteratively. We start with an initial relative permeability. In each iteration, we update the relative permeability based on the changes of conductor currents and magnetic charges. The iterative procedure will stop until the relative permeability difference between consecutive iterations becomes less then a user defined bound. The overall flow is described in Algorithm 1.

# Algorithm 1

- 1: Set user defined error bound  $\varepsilon$  and initial value of  $H_0$  and  $M_0$ ;
- 2: Generate the solenoidal basis matrix F and the current vector  $I_p$ ;
- 3: for iteration  $i = 1, 2, \dots$  do
- 4: Update entries of matrix Z according to  $\mu_r(i-1)$ ;
- 5: Solve the system  $F^TZFx = -F^TZI_p$  to determine the current vector x;
- 6: Compute the filament current vector  $I_c^f = Fx + I_p$  and magnetic charge vector  $q_n = PA_qH_JI_c^f$ ;
- 7: Update  $\mu_r(i)$  based on the solution of vectors  $I_c^f$  and  $q_n$ ;
- 8: **if**  $|\mu_r(i) \mu_r(i-1)| < \varepsilon$  **then**
- 9: stop the inner loop, get solution from iteration i;
- 10: **end if**
- 11: end for
- 12: Determine the conductor potential difference by computing  $Z(I' + I_p)$ .

## E. Experimental Results

## 1. Accuracy and Efficiency

To demonstrate the accuracy and efficiency of the new algorithm, we consider a conductor array with a series of magnetic blocks which is the typical structure in MRAM. See Fig. 30 for an illustration. The length of conductor is 100um. The cross section of the conductor is  $2.5um \times 1um$ . The length and thickness of the magnetic blocks is 5um and 1um, respectively. The distance between the magnetic blocks (bottom edge) and conductor (top edge) is 0.1um.



Fig. 30. Conductor array with magnetic blocks

Table V shows the inductance computed by Maxwell 3D and the new algorithm. The blocks are set to be non-magnetic (vacuum), nonlinear magnetic (CoFe, Permalloy and CoZnNb) and linear magnetic, respectively. The inductance value is for the central line in the conductor array. The effective inductance value is the partial inductance which includes self inductance and mutual inductance from nearby conductors. Compared to Maxwell 3D, the relative error of the new algorithm is below 3%.

One of the most important advantages of the proposed algorithm is its fast speed. Fig. 31 gives the efficiency comparison between Maxwell 3D and the new algorithm for the experiential cases shown in table V with all frequency points. From Fig. 31, we find that the new algorithm is up to one magnitude faster than Maxwell 3D.



Fig. 31. CPU time ratio of Maxwell 3D over our algorithm

# 2. Magnetic Characteristics Analysis

As shown in table V, the existence of the magnetic materials significantly increases the inductance of the nearby interconnects, up to almost two times. Ignoring the effect of magnetic materials could create up to 100% error to the interconnect inductance value. Fig. 32 shows the cross sectional view of the magnetic flux density  $|\mathbf{B}|$  when Permalloy as magnetic material. The magnetic flux density  $|\mathbf{B}|$  is strongly effected by the magnetic blocks. From table V and Fig. 32, it is clear that we must consider the effect of nonlinear magnetic materials in the research and development of the technology involving nonlinear magnetic materials.

The analysis of nonlinear magnetic materials is more complicated than that of linear magnetic materials because it is state dependent and current dependent. We give the first algorithm capable of simulating the state dependent and current dependent behavior of the interconnect inductance. Fig. 33 shows the normalized inductance in the presence of magnetic materials in different states. Note that  $L_0$  denotes the inductance of conductor in the presence of non-magnetic materials ( $\mu_r=1$ ). As



Fig. 32. Cross sectional view of  $|\mathbf{B}|$ .

described in Chapter IV, one of the most important properties of the nonlinear materials is "state dependent". Our experiments show that different states could lead to 30% difference in inductance. Fig. 34 shows the normalized inductance at different frequency points with the effects of conductor currents. The conductor currents will significantly influence the magnetic field across the nearby magnetic blocks. As show in our experiment, the different currents could lead to about 10% difference in inductance. From Fig. 33 and 34, we claim that the nonlinear magnetic materials greatly influence the value of inductance. It is very important to consider the nonlinearity of magnetic materials on the interconnect inductance extraction.

At last, we show how to utilize the extracted frequency dependent inductance for circuit analysis. The inductance extracted from our algorithm is frequency dependent. Such an inductance cannot be directly incorporated into most SPICE simulators. However, most SPICE simulators accept multi-port scattering (S) parameter models. We can calculate the S-parameter matrix [S] from inductance [35]. A simple example of a wire modeled as an inductor with two ports is given to show the transformation. The S-parameter matrix is

$$[S] = \frac{1}{2 + Z_L/Z_0} \begin{bmatrix} Z_L/Z_0 & 2 \\ 2 & Z_L/Z_0 \end{bmatrix},$$



Fig. 33. Effect of  $M_0$  ( $M_0 = M_S cos(theta)$ ) on the inductance

where  $Z_L = j\omega L, Z_0$  is the output impedance.

In SPICE simulators, the S-parameters can be input files in touchstone or citi format, and incorporated into the multi-port (MPORT) model. The detailed syntax is shown as follows:

$$Mname \quad n_1^+ \quad n_1^- \quad n_2^+ \quad n_2^- \quad \dots \quad n_N^+ \quad n_N^- \quad Sname$$
 
$$.model \quad Sname \quad mport \quad param = s \quad file = filename$$
 
$$nport = val \quad [fileformat = citi|touchstone] \quad [Z_0 = val]$$

where Mname is the MPORT model,  $n_i^+$  and  $n_i^-$ ,  $1 \le i \le N$ , represent the positive and negative element nodes at each port, respectively, Sname is the name of the model in which the port description is given, mport identifies this as a multi-port model,



Fig. 34. Effect of conductor current on the inductance

param specifies the parameter type as S-parameters, file format is the format of file which can be either citi or touchstone and  $Z_0$  is the output impedance for every port with default value as 50Ohm.



Fig. 35. Magnitude of S-parameters for interconnect with non-magnetic blocks

Fig. 35 to 38 demonstrate the comparison of the S-parameters (magnitude&phase) between with non-magnetic blocks and with magnetic blocks. The parameters S11 (input port voltage reflection coefficient) and S12 (reverse voltage gain) are sensitive to magnetic materials. From Fig. 35 to 38, we find that the existence of magnetic materials has important effect on the circuit performance.



Fig. 36. Magnitude of S-parameters for interconnect with magnetic blocks

Table V. Inductance (nH) computed by Maxwell 3D and the new algorithm, error is with respect to Maxwell 3D  $\,$ 

| Freq (GHz) |               |          | 0.1   | 0.5   | 1     | 3     | 5     | 8     | 10    |
|------------|---------------|----------|-------|-------|-------|-------|-------|-------|-------|
|            |               | Max. 3D  | 1.021 | 1.016 | 1.007 | 0.989 | 0.986 | 0.983 | 0.983 |
| Vacuum     | $\mu_r = 1$   | New Alg. | 1.011 | 1.010 | 0.996 | 0.980 | 0.978 | 0.971 | 0.971 |
|            |               | Error(%) | 1.0   | 0.5   | 1.1   | 0.9   | 0.8   | 1.2   | 1.2   |
|            |               | Max. 3D  | 1.767 | 1.776 | 1.762 | 1.727 | 1.716 | 1.706 | 1.701 |
|            | CoFe          | New Alg. | 1.729 | 1.751 | 1.772 | 1.772 | 1.697 | 1.670 | 1.655 |
|            |               | Error(%) | 2.3   | 1.4   | 0.6   | 2.6   | 1.1   | 2.1   | 2.7   |
| Nonlinear  |               | Max. 3D  | 1.922 | 1.927 | 1.991 | 1.952 | 1.941 | 1.929 | 1.924 |
| Magnetic   | Permalloy     | New Alg. | 1.857 | 1.902 | 1.931 | 1.928 | 1.923 | 1.920 | 1.919 |
| Material   |               | Error(%) | 3.4   | 1.3   | 3.0   | 1.2   | 1.0   | 0.5   | 0.3   |
|            |               | Max. 3D  | 1.426 | 1.415 | 1.389 | 1.321 | 1.276 | 1.217 | 1.182 |
|            | CoZnNb        |          | 1.428 | 1.452 | 1.378 | 1.357 | 1.313 | 1.233 | 1.184 |
|            |               | Error(%) | 0.2   | 2.6   | 0.8   | 2.7   | 2.9   | 1.3   | 0.2   |
| Linear     |               | Max. 3D  | 1.378 | 1.369 | 1.354 | 1.338 | 1.331 | 1.327 | 1.324 |
| Magnetic   | $\mu_r = 1e3$ | New Alg. | 1.395 | 1.379 | 1.378 | 1.363 | 1.348 | 1.339 | 1.319 |
| Material   |               | Error(%) | 1.2   | 0.7   | 1.8   | 1.8   | 1.3   | 1.0   | 0.4   |



Fig. 37. Phase of S-parameters for interconnect with non-magnetic blocks



Fig. 38. Phase of S-parameters for interconnect with magnetic blocks

#### CHAPTER VI

## COMPACT AND ACCURATE INTERCONNECT MODEL

#### A. Introduction

Increasing clock speeds, die sizes, and power dissipations have driven VLSI manufacturers to abandon the simple scaling approach of interconnect wiring. Instead, they employ a hierarchy of metal wiring levels. Thinner wiring levels are used at the circuit level where density is required, and thicker layers at the top or global levels in order to route low-skew clock trees, low-loss power distribution buses, and the fastest signal interconnects. This trend, coupled with the recent introduction of copper wiring (because its resistivity is approximately half that of aluminum wiring) has made inductance modeling necessary to be included in the interconnect model.

Since the interconnect model includes mutual inductances between every pair of conductors, the resulting circuit matrix is very dense [44] [45] [46]. As an example, large clock net topologies along with power grid can lead to the number of self inductances of the order of 100K and mutual inductance of the order of 10G. Hence the SPICE simulation is infeasible due to impractical time and memory requirements. This has been the main bottleneck in the use of the interconnect *RLC* models. The following techniques can be used to sparsify the inductance matrix.

The simplest approach to sparsify the inductance matrix is to discard all mutual coupling terms falling below a certain threshold. This translates to removing entries from the inductance matrix, thus making it sparse and faster to process. However, the resulting matrix can become non-positive definite, and the sparsified system becomes active and generates energy (positive poles) [47]. Unlike capacitance matrices which can be truncated to represent only localized couplings, there is no guarantee on either

the degree of sparsity or stability for the inductance matrix truncation.

As an alternative to simple truncation, the shift-truncate method proposed in [48] can guarantee that the generated sparse inductance matrix is positive definite. However, the accuracy is not satisfactory [49] [17].

Recently, a new circuit element K, which is defined as the inverse of inductance, is introduced and is incorporated into a simulation tool KSim [17]. The locality of K is demonstrated. Thus, the K matrix can be easily sparsified by dropping small entries while keeping the stability.

In this chapter, we study the practical issues of the RKC model. We validate the RKC model by circuit simulations of practical examples. The RKC model is very sparse and stable, and accurately captures the inductance effect. Furthermore, we propose an efficient way to achieve high accuracy extraction based the delay sensitivity analysis. Interconnect R and L/K close to driver should be extracted with high accuracy, while interconnect C close to receiver should be extracted with high accuracy.

## B. Sparse K Model

We illustrate the locality of K matrix by using a simple example of one wire divided into ten segments. we consider a layout example with one wire divided into ten segments, as shown in Fig. 39. The length, width and thickness of each segment are  $10 \ \mu m$ ,  $0.2 \ \mu m$  and  $0.2 \ \mu m$ , respectively.



Fig. 39. One wire with ten segments

The L matrix and its inverse  $K=L^{-1}$  matrix are

$$L=\!10^{-12}\times$$

| 8.895 | 1.382 | 0.525 | 0.341 | 0.251 | 0.201 | 0.167 | 0.141 | 0.125 | 0.111 |
|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| 1.382 | 8.895 | 1.382 | 0.525 | 0.341 | 0.251 | 0.201 | 0.164 | 0.144 | 0.125 |
| 0.525 | 1.382 | 8.895 | 1.382 | 0.525 | 0.339 | 0.254 | 0.193 | 0.168 | 0.143 |
| 0.341 | 0.525 | 1.382 | 8.895 | 1.382 | 0.525 | 0.337 | 0.250 | 0.202 | 0.166 |
| 0.251 | 0.341 | 0.525 | 1.382 | 8.895 | 1.382 | 0.525 | 0.341 | 0.254 | 0.199 |
| 0.201 | 0.251 | 0.339 | 0.525 | 1.382 | 8.895 | 1.382 | 0.525 | 0.341 | 0.251 |
| 0.167 | 0.201 | 0.254 | 0.337 | 0.525 | 1.382 | 8.895 | 1.382 | 0.525 | 0.341 |
| 0.141 | 0.164 | 0.193 | 0.250 | 0.341 | 0.525 | 1.382 | 8.895 | 1.382 | 0.525 |
| 0.125 | 0.144 | 0.168 | 0.202 | 0.254 | 0.341 | 0.525 | 1.382 | 8.895 | 1.382 |
| 0.111 | 0.125 | 0.143 | 0.166 | 0.199 | 0.251 | 0.341 | 0.525 | 1.382 | 8.895 |

(6.1)

$$K = 10^9 \times$$

$$\begin{bmatrix} 11.55 & -1.71 & -0.35 & -0.24 & -0.16 & -0.13 & -0.10 & -0.08 & -0.07 & -0.06 \\ -1.71 & 11.80 & -1.65 & -0.32 & -0.21 & -0.14 & -0.11 & -0.09 & -0.08 & -0.08 \\ -0.35 & -1.65 & 11.81 & -1.64 & -0.31 & -0.21 & -0.15 & -0.10 & -0.09 & -0.09 \\ -0.24 & -0.32 & -1.64 & 11.81 & -1.64 & -0.31 & -0.20 & -0.14 & -0.11 & -0.10 \\ -0.16 & -0.21 & -0.31 & -1.64 & 11.82 & -1.64 & -0.31 & -0.21 & -0.15 & -0.12 \\ -0.13 & -0.14 & -0.21 & -0.31 & -1.64 & 11.82 & -1.64 & -0.31 & -0.21 & -0.16 \\ -0.10 & -0.11 & -0.15 & -0.20 & -0.31 & -1.64 & 11.81 & -1.64 & -0.32 & -0.24 \\ -0.08 & -0.09 & -0.10 & -0.14 & -0.21 & -0.31 & -1.64 & 11.81 & -1.65 & -0.35 \\ -0.07 & -0.08 & -0.09 & -0.11 & -0.15 & -0.21 & -0.32 & -1.65 & 11.80 & -1.70 \\ -0.06 & -0.08 & -0.09 & -0.10 & -0.12 & -0.16 & -0.24 & -0.35 & -1.70 & 11.55 \end{bmatrix}$$

$$(6.2)$$

It can be observed that the off-diagonal elements in K matrix decrease faster than that of the inductance matrix. The mutual inductance  $L_{81}$  is 1.6% of the self inductance  $L_{11}$ , while  $|K_{81}|$  is only 0.6% of the self term  $K_{11}$ . We call K matrix has locality. The physical explanation of the locality of K matrix can be found in [49] [50].

Since K matrix is a sparse diagonally dominant matrix, we only need to consider a small number of off-diagonal entries [49]. If we drop the small off diagonal terms in

K, we obtain a band matrix  $\tilde{K}_3$  with bandwidth is 3

The  $\tilde{K}_3$  matrix, which is a subset of the K matrix is capable of accurately capturing the inductance effect in the circuit simulation.

Since K has C-like locality, we only need to consider a small number of neighbors in the K method. The procedure of constructing the K matrix can be summarized as follows.

Fist of all, we calculate the inductance matrices of each segment with closest neighborhoods. Secondly, we get the small K matrices by inverting the corresponding inductance matrices. Finally, we compose the full K matrix by combining the small K matrices, which is similar to the general techniques in the capacitance extraction.

After getting the spatially distributed K model, we can produce the spatially distributed RKC circuit model by combining the resistance and capacitance model. The resistance and capacitance matrices can be very sparse via simple truncation due

to the locality of resistance and capacitance matrices. We validate the RKC model by circuit simulation of practical problems. Since the K method will generate a very sparse and stable system for the circuit simulation, it can save a great amount of CPU time and memory usage.

# C. Delay Sensitivity Analysis

It has become well accepted that interconnect delay dominates gate delay in current deep sub-micrometer VLSI circuits. With the continuous scaling of technology and increased die area, there has been great emphasis on the interconnect delay analysis. In order to properly design circuits, accurate interconnect models and signal propagation characterization are required.

The interconnect is usually modeled as a distributed RLC circuit (multiple T or Pi sections) for delay models. Fig. 40 and 41 show the circuit model and gate model, respectively. A well known method used to determine which nets require more accurate delay models is to compare the pull up/down driver resistance  $R_{ui}/R_{di}$  and the load capacitance  $C_{load}$  to the total resistance and capacitance of the interconnect  $R_{out}$  and  $C_{out}$ . Typically, the nets that require more accurate RC models are longer, more highly resistive nets.



Fig. 40. Circuit model



Fig. 41. Gate model

We measure the delay between port 1 and port 2 as shown in Fig. 40. The delay formulation for the interconnect RC distributed model is

$$T_{dui} = R_{ui}C_{pi} + R_{ui}\sum_{i=1}^{l+1} C_i + \sum_{i=1}^{l} R_i \sum_{j=i+1}^{l+1} C_j + \sum_{i=1}^{l} R_i C_{load},$$
(6.4)

$$T_{ddi} = R_{di}C_{pi} + R_{di}\sum_{i=1}^{l+1} C_i + \sum_{i=1}^{l} R_i \sum_{j=i+1}^{l+1} C_j + \sum_{i=1}^{l} R_i C_{load},$$
 (6.5)

where  $T_{dui}$  and  $T_{ddi}$  are the pull up and pull down delay, respectively,  $R_{ui}$  and  $R_{di}$  are the pull up and pull down resistance of the driver, respectively,  $C_{pi}$  is the parasitic capacitance of the driver,  $R_i$  and  $C_i$  are the interconnect resistance and capacitance at position i, respectively,  $C_{load}$  is the input capacitance of the receiver.

To quantify the error in the delay due to errors in the extracted resistance and capacitance at different positions, we consider an extraction tool that extracts the resistance or capacitance value with a maximum error E. The effective error to the

total resistance or capacitance would be E/l. In other words, the extracted resistance or capacitance by the extraction tool is in the range between  $R_i(1-E)$  and  $R_i(1+E)$  or between  $C_i(1-E)$  and  $C_i(1+E)$ , where  $R_i$  and  $C_i$  are the assumed accurate resistance and capacitance value at position i, respectively.

We try to find the worst case when  $R_i$  or  $C_i$  is overestimated by a maximum factor of 1 + E (or underestimated by a minimum factor of 1 - E). In that case, the relative error of the pull up delay  $E_{dui}(R_i)$  and the pull down delay  $E_{ddi}(R_i)$  due to the extraction error of  $R_i$  become

$$E_{dui}(R_i) = \left| \frac{T_{dui}(R_i) - T_{dui}}{T_{dui}} \right|$$

$$= \left| \frac{R_i E \sum_{j=i+1}^{l+1} C_j + R_i E C_{load}}{T_{dui}} \right|,$$
(6.6)

$$E_{ddi}(R_i) = \left| \frac{T_{ddi}(R_i) - T_{ddi}}{t_{ddi}} \right|$$

$$= \left| \frac{R_i E \sum_{j=i+1}^{l+1} C_j + R_i E C_{load}}{T_{ddi}} \right|.$$
(6.7)

The relative error of the pull up delay  $E_{dui}(C_i)$  and the pull down delay  $E_{ddi}(C_i)$  due to the extraction error of  $C_i$  become

$$E_{dui}(C_i) = \left| \frac{T_{dui}(C_i) - T_{dui}}{T_{dui}} \right|$$

$$= \left| \frac{C_i E \sum_{j=1}^i R_j + R_{ui} E C_i}{T_{dui}} \right|,$$
(6.8)

$$E_{ddi}(C_i) = \left| \frac{T_{ddi}(C_i) - T_{ddi}}{T_{dui}} \right|$$

$$= \left| \frac{C_i E \sum_{j=1}^{i} R_j + R_{di} E C_i}{T_{ddi}} \right|.$$
(6.9)

From equations (6.6-6.9), we find that  $R_i$  is more sensitive to delay when i decreases, while  $C_i$  is more sensitive to delay when i increases. That tells us that the interconnect R close to the driver should be extracted with high accuracy, while interconnect C close to the receiver should be extracted with high accuracy.

We now extend the delay analysis to inductance. The delay due to the interconnect inductance distributed model is [51]

$$T_{dui}^{l} = 1.047 \sqrt{\sum_{i=1}^{l} L_{i} \left(\sum_{j=i+1}^{l+1} C_{j} + C_{load}\right) \cdot e^{-\frac{\xi_{dui}}{0.85}}},$$
(6.10)

$$T_{ddi}^{l} = 1.047 \sqrt{\sum_{i=1}^{l} L_{i} \left(\sum_{j=i+1}^{l+1} C_{j} + C_{load}\right) \cdot e^{-\frac{\xi_{ddi}}{0.85}}},$$
(6.11)

where  $\xi_{dui}$  and  $\xi_{ddi}$  are the damping factor for the pull up and pull down delay, respectively,

$$\xi_{dui} = \frac{1}{2} \frac{R_{ui}C_{pi} + R_{ui} \sum_{i=1}^{l+1} C_i + \sum_{i=1}^{l} R_i \sum_{j=i+1}^{l+1} C_j + \sum_{i=1}^{l} R_i C_{load}}{\sqrt{\sum_{i=1}^{l} L_i (\sum_{j=i+1}^{l+1} C_j + C_{load})}}, \quad (6.12)$$

$$\xi_{ddi} = \frac{1}{2} \frac{R_{di}C_{pi} + R_{di}\sum_{i=1}^{l+1} C_i + \sum_{i=1}^{l} R_i \sum_{j=i+1}^{l+1} C_j + \sum_{i=1}^{l} R_i C_{load}}{\sqrt{\sum_{i=1}^{l} L_i (\sum_{j=i+1}^{l+1} C_j + C_{load})}}.$$
 (6.13)

The general expression for the delay of a CMOS gate driving a RLC interconnect is the combination of equations (6.4) and (6.10) (pull up) and equations (6.5) and (6.11) (pull down).

From equations (6.10-6.11), we find that interconnect  $L_i$  will be more sensitive to delay as i decreases, which means L close to the driver should be extracted with high accuracy. As described in Section B, the K model is constructed by combining the inverse of L matrix of each segment with closest neighborhoods. In the other words, the value of K element is significantly effected by that of L at the same position. Therefore, K close to the driver should also be extracted with high accuracy.

## D. Experimental Results

To validate the spatially distributed RKC model, we simulate a single wire divided into ten segments along its length. The circuit model is shown in Fig. 40. The

driver is a PWL voltage source with its rising time 50ps. The gate model is IBM 130nm technology. Since K simulator is not available, we use HSPICE to verify the correctness of the K model. However, HSPICE can not accept the K model. Instead, we inverse the K matrix and simulate the corresponding  $L_{new} = K^{-1}$  in HSPICE.

Table VI shows the delay comparison between the full L method and the sparse K method simulation. We use the full L matrix method as reference. The delay is measured between port 1 and port 2. From Table VI, we find that the relative error of the sparse K method with respect to the full L method is very small (less than 0.1%).

Table VI. Delay comparison between the full L method and the sparse K method, error is respect to the full L method

|          | Delay (ns) | Relative Error (%) |
|----------|------------|--------------------|
| accurate | 5.3568     | -                  |
| n=1      | 5.3573     | 0.09               |
| n=2      | 5.3578     | 0.18               |

The time domain voltage waveform at port 2 in the transient analysis is extracted and shown in Fig. 42. From Fig. 42, we can see a good agreement in terms of circuit simulation results between the full L method and the sparse K method. The frequency domain voltage waveform at port 2 in the AC Analysis is simulated and shown in Fig. 43. The RKC model shows high accuracy. Overall, Table VI, Fig. 42 and 43 demonstrate that the K matrix is capable of capturing the inductance effect, while still preserves locality. Since K matrix is a sparse diagonally dominant matrix, we only need to consider a small number of neighbors. In our examples, the voltage waveform match well with the full L method when the bandwidth of the K matrix is





Fig. 42. Voltage waveform at port 2 in transient analysis

We validate our idea about achieving high accuracy extraction. Fig. 44, 45 and 46 show the relative error of delay due to the extraction error of  $R_i$ ,  $C_i$  and  $L_i$ , respectively. We assume the total error is 0.05. Circles in the figures represent the relative error of delay with each element has E = 0.05. Table VII, VIII and IX demonstrate the worst case  $Error_{max}$  when  $R_i$ ,  $C_i$  and  $L_i$  are overestimated by a factor of 1.05, respectively.  $Error_{uni}$  is the relative error of delay with each element has extraction error 0.05. R and L/K closer to driver are more sensitive to delay, while interconnect C closer to receiver has more significant effect on delay. In other words, interconnect C close to receiver should be extracted with high accuracy, while interconnect C close to receiver should be extracted with high accuracy.



Fig. 43. Voltage waveform at port 2 in AC analysis

Table VII. Relative error of delay due to extraction error of  $R_i$ 

| R(ohm) | C(fF) | $T_d(\mathrm{ns})$ | $Error_{uni}(\%)$ | $Error_{max}(\%)$ |
|--------|-------|--------------------|-------------------|-------------------|
| 50     | 160   | 0.85               | 3.0               | 6.7               |
| 50     | 320   | 1.58               | 3.2               | 6.8               |
| 100    | 160   | 1.54               | 3.1               | 7.7               |
| 100    | 320   | 2.88               | 3.3               | 7.9               |



Fig. 44. Relative error of delay due to extraction error of  $\mathcal{R}_i$ 

Table VIII. Relative error of delay due to extraction error of  $C_i$ 

| R(ohm) | C(fF) | $T_d(ns)$ | $Error_{uni}(\%)$ | $Error_{max}(\%)$ |
|--------|-------|-----------|-------------------|-------------------|
| 50     | 160   | 0.85      | 3.3               | 7.3               |
| 50     | 320   | 1.58      | 3.5               | 7.9               |
| 100    | 160   | 1.54      | 3.4               | 7.8               |
| 100    | 320   | 2.88      | 3.6               | 8.6               |



Fig. 45. Relative error of delay due to extraction error of  $C_i$ 

Table IX. Relative error of delay due to extraction error of  $L_i/K_i$ 

| R(ohm) | C(fF) | L(nH) | $T_d(\mathrm{ns})$ | $Error_{uni}(\%)$ | $Error_{max}(\%)$ |
|--------|-------|-------|--------------------|-------------------|-------------------|
| 10     | 160   | 50    | 1.3                | 2.4               | 3.0               |
| 10     | 160   | 200   | 2.6                | 2.5               | 3.1               |
| 100    | 160   | 50    | 1.6                | 2.1               | 3.2               |
| 100    | 160   | 200   | 2.8                | 2.2               | 3.4               |



Fig. 46. Relative error of delay due to extraction error of  $\mathcal{L}_i$ 

## CHAPTER VII

## CONCLUSIONS

In this dissertation, we present several novel techniques based on BEM for parasitic extraction problems. Our algorithms are significantly faster than existing techniques with sufficient accuracy.

We first tackle the problem of impedance extraction for interconnect with multiple dielectrics. Multiple dielectrics are common in integrated circuits and packages. However, previous BEM algorithms, including FastImp and FastPep, assume uniform dielectric due to their limitation, thus causing considerable errors. Our algorithm introduces a circuit formulation which makes it possible to utilize either multilayer Green's function or equivalent charge method to extract impedance in multiple dielectrics. The novelty of the formulation is the reduction of the unknowns and the application of the hierarchical data structure. The hierarchical data structure permits efficient sparsification transformation and preconditioners to accelerate the linear equation solver. Experimental results demonstrate that the new algorithm is accurate and efficient. For uniform dielectric problems, our algorithm is more accurate than FastImp while its number of unknowns is ten times less than that of FastImp. For multiple dielectric problems, its relative error with respect to HFSS is below 3%.

The second problem we investigate is the inductance extraction for structures in the presence of magnetic materials. The existence of magnetic materials poses a challenge to the interconnect inductance extraction for circuits in MEMS, RFID and MRAM. We develop a fast algorithm to extract inductance in the presence of magnetic materials. The new algorithm models the magnetic characteristics by the Landau-Lifshitz-Gilbert equation and fictitious magnetic charges. To speed up the algorithm, we apply a number of innovative techniques including the approximation of magnetic

charge effect and the modeling of currents with solenoidal basis. Experimental results demonstrate the accuracy and efficiency of the new algorithm. Its relative error with respect to the commercial tool is below 3%, while its speed is up to one magnitude faster.

The third problem we focus on is generating and simulating a compact and accurate interconnect circuit model. Since the interconnect circuit model includes mutual inductances between every pair of conductors, the resulting circuit matrix is very dense. The circuit simulation is infeasible due to impractical time and memory requirements. This has been the main bottleneck in the use of interconnect circuit model. We study the practical issues of the RKC model, where K is defined as the inverse of inductance. We verified the RKC model by circuit simulation of practical examples. The RKC model is very sparse and stable, and accurately captures the inductance effect. Furthermore, we propose an efficient way to achieve high accuracy extraction based the delay sensitivity analysis. Interconnect R and L/K close to driver should be extracted with high accuracy, while interconnect C close to receiver should be extracted with high accuracy.

#### REFERENCES

- E. Chiprout, J. R. Phillips, and D. D. Ling, "Efficient full-wave electromagnetic analysis via model-order reduction of fast integral transforms," in *Proceedings of IEEE/ACM Design Automation Conference*, 1996, pp. 377-382.
- [2] Y. Saad and M. H. Schultz, "GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems," SIAM Journal on Scientific and Statistical Computing, vol. 7, no. 3, pp. 856-869, 1986.
- [3] N. Rubin, "Data flow computing and the conjugate gradient method," in *Proceedings of Architectures and Compilation Techniques for Fine and Medium Grain Parallelism*, 1993, pp. 257-264.
- [4] K. Nabors and J. White, "FastCap: A multipole accelerated 3-d capacitance extraction program," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 10, no. 11, pp. 1447-1459, 1991.
- [5] M. Kamon, M. J. Tsuk, and J. K. White, "FASTHENRY: A multipole-accelerated 3-D inductance extraction program," *IEEE Transactions on Microwave Theory and Techniques*, vol. 42, no. 9, pp. 1750-1758, 1994.
- [6] W. Shi, J. Liu, N. Kakani, and T. Yu, "A fast hierarchical algorithm for 3-d capacitance extraction," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 21, no. 3, pp. 330-336, 2002.
- [7] S. Yan, V. Sarin, and W. Shi, "Sparse transformations and preconditioners for capacitance extraction," *IEEE Transactions on Computer-Aided Design of Inte*grated Circuits and Systems, vol. 24, no. 9, pp. 1420-1426, 2005.

- [8] J. Phillips and J. White, "A precorrected FFT method for capacitance extraction of complicated 3D structures," *IEEE Transactions on Computer-Aided Design* of Integrated Circuits and Systems, vol. 16, no. 10, pp. 1059-1072, 1997.
- [9] Z. Zhu, B. Song, and J. White, "Algorithms in FastImp: A fast and wideband impedance extraction program for complicated 3D geometries," in *Proceedings* of IEEE/ACM Design Automation Conference, 2003, pp. 712-716.
- [10] H. Xin, L. Daniel, and J. White, "Partitioned conduction modes in surface integral equation-based impedance extractio" in *Proceedings of IEEE Electrical Performance of Electronic Packaging*, 2003, pp. 355-358.
- [11] S. Kapur and D. E. Long, "IES3: A fast integral equation solver for efficient 3dimensional extraction," in *Proceedings of IEEE/ACM Design Automation Con*ference, 1997, pp. 448-455.
- [12] R. Jiang, Y. Chang, and C. C.-P. Chen, "ICCAP: a linear time sparse transformation and reordering algorithm for 3D BEM capacitance extraction," in Proceedings of IEEE/ACM Design Automation Conference, 2005, pp. 163-167.
- [13] M. Kamon, N. Marques, and J. White, "FastPep: A fast parasitic extraction program for complex three-dimensional geometries," in *Proceedings of IEEE/ACM International Conference on Computer-Aided Design*, 1997, pp. 456-460.
- [14] Y. Yi, P. Li, V. Sarin, and W. Shi, "A preconditioned hierarchical algorithm for impedance extraction of three-dimensional structures with multiple dielectrics," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Sys*tems, vol. 27, no. 11, pp. 1918-1927, 2008.

- [15] Y. Massoud and J. White, "FastMag: a 3-D magnetostatic inductance extraction program for structures with permeable materials," in Proceedings of IEEE/ACM International Conference on Computer-Aided Design, 2002, pp. 478-484.
- [16] L. Li, D. W. Lee, and R. Bubber, "Tensor nature of permeability and its effects in inductive magnetic devices," *IEEE Transactions on Magnetics*, vol. 43, no. 6, pp. 2373-2375, 2007.
- [17] H. Ji, A. Devgan, and W. Dai, "KSim: A stable and efficient RKC simulator for capturing on-chip inductance effect," in *Proceedings of Asia South Pacific Design Automation Conference*, 2001, pp. 379-384.
- [18] A. E. Ruehli, "Equivalent circuit models for three-dimensional multiconductor systems," *IEEE Transactions on Microwave Theory and Techniques*, vol. 22, no. 3, pp. 216-221, 1974.
- [19] A. E. Ruehli and H. Heeb, "Circuit models for three-dimensional geometries including dielectrics," *IEEE Transactions on Microwave Theory and Techniques*, vol. 40, no. 7, pp. 1507-1516, 1992.
- [20] A. E. Ruehli, "Inductance calculations in a complex integrated circuit environment," IBM Journal of Research and Development, vol. 16, no. 5, pp. 470-481, 1972.
- [21] F. Grover, Inductance Calculations: Working Formulas and Tables, New York, Dover, 1962.
- [22] C. Hoer and C. Love, "Exact inductance equations for rectangular conductors with applications to more complicated geometries," *Journal of Research of the*

- National Bureau of Standards, vol. 69, no. 2, pp. 127-137, 1965.
- [23] S. M. Rao, T. K. Sarkar, and R. F. Harrington, "The electrostatic field of conducting bodies in multiple dielectric media," *IEEE Transactions on Microwave Theory and Techniques*, vol. 32, no. 9, pp. 1441-1448, 1984.
- [24] R. A. Horn and C. R. Johnson, *Matrix Analysis*, Cambridge, Cambridge University Press, 1989.
- [25] N. Appannagarri, I. Bardi, and J. Hadden, "Modeling phased array antennas in Ansoft HFSS," in *Proceedings of IEEE International Conference on Phased Array Systems and Technology*, 2000, pp. 323-326.
- [26] T. M. Maffitt, J. K. DeBrosse, J. A. Gabric, E. T. Gow, M. C. Lamorey, J. S. Parenteau, D. R. Willmott, M. A. Wood, and W. J. Gallagher, "Design considerations for MRAM," *IBM Journal of Research and Development*, vol. 50, no. 1, pp. 25-40, 2006.
- [27] W. J. Gallagher and S. S. P. Parkin, "Development of the magnetic tunnel junction MRAM at IBM: From first junctions to a 16-Mb MRAM demonstrator chip," IBM Journal of Research and Development, vol. 50, no. 1, pp. 5-10, 2006.
- [28] J. J. Nahas, T. W. Andre, and B. Garni, "A 180 Kbit embeddable MRAM memory module," *IEEE Journal of Solid State Circuits*, vol. 43, no. 8, pp. 1826-1834, 2008.
- [29] S. P. Parkin, M. Hayashi, and L. Thomas, "Magnetic domain-wall racetrack memory," Science, vol. 320, no. 5873, pp. 190-194, 2008.
- [30] S. X. Wang and A. M. Taratorin, Magnetic Information Storage Technology, San Diego, Academic Press, 1999.

- [31] A. E. Ruehli, "Bit selection scheme and dipolar interactions in high density precessional MRAM," *IEE Proceedings Science*, Measurement and Technology, vol. 152, no. 4, pp. 196-200, 2005.
- [32] I. Cimrk, "A survey on the numerics and computations for the Landau-Lifshitz equation of micromagnetism," Journal of Archives of Computational Methods in Engineering, vol. 15, no. 3, pp. 277-309, 2008.
- [33] J. D. Jackson, Classical Electodynamics, New York, John Wiley Press, 1962.
- [34] J. Helszajn, *Principles of Microwave Ferrite Engineering*, New York, Wiley-interscience Press, 1969.
- [35] D. Pozar, Microwave Engineering, Chichester, John Wiley & Sons, 2004.
- [36] Y. Massoud and J. White, "Improving the generality of the fictitious magnetic charge approach to computing inductances in the presence of permeable materials," in Proceedings of IEEE/ACM International Conference on Computer-Aided Design, 2002, pp. 552-555.
- [37] A. E. Ruehli, "Survey of computed-aided electrical analysis of integrated circuit interconnections," IBM Journal of Research and Development, vol. 23, no. 2, pp. 626-631, 1979.
- [38] B. Krstaji, Z. Andeli, S. Milojkovit, and S. Babi, "Nonlinear 3D magnetostatic field calculation by the integral equation method with surface and volume magnetic charges," *Journal of Archives of Computational Methods in Engineering*, vol. 28, no. 2, pp. 1088-1091, 1992.
- [39] Y. Yi, P. Li, V. Sarin, and W. Shi, "Impedance extraction for 3-D structures with multiple dielectrics using preconditioned boundary element method," in *Proceed-*

- ings of IEEE/ACM International Conference on Computer-Aided Design, 2007, pp. 7-10.
- [40] Y. Yi, P. Li, V. Sarin, and W. Shi, "A preconditioned hierarchical algorithm for impedance extraction of three-dimensional structures with multiple dielectrics," in *Proceedings of IEEE Electrical Performance of Electronic Packaging*, 2008, pp. 99-102.
- [41] Y. Yi, V. Sarin, and W. Shi, "An efficient inductance extraction algorithm for 3-D interconnects with frequency dependent nonlinear magnetic materials," in Proceedings of IEEE Electrical Performance of Electronic Packaging, 2008, pp. 217-220.
- [42] Y. Yi, R. Wenzel, V. Sarin, and W. Shi, "Inductance extraction for interconnects in presence of magnetic materials," to appear, *IEEE Transactions on Computer-*Aided Design of Integrated Circuits and Systems, 2009.
- [43] H. Mahawar, V. Sarin, and W. Shi, "A solenoidal basis method for efficient inductance extraction," in *Proceedings of IEEE/ACM Design Automation Con*ference, 2002, pp. 751-756.
- [44] M. Ranjan, W. Verhaegen, A. Agarwal, H. Sampath, and R. Vemuri, "Fast layout-inclusive analog circuit synthesis using pre-compiked parasitic-aware symbolic performance models," in *Proceedings of Conference on Design, Automation*, and Test in Europe, 2004, pp. 604-609.
- [45] H. Chan and Z. Zilic, "Modeling layout effects for sensitivity-based analog circuit optimization," in *Proceedings of IEEE/ACM International Symposium Quality Electronic Design*, 2005, pp. 390-395.

- [46] E. Rosa, "The self and mutual inductance of linear conductors," Bulletin of the National Bureau of Standards, vol. 4, no. 2, 1908, pp. 301-344.
- [47] Z. He, M. Celik, and L. Pileggi, "SPIE: Sparse partial inductance extraction," in Proceedings of IEEE/ACM Design Automation Conference, 1997, pp. 137-140.
- [48] B. Krauter and L. Pileggi, "Generating sparse partial inductance matrixes with guaranteed stability," in *Proceedings of IEEE/ACM International Conference on Computer-Aided Design*, 1995, pp. 45-52.
- [49] A. Devgan, H. Ji, and W. Dai, "How to efficiently capture on-chip inductance effects: Introducing a new circuit element K," in *Proceedings of IEEE/ACM* International Conference on Computer-Aided Design, 2000, pp. 150-155.
- [50] G. Zhong, C. Koh, and K. Roy, "On-chip interconnect modeling by wire duplication," in *Proceedings of IEEE/ACM Design Automation Conference*, 2002, pp. 341-346.
- [51] Y. I. Ismail, "Equivalent elmore delay for RLC trees," in *Proceedings of IEEE/ACM Design Automation Conference*, 1999, pp. 715-720.

## VITA

Yang Yi received the B.S. and M.En. degrees in electrical engineering from Shanghai Jiaotong University, China in 2003 and 2005, respectively. She then graduated with her Ph.D. degree in the Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas in 2009. During her Ph.D. study, she worked as an intern at IBM, Austin, Texas, from May 2007 to September 2007, and as a senior R&D engineer at Freescale Semiconductor Inc., Austin, Texas, from May 2008 to 2009. Her research interests include interconnect parasitic extraction, power grid analysis, and chip and package co-simulation.

Yang Yi won the title of Outstanding Student of Shanghai Jiaotong University in 2001 and 2002. She was a recipient of the Huawei Scholarship in 2004 and the student travel grant of ACM/IEEE Design Automation Conference in 2008. She is a Member of the Phi Kappa Phi honor society. She can be reached at Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas 77843-3128.

The typist for this dissertation was Yang Yi.