PROCESS INVARIANT CIRCUIT TECHNIQUES FOR RELIABLE MIXED SIGNAL SYSTEMS
A Dissertation Presented to the Faculty of the Graduate School
of Cornell University In Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy
by Mustansir Yunus Mukadam
August 2012
i

© 2012 Mustansir Yunus Mukadam ii

PROCESS INVARIANT CIRCUIT TECHNIQUES FOR RELIABLE MIXED SIGNAL SYSTEMS
Mustansir Yunus Mukadam, Ph D. Cornell University 2012
CMOS scaling has enabled circuit designers to develop a wide variety of fully integrated mixed signal systems by taking advantage of the high switching speeds and lower noise figures offered by these devices. Unfortunately, scaled CMOS increasingly suffer from large variations in expected performance due to defects in manufacturing and fluctuations in environmental conditions. This phenomenon is termed as process variation and it ultimately impacts yield of mixed signal systems. Post fabrication tuning efforts to correct for these effects is an expensive solution and, in some cases, infeasible.
This work proposes a variety of circuit techniques to combat variations in standard mixed signal blocks such as low noise amplifiers (LNA), voltage controlled oscillators (VCO), and digital to analog converters (DAC).
An on-chip statistical technique, designed in the TSMC 65nm CMOS process, tracks changes in threshold voltage due to variations in process, temperature, and supply voltage, and provides an error correction signal to the LNA. Silicon measurements show that our technique reduces the variation in voltage gain of LNAs by a factor of 3.6. We also

demonstrate that this technique can be applied to other amplifiers designed in advanced CMOS processes and demonstrate with a common source amplifier.
A switched capacitor based feedback loop, designed in the IBM 90nm CMOS process, generates an error signal based on the drift in the center frequency of VCOs and provides an appropriate correction signal to compensate for the drift. Measured results show a 2.5x reduction in center frequency variation of the VCO.
We propose using redundancy in a DAC to tighten the error distribution of DAC elements and improve non-linearity. Measured results of an 8 bit thermometer current steering DAC designed in the TSMC 65nm CMOS process show 38% reduction in non-linearity. Another technique to reduce non-linearity is reordering of elements based on their error distribution. This reduces non-linearity by an additional 30%. Combining both schemes significantly reduces induced non-linearity errors with minimal area and power increase.

BIOGRAPHICAL SKETCH Mustansir Yunus Mukadam was born in Mumbai, India in April 1984. He grew up entirely in the United Arab Emirates, in the Middle East, shuttling between Abu Dhabi, Sharjah, and finally, Dubai, where he graduated first in his high school class of 2001. From the extreme heat of Dubai, he moved to the frigid climate of Montreal where he studied Electrical Engineering at McGill University. After graduating from McGill in 2006, Mustansir moved to the boonies in Ithaca, NY in the position of a Graduate Research Assistant in Dr. Apsel’s lab at Cornell University. He started his research with high speed opto-electronic receivers but finally settled on developing process invariant circuit techniques to improve yield and performance of various mixed signal blocks used in wireline and wireless systems. In between all of this, he took a break from the rigors of a PhD to intern as a Mixed Signal designer at PMC-Sierra Inc. in Vancouver, BC, from January 2011 to May 2011. Mustansir finished his PhD in August 2012 and moved to the Bay Area where he will work with the XBOX team at Microsoft in Mountain View.
iii

ACKNOWLEDGEMENTS
This thesis would not have been possible without the contributions, help, and support from the following people.
First and foremost, my advisor, Dr. Alyssa Apsel, for her constant support in my work, belief in my ideas, and encouraging inputs and critiques in realizing and polishing the works in this thesis. This work would not have been possible without her.
My dissertation committee members, Dr. Ehsan Afshari and Dr. K. Max Zhang, for their guidance and help in various stages of this work.
My fellow collaborator, Ishita Mukhopadhyay, was essential for accomplishing the concepts, circuit design, layout, and testing of the work on digital to analog converters, and an excellent lab mate to interact with overall. Without Xiao Wang’s superior testing skills and board designs, we would not have been able to test various aspects of the DAC circuit. Oscar Filho and Xuan Zhang were crucial towards conceptualizing the work done in process invariant low noise amplifier and voltage controlled oscillator designs. I’d also like to thank the various group members I have interacted with over the past six years at Cornell; Rajeev, Bo Sr., Bo Jr., Wacek, Carlos, Tony, Zhongtao, Nick, and Jerry.
The support from Cornell NanoSciences and National Sciences Foundation was crucial towards completion of this thesis. Without fabrication runs provided by TSMC, we would not have any results to show.
iv

My family has always been a pillar of strength and motivation. My mother has provided me with moral support and guidance through each of my struggles during this thesis. Not a day goes by when I don’t miss my dad and his practical advice and sound judgment on all matters of life. My aunt Anjum, uncle Ebrahim, and cousins Sarah and Taher form part of the best family anyone could ever have. Lastly, the time spent at Cornell would not have been enjoyable and memorable without the many friends I made. Krishna, Shantanu, Suresh, and Tanay are the best compadres for daily coffee sessions, random discussions on life, and criticisms of the Indian cricket team. My Monday night trivia team for the entertainment and relief from the PhD. Various members of the Cornell India Association with whom I had the pleasure of organizing too many events to count. A special shout out goes to a ton of other friends I made at Cornell who are way too many to name in this thesis.
v

TABLE OF CONTENTS
Biographical Sketch Acknowledgements List of Figures List of Tables 1. Introduction
1.1: The CMOS Industry 1.2: Scaling in Analog CMOS 1.3: Process Variation 1.4: Goal of the Dissertation 1.5: Organization of the Dissertation
2. Process Variation in CMOS Technology 2.1: Scale of Variations 2.1.1: Inter-die Variations 2.1.2: Intra-die Variations 2.2: Random Dopant Fluctuation vi

iii iv xii xviii 1 1 3 5 5 6
7 7 7 8 10

2.3: Line Edge Roughness 2.4: Impact of RDF and LER on Threshold Voltage 2.5: Impact on Integrated Circuits 2.6: Impact of Process Variation on the IC Industry 2.7: Impact of Process Variation on Energy Usage in Industry

14 16 18 23 25

3. Process Compensation of Amplifiers 3.1: Introduction 3.2: Related Work 3.3: Amplifier Variations 3.3.1: Variations in the LNA 3.3.2: Variations in the CSA 3.4: Correction Scheme 3.4.1: Proposed Solution 3.4.2: Temperature Variation 3.4.3: Supply Voltage Variation 3.5: Compensation Circuit Design vii

28 28 30 31 32 34 35 35 38 39 41

3.5.1: First and Second Stage

42

3.5.2: Third Stage

43

3.5.3: Fourth Stage

43

3.5.4: Performance of the Bias Circuit 3.6: Design Example I – 3.2 GHz Low Noise Amplifier

45 46

3.6.1: Measured PVT Results of the LNA

46

3.6.2: Impact on Input Matching, NF, and Linearity of the LNA

49

3.6.3: Yield Analysis and Comparison to Other Work 3.7: Design Example II – Common Source Amplifier

51 52

3.8: Conclusion

54

3.9: Appendix A

54

3.9.1: Amplifier

Derivation of transconductance variation in the Low Noise 54

3.10: Appendix B

55

3.10.1: Derivation of gm variation with supply voltage

55

4. Process Compensation of Oscillators viii

59

4.1: Introduction 4.2: System Design Concept 4.3: System Design 4.4: Frequency Correction Unit
4.4.1: Initialization Stage 4.4.2: Comparison Stage 4.4.3: Correction Stage 4.5: Loop Stability 4.6: Accuracy Analysis 4.7: Low Variation Current Source 4.8: Voltage Controlled Oscillator 4.9: System Measurement Results
5. Mismatch Compensation of Digital to Analog Converters 5.1: Current Steering DACs 5.2: Differential Non-Linearity 5.3: Integral Non-Linearity ix

59 60 61 62 63 64 65 66 68 69 70 70
75 75 77 78

5.4: Prior Work in Calibration of Current Steering DACs 5.5: Proposed Solution
5.5.1: Error Analysis of Differential Non Linearity 5.5.2: Error Analysis of Integral Non-Linearity 5.5.3: Redundancy 5.5.4: An 8-bit Redundant Thermometer DAC 5.5.5: Practical Realization of a Redundant N-bit DAC 5.5.6: Reordering 5.5.7: Reordering in a 2-Dimensional DAC 5.5.8: Combining both Redundancy and Reordering 5.6: Circuit Implementation and Challenges 5.6.1: First Generation DAC Design 5.6.2: Second Generation DAC Cell Design 5.6.3: Generating the Mean Current 5.6.4: Median Generation 5.6.5: High Resolution Current Comparator 5.6.6: Eliminating Outliers
x

79 82 83 84 85 86 89 92 95 97 98 102 105 109 111 112 114

5.6.7: Cost of Calibration on Overall DAC Design 5.7: Conclusion
6. Conclusion 6.1: Conclusion 6.2: Future Work
References

114 115
117 117 119 120

xi

LIST OF FIGURES

Figure 1.1: Edholm's law projecting required bandwidth for various communication systems Figure 1.2: RF transceiver market share versus technology Figure 1.3: Simulated cutoff frequency of NMOS devices of various nominal gate lengths
Figure 1.4: NFmin at 2.4 GHz versus gate length Figure 2.1: Scale of variations in an Integrated Circuit
Figure 2.2: Oscillation frequency of identical devices with different layouts Figure 2.3: Variation extractor at various levels of the IC fabrication process Figure 2.4: Atomistic simulation of a 50 x 50nm MOSFET
Figure 2.5: Scaling trend of Vth variance due to RDF Figure 2.6: Distribution of Vth as a function of the number of dopants for a (a) 35nm device and (b) 13nm device Figure 2.7: Potential distribution at the Si/Si02 interface of two microscopically different MOSFETs
Figure 2.8: LER in advanced lithography processes Figure 2.9: Potential distribution of a 200nm x 30nm MOSFET in the presence of LER
Figure 2.10: Vth fluctuations associated with LER as a function of its amplitude Figure 2.11: Vth deviation in the presence of RDF, LER, and both effects

2 3 4 4 7 8 9 11 12 13
14 15 15 16 17

xii

Figure 2.12: Normalized Vth variation of a 65nm MOSFET Figure 2.13: Normalized leakage current distribution of a 65nm MOSFET Figure 2.14: Fault statistics of a 32 K SRAM in 45nm technology Figure 2.15: Measured leakage power and frequency for 62 dies Figure 2.16: Distribution of the performance parameters of a narrowband LNA at 2.4 GHz Figure 2.17: Impact of parameter variations on RF performance Figure 2.18: Measured receiver gain and noise figure over fast, typical, and slow process corners Figure 2.19: Measured receiver gain and noise figure over operating temperature (left) and supply voltage (right) Figure 2.20: System level illustration of a 16-QAM RF Receiver Figure 2.21: Yield of ICs and its impact on profit Figure 2.22: Iterative performance calibration in which knobs are tuned until an IC is healed Figure 2.23: Relative cost to manufacture and test a transistor Figure 2.24: Energy use per cm2 of wafer area Figure 3.1: Circuit diagram of the inductive cascode LNA Figure 3.2: Circuit diagram of the common source amplifier Figure 3.3: Circuit diagram of the compensated LNA

18 18 19 19 20 21 22
22 23 23 24 25 25 32 34 36

xiii

Figure 3.4: Circuit diagram of the bias circuit Figure 3.5: gm of input transistors M1 and M2 in the compensated amplifier Figure 3.6(a): Histogram of voltage gain of the uncompensated LNA over two wafer runs Figure 3.6 (b): Histogram of voltage gain of the compensated LNA over two wafer runs Figure 3.7: Gain of uncompensated and compensated LNA with supply voltage variations Figure 3.8: Gain of uncompensated and compensated LNA with temperature variation Figure 3.9: Die photo of compensated LNA Figure 3.10(a): Measured NF of the LNA without compensation Figure 3.10(b): Measured NF of the LNA with compensation Figure 3.11(a): Histogram of voltage gain of the uncompensated CSA over two wafer runs Figure 3.11(b): Histogram of voltage gain of the compensated LNA over two wafer runs Figure 4.1: System Diagram of the General Compensation Loop Figure 4.2: Switched Capacitor based VCO tuning circuitry Figure 4.3: Discrete time switched capacitor integrator Figure 4.4: Timing waveform controlling switches in the frequency sensor Figure 4.5: Initialization Stage Figure 4.6: Comparison Stage

42 45 47 47 48 48 49 50 50 53 53 60 61 62 63 64 64

xiv

Figure 4.7: Correction Stage Figure 4.8: Low Variation Addition Based Current Source Figure 4.9 (a): Histogram of uncompensated VCO Figure 4.9 (b): Histogram of compensated VCO Figure 4.10: Temperature drift of baseline oscillator and oscillator in Switched Capacitor loop Figure 4.11: Convergence behavior of VCO compensation loop Figure 4.12: Die photo of Switched Capacitor based VCO Compensation Figure 5.1: Calibration DAC used in a Direct Conversion receiver architecture Figure 5.2: Offset compensation in comparators using a calibration DAC Figure 5.3: DNL error in DACs Figure 5.4: INL error in DACs Figure 5.5: Introducing redundancies in current sources to reduce errors Figure 5.6: Error Variance of current sources in an 8 bit DAC design with and without redundancy. Figure 5.7: Worst case DNL value with and without redundancy Figure 5.8: DNL Variance with and without redundancy Figure 5.9: Illustration of a 2-Dimensional current steering DAC

65 69 71 71 72 73 73 76 76 78 79 86 87 88 88 89

xv

Figure 5.10: Variance of the DNL

Figure 5.11: 5000 Monte Carlo runs plotting DNL

Figure 5.12: Reducing INL error by alternatively switching

Figure 5.13: Worst case INL for 1-dimensional DAC with element reordering

Figure 5.14: Variance of INL for each code of the DAC over 5000 Monte Carlo runs

Figure 5.15: Worst case INL

Figure 5.16: Variance of INL at each code

Figure 5.17: 3σ of INLDAC across different current source mismatch

Figure 5.18: 3σ of

for methods presented in this work

Figure 5.19: Positional chart to determine redundant sources

Figure 5.20: DAC Cell with SRAM

Figure 5.21: 6T SRAM with a transistor to read its contents

Figure 5.22: Outlier access

Figure 5. 23: Unit Current Cell of thermometer current cell - Generation 1

Figure 5.24: Chip micrograph of the DAC designed in the TSMC 65nm process

Figure 5.25: Measured results of DNL of the 8-bit thermometer current steering DAC.

Figure 5.26: Measured results for INL of 8-bit thermometer current steering DAC.

xvi

91 92 93 94 95 96 96 97 98 99 100 100 101 103 103 104 105

Figure 5.27: Unit cell designed for the second generation of the DAC

106

Figure 5.28(a): Scenario 1: Cell under consideration is the highest accessed cell

107

Figure 5.28 (b): Scenario 2: Cell under consideration occurs (red) before highest accessed cell

107

Figure 5.29: DAC DNL for various sizes of current contributing PMOS

108

Figure 5.30: Plot of median confidence for various errors

111

Figure 5.31: Median Generation using a successive approximation approach

112

Figure 5.32: High Precision Current Comparator

113

Figure 5.33: Sampling (fast) and averaging (slow) clocks used in comparison

113

Figure 5.34: Time-to-Digital converter

114

Figure 5.35: DNL reduction taking area occupied by the calibration circuitry into account

115

xvii

LIST OF TABLES
Table 3.1: Design Parameters of the Bias Circuit Table 3.2: Performance of the Bias Circuits under Process Corners Table 3.3: Comparison with Other Work Table 3.4: Summary of Measurement Results Table 5.1: Truth Table for Priority Encoder Scheme

44 45 51 53 101

xviii

CHAPTER 1

The CMOS Industry

INTRODUCTION

CMOS transistors have transformed the world in which we live. From portable electronics such as cellphones and tablets, to control systems in automobiles and transport systems, and complex communication systems including satellites orbiting our earth, transistors form an integral part of our daily lives. The semiconductor industry, virtually dominated by CMOS, is a $300 billion revenue market, growing at a rate of 30% annually [1] . This astronomical growth is partly due to CMOS technology continuing to follow Moore’s Law. Moore’s law, proposed by Gordon Moore in 1956, is an economic indicator whereby the number of transistors placed inexpensively in an integrated circuit doubles roughly every 18 months [2]. This has been made possible by continuing to shrink the size of the CMOS transistor during the same interval.

Scaling CMOS processes are also beneficial for designing high performance analog and RF systems. Over the past decade we have achieved cellular data rates that match and even exceed the bandwidth obtained from the highest Ethernet speeds of the mid noghties. This exponential increase in annual data rates has been termed the Edholm’s law of bandwidth, in honor of Phil Edmond, chief technology officer of the now defunct Nortel Networks [3] and it closely follows Moore’s law. Seen in Figure 1.1, the three telecommunication categories – wireline, nomadic, and wireless – follow similar trends with data rates increasing on exponential curves and wireless applications following their wireline counterparts with a constant time lag. This is not too surprising, however, since all technologies rely on the same core technology of the radio with the
1

wireless devices requiring faster and more powerful radio transceivers with the introduction of newer communication protocols.
Figure 1.1: Edholm's law projecting required bandwidth for various communication systems [4] Researchers over the last ten years have focused on the ultimate goal of integrating analog and RF CMOS devices with the digital baseband on the same chip [5] [6] [7] [8] to take advantage of the inexpensive yet powerful digital logic, fast switching, and higher capacitance density obtained from digital CMOS processes. The use of RF devices in the analog front end would ultimately replace the high performing, but more expensive, silicon germanium (SiGe) and BiCMOS technologies previously used in the RF front ends with the aim of full system-on-achip integration and reduced costs and overheads associated with off-chip integration. Indeed, the increasing market penetration of analog and RF CMOS has in the cellphone market, shown in Figure 1.2, confirms a continuously growing trend.
2

Figure. 1.2: RF transceiver market share versus technology [9] Scaling in Analog CMOS Based on the chosen transistor widths for scaled CMOS, the scaling rules in (1) apply for RF CMOS performance [10].

⁄⁄

(4.1)

λ is the technology scaling factor. (1.1) indicates that, as transistor feature sizes shrink, their cutoff frequency fT increases. Woerlee et. al. present the fT of nominal gate length NMOS devices as a function of its drain current in [11] presented here in Figure 1.3. It is evident that, for both low and high drain currents, fT increases with down scaling, confirming the high potential of CMOS for RF applications at gigahertz frequencies

3

Figure 1. 3: Simulated cutoff frequency of NMOS devices of various nominal gate lengths [56] The minimum noise figure of an FET, as determined by Fukui in [12] and expressed in (1.2) states that NFmin scales with λ. This is verified in Figure 1.4. This decrease is mainly due to the increase in fT.
(4.2)
Figure 1.4: NFmin at 2.4 GHz versus gate length. Solid dots are obtained from fabricated devices in a standard 0.18-µm process [13] .
4

Process Variation
Although CMOS scaling is advantageous in terms of the increase in fT and decrease in NF, the devices are more susceptible to increasing shifts in performance from their nominal specifications, termed process variations. Process variation is a naturally occurring variation in the transistor’s physical properties (length, oxide thickness, etc.) due to defects in manufacturing. It is a continuous theme in the history of semiconductor manufacturing but is becoming more and more prominent as devices scale and variation becomes a larger percentage of critical dimensions. Variation in electrical properties of CMOS transistors ultimately translate to variation in circuit performance such as amplifier gain, signal delay, and oscillation of center frequency. As a result, process variation is a detriment to achieving robust integrated circuit systems in sub-micron CMOS, with the International Technology Roadmap for Semiconductors highlighting variation as a key bottleneck in the design of systems with high yield [14].
Goal of the Dissertation
The goal of this dissertation is to develop tools to combat variations in performance of critical analog and RF blocks used in a wide variety of mixed signal applications. The techniques developed are on-chip circuit solutions which occupy a small area footprint and consume little power while providing significant reduction in circuit performance where degradation in performance is measured either as deviation from the nominally designed specification or from difference in behavior of identically laid out components.
In order to overcome process variation, a self-calibrating, a statistical feedback loop is designed for a low-noise-amplifier (LNA) which measures changes in threshold voltage due to variations in process, temperature, and supply voltage, and generates a control signal to correct for
5

deviation in LNA performance. This technique is also extended to a common source amplifier to demonstrate adaptability to other types of amplifiers. To correct for variation in the center frequency of voltage controlled oscillators (VCO), a switched capacitor feedback loop is designed to track frequency error and generate control signals to mitigate the drift in center frequency.
To minimize degradation in performance due to component mismatch, a calibration technique using redundancy in identically laid out elements is employed on a thermometer current steering digital-to-analog converter. With a small increase in the number of elements, 40% improvement in linearity is experimentally demonstrated. Reordering of elements based on their distribution is used to reduce integral non-linearity errors with minimal hardware penalty. We also propose combining both redundancy and reordering to further improve DAC linearity with low area and power penalty.
Organization of the Dissertation
This dissertation consists of six chapters. Chapter 2 will discuss the sources and scale of process variation and its overall impact on mixed signal circuit systems and the semiconductor industry. In Chapter 3, we will present a technique to overcome amplifier gain variations due to variations in process, supply voltage, and temperature in low noise amplifiers and common source amplifiers. In Chapter 4, we will discuss a scheme to reduce the spread in center frequency of voltage controlled oscillators. In Chapter 5, we will present an approach using redundancy in identically laid out current sources in a thermometer current steering DAC to reduce nonlinearity errors. Finally, in Chapter 6, we will summarize the conclusions of this research and propose some future research directions.
6

CHAPTER 2 PROCESS VARIATION IN CMOS TECHNOLOGY Scale of variations Process variation can generally be divided into two categories – inter-die variations and intra-die variations [1] [15] .
Figure 2.1: Scale of variations in an Integrated Circuit [16] Inter die variations Inter die variations occur from one die to the next. This means that the same device has different electrical characteristics and performance among different dies of a wafer, from wafer to wafer, and from wafer lot to lot. Lot-to-lot and wafer-to-wafer variations are caused by parameters such as process temperature, equipment properties, wafer polishing and placement. They affect every
7

device on the chip equally, and are generally deterministic, or systematic in nature [17] . Withinwafer variations can be attributed to issues such as resist thickness across dies [18]. Intra-die variations Intra-die variations are variations in device features present within a single chip. This means that seemingly identical devices have varied characteristics based on their location on the same chip. Systematic variations in devices within the same die have a known quantitative relationship to a source and can be modeled. For example, lithography and etching errors can easily be quantified. These variations have a strong spatial correlation and can be characterized by placing test structures at several locations on chip [19] . Layout dependent errors, which refer to two devices having different characteristics due to differences in layout, can be easily modeled in the design.
Figure 2.2: Oscillation frequency of identical devices with different layouts In general, process variation at any scale of IC design can be decomposed with an additive model, shown in Figure 2.3, where estimates of variation at each level that match empirical observations are termed systematic, with residuals from one estimator becoming the input to the next level. The sum of all the estimates becomes systematic sources of variation, which can be
8

accounted for non-idealities at different levels in the fabrication process. Systematic sources are

an indicator of how far away the performance of the CMOS system is from the nominally

designed value and they can be simulated with process corners at set standard deviations from

the mean value of an electrical parameter [20] .
et al.: ANALYSIS AND DECOMPOSITION OF SPATIAL VARIATION

25

ion as one moves across a wafer. For instance, die the edge of the wafer tend to have quite different ion proﬁles compared to die near the center of the . New methods for capturing these interaction terms are uced in Section V. Methods for analysis of the residuals ning after systematic components have been removed resented in Section VI. This section further provides a s for the comparison and evaluation of the effectiveness decomposition algorithms in Sections III through V. In on to presenting the methods used to factor variation, we lso demonstrate the methodology on two datasets. The s an artiﬁcial dataset created to test the efﬁcacy of each estimators as they are presented in Sections III through description of this dataset is provided in Appendix A. econd example uses data collected from an experiment ned to investigate interlevel dielectric thickness (ILD) ion in chemical-mechanical polishing (CMP) processes. dataset and analysis results are described in Section VII. y, concluding remarks and directions for future work are ded in Section VIII.

II. VARIATION CLASSIFICATION AND DEFINITION

iation in semiconductor manufacturing appears largely

r different scales in time and space: lot-to-lot, wafer-to-

, within-wafer, and intra-die. Lot-to-lot variation is the

ncy of the lot mean of a device or process parameter Fig. 1. Variation decomposition method ﬂow diagram. the mean of channel length computed over the entire

o vary from one monitored using

lot to the statistical

next. Lot-to-lot variation
process cFonitgroul raend2m.3ay:

VbiesariwaatBfieeorc-nauasneedxtthdreiae-cplehtvoyesrliscaaaltresvoauvrercrieyos udoisfffelsrepevanttie,allist voaisrfiacttrihoitniecaaIl tCththafetabrication

process

ensated for using run by run or other feedback control methods be available for the separation and analysis of varia-

aches, e.g. [10]. Wafer-to-wafer variation may be either tion components. Equipment and process-related issues can

orgaelnoerraslplyatciaalusinednbaytudrTeri.fhtTieenmppfrooincraealsslweaqbfueoirp-xtmo-ewnrteafopeprreervaastrieioann- ts

thtehnebepoidretnitioﬁned oanfd raedsdirdesuseadlsvialepfrtoceosvs eorptiamnizdatiotnhose
and control, and pattern dependencies can be minimized by

are

termed

as

random

one wafer to the next. This variation is increasing in judicious circuit design practices.

tance as single-wafer sporoucerscsiensg eoqfuipvmaernitaetxipoannd.s RinandoFimg. 1ssohouwrscea sﬂoowfdivagarramiatfoior tnheagreenerdaludeecotmoposstiatitoinstical uncertainty in process

Spatial wafer-to-wafer variation may also result from algorithm we have developed. A hierarchical model is assumed

eal process equipment, e.g. due to different positions of in which the residuals (the output of the previous estimator

s in a boat during a bcatochndfuirtniaocenssteaps. critical lengmthinsusoiftsCinMputO) fSromdeonveicesetismsatcoar lbeec[o2m1e ]theainnpdutatorethae n indicator of how different

fer-level variation is generally caused by additional next estimator. There are three main estimators depicted in

ment nonuniformity and other physical effects such as
al gradients and loaditnhgephpeenormfoenram. Taynpcicealloy,fwtawfeor-

idFtheiegn.wt1ia:cfetahr–leldywieaidfneetre-srleaivcgteinloneesdtteimrmdateoevsr,tiimtchaeetsodri.aeD-rleeevt,aeiloleedrstdtimehsaectroimpr,tiaoinsndsmatch

between

two

identical

variation is low frequency and smooth, and neighboring of these estimators are presented in Sections III, IV, and V,

are likely to be highly correlated with each other. Also,
-level variation oftenCeMxhiObitSs sdymevmiectreicsa.l Eprfofpeerctitess
as radial (or “bull’s eye”) patterns or slanted planes.

surteecrsmphesc—ativstheelvy.apTorryhteiionnﬁgnoalfnbuthomex bivnaerFriaigtoi.ofn1 drtehoparptesaiesnnttlsesftth(edovureeersidatunoadl
assumed to be purely random in nature.

its

discrete

nature

in

scaled

a-die ction

variation with the

ipsroofcteesnps.craoKucseeyedsebsxyeamlsa)py,loeustgaiannctdleutodpeooxgpraiatdtpeheryn

thicGkenneeraslsly, sapenadkingc,hthaenvnaerialtiolnendegctohmpovsaitrioinatailogonristhmlead
can be expressed in the framework of an additive model. An

to

differences

between

ization in chemical mechanical polishing [11], and crit- excellent discussion of generalized additive models, of which

newidth dimension variation in channel length or metal we use a special case, can be found in [14] and [15]. Using an

[12], [13]. Intra-die variation has only recently received additive model allows the parameter of interest to be expressed

ciable attention, in part due to the need for a large amount as the sum of several contributions, each with their own

tistically meaningful data, and the prevailing belief that distributions or dependencies, such as die-level components

die variation is inconsequential compared to lot-to-lot, -to-wafer, and within-wafer variation. Several studies [9],

level

, wafer-level components

components9

, and die-cross wafer

[13] have shown that this is not the case and that intra-

ariation is often much larger or comparable to the other

ional sources.

where

(1)

identical devices, which can only be estimated with empirical models. We discuss these effects in the following section. Random Dopant Fluctuation As feature sizes shrink, statistical variation in the number and placement of dopant atoms in the MOSFETs leads to significantly random fluctuations in transistor performance such as deviations in threshold voltage (Vth), drive current mismatch, and so on. Even if fluctuations due to lithographic dimensions and layer thicknesses can be well controlled, random fluctuation of the small number of dopant atoms and their microscopic arrangement in the channel will still lead to significant variations in the transistor’s electrical parameters. This phenomenon is known as random dopant fluctuation (RDF), shown in Figure 2.4, and is considered as one of the significant contributors to device mismatch of identical devices and overall transistor variation and increases with device scaling as the average number of dopant atoms decreases.
10

Figure 2.4: Atomistic simulation of a 50 x 50nm MOSFET. (a) Potential distribution with position of dopants (b) One equi-concentration contour [22]
For example, the dopant concentration in the 65nm process is 1018 atoms/cm3 [23] . For a channel of minimum size ( = 60 nm), and width of twice the channel length, the average number of dopant atoms is 100. The dopants typically follow a Poisson distribution [24] with a standard deviation of the square root of the mean number of dopants. In our example, this translates to a 10% variation in the number of dopant atoms, which is a large uncertainty in dopant atoms for sub-micron devices. Since the threshold voltage is a function of the number of dopant atoms, this translates to significant Vth fluctuation, which affects circuit operation. Empirically, it has been shown by Asenov, et. al. in [25] hat the standard deviation of the MOSFET, shown in (2.1) is
11

proportional to the doping concentration and inversely proportional to the transistor’s dimensions.
√ (2.1) As devices scale, it is expected that σVth due to RDF will increase. This effect has been captured by Ye, et. al. in Figure 2.5. We can see that the variation in threshold voltage is exacerbated at smaller device dimensions, indicating increased device variation due to RDF for advanced CMOS processes.
Figure 2.5: Scaling trend of Vth variance due to RDF [26] Figure 2.6 also shows the distribution of Vth for two devices with different channel lengths as a function of the number of dopant atoms. We notice the increasing mean and standard deviation for the 13nm device, highlighting the adverse effects RDF will have on scaled devices.
12

Figure 2.6: Distribution of Vth as a function of the number of dopants for a (a) 35nm device and (b) 13nm device [27]
As mentioned earlier, both the number and the placement of dopants in the channel affects the transistor’s performance. Shown in Figure 2.7, for two MOSFETs with the same number of dopant atoms ( = 170), device (a) has more atoms closer to the channel surface than (b), translating to higher Vth for (a). This discrepancy in Vth for two seemingly identical devices is also due to RDF.
13

Figure 2.7: Potential distribution at the Si/Si02 interface of two microscopically different MOSFETs, both with 170 dopant atoms. (a) has Vth = 0.78V. (b) has Vth = 0.56V [27]
Line Edge Roughness
Another effect that contributes to variations in threshold voltage is line-edge roughness (LER), which is the distortion of gate shape along the channel width. This variation is mainly due to the gate-etch process. LER is a big concern in short channel transistors since its variance does not scale with technology, therefore it plays a bigger role in Vth variation in scaled CMOS processes [28] .
14

Figure 2.8: LER in advanced lithography processes. The inset shows LER found in sub-100nm ebeam generated lines [29]
In Figure 2.8 we can see that LER remains on the order of 5nm, independent of the type of lithography and channel length. This translates to a variation in potential distribution in the scaled device, shown by Reid, et. al. in Figure 2.9.
Figure 2.9: Potential distribution of a 200nm x 30nm MOSFET in the presence of LER [27] 15

Similar to RDF, random LER introduces Vth variations in MOSFETs, its effect enhanced for advanced CMOS devices, as shown in Figure 2.10.
Figure 2.10: Vth fluctuations associated with LER as a function of its amplitude [29] The standard deviation of Vth due to LER depends on the standard deviation of the RMS value of LER, as shown by Ye, et. al. in [26] , presented here in (2.2)
⁄ (2.2) WC is the correlation length of LER, C2 is a technology dependent coefficient, l’ is the length of DIBL effect. Further work has been done by Asenov, et. al. in [30] to accurately model process parameters which contribute to LER. Impact of RDF and LER on Threshold Voltage The discussion presented above confirms that continuous scaling exacerbates both RDF and LER effects. With continuous scaling, the number of dopant atoms in the channel reduces, making
16

RDF more significant. As gate lengths continue to shrink, they approach the 3σ value of LER, dramatically increasing device sensitivity to LER effects. These effects modeled together impact threshold voltage as follows:
(2.3) Graphically, this is represented in Figure 2.11. It is evident that the combined effect of RDF and LER increases the standard deviation of threshold voltage exponentially as device dimensions scale.
Figure 2.11: Vth deviation in the presence of RDF, LER, and both effects [26] The impact this total variation on Vth has on transistor performance is evident in Fig. 16 where the Vth fluctuations for a MOSFET designed in the 65nm process exceed 15%. The combined effect of RDF and LER also impacts normalized leakage current, as shown in Figure 2.12.
17

Figure 2.12: Normalized Vth variation of a 65nm MOSFET [31]
Figure 2.13: Normalized leakage current distribution of a 65nm MOSFET [31] Impact on integrated circuits Process variations of CMOS devices can be represented by a continuous probability distribution, empirical data, or a combination of both, where the total variation, P, can be expressed as a function of its known distributions, as follows:
(2.4) 18

Agarwal, et. al. show, in [32] , how a 30mV deviation in threshold voltage results in a low yield of 33.4% in an SRAM array designed in a 45nm CMOS process.
Figure 2.14: Fault statistics of a 32 K SRAM in 45nm technology Tschanz, et. al. demonstrate in [33] how both inter-die and within-die variation affects both the normalized frequency and leakage power of 62 testchips in the 150nm CMOS process.
Figure 2.15: Measured leakage power and frequency for 62 dies Variation in frequency leads speed binning to qualitatively sort the working ICs based on the frequency of operation. High frequency ICs correspond to higher price points compared to
19

lower-frequency counterparts [34] . Since there is a larger spread in frequency due to variations, this affects yield and profit margins of IC manufacturers. Process variation is also detrimental to the performance of RF CMOS applications. Figure 2.16 depicts the distribution of an LNA’s performance metrics obtained by Nieuwoudt et. al. in [35] . Both the input and output impedance exhibit a skewed distribution from the mean and the gain and power consumption show a 3σ variation of 30%. Extreme variation in gain and other metrics of the LNA can significantly reduce yield of RF front-ends to as low as 11% [36].
Figure 2.16: Distribution of the performance parameters of a narrowband LNA at 2.4 GHz [35] Figure 2.17 shows the impact of process variation on RF figures of merit such as fT, fmax, and Gmax for five different CMOS technologies. We can notice that the impact of parameter variations on 70nm CMOS is almost double to that of the 250nm technology node. fT suffers the most from process variation because it directly depends on CMOS parameters most affected by
20

process variations. fmax and Gmax depend on parasitics as well which is why their variation is lower. We still observe over 30% variation in Gmax, which directly translates to gain variations of RF circuits. This leads to a lot of overdesign in RF CMOS circuits to increase their tolerances to parametric variations and maintain a higher system yield.
Figure 2.17: Impact of parameter variations on RF performance [37] Previously mentioned Vth variations in sub-micron CMOS transistors significantly impact the transconductance of various amplifier blocks in the RF receiver chain, and hence the power gain and noise figure of the entire receiver. RF performance is not only impacted by variations due to process parameters. An integrated circuit has to work under a wide variety of dynamic environmental conditions which leads to prominent drifts in temperature across the chip and fluctuations in supply voltage to various circuit blocks in the RF system. A transceiver designed in 65nm CMOS by Tomkins, et. al. in [38] demonstrates how measured receiver gain and noise figure vary significantly across these three effects.
21

Figure 2.18: Measured receiver gain and noise figure over fast, typical, and slow process corners [38]
Figure 2.19: Measured receiver gain and noise figure over operating temperature (left) and supply voltage (right) [38]
To further study the impact variation of certain components in an RF transceiver chain has on the overall system performance, we simulated a 16-QAM receiver, shown in Figure 2.20, with measured performance of the low noise amplifier and voltage controlled oscillator designed in the TSMC 65nm CMOS process. Applying as little as 5% variation in the gain of the LNA and 5% variation in the center frequency of the VCO, we observed that the BER of the receiver degrades by a factor of 10. This confirms that variation in critical analog components severely
22

affects sensitivity and linearity performance of typical RF receivers, degrades yield, and increases overall manufacturing costs.

Mixer LNA
VCO

Filter

IF Amp

16-QAM Demod.

BERT

Figure 2.20: System level illustration of a 16-QAM RF Receiver Impact of Process Variation on the IC Industry Ultimately, loss of yield translates directly to loss of profits for IC manufacturers, as indicated in Figure 2.21.

Figure 2.21: Yield of ICs and its impact on profit To increase yield of integrated circuits, the “bad” ICs, i.e. the ones whose performance is adversely affected by variations in process, need to be extensively tested and fine-tuned using expensive analog and RF automated test equipment (ATE) to recover yield. This can be a very
23

cost and time insensitive process, increasing exponentially with the number of tunable knobs required to be tweaked to “heal” the currently failing IC.
Figure 2.22: Iterative performance calibration in which knobs are tuned until an IC is healed [39] As a result, even though the cost of manufacturing a transistor is rapidly declining – as shown in Figure 2.23 – increasing variation in semiconductor devices is causing the cost to test a transistor to steadily increase, affecting overall cost of manufacturing an IC. Testing an IC accounts for 4050% of the total cost to manufacture an IC and this number is projected to rise by as much as 75% within the next few years [40] .
24

Figure 2.23: Relative cost to manufacture and test a transistor Impact of Process Variation on Energy Usage in Industry Living with the growing concern of finding ways to minimize environmental impact in human actions, it is relevant to talk about the energy usage of the semiconductor industry and the impact yield has on overall energy utilization in manufacture of good ICs [41] . The energy used in manufacturing an IC has remained constant over the last decade at approximately 1.5 kWh/cm2 [42] [43] , despite more transistors being packed in the same area.
Figure 2.24: Energy use per cm2 of wafer area [42]
25

This is also due to the cost to manufacture a transistor dropping in every node, as shown in Figure 2.23. The energy density of “working ICs” can be expressed as a function of yield as follows:
(2.5)
With current first-pass die yields of DRAMs at 50% and RF transceivers at approximately 20%, we can estimate fairly high energy usage in the manufacture of working ICs at 6 kWh and 450 Wh respectively. With RF CMOS accounting for 40% market share of RF transceivers in cellular phones – currently sized at over 1.2 billion [44] – we realize that low initial yields translate to high effective energy costs.
For example, a low noise amplifier, which is part of an RF transceiver circuit, can cost up to $0.3/IC and contains, on average, 3 tunable knobs. The overall cost to test the LNA is calculated to be $0.09 [109] for 33 = 27 tests. Total energy consumed per LNA is 45 Wh. If we introduce even one additional knob to overcome process variation, as shown in Figure 2.22, the costs and energy usage will exponentially rise to $0.8/IC and 426 Wh, an increase of 9x! It is obvious that the total energy usage also increases dramatically as we introduce additional knobs to overcome increasing variability in advanced CMOS nodes. The 9x increase seems unreasonable high and rightly so because IC manufacturers would rely on statistical data from batch testing to keep testing costs low. Although obtaining this number was not possible since the information is proprietary, there will still be some increase in costs and energy usage due to additional requirements on testing.
26

If we designed on-chip circuit solutions, we could eliminate a tuning knob, potentially dropping energy usage and costs by a factor of 6. Realistically, however, energy consumption must be looked at by taking the entire system into account. By designing self-healing blocks on the IC itself, if the LNA’s yield goes up by 50%, the overall yield of the RF transceiver increases from 30% to close to 45%. This drops energy used in manufacturing a working RF transceiver to 200 Wh, which is a saving of 100 Wh. With continuous scaling to 22nm and even beyond that, we, as designers, will encounter largely varying device characteristics, making it more challenging to design robust, reliable circuits with high yield. Mitigating process variation is a continuous theme in the semiconductor industry and various circuit solutions have been incorporated on integrated circuits to increase yields. By continuing to design ingenious solutions, process variation will not be an insurmountable barrier to Moore’s law, but simply another challenge to be overcome.
27


CHAPTER 3

Introduction

PROCESS COMPENSATION OF AMPLIFIERS

Low Noise Amplifiers are the first active block in almost all wireless receiver chains. It is placed immediately, or very close to the receiver antenna and is used to boost the incoming signal power while adding as little noise and distortion as possible. Process variation severely affects performance and yield of LNAs designed in modern processes, especially their voltage gain. According to Friis’ formula for a system of cascaded stages in (3.1), an LNA gain which is lower than the specification it is designed for will not suppress the noise contributions of later stages enough to meet the receiver’s sensitivity requirements. On the other hand, an LNA gain larger than the target value will cause the receiver to fail to meet its intermodulation specifications. It then becomes critical to keep the voltage gain of LNAs stable against process variations in order to maximize the yield of a receiver chain.

28

(3.1)
In this chapter, we determine that the variation in threshold voltage of the input transistor is the main contributor to gain variations of LNAs and other standard amplifier configurations where transconductance determines gain. With this in mind, we design and develop a compensation scheme that measures the changes in threshold voltage and generates a bias signal for amplifiers in order to minimize deviations in their voltage gain. We experimentally demonstrate the validity of our method on an inductive degenerated cascade LNA and, to show that our scheme can also be adapted to a variety of such amplifier topologies, we employ it on a common source amplifier which is used as standard gain cells in many mixed signal system applications. Both topologies have been designed in the TSMC 65nm CMOS process.
Our work is the first experimental demonstration of successful on-chip PVT compensation of sub-micron amplifiers. Measurement results show that our method is successfully able to lower the variation in voltage gain of the LNA – centered at 3.2 GHz for WiMAX requirements – to 2.2%. This is a 3.7x reduction in the standard deviation of S21 as compared to a baseline, uncompensated LNA, translating to yield improvement of 50%. Our scheme also reduces the variation in voltage gain due to supply voltage and temperature variations by 9.4x and 1.5x respectively. Applying the same technique to a common source amplifier (CSA) shows similar
29

reductions in voltage gain variation. Our scheme occupies a small footprint and consumes very little additional power, making it an attractive low cost solution.
Related Work
Traditional approaches to detecting and correcting for variations in the gain of amplifiers have relied on using either built-in-self-test (BIST) devices, which either map the peak output signal to a corresponding DC value or introducing additional circuitry which adapts to variations in process. A survey of the state of the art of other LNA compensation schemes in literature shows good examples of these approaches.
While BIST based methods can have precise correction, they generally require very high power back-end calibration circuitry, can affect the performance of the amplifier, and are costly in area. Han et. al. devise a calibration scheme in [45] which demonstrates significant reduction in variation of LNA gain but the presence of a DSP and tuning control circuitry makes it very costly in power and area. Jayaraman et. al. in [53] also use peak detectors to maximize S21 gain but off-chip calibration makes it impractical for on-chip, low power solutions.
Sen et. al. in [54], use a sensing transistor at the output to control the current in the LNA. However, the large transistor used in the design makes the scheme unsuitable for low supply voltage processes. In [55], Sivonen et. al. identify that the variation in gain of an LNA is a function of its load impedance and, by replacing the load resistor with a parallel combination of different resistance ratios, they demonstrate simulated voltage gain stability over process corners. However, variation of passive elements is reported to be much smaller than that of active elements [56], therefore the major contributor is the variation of the transconductance of the system. Gomez et. al. employ a biasing circuit in [57] to control the variation in the gain of
30

LNAs, but optimally sizing the circuit trades off performance in the presence of both process and temperature variations. This causes the scheme to under-perform with PVT variations. The bias circuit also suffers from stability issues addressed in [58].
Despite the existence of various proposed schemes mentioned above, there has been no experimental demonstration as yet of a precise, low power scheme, which corrects for variations in gain of common amplifier topologies. Our method is based on statistical feedback, where we rely on local match between transistors to track changes in threshold voltage – occurring due to process and temperature variation – from its nominal value. We then generate a correction signal to feed back to the amplifier and correct for changes in gain, without affecting its operation under nominal conditions. Our method also detects and compensates for gain variations caused due to fluctuations in supply voltage. We show that our scheme can be applied to a wide variety of amplifiers, can easily be scaled for advanced CMOS processes, requiring minimal area and power overhead for its implementation.
Amplifier Variations In this section, we introduce both amplifier topologies – the CSA and the LNA, which we have used as design examples. We derive the process dependent terms that cause voltage gain variations and the necessary correction that needs to be applied to eliminate gain variations. We then discuss what a compensation scheme must accomplish to overcome variations in such topologies.
31

Variations in the LNA
The inductively degenerated cascode LNA, shown in Figure 3.1, is used as the first active block in a variety of wireless

Figure 3.1: Circuit diagram of the inductive cascode LNA

receiver systems because it provides a good balance between input match, noise figure, and gain. The cascode configuration also provides excellent isolation at the input port. We calculate the

resonance frequency of the LNA as

, where LS is the source degeneration
√( )

inductor, Lg is the gate inductor, and Cgs is the gate-source capacitance of the input transistor. The input tank is able to boost the transconductance of the LNA to [59]:

(3.2)

32

Here gm is the small signal transconductance of the input transistor, Rs is the impedance of the

input source or the antenna,

of the input transistor, and Qin is the quality factor of the

input series RLC tank. Using these relations, we can rewrite (3.2) as:

√( )

(3.3)

where ‘p’ is the state variable denoting the process conditions. To achieve zero variations in the voltage gain of the LNA, we must minimize the total variations seen in (3.3), i.e., we want ΔGm(p) = 0. We note that variation of spiral inductors in sub-micron processes has been shown to have an insignificant impact on the performance of LNAs [50]. Work done in [60] shows that, by setting partial derivatives with respect to ‘p’ to zero, a rule for compensation of the circuit can be derived. Since we have written Gm as a function of electrical parameters of the LNA topology which also suffer from process variations, we can use the above method in (3.4)

(3.4)

33

Variables with subscript ‘0’ represent values at the nominal process corner. Around the input match condition where RS = ωTLS, we write the total variation in Gm as:

ΔΔ

(3.5)

(3.5) indicates that, in order to have no variation in the transconductance of the LNA, we must ensure that the variation on the input transistor’s transconductance must be zero. A detailed derivation of (3.5) is shown in Appendix A.
Variations in the CSA

Figure 3.2: Circuit diagram of the common source amplifier
The common source amplifier (CSA) is a basic amplifying cell used in a variety of mixed signal applications. Shown in Figure 3.2, its gain is a strong function of the gm of the input transistor,
34

M1. By taking the partial derivatives with respect to process, similar to as done above, we can derive the following relationship for the overall transconductance:

ΔΔ

(3.6)

In (3.6), we can see that, by eliminating variations due to process in gm, we can ensure that the variation in gain of CSAs can also be minimized.
From (3.5) and (3.6), we realize that we need to eliminate variations in gm of the input transistor to eliminate gain variations. The LNA and CSA are examples of amplifiers where transconductance determines gain, therefore we need to develop a general method to eliminate variations in the input transistor’s gm to compensate for gain variations of similar amplifiers.
Correction Scheme
Proposed Solution
In order to eliminate variations in transconductance due to process, we replace the Minput in Figure 3.1 and 3.2 with two input transistors in parallel. Figure 3.3 shows the modification made to the LNA as an example. The same change can be made to the input transistor of the CSA. The total input gm is now the sum of the individual gm of the transistors and our goal is for Δgmtotal = 0 in (3.7) for voltage gain variations of the amplifier to equal zero.

35

Figure 3.3: Circuit diagram of the compensated LNA. Transistors M1 and M2 are in parallel and form the input transistor of the LNA. A similar modification is made to the input transistor of the
CSA
To accomplish this, we want Δgm1 and Δgm2 to move in opposite directions with process
variations. From Figure 3.3,

(3.7)

In the scheme, Vgs1 – the gate bias of M1 – is a set DC bias that does not vary. It can be generated from a bandgap reference or supplied externally. The nominal value of the DC bias Vgs2 of M2 is equal to Vgs1. We size M1 and M2 equally – both transistors are half the size of the input transistor in Figure 3.1 – and place them close to each other in layout to ensure that

Vth1≈Vth2 [61]. For sub-micron transistors,

, where

and α

represents the non-idealities due to short-channel effects [62]. With these conditions for the

system:

36

()

(3.8)

Since Vgs2, Vth, and κ are process dependent terms, the variations in gm1 and gm2 with respect to disturbances in process are:

()[

()

( )]

()[

()

(

(3.9) )]

VOD,0 is the nominal gate overdrive voltage of M1 and M2 and κo is the nominal current gain. We express the total variation in gm of the transistors as:

()

(3.10)

For Δgmtotal = 0, the condition on ΔVgs2 now becomes ΔΔΔ

(3.11)

We can extend the analysis previously shown in [63], to include the dependence of Δκ(p), i.e. the transistor’s current gain. Due to process variations, a positive Δκ, due to an increase in mobility and oxide capacitance, has the same impact as decrease in threshold voltage, which is increasing the transistor’s drive current. Therefore, it is equivalent to say that the second term of (3.11) can
37

be replaced with some fraction of -ΔVth. Based on this, we can represent the required bias for Vgs2 as:
(3.12)
Performing Monte Carlo simulations on the modified design in spectreRF for various values of Γ provides us with an optimum value of 2.8 which gives us the lowest variation in voltage gain of amplifiers for the TSMC 65nm CMOS process. The dependence of (3.11) on α allows the method to be applied to more advanced technologies as well. Before we present a circuit implementing (3.12), a discussion of the robustness of the scheme against changes in supply voltage and temperature is important to ensure reliable operation in various environments. Temperature Variation Recent studies have shown the adverse effects temperature variation has on power consumption, leakage, voltage gain, and noise performance of amplifiers [64][65]. In (3.8), the parameters that are most affected by temperature are the carrier mobility and threshold voltage of the transistor. From [66]:
() (3.13)
38

where T0 and T are the reference and operating temperatures respectively. σµ is the mobility exponent constant between 1 and 2, and σv is the threshold voltage temperature constant ranging from 0.5 mV/K to 3 mV/K.
We can derive the temperature dependence of transconductance of the uncompensated amplifier – gm,uncomp – from (3.13) as

Δ

Δ ( ( ))

(3.14)

For a temperature range of 273K to 323K and moderate inversion of input transistors, . The temperature dependence of the transconductance of the compensated
amplifier is

Δ

Δ ( ( ))

(3.15)

Since the compensated circuit contains an extra

, our scheme is able to

minimize the temperature effect on gm due to the threshold voltage when compared to the uncompensated case.

Supply Voltage Variation

Increased transistor count due to transistor scaling and decreased supply voltages causes large IR and di/dt events. This leads to supply voltage variations on chip, adversely affecting the ICs performance [67]. Our scheme needs to be designed to minimize the impact of this variation on
39

the voltage gain of amplifiers as well. A detailed derivation of the supply voltage dependence on gain is provided in Appendix B. Here we summarize the results.
For the uncompensated amplifier – biased with a constant dc bias – we obtain the variation in transconductance by taking the partial derivatives with respect to disturbances in VDD (represented by state variable ‘s’) as follows:

(3.16)

VOD,0 is the nominal overdrive voltage of the input transistor, RL is the output load impedance of the amplifier, and λ accounts for channel length modulation. From (3.16) we infer that gm,uncomp has a linear dependence of λ with respect to VDD. Hence the gain will also increase linearly with VDD.

We have biased M2 of the compensated amplifier with a circuit representation of (3.12) which, as we will show in the next section, is designed to have a dependence on VDD as well. M1 is once again biased with a constant dc source. It must also be noted that at nominal VDD, M1 and M2

have approximately the same gate bias. To ensure that

equals zero, we derive the

following condition on the generated bias voltage.

(3.17)

Based on the nominal bias conditions of the amplifier and process parameters for the 65nm process, the slope in (3.17) equals -0.25. Therefore, by designing ΔVgs2 to have a dependence on
40

ΔVDD close to this value, we can eliminate all first order gain variations of the compensated amplifier with supply voltage. In the next section, we will discuss how we can engineer a bias circuit to exhibit this dependence with VDD while also generating (3.12). Compensation Circuit Design To generate a second compensating bias that will satisfy (3.12), we design a bias circuit shown in Figure 3.4 with the following properties: the output of the block must provide a DC bias which has a nominal value of Vgs1 and exhibit positive correlation with the threshold voltage with a slope of Γ. It must also have a dependence of approximately -0.25 to changes in supply voltage. All transistors in the four stage cascade configuration are biased in saturation and the output of the fourth stage provides us with (3.12). In Figure 3.4, β is a scaling factor generated from a resistive divider, and Aj2 is a width multiplier for transistors in stage j. Ratioing each stage gives us control over the bias circuit’s power consumption.
41

Figure 3.4: Circuit diagram of the bias circuit designed to compensate for process, temperature, and supply variations in the LNA
First and Second Stage
In order to analyze these stages, we apply KCL on the output nodes. At the output of the first stage,

(3.18)

By taking partial derivatives of Vo1 with respect to PVT variations, we get

ΔΔ Δ

(3.19)

Similarly, and using the result from (3.19), the dependence of the output of stage 2 with respect to PVT is:
42

√ ()

(√ (

))

(3.20)

Third Stage
The output dependences of the third stage give us more control on the coefficients for ΔVth and ΔVDD. Following a similar analysis we get:

√

((

))

((

√
))

(3.21)

By choosing our constants κj and Aj, we can design the bias circuit to accurately compensate for variations in both threshold and supply voltage.
Fourth Stage
The fourth stage is used to primarily to adjust the nominal DC bias of the output. Following a KCL analysis on the output node, we can derive the dependencies of Vout in (3.22).

√ ((

√ ( )))

43

(3.22)

√ (

(

√ ))

By adjusting the value of β and carefully sizing the fourth stage, we design Vout to have a nominal value of 0.5 V, which is chosen as an optimal value for the targeted voltage gain, noise, linearity, and power consumption for the 65nm technology. Since each term in (3.22) is a combination of well-defined constants over which we have complete design control, the design parameters shown in Table 3.1 are optimized for lowest gain variations due to process, temperature, and supply voltage.

TABLE 3.1

DESIGN PARAMETERS OF THE BIAS CIRCUIT

Design Parameter

Value

A1
A3 β √κ5/κ4 √κ2b/κ2a Γ ∆VDD coefficient

2.17 2.25 0.85 1.73 1.73 2.8 -0.18

Extracted simulation results in Figure 3.5 show the percentage variation of gm for transistors M1 and M2 in the compensated amplifier over a ±100 mV supply voltage sweep. We observe that gm of M1 and M2 move in opposite directions to cancel total Δgm over VDD.

44

Figure 3.5: gm of input transistors M1 and M2 in the compensated amplifier Performance of the Bias Circuit

We simulate only the inductive degenerated LNA both with and without the bias circuit to see how accurately our method is able to compensate for process variations at every manufacturing corner. We also note the bias voltage generated at every corner and compare it to the optimum dc bias voltage required for Vgs2 to keep the voltage gain constant across all corners. Results are in Table 3.2. The CSA shows similar performance improvements.

TABLE 3.2

PERFORMANCE OF THE BIAS CIRCUITS UNDER PROCESS

CORNERS

Corner

% Variation of S21 of
uncompensated LNA

% Variation of S21 of
compensated LNA

Vgs2 generated by bias
circuit (V)

Required Vgs2 for zero S21 variation
(V)

TT 0

0 0.48 0.48

SS 29.83

4.5 0.64 0.70

FF 19.3

0.05 0.30 0.29

SF 19.02

2.45 0.58 0.60

FS 12.2

0.57 0.37 0.38

45

The bias circuit exhibits a maximum deviation of 60mV from the optimal value. The maximum gain variation from the TT corner is 4.50% as opposed to the base case of 29.83%, validating the scheme. The difference in sizes of the transistors and their relative distance causes some mismatch, which can affect the ability of the circuit to accurately track changes in threshold voltage. There is also some error from the mismatch of the two input transistors of the amplifier and variation of the resistive load. We have minimized these effects with common centroid layout techniques to eliminate gradient effects and dummy elements to mitigate LOD effects. Relative sizing of each stage allows us to keep the transistors small and limit the power consumption. In the next two sections, we take the reader through two design examples – a 3.2 GHz Low Noise Amplifier, and a Common Source Amplifier – to demonstrate the reduction in PVT variations with our compensation scheme. We present measured results for both topologies designed in the TSMC 65nm standard CMOS process fabricated over multiple wafer runs. Design Example I – 3.2 GHz Low Noise Amplifier Measured PVT Results of the LNA Figure 3.6 (a) and (b) show the histograms for the measured voltage gain of the uncompensated and compensated LNA, from 100 chips over multiple wafer runs.
46

Figure 3.6(a): Histogram of voltage gain of the uncompensated LNA over two wafer runs

Figure 3.6 (b): Histogram of voltage gain of the compensated LNA over two wafer runs

The uncompensated LNA has a voltage gain variation of 8.07% over two water runs while the compensated LNA has a much smaller spread and narrower shift in mean voltage gain over two runs with a standard deviation over mean gain of 2.19%. This is a reduction in variation of 3.7x. We also sweep the supply voltage for both the uncompensated and compensated LNA by ±10% to observe the effects of supply variation. Measurement results are shown in Figure 3.7. We measure the variation of the voltage gain of the uncompensated LNA due to supply voltage variations as 275 ppt/V. The variation in gain due to supply voltage variations is defined as
where Gain(VDD,0) is the voltage gain at the nominal supply voltage and ΔGain(VDD)
is the spread between Gain(VDD) and the nominal gain.
In the compensated LNA, the voltage gain is almost constant over the entire supply voltage range. The variation is 29 ppt/V. It is noteworthy that we achieved an almost flat voltage gain over the supply voltage range without any post-fabrication calibration and process trimming. To measure the voltage gain across temperature, we use a probe station equipped with a vacuum

47

chamber. Liquid hydrogen cooling allows us to measure a temperature range of 273K to 373K.

The temperature variation of the voltage gain is defined as

where Gain(T0) is the voltage

gain of the amplifier at room temperature (300K), and ΔGain(T) is the difference between Gain(T) – the gain at temperature T – and Gain(To). The measurement results are shown in Figure 3.8. With no bias compensation, the gain of the LNA varies as much as 2310 ppm/oC. By applying compensation, we lower the voltage gain variation of the LNA to 1554 ppm/oC.

Figure 3.7: Gain of uncompensated and compensated LNA with supply voltage variations
Figure 3.8: Gain of uncompensated and compensated LNA with temperature variation 48

The bias circuit occupies 0.0013mm2 and consumes 0.68 mW. The uncompensated LNA consumes 6.88 mW. The die photo of the compensated LNA is shown in Figure 3.9.

Figure 3.9: Die photo of compensated LNA

Impact on Input Matching, NF, and Linearity of the LNA

Apart from voltage gain, process variation affects the input matching characteristics, noise

figure, and linearity of LNAs and overall performance of wireless receiver systems, as indicated

in (1). From [59], at the resonant frequency of the LNA, we design the real part of the input

impedance –

– to be as close as possible to the 50Ω impedance of the antenna or input

source in order to maximize the return loss, i.e. minimize S11 of the LNA. Process variations will affect S11, causing the LNA to have a lower effective power delivery to the receiver chain. Since our scheme minimizes variation in gm, we expect variability in S11 to be minimized. Indeed, Testing 20 chips at random exhibit S11 of less than -11.0 dB at 3.2GHz when the compensation is applied as opposed to a worst case S11 of -8.0 dB for the uncompensated LNA.

The Noise Factor of an inductive degenerated LNA is given in [70] as

49

()

(3.23)

where γ is the coefficient for channel thermal noise, ξ is the ratio of the device transconductance to the zero-bias drain conductance, Rg is the gate resistance of the input transistor, and gm,input is the nominal transconductance of the input device. In (3.23) we see a clear dependence of NF on gm,input and, expect some reduction with compensation. Measured data for 9 chips, in Figure 3.10 (a) and (b), show the expected reduction in NF. The average NF remains below 2.9 dB around the 3.2 GHz operating frequency of the LNA. This is comparable to previously reported LNAs operating in a comparable frequency range [68][69].

Figure 3.10(a): Measured NF of the LNA without compensation

Figure 3.10(b): Measured NF of the LNA with compensation

Third order intermodulation distortion can be expressed as [70]:

(3.24)

50

gm,input is the input transconductance, gm3 is its second derivative, and RS is the input resistance. Monte Carlo simulations of the IIP3 in spectreRF show a σ/μ spread of 7.8% for the compensated LNA as opposed to 12.1% for the uncompensated LNA.

Yield Analysis and Comparison to Other Work

The compensation scheme increases the number of working LNAs by 25 if we introduce a lower bound gain constraint of 10 dB, the gain we design for in our prototype. We can determine a constraint for the upper bound on the gain, based on the input compression point of the next stage in the receiver, which is related to the power budget, and range of the wireless system. As an example, if the application is able to support a ±5% tolerance in the gain of the LNA, our compensation scheme increases the yield of working LNAs by 50%. Comparison with other published works is shown in Table 3.3. Our proposed scheme experimentally demonstrates the lowest gain sensitivity to variations in process. We also experimentally demonstrate supply voltage and temperature compensation of the LNA’s gain. We show comparable reductions in gain variation to the technique in [45] but consume less power and area since that scheme relies on off-chip hardware for external calibration of the LNA. [57] and [55] both use variation adaptive circuitry to compensate for gain variations in simulations but we demonstrate better PVT control with measured results. Our method shows higher measured yield improvements with less power consumption compared to the simulated results presented in [54] and [52].

Baseline LNA This

Tech.
65nm CMOS 65nm

Center Freq. (GHz)
3.2
3.2

TABLE 3.3

COMPARISON WITH OTHER WORK

Target Gain (V/V)

Process Varn.

Varn. Redn.

Yield inc.

No. of chips measured

3 8.07 %

- 100

3 2.19 % 72% 50%

100

Power
6.88 mW 7.69 mW

Area
0.4 mm2 0.4 mm2

51

work [45] [55] [54]
[52] [57]

CMOS
0.25µm CMOS
0.13µm CMOS
0.25 µm CMOS 0.18 µm CMOS 0.18µm CMOS

1.9
2
1.8 3.1 –
10 2.4

3.67 3.62 % 28%

11.2 3.33 % 85%

5.62 N/R

-

3.54 N/R

-

1.73 13.48 % 47%

5.33 mW, 1.7 mm2,

N/R 75 excluding excluding

calib.

calib.

N/R Simulated N/R

N/R

18% Simulated N/R

N/R

27% Simulated 15 mW

N/R

N/R Simulated 410 µW 1.1 mm2

Design Example II – Common Source Amplifier
We choose the Common Source Amplifier as another design example since it is one of the most efficient single transistor amplifiers that can be implemented in standard CMOS technologies. Measured results of over 88 samples of the CSA – shown in Figure 3.11(a) and (b) – indicate a reduction in variation of 3.8x. The performances of the uncompensated and compensated CSA in the presence of supply voltage variations are measured, similar to that of the LNA. The compensated CSA has a gain variation of 159 ppt/V with respect to varying supply voltage while the uncompensated CSA exhibits gain variation of 476 ppt/V.

52

Figure 3.11(a): Histogram of voltage gain of the uncompensated CSA over two wafer runs

Figure 3.11(b): Histogram of voltage gain of the compensated LNA over two wafer runs

Across temperature, with no bias compensation, the gain of the CSA varies up to 2885 ppm/oC. Applying compensation lowers the voltage gain variation of the CSA to 1669 ppm/oC. The results obtained have been summarized in Table 3.4 along with those from the LNA.

Chip Type
Uncompensated LNA
Compensated LNA
Uncompensated CSA
Compensated CSA

Wafer Run
1st 2nd 1st 2nd 1st 2nd 1st 2nd

TABLE 3.4

SUMMARY OF MEASUREMENT RESULTS

No. of
Chips Meas’d

Gain µ
(V/V

Gain σ

Norm. Imp. over Std. baseline

)

50 3.29 0.179 5.44% 50 2.99 0.183 6.12%

-

50 50

3.10 0.069 2.22% 3.05 0.057 1.96%

3.7x

44 2.85 0.167 5.85% 44 2.59 0.176 6.79%

-

44 44

2.64 0.049 1.85% 2.67 0.048 1.79%

3.8x

Temp. Varn. (ppm/o
C) 2310
1554
2885
1669

Supply Varn. (ppt/V)
275
29
476
159

53

Conclusion
In this paper, we develop a general design methodology to compensate for voltage gain variations of common amplifier topologies where gain is a strong function of transconductance. Our work is the first experimental demonstration of PVT compensation of the gain of amplifiers designed in a sub-micron process. Using statistical feedback to track changes in Vth due to process and temperature, and by generating an appropriate bias signal to the amplifier, we experimentally demonstrate – without any post-fabrication trimming or calibration – 3.7x reductions in gain variation of low noise amplifiers and common source amplifiers designed in the TSMC 65nm CMOS process. We also show that our scheme can successfully reduce variations arising from fluctuations in supply voltage. Results obtained from our design examples confirm that our scheme can easily be adapted to other amplifier topologies where transconductance determines gain such as differential amplifiers, common gate amplifiers, and operational transconductance amplifiers. Our compensation method occupies a small footprint and has a low power overhead of 9%, making it attractive for a variety of robust, low power, mixed signal systems. By regulating the gain of amplifiers, our scheme increases overall yield of systems, reduces costs, and decreases turnaround time.
Appendix A
Derivation of transconductance variation in the Low Noise Amplifier
In this Appendix we present detailed calculation of the variation in Gm of the LNA with respect to process. The first order partial derivatives in (3.4) can be summed and simplified relative to the nominal Gm of the LNA as follows:
54

ΔΔ

Δ

ΔΔ

(A1)

Around the input match condition, RS = ωTLS, therefore (A1) can be simplified to:

Δ ΔΔΔΔ ΔΔ

(A2)

Appendix B
Derivation of gm variation with supply voltage
We present a detailed derivation of (3.16) and (3.17) to account for how disturbances in VDD affect the voltage gain of the LNA and CSA. To simplify the analysis for the LNA, we can ignore the cascode transistor in the LNA since its primary function is input isolation and it doesn’t affect the current in the LNA. Variables with subscript ‘0’ represent nominal values. The total current – as a function of supply voltage ‘s’ – flowing through the amplifier is:

()

(B1)

Vin,0 is the nominal dc bias of the input transistor and Vout is the dc voltage at the output node of the amplifier expressed as:

55

(B2)

RL is the output impedance. The variation of Itotal due to disturbances in supply voltage is given as

()

(B3)

Notice that Vin,0 has no dependence on VDD since the uncompensated amplifier is biased by a constant dc source. Combining (B2) and (B3), we can get:

( (

) )

(B4)

Taking the partial derivative of (B4) with respect to Vin,0 we get the following result for variation in gm,total:

(B5)
Vin,0 – Vth has been replaced by the nominal overdrive voltage VOD,0 of the input transistor. From (B5) we infer that gm,total, and hence the gain, has a linear dependence of λ with respect to VDD. In the case of the compensated amplifier, the dc input to transistor M2 – Vin2(s) – is generated by the bias circuit and therefore depends on VDD. M1’s bias – Vin1,0 – does not change with disturbances in VDD. The dc voltage at the output node is given as:
56

(B6)

where I1 and I2 are currents flowing through transistors M1 and M2 respectively and are expressed as:

()

(B7)

Taking the partial derivatives of (B7) with VDD, we get:

()

( )( )

(B8)

Combining (B6) and (B8) we can derive a dependence on Vout with respect to Vin2 as:
⁄
(B9)

57

At nominal VDD,

. We can now calculate the variations in

gm as follows by taking partial derivatives of the terms of (B8) with Vin1,0 and Vin2,0

respectively:

(B10)
[]

In order for the gain to be independent of variations in VDD, we require

From (B9) and (B10), the condition on

becomes:

to equal zero.

(B11)

58

CHAPTER 4

Introduction

PROCESS COMPENSATION OF OSCILLATORS

Voltage controlled oscillators (VCO) are widely used in high speed clock recovery systems and as a precise clock for digital systems. Although crystal oscillators are excellent references and are stable with variations in supply voltage, temperature, and process, integrating them with onchip systems is difficult and expensive. Technology scaling beyond 90nm has made integrated circuits more vulnerable to die-to-die and within-die parameter fluctuations in the manufacturing process [71]. The challenge therefore lies in designing on-chip frequency references in CMOS that can tolerate worst case variations in process, temperature, and power supply.

A lot of recent circuit design effort has been made to address this issue. Tschanz, et. al. [72] employ bidirectional adaptive body bias to maximize the number of dies that meet both the frequency and leakage constraints. This scheme, however, uses a reference crystal and post fabrication trimming to achieve its purpose, making it an expensive option. Sundaresan, et. al. [73], were able to achieve less than 3% variation in the frequency of a VCO by sensing the process corner in which the chip operates, but the scheme operates in the MHz range due to the assumption of its analytical model. Chen, et. al. ,[74] use a phase locked loop to counter variation but the external reference used makes it impractical for on-chip solutions.

59

System Design Concept To design a compensation loop for a low variation VCO, inspiration is derived from the PLL’s architecture where frequency differences between the VCO and a reference signal are translated to a voltage building up on a capacitor.
Figure 4.1: System Diagram of the General Compensation Loop Based on the idea of a control loop feedback system, we propose the compensation system illustrated in Figure 4. 1. The signal from the VCO is fed to a digital unit which generates control signals for the frequency correction block. The correction unit generates a voltage VFS proportional to the period of the VCO. This voltage is then compared to VREF, a stable dc reference voltage. If VFS is higher than VREF, their difference will be positive and the VCO will be sped up. On the other hand, if VFS is lower than VREF, their difference will be negative and the VCO will be slowed down. The VCO’s frequency is corrected for process variations when VFS matches VREF. In this case, VCTRL reaches a stable value and the VCO settles to a particular
60

frequency. Using circuit components in the frequency correction unit which are robust to variations in process will give us a VCO with zero frequency variation. Unfortunately, circuit blocks on chip suffer from inherent variation due to effects discussed earlier. Nevertheless, novel designs of building blocks such as low variation current sources [76] assist in designing a stable, low variation, process compensated VCO without loading any critical high speed nodes. System Design The frequency sensing and correction block is implemented using a switched-capacitor technique, shown in Figure 4.2.
Figure 4.2: Switched Capacitor based VCO tuning circuitry This section describes the implementation of various blocks used in the process compensation feedback system of the VCO.
61

Frequency Correction Unit The frequency correction unit is the most important component of the system since the stable, low variation, oscillation frequency of the system depends on proper functionality of this block. The architecture is based on a discrete time switched capacitor integrator and is shown in Figure 4.3. It consists of a current source Iref, capacitors C1 and C2, a high gain operational amplifier, transmission gate switches, and external inputs VREF and RST.
Figure 4.3: Discrete time switched capacitor integrator An external RST is applied at the beginning of operation to clear all digital counters and establish a DC operating point for the output of the operational amplifier. Once the RST signal is deasserted, the VCO oscillates with its free running frequency. The output of the VCO is passed through a series of dividers to shape it into a square wave with a 50-50 duty cycle. The timing signal generator produces signals φAB, φA, φB, and φC based on digital logic.
62

AB  CLKx16 A  CLKx4  CLKx8  CLKx16 B  CLKx4  CLKx8  CLKx16 C  CLKx4  CLKx8  CLKx16

(4.1)

Figure 4.4: Timing waveform controlling switches in the frequency sensor where CLKx4 is the waveform generated by dividing the output of the VCO by 4, as shown in Figure 4.4. Conventional CMOS logic was used in generating the control signals. Based on when the signals are asserted, the operation of the frequency correction unit can be divided into three stages: Initialization stage, Comparison stage, and Correction stage. Initialization Stage When φAB and φA are asserted, one plate of capacitor C1 is charged to VREF and the other plate is held at ground. This state is used to set an initial condition on C1 and allows for a comparison to be made between VREF and the voltage proportional to the system’s oscillation period. The charge contained in C1 at the end of the initialization stage is VREFC1.
63

Figure 4.5: Initialization Stage Comparison Stage When φAB and φB are asserted, one plate of the capacitor C1 is charged up by current source IREF for a period NTOSC, N being the divider ratio. The charge contained in C1 at the end of the comparison stage is VREFC1- NIREFTosc.The comparison stage establishes a charge difference at C1 which is proportional to the difference between the system’s current oscillation period and its nominal oscillation period.
Figure 4.6: Comparison Stage 64

Correction Stage Once φAB is deasserted, capacitor C1 is floating and the charge on it is held. When φC is asserted, capacitor C1 is discharged by connecting one plate to ground and the other to the negative input of the operational amplifier. The high gain of the operational amplifier requires that its negative input also be a virtual ground as it tracks the positive input, which is set to ground. Since charge must be conserved, charge on the plate of C1 connected to the negative input of the operational amplifier is transferred to capacitor C2.
Figure 4.7: Correction Stage The operational amplifier is designed as a conventional folded cascode to provide high gain so that both input nodes are able to track each other effectively. pFET transistors are used as input since the inputs to the operational amplifier are close to ground. The pFET input transistors are made large and square in layout to improve matching characteristics. Care is taken to ensure that the parasitic capacitance of the input transistors is much smaller than those used in the switched capacitor circuit. The op-amp is designed with a nominal gain of 35 dB.
65

Loop Stability
The voltage at the output of the operational amplifier, VControl, increases proportional to the amount of charge transferred. This voltage does not change until the next occurrence of φc and sets the frequency of the VCO. After n cycles, the voltage at the output of the operational amplifier is updated according to the difference equation:

Vctrl

(n

1)

 Vctrl

(n)



I REF

NTosc (n) C2

VREF C1

(4.2)

where Vctrl(n) is the control voltage of the VCO and Tosc(n) is the oscillation period of the VCO in the nth step. The system will converge to a steady oscillation period when VREFC1=NIREFTosc. At this point, further values of VControl will equal their corresponding values in the previous cycle, indicating that the VCO has converged to its desired nominal oscillation period. Both capacitors C1 and C2 are on the order of pF so that they are much larger than the parasitic capacitances of the operational amplifier and the switches.
The above simplified analysis doesn’t take into account the finite gain and input offset voltage of the operational amplifier in the loop. In order to properly analyze the stability and convergence of the loop, we need to re-write (4.2) introducing parameters A and Voffset representing the gain and input offset of the amplifier respectively. The relation between Vcontrol and the voltage at the negative input of the amplifier (Vx) can now be expressed as

Vcontrol   A(Vx  Voffset )
66

(4.3)

Maintaining conservation of charge on capacitors C1 and C2 before and after switch S3 is closed, we get the following expression for Vx as

Vx

(n



1)



Vx

(n)

C2 (A 1) C1  C2 (A 

1)



VREF C1 C1

 I REF NTosc  C2 (A 1)

(n)

(4.4)

and for Vcontrol as

Vcontrol(n

 1)



Vco n tro l( n )

C2 ( A 1) C1  C2 ( A  1)

 Voffset

C1



AC1 C2 ( A 1)

  A I REF NTosc(n) VREF C1
C1  C2 ( A  1)

(4.5)

where Vcontrol(n) is the control voltage applied to the VCO in the previous correction cycle and Vcontrol(n+1) is the control signal that will be applied at the end of the current correction cycle. From (4.5), it is evident that, even in the presence of a finite gain, the compensation loop is stable and will still converge based on a first-order negative feedback exhibited by the third term, regardless of the starting condition. The static error Voffset will cause some amount of ripple on Vcontrol when VREFC1 = IREFNTosc but this can be minimized by increasing the ratio of C1 and C2 and ensuring the input transistors in the amplifier are well matched and large. Care must be taken not to make C2 too large as this would make the incremental voltage buildup on Vcontrol smaller, and hence, the compensation time larger. Making C2 small would lead to a loss of precision on

67

Vctrl, forcing it to periodically overshoot and undershoot the correct value. In our design a C1:C2 ratio of 1:3 was chosen.
Accuracy Analysis
In this section, we will analyze the factors that may limit the accuracy with which the switched capacitor configuration compensates the VCO for process variation. When the loop converges, Vcontrol (n+1)= Vcontrol (n) = Vcontrol ∞ and the oscillation period Tosc is represented as Tosc∞, where Tosc∞ = K’VCO. Vcontrol ∞. We can now determine how close Tosc∞ is to the ideal value of Tosc=VREFC1/NIREF by solving (4.6):

Tosc



(VREF  Voffset )C1

N  I REF



C1 A.KV' CO



(4.6)

The above expression shows that there is still some accuracy error present due to non-idealities in the compensation loop, similar to the error in the comparator based compensation loop. For most operational amplifiers, input offset error is in the range of less than ten millivolts [23] and can be further minimized by a number of proposed techniques [24]. This reduces the error in the numerator of (4.6) to less than 1.5% for VREF = 0.7. For an IREF = 300 µA, C1 = 1 pF, and K’VCO = 1.3 ns/V, the error in the denominator of (4.6) is less than 5%.
Given the fact that Voffset<<VREF and C1/AK’VCO<<IREF, we can approximate the relative accuracy of fosc=1/Tosc as a function of the tolerance of the design parameters:

68

   

f

f

0 osc

2





T To0sc

2





2 I



C12

A2

K '2 VCO



I

0 REF



2 K

'

KV' CO

2



C C10







C C10

2





2 V





2 off

V0 2 REF

(4.7)

Based on the frequency accuracy analysis, the switched capacitor-based compensation loop will achieve similar process variation in its settling oscillation frequency as the comparator-based compensation loop, since VREF, IREF and vertical parallel plate capacitors, which are the dominant contributors to the frequency variation, are the same in both designs.
Low Variation Current Source

Vgs1 M1

Vgs1 M3

Vgs2

M2

R1

Iref

Figure 4.8: Low Variation Addition Based Current Source
The reference current source chosen is the addition based current source topology presented in [76]. The topology has an on-chip variation of 4.5% from its mean current value. This topology is preferred because it consumes low power, has good matching characteristics between transistors, and does not additionally load the frequency correction unit due to its small area. A PFET version of the current source is designed for this work. The topology is shown in Figure 4.8.
69

Voltage Controlled Oscillator A three stage inverter based current starved ring oscillator topology is used as the voltage controlled oscillator for this system. An NFET transistor, which provides current to the inverter branch, has its gate connected to the control voltage generated by the frequency correction block. The equivalent PFET used in the current mirror branch is diode connected. The inverter based ring oscillator is chosen for its simplistic design, but the choice of VCO topology is not restricted for this system as long as there is a well-defined relationship between the control voltage and frequency. System Measurement Results The process compensated VCO system is designed and taped out in the IBM CMOS9SF process. The baseline case being compared is a stand-alone three stage current starved ring oscillator with the input NFET transistor biased by the same low variation addition based current source. The histograms for the center frequencies of the VCO with and without compensation are presented in Figure 4.9 (a) and (b). Without applying the compensation scheme, the frequency of the VCO has a standard deviation of 434.8 MHz about its mean value of 2.86 GHz. This translates to a variation of 15.2%. With compensation, the frequency of the VCO has a standard deviation of 180.4 MHz about its mean value of 2.91 GHz, translating to a variation of 6.2%. The improvement factor is 2.5x over the baseline case. Measurement results are taken from 46 chips.
70

Figure 4.9 (a): Histogram of uncompensated VCO Figure 4.9 (b): Histogram of compensated VCO If we can make the dominant variables in (4.6) temperature invariant, the compensation scheme can be used to lower the temperature drift of the VCO. The addition based current source used has less than 90 ppm/oC temperature sensitivity between 200K and 400K and VREF, C1 and C2 are relatively constant over temperature. We measured the changes of oscillation frequency in the switch-capacitor based loop over a temperature range from 280K to 350K, and the results are plotted in Figure 4.10. The loop architectures exhibits less than 290ppm/oC temperature sensitivity, compared to 965 ppm/oC in the baseline case.
71

Figure 4.10: Temperature drift of baseline oscillator and oscillator in Switched Capacitor loop The added circuitry for the compensation loop consumes an additional 3.3 mW of power. The main consumer is the operational amplifier. The opamp is required only during tracking and correction which takes less than 40 calibration steps, or less than 60ns. Once the VCO has converged to a particular frequency, the control voltage can be latched and stored. This would allow us to turn off the frequency correction unit and save overall system power.
72

Figure 4.11: Convergence behavior of VCO compensation loop The system also has the benefit of being able to provide a measure for the amount of process variation the VCO suffers from. By observing VCTRL, we can tell how much that voltage deviates from its nominal value in order to correct for variations in frequency. This difference in voltage gives a precise measure of the impact process variation has had on the VCO. The area overhead associated with the compensation loop is 0.033 mm2. The die photo is show in Figure 4.12.
Figure 4.12: Die photo of Switched Capacitor based VCO Compensation 73

This cost in power and area is lower than that reported by other VCO compensation methods in [73] and [74] as there are no external references used. This scheme is suitable for high frequency, high precision applications where off-chip components are undesirable.
74

CHAPTER 5
MISMATCH COMPENSATION OF DIGITAL TO ANALOG CONVERTERS
Thermometer DACs The increase in markets for high speed communication systems, image and video signal processing capabilities, and high resolution display systems has forced IC manufacturers to integrate digital and analog systems on the same chip [79] [80] [81] . As a result, there is increasing requirement for calibration tools for seamless integration of such systems. Thermometer digital to analog converters (DACs) are widely used in this domain since they offer the best calibration accuracy due to smaller voltage glitches during code switch and guaranteed monotonicity. Current steering DACs offer a good architectural solution since they do not occupy large silicon area and can easily be implemented in CMOS processes [83] [84] . Since they can also directly drive resistive loads without the need for a voltage buffer, they are much faster than other DAC architectures. Therefore, a wide variety of thermometer current steering DACs are used in a variety of calibration applications, such as compensate for DC offsets in mixers in transceivers [85] [86]
75

Figure 5.1: Calibration DAC used in a Direct Conversion receiver architecture and offset correction occurring due to mismatch in high speed comparators [87] [88] .
Figure 5.2: Offset compensation in comparators using a calibration DAC [87] Since an N bit thermometer current steering DAC uses 2N-1 identical unit current cells, its performance is highly correlated with the matching accuracy of these cells. Although reducing the area of the current cells decreases area and allows for faster operation and reduced timing skew, the accuracy of the current sources degrades since it is inversely proportional to the square
76

root of its current sourcing transistor’s area [90] . This ultimately affects the linearity of the DAC[82] [89] [91] [92] .
In an N bit DAC, where the nominal current output of each unit current cell is Io, the output of the j-th cell is expressed as

I j  Io (1  j )

(5.1)

where εj is the error in the current of the j-th cell from its nominal value as a result of mismatch between the current sources, given as:

 2(I )  4 2(VT0 )   2( ) I (VGS VT 0 )2 

(5.2)

and is approximately a Normal distribution centered at zero mean and σε2 variance. The mismatch in unit current cells in the thermometer DAC leads to two types of non-linearities, or errors: Differential Non-Linearity, and Integral Non-Linearity.
Differential Non-Linearity
Differential Non-Linearity, or DNL, is the normalized deviation between two analog values of adjacent inputs. The DNL of the k-th code of the DAC is expressed as

(5.3)

77

where

∑

( ∑ ), n ( = 2N-1) is the total current cells in a DAC with N-bits

of resolution, and I(k) is the total current output of all current cells till the k-th cell. Graphically, DNL is shown in Figure 5.3where the deviation of the red curve from the ideal gain line (shown in black) at each code is the DNL error for each code.

Figure 5.3: DNL error in DACs Integral Non-Linearity Integral Non-Linearity, or INL, is the normalized deviation between the analog output at each code and the ideal DAC output. It is the accumulation of errors or, the summation of DNL for each code. The INL of the k-th code of the DAC is expressed as:
(5.4)
78

INL error is illustrated in Figure 5.4 where error is shown as the deviation of the red curve from the ideal DAC characteristics (shown in blue) at each code.
Figure 5.4: INL error in DACs Both DNL and INL contribute to reducing accuracy of DACs, which in turn affects their ability to efficiently calibrate out errors in larger systems. To overcome errors due to mismatch in unit cells in DACs, various calibration techniques have been proposed. Prior Work in Calibration of Current Steering DACs Broadly speaking, there are two methods to calibrate current steering DACs. The first relies on foreground calibration to store current errors digitally or tune current cells to a reference using a calibration DAC (CALDAC). During normal operation, the output of the CALDAC is summed with the current in the main DAC to become the final output. The second method adjusts the switching sequence of the current sources in the DAC – switching-sequence-post-adjustment
79

(SSPA) – based on various statistical algorithms to average out current errors in the DAC cells due to mismatch. We will further discuss prior work been done in each of these two areas.
Schofield et. al. in [94] uses small calibrating DACs attached to every current cell to increase its current accuracy. The calibration units increase complexity in the unit cell and extensive external control makes them difficult to integrate in mixed signal systems. Similar work was presented by Huang et. al. in [95] . Bugeja, et. al, demonstrate a self-trimming circuit scheme in [96] to tune individual current sources and correct for a large set of errors but the requirements for area and power hungry components in the calibration scheme along with large head room makes the scheme impractical for sub-micron CMOS systems. Similar trimming procedure is described by Tiilikainen in [99] . A calibration scheme proposed by Radulov et. al. in [93] demonstrates a fully integrated system incorporating a 1-bit ADC and 8 stage Finite State Machine calibrating small CALDACs attached to each thermometer current source. Static current mismatch errors are self-calibrated by comparing each MSB current cell to a reference current. This scheme significantly increases the active area of the DAC and does not calibrate out mismatches in the LSB portion of the DAC, thereby limiting the DAC’s overall linearity. Cong et. al. present another foreground self-calibration scheme in [97] for very low-voltage environments. A 16 bit ADC along with an 8 bit CALDAC is used to determine the amount each of the 63 MSB current units deviate from their nominal value. This error is stored as an 8 bit word in a bank of SRAMs which are read by the CALDAC during normal operation. Further calibration is done for the LSB portion of the DAC to ensure that the total LSB current is equal to 1/64 of the DAC full scale output. The scheme requires extensive amounts of area for the SRAMs and CALDACs and a very high accuracy ADC which will consume a significant portion of the area and power budget for the DAC making it prohibitively expensive for integrated systems. A similar idea is proposed
80

in [98] where dynamic glitch errors and settling errors are stored in a look up table for each of the 3-bit upper MSBs again with large memory requirements and limited DAC calibration.
Switching schemes to remove static mismatch and gradient errors in current steering DACs are reported to substantially reduce non-linearity compared to conventional row-column addressing methods [100] . Various switching schemes have been proposed to decrease INL in unary DACs. A Q2 random walk switching by Geert et. al. in [101] obtains full 14-bit accuracy without the need for trimming or tuning. This method has been extended by Lee et. al. in [102] with a QN rotated walk switching scheme with multiple pointers to control element selection and further randomize switching and average errors. Both methods, however, rely on extensive interconnection network and pseudo-random sequence generators and do not provide hard guarantees on static linearity of DACs. The increased interconnects also increases parasitics, affecting settling rate and dynamic linearity. A novel SSPA method by Chen et. al. in [103] measures the current value of each current source and sorts them in ascending order. Neighboring currents are then summed and re-sequenced by a digital calibration controller. Although there is significant improvement in INL, large memory is required to store the calibrated switching sequence on chip. Lee et. al. present a dynamic element matching (DEM) method along with a thermometer coding scheme to randomize starting-element selection and consecutive-element selection in [104] . The addressing scheme requires a stochastic encoder to randomize the thermometer addresses for the MSBs of the DAC which increases complexity and power. A new architecture for binary to thermometer decoders is presented in [105] . By replacing the conventional row column decoder with a custom switching sequence, the authors demonstrate a yield improvement of 18% but the restrictive nature of the scheme with no analytical proof limits the scope of the method. Chen et. al. device a switching scheme in [106]
81

where the deviation of current sources from the ideal value are first measured after which the current source with the maximum deviation is turned on, followed by several sources with small deviations in the opposite direction. This repeated process has a greater impact on INL improvement than random walk schemes with significant area reduction comparatively. The area for RAMs to store the switching sequence increases exponentially with the number of bits making this scheme, like other SSPA schemes, expensive in area.
Proposed Solution
The solution we propose in this dissertation involves two different techniques to reduce errors in DNL and INL in thermometer current steering DACs.
The first method uses redundancy in unit elements. By adding some extra, identically laid out elements, to the DAC matrix, and them weeding out those elements whose current values have large deviations from the nominal, or mean, DAC current, we are able to reduce overall error in DNL and INL with a smaller area and power penalty than by simply making unit cells larger.
The second technique reorders element addressing based on their relative deviation from the nominal, or mean, DAC current. By alternating elements with values on either side of the mean current, we can reduce overall error accumulation and decrease overall INL of the DAC.
In the following sections, we first theoretically derive how both techniques provide lower nonlinearity errors. We then discuss how these methods can be implemented on-chip with a thorough discussion of the circuit design challenges we encountered followed by measured results of an 8-bit DAC in the TSMC CMOS 65nm process. The methods derived ultimately
82

reduce element mismatch so we can employ the same concept in a wide variety of DACs where element matching is critical. We first derive how the errors, ε, in unit cells directly translate to error in DNL and INL in thermometer current steering DACs.
Error Analysis of Differential Non Linearity
By substituting (5.1) in (5.3) gives

∑∑ ∑

(5.5)

∑
For large n,

. This allows us to simplify the above expression to

( )∑

(5.6)

The worst DNL occurs at code kworst and can be calculated as follows:

||

{|

∑ |}

(5.7)

For large n, Expectation{ε}→0, and the worst DNL occurs at the worst |εkworst|.
From (5.6), we realize that DNL is a linear combination of n independent random variables, allowing us to add the variances of each term in the expression.
83

()

(5.8)

Therefore, the DNL of a thermometer DAC also has a normal distribution centered at zero with a

variance of

.

Error Analysis of Integral Non-Linearity Substituting (5.1) in (5.4) gives us

∑ ∑ (5.9)

∑
For large n,

. This allows us to simplify the above expression to

( )∑

∑

(5.10)

In (5.10), the INL for code kworst can be expressed as:

||

{|∑

∑ |}

(5.11)

84

Once again, for large n, Expectation{ε}→0, and the worst INL is simply the worst case accumulation of errors till kworst.
INL can also be expressed as a linear combination of n independent random variables, allowing us to add the variances of each term in the expression.

()

()

(5.12)

The INL also has a normal distribution around zero mean and variance of

.

Redundancy
To reduce both the maximum value of DNL and INL and their variances, we must design current sources with reduced errors. A common technique is to make each current source larger. From Pelgrom’s model we know that mismatch among devices is inversely proportional to their size and we can potentially reduce mismatch among current sources by a factor of two with a fourfold area increase per unit cell. This leads to very large DAC designs as N increases.
In this work, we propose introducing nadd additional current sources in the DAC and selecting the n best sources whose value lies closest to the mean current value. By introducing redundancies in the normally distributed current sources and weeding out the outliers in the sampled set, we can eliminate the skirts of the current distribution, as shown in Figure 5.5.

85

Figure 5.5: Introducing redundancies in current sources to reduce errors
When the sampled set of errors of size n+nadd approaches a normal distribution (for sufficiently large n), the set of errors with outliers removed resembles a truncated normal distribution with variance of the form:

|

[

( )]

(5.13)

where

and . For a normal distribution,

Since the

truncated distribution has hard cutoffs in its distribution at εmin << -∞, εmax << +∞, This allows for overall |ε| to reduce as well.

.

An 8-bit Redundant Thermometer DAC
As a proof of concept, we design an 8-bit DAC by sampling errors ε from a normal distribution and generating unit current cells based on (5.1). Simulation results of our design are presented in Figure 5.6.
The “baseline DAC” is a DAC design where no additional redundant elements have been added and all the current sources contribute to DNL and INL error. The blue mesh serves as the
86

mismatch among elements in the baseline. We observe that the mismatch among current elements in the DAC is unchanged since outliers aren’t discarded in this DAC. Using the truncated Gaussian distribution model, we eliminate equal amounts of area either side of the distribution for increasing number of redundant elements. The colored plane shows how
decreases as nred increases. It is also interesting to note that there is a steep decline in with only a few bits of redundancy, which suggests that we get more reduction in error than penalty in area. We can then define an “outlier-free DAC” as a DAC design where we have only chosen the n best cells from n+nadd current cells whose value lies closest to the mean current cell.
Figure 5.6: Error Variance of current sources in an 8 bit DAC design with and without redundancy.
87

Similarly, we can plot the DNL and INL error and variances with and without redundancy in Figure 5.7 and 5.8 for both the baseline (blue) and outlier-free (red) DACs. These plots are generated for an 8-bit thermometer DAC designed in the TSMC 65nm CMOS process where transistor mismatch is measured at 20%. In both cases, redundancy decreases both the worst case and variance of non-linearity error for the thermometer DAC.
Figure 5.7: Worst case DNL value with and without redundancy
Figure 5.8: DNL Variance with and without redundancy
88

Practical Realization of a Redundant N-bit DAC In a circuit realization of an N-bit thermometer DAC, a one-dimensional DAC would have huge hardware costs associated with the binary-to-thermometer decoder, addressing, routing, and switching. Therefore, it makes more sense to split the DAC into a two-dimensional matrix with nrows rows where all current units in one row are accessed before we move to units in the next row. In this implementation, however, we introduce nadr elements per row to the already existing ncols elements per row, to simplify access and decoder circuits. The algorithm to determine outliers in each row now becomes:
 For every row i from 1 to nrows  For every element j from 1 to ncols+nadr  Compare each εi,j to the mean value of the error distribution and determine the nadr
outliers per row  Save this information to ensure they are not accessed during DAC operation
Figure 5. 9: Illustration of a 2-Dimensional current steering DAC. Cells shaded black are the outliers for that particular row 89

Since ncols << n (= nrows x ncols), the sample size of the errors ε of unit currents in each row is too

small to approximate it as a truncated normal distribution. As each εi,j is sampled from a normal distribution, the units in a particular row are part of a Student’s T distribution with ncols-1 degrees

of freedom and Si2 is the sampled variance of each row, determined as

∑

̅
. Since

each Si2 has been generated from sampled values of a Student’s T distribution, the set of sampled

variances of nrows rows are chi-squared random variables with nrows-1 degrees of freedom. We

can now estimate the confidence with which our chosen model is appropriate for a 2-dimensional

design. Since the long term average of Si2 approaches

– the variance of a 2-dimensional

outlier-free DAC – we can derive the upper and lower bounds of our confidence interval for

as follows:

̅̅̅ √

̅̅̅ √

(5.14)

The confidence interval is 100(1-α)%, ̅̅̅ is the mean of the sampled variances and variance of the sampled variances of each row.

is the

In Figure 5.10, the green error bars shows the bounds in our estimation for

against

empirical results from a 2-dimensional DAC – plotted as a solid green curve – for various nadr per row. α is chosen as 0.05 to give us a 95% confidence interval in our estimations. For each

case, our model predicts

to within 1% error (inset) confirming that each row contains

values sampled from a Student’s t distribution. The variances are with respect to DNL, which

follows the variance of errors in current units since we are dealing with a large sample size

90

for the 1-dimensional DAC is also plotted in red. It correctly shows lower values than the 2-dimensional case indicating that, removing nadr x nrows outliers over the entire design space gives us a higher probability of removing units further away from the mean of the distribution than selecting smaller samples as outliers for each row.

2DNL

0.045 0.04
0.035 0.03
0.025 0.02
0.015 0.01 0

0.0315 0.031
0.0305

16

Baseline DAC Redundancy 2D Redundancy 1D

16 32 48 64 80 96 112
No. of total outliers

Figure 5.10: Variance of the DNL for a 2-Dimension DAC (red) and a 1-Dimension DAC (green).

For measured mismatch of 20% among current units, we can also simulate worst case DNL for 5000 Monte Carlo runs to see the reductions we obtain with redundancy both in a 1-dimensional and 2-dimensional DAC. We have chosen 2 outliers per row to illustrate the improvements. Once again we observe that, across all Monte Carlo runs, there is significant reduction in DNL when redundancy among current sources is introduced. The 1-dimensional DAC shows more improvement than the 2-dimensional DAC since we can eliminate outliers from a larger sample space, giving more control on error reduction. Nonetheless, a 2-dimensional DAC exhibits 40% less DNL error than the baseline case with the additional cost of 32 redundant elements.
91

Figure 5.11: 5000 Monte Carlo runs plotting DNL for the baseline DAC (blue), 1D DAC with redundancy (green), and 2D DAC with redundancy (blue)
Reordering
Another technique we propose in this dissertation is to reduce INL error in thermometer DACs by reordering elements. We can rewrite the expression for INL as:

∑∑

(5.15)

Since there is an equitable distribution of errors around zero, we observe that worst case INL is due to worst case error accumulation. As designers, we have control over our switching scheme to reduce errors. This allows us to envision a scheme where we can alternate between switching on elements with positive and negative deviations of current value from IM so as to reduce
92

overall error accumulation. By reordering the elements in the current steering DAC, we can reduce overall INL with minimal switching overhead.

Figure 5.12: Reducing INL error by alternatively switching between elements with positive and negative deviations from IM

The errors, can be ordered in increasing order of their values, i.e.,

. The brackets in the subscript denote that the random variable is sampled from the ordered

set. Even though the initial errors, are independent random variables, the random variables of

the ordered set, { |

} are no longer independent. To decrease the error accumulation,

we further reorder the elements from this ordered set by alternating between the minimum and

maximum value elements. The expression for INL, for any code , using this scheme of access

of current sources is given by

93

∑( ∑( {

) ) ()

(5.16)

By adding errors with opposite signs we can cancel the effect of the error introduced by the previous code. Since the switching sequence has been altered, we expect the variance of INL to reduce as well, based on the first term in equation for INL variance. This is confirmed in simulation for both the worst case INL over 5000 Monte Carlo runs and variance of INL for every code k in the DAC.

Worst case INL (LSB)

8
Baseline 7 Reorder

6

5

4

3

2

1

0

0

1000

2000

3000

4000

5000

Monte Carlo runs

Figure 5.13: Worst case INL for 1-dimensional DAC with element reordering. With reordering, the worst-case error (green) is significantly reduced as compared to the baseline DAC (blue)

94

I2NL(k)

3
Baseline Reorder
2.5
2
1.5
1
0.5
0 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256
Input code k

Figure 5.14: Variance of INL for each code of the DAC over 5000 Monte Carlo runs. There is significant reduction with reordering of elements (green) when compared to the baseline DAC
(blue)

Reordering in a 2-Dimensional DAC

For the practical 2-dimensional realization of the DAC, we employ a row-wise reordering

scheme to minimize complexity while achieving considerably lower INL for each code of the

DAC. We still access current elements sequentially within a row. The sum of errors in each row

is taken as the row-error random variable,

, and then the rows are accessed by reordering

these random variables, i.e., the rows are picked from {

},

where

is the ordered statistics of

|

.

95

INLDAC (LSB)

8
Baseline 7 Reorder 1D
Reoder 2D
6 5 4 3 2 1 0 0 1000 2000 3000 4000 5000
Monte Carlo runs
Figure 5.15: Worst case INL when all elements are reordered (green) and when rows are reordered (red)
3
Baseline Reorder 1D 2.5 Reorder 2D
2
1.5
1
0.5
0 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256
Input code k
Figure 5.16: Variance of INL at each code for the 2-dimension reordered DAC (red) and the 1dimension reordered DAC (green)
Since a 2-dimensional DAC reduces the sample space for error reduction, there is less worst-case INL reduction for each Monte Carlo run as compared to the 1-dimensional DAC. But there is still a significant error reduction when compared to the baseline DAC. The variance of INL at
96

I2NL(k)

each code for the 2-dimensional DAC has a similar envelope to the 1-dimensional DAC but follows a parabolic trend between successive rows. This is because, within a row, elements are not reordered and, therefore, follow a similar trend to that of the baseline DAC. Even with the limitations imposed with row-wise reordering, we get 41% reduction in INLDAC for various values of σε, as shown in Figure 5.17.

3INL DAC

10
Baseline 9 Reorder 1D 8 Reorder 2D
7
6
5
4
3
2
1
0 5 10

15 20 25
Current source mismatch %

30

35

Figure 5.17: 3σ of INLDAC across different current source mismatch

Combining both Redundancy and Reordering

While reordering reduces cumulative errors, we notice that the INL of reordered DAC – expressed in (5.16) – depends on the errors of unit elements, which is not being reduced with reordering. We can then envision a 2-dimensional DAC where we first use redundancy to reduce |ε| and σε of unit sources within rows to provide some reduction in INL, followed by rowreordering to further improve the INL. The only hardware required to combine both methods is additional memory units to save row ranks and outlier assertion for each unit cell. In Figure 5.18 we have simulated the improvement in INL for the two error reduction schemes, against current element mismatch and number of outliers per row. The top meshed plane is for the baseline DAC
97

where none of the methods are implemented. The intermediate plane is the DAC INL when redundancy in current elements is present. It is evident that, reducing the absolute error and its variance has a direct impact on INL reduction. The bottom plane is the INL when both redundancy and reordering are in effect for the DAC. In this case, the combination of these methods drastically reduces INL by over 55%.

Figure 5.18: 3σ of

for methods presented in this work

Circuit Implementation and Challenges

As a circuit designer, there are numerous implementation challenges associated with designing a 2-D current steering thermometer DAC with nadr bits of redundancy per row. The three major hurdles that need to be overcome are:

1. Determining what current elements are outliers

2. Disabling a current source once it has been marked as an outlier

98

3. Addressing current elements taking the position of outliers into account
In this section we will discuss how each of these issues are addressed in the first generation design of the thermometer DAC, where two additional elements were added as outliers in each row (nadr = 2).
Determining Outliers
In order to eliminate two elements whose current values have the largest absolute deviations from the mean current value of the DAC cell, IM, we first sequester the two largest (IA and IB) and two smallest (IC and ID) current cells in each row by comparing each cell in the row to IM. With the aid of the positional chart, shown in Figure 5.19, we can then eliminate two outliers per row (highlighted in red) based on their relative distance from IM. If IM lies between the four sequestered elements, we mark IA and ID as outliers. If IM lies in the upper quadrant between IC and ID, then we eliminate the two smallest elements IA and IB. Finally, if IM lies in the lower quadrant between IA and IB, we eliminate the two largest elements IC and ID.

Scenario 1 Scenario 2 Scenario 3

IA IA IA

IB IM IB IC IM IB

IC IM IC

Increasing current value

ID ID ID

Figure 5.19: Positional chart to determine redundant sources
In the first generation design of the DAC, a computer that runs MATLAB and Data Acquisition toolboxes coordinated the comparison and elimination. The second generation implements these functionalities on-chip and will be discussed later in this chapter.
99

Disabling Outliers
Once a current element has been determined as an outlier for that row, we require a method of storing that information for that element. This can be accomplished with the help of an SRAM cell attached to each element, as shown in Figure 5.20 with the topology of the 6T SRAM shown in Figure 5.21.

Figure 5.20: DAC Cell with SRAM

WWL

WBL

RWL

Valid

Valid

WBL

Figure 5.21: 6T SRAM with a transistor to read its contents
If a cell needs to be disabled since it is flagged as an outlier for that row, we write a ‘0’ to the SRAM. If the cell is included during normal DAC operation, we write ‘1’ to the SRAM unit. This information is written to the SRAM of each cell once outliers of every row have been determined.
100

Element Addressing
If the two outliers in a row existed right at the beginning or at the very end, it becomes trivial to skip these elements during normal DAC operation. Unfortunately, outliers in a row can exist in any column within the row, making element addressing during digital-to-analog conversion challenging. For example, if the outliers of a row are highlighted in red, and we wish to access the eight element in the row, the two situations in Figure 5.22 require different addressing schemes based on the position of the outliers

Figure 5.22: In the top figure, the two outliers occur after the eight element being accessed (indicated by the blue arrow). We don't need to take the position of the outliers into account in this scenario. In the bottom figure, the outliers occur before the element being accessed and this
situation needs to be dealt with using additional logic.

To overcome this, we design the addressing scheme as a priority encoder. The truth table is shown in Table 5.1 below.

TABLE 5.1

TRUTH TABLE FOR PRIORITY ENCODER SCHEME

Previous Sum 00

VB(i) 1

VB(i+1) Next Sum 1 00

00

01

01

00

10

00

00

00

10

01

11

01

01

10

10

10

11

10

10

01

10

101

VB(i) and VB(i+1) indicate the validity of the current cell, stored in its SRAM. The ith cell is the current cell being accessed and Sum keeps track of how many elements to skip in order to access the correct current cell. The first cell in the row has a sum of 00 since no outliers have been encountered as yet. The priority encoder designed for our system is for two outliers per row. This logic can be easily extended to include more than two outliers for each row. First Generation DAC Design The first generation of the 8 bit thermometer current steering DAC was designed and taped out in the TSMC 65nm CMOS process and each DAC cell is as shown in Figure 5.23.
Figure 5.23: Unit Current Cell of thermometer current cell - Generation 1 The chip micrograph is shown in Figure 5.24.
102

Figure 5.24: Chip micrograph of the DAC designed in the TSMC 65nm process DNL Measurements Measured DNL for the DAC is shown in Figure 5.25. Without redundancy, the baseline DAC has a maximum DNL of 0.52. With two bits of redundancy per row, the DNL drops to 0.33. This is a reduction in DNL error of 38%.
103

Figure 5.25: Measured results of DNL of the 8-bit thermometer current steering DAC. Shaded area is used to emphasize reduction in DNL error achieved with two bits of redundancy per row INL Measurements Measured results for INL is shown in Figure 5.26. As expected, two redundant elements per row reduces INL from 3.74 to 2.78, a reduction of 28%. After reordering the 16 rows, however, we observe an additional reduction of 30%, bringing the overall reduction in INL from the baseline case to 48%.
104

Figure 5.26: Measured results for INL of 8-bit thermometer current steering DAC. Second Generation DAC Cell Design The unit cells designed for use in the first generation of the DAC occupied a lot of area for logic and switches. For the next generation, the size of the DAC cell was drastically shrunk to justify that we could achieve the same reduction in DNL error without any area penalty. The schematic of the new DAC cell is presented in Figure 5.27.
105

Figure 5.27: Unit cell designed for the second generation of the DAC
The top PMOS transistor determines the current contribution of the cell while the cascode PMOS provides additional isolation. The switch controlled by the signal Current Row is common to all cells in a particular row. The role of the rest of the switches is described below.
The DAC has 3 phases of operation: calibration, outlier enable/disable, DAC operation.
The operation of the cell in each phase is briefly described.
Calibration
During calibration, the current contribution of each current cell is read and recorded for comparison with the mean. In this phase, Column, Override, and Current Row are asserted to allow the current of the chosen cell to be read at the output.
Outlier Enable/Disable
Once the current of each cell has been compared to the mean current of the DAC matrix, and after it has been ascertained whether it is an outlier or not, its SRAM must be written with the appropriate value. For verification of this phase, Column and Current Row are asserted. If a ‘1’
106

was written in the SRAM, the current contribution of that cell will be recorded at the output, indicating that the cell is not an outlier and is to be used during normal DAC operation. On the other hand, if a ‘0’ was written to the SRAM, there is no path for the current to flow from this cell to the output, indicating that this cell is an outlier and is to be disabled during normal DAC operation.
DAC Operation
During normal DAC operation, there are two scenarios depending on which, the switch controls change.

Figure 5.28(a): Scenario 1: Cell under consideration is the highest accessed cell

Figure 5.28 (b): Scenario 2: Cell under consideration occurs (red) before highest
accessed cell

In Scenario 1, the cell we are analyzing is also the highest cell we are attempting to access. In this case, Column and Current Row are asserted and its current will contribute to the output if its SRAM stores a ‘1’. In Scenario 2, the cell we are analyzing occurs before the highest cell we are attempting to access. In this case, Next Row is asserted and, since this is a thermometer DAC, all rows before the highest accessible row will also be asserted. Therefore, Current Row is also asserted and current will flow to the output if its SRAM stores a ‘1’.

107

Justification for size of unit cell This DAC cell uses 8 extra transistors (7 in the SRAM, 1 for Override). In order to justify this area increase, we must prove that using extra transistors gives more reduction in error than by simply increasing the size of the current contributing PMOS.
Figure 5.29: DAC DNL for various sizes of current contributing PMOS. The star indicates area increase with new DAC cell and two bits of redundancy per row
From Figure 5.29, we notice that, to achieve the same DNL of 0.32 as we get for a DAC with two bits of redundancy, we need to increase the area of the current contributing PMOS by a factor of 6. With two bits of redundancy per row and using the DAC cell from Figure 5.20, our effective area increases only by a factor of 3.2 for a DNL of 0.32. This implies that we can achieve the lower DNL using redundancy with 55% less area usage than by increasing the size of the PMOS transistor to reduce mismatch.
108

Generating the Mean Current
In order to determine outliers, we need to compare the current contribution of each cell to the mean current of the DAC matrix. Circuit approaches of generating and storing the mean current value of the DAC cell include charge storage on capacitors with an amplifier to maintain the precise charge for a long duration. Unfortunately, this requires a large capacitor and a high gain operational amplifier and the method consumes considerable amount of area and power. Further, the mean value needs to be constantly refreshed to negate charge leakage and accommodate for drifts in environmental conditions.
Median as an approximation for the mean
To overcome these issues involved in generating the mean, we investigate whether the median current value is a good representation of the mean of the DAC. We can define ε as the error between the median, IM and mean, Iµ of the DAC. We want IM to fall within the bounds specified by ε as shown in the probability condition in (5.17):

[]

(5.17)

(5.17) can be redefined using the probability distribution function of IM as shown in (5.18)

∫ (5.18)

109

Order Statistics
The distribution of IM is derived from the concept of order statistics. The order statistics of a random sample X1…Xn are the sample values placed in ascending order and denoted as X(1)…X(n). Since the mismatch profile of the current sources in the DAC assume a Gaussian distribution, we conclude that the current cells in the DAC constitute a sample of size n from an infinite Gaussian distribution. The current sources satisfy the order statistics requirement when they are ranked such that I(1) < I(2) < … < I(N) and IM is the median current cell of this ordered sample. The pdf of the rth order statistic I(r) is given as:

[∫ ] [∫ ] where f(x)∂x is the pdf of a Gaussian distribution given as:

(5.19)

√ (5.20)
Since the pdf of each element in the ranked sample can now be determined by (19), the pdf of the median cell is given in (5.21) as:

n1 n1

 fx (IM )



 n

n! 1

2!

x  

f

 ( x)x


2

  2 f (x) f (x)x
x 

2

(5.21)

(5.21) can be simplified for sufficiently large n in (5.22) 110

(
√

)

(5.22)

where f(µ) is the distribution of the mean which equals f(x) for sufficiently large n.
Figure 5.30 graphically represents the solution in (5.18) for various values of ε. For our 8 bit DAC design with two bits of redundancy per row, the error between IM and Iµ is close to 4%. This error is tolerable since outliers can have an absolute deviation as large as 50% of Iµ.

Figure 5.30: Plot of median confidence for various errors

Median Generation

The median current is generated using replica unit current cell in a selection sort algorithm to

tune its value to lie exactly in the middle of the currents in the DAC matrix. We accomplish this

with a 5-stage successive approximation method using a 5-bit binary DAC, as shown in Figure

5.31.

111

DAC Median Cell Vbias

5b Binary DAC

S<4:0>

8 bit current steering DAC

Counter and SAR Logic
Current Comparator IinA

IinB>IinA
Mismatch Detected
IinA>IinB

IinB
Figure 5.31: Median Generation using a successive approximation approach
In each cycle of approximation, the current of the replica median cell, IM, is compared with each of the 288 unit cells in the DAC matrix using a high-resolution 1-bit ADC current comparator. Its operation is discussed in the next section. Once all comparisons have been made, the binary DAC is updated based on whether IM lies in the upper or lower half of the current distribution. After 5 cycles of approximation – over 1000 Monte Carlo runs – this approach generates IM which tracks Iµ with an accuracy of 93.3%.
High Resolution Current Comparator
In order to compare IM with each of the unit cells in the DAC matrix, we have designed a high resolution current comparator with nonlinear sensing, as shown in Figure 5.32. This topology combines the advantages of high resolution and fast amplification for low current levels and reduced voltage swings at VA for larger current levels.
112

Figure 5.32: High Precision Current Comparator
Figure 5.33: Sampling (fast) and averaging (slow) clocks used in comparison For well-matched MA and MB, I1 equals I2. When two current IinA and IinB are compared, IDiff then equals IinB – IinA. For positive IDiff, VA increases and VB decreases, causing MP to turn ON, creating a feedback loop. Similarly, for negative IDiff, MN turns ON. These voltage excursions are sampled by the two D flip flops to determine whether IinA > IinB or vice versa. Unfortunately, any mismatches between MA and MB will contribute to IDiff, causing incorrect mismatch detection. In order to minimize this, we use Dynamic Element Matching where we switch IinA and IinB feeding into MA and MB with a fast clock while sampling IDiff with a slow clock. This allows both IinA and IinB to equally sample the offset between MA and MB, averaging out their mismatch over the course of comparison. By doing this, we are able to detect as little as 5% mismatch between IinA and IinB, measured over 1000 Monte Carlo runs.
113

Eliminating Outliers Once IM has been generated, we can use the positional chart in Figure 5.22 to find IA, IB, IC, and ID for each row. Ranking unit cells in each row is accomplished with a time-to-digital converter using a current starved ring oscillator, which triggers an up-counter for a specified duration. The circuit topology is shown in Figure 5.34.
Figure 5.34: Time-to-Digital converter The circuit has a resolution of 93nA and uses a 7-bit counter to digitize the current contribution from every unit cells. These values are stored in registers and compared to a digital representation of IM to determine outliers. Cost of Calibration on Overall DAC Design The additional calibration circuitry utilizes minimum sized logic and switches and increases the required area by an additional 11% on top of the DAC matrix. With two bits of redundancy per row and the designed calibration blocks, we still achieve a reduced DNL with 40% less area usage as compared to simply increasing the current driving PMOS transistor, confirming that the benefits of calibration outweigh cost in complexity and area.
114

Figure 5.35: DNL reduction taking area occupied by the calibration circuitry into account
Conclusion
In this chapter we propose two techniques to lower non-linearity errors in thermometer current steering digital-to-analog converters, which occur due to increasing mismatch among unit current cells in advanced CMOS processes. The first technique we discuss introduces additional current sources in the DAC and eliminates outliers to reduce the error distribution among units. The redundancy in unit sources is a small price to pay in terms of area and power but with increased reduction in mismatch and hence DNL and INL errors in the DAC. Measured results of an 8-bit DAC designed in the TSMC 65nm CMOS technology confirm that redundancy leads to 38% reduction in DNL and 28% reduction in INL.
The second technique reorders rows of the 2-dimensional DAC to minimize accumulation of errors and therefore reduce INL error of the DAC. Measured results show a further 30% reduction in INL error of the DAC. The two techniques combined show a 48% reduction in INL error.
115

We also discuss circuit solutions to implement both redundancy and reordering on-chip and demonstrate that the calibration circuitry improves both DNL and INL errors with 40% less area usage than by simply making unit current cells larger to limit their mismatch. This confirms that implementation of both redundancy within elements and reordering of rows of a thermometer DAC offers superior performance with a cheaper power overhead and area footprint, making it a viable solution to decrease errors in thermometer DACs required for precise calibration in a wide variety of wireless and wireline applications.
116

CHAPTER 6

Conclusion

CONCLUSION AND FUTURE WORK

The goal of this dissertation is to present on-chip circuit compensation techniques to reduce the adverse effects of process variation in advanced CMOS mixed signal circuit blocks.

Circuit performance is increasingly impacted by process variation and it becomes more expensive to perform post-fabrication techniques to bring performance of ICs closer to their nominally designed specification. In this dissertation, we have presented two techniques to overcome the adverse effects of process variation in low-noise amplifiers and voltage controlled oscillators. We designed a novel bias circuit using statistical feedback to measure changes in threshold voltage, which occur due to variations in process, supply voltage, and temperature. The error signal generated is fed back to one of the inputs of the LNA to compensate for variations in overall transconductance of its input transistor, and hence the voltage gain. By compensating for variations in transconductance, we are also able to reduce variations in noise figure and input match of the LNA. Measured results of 100 LNAs over two wafer runs in the TSMC 65nm CMOS process show a 3.6x reduction in voltage gain variations due to process variation. Our scheme also decreases gain variations due to temperature changes from 2310 ppm/oC to 1554 ppm/oC and due to supply voltage changes from 275 ppt/V to 29 ppt/V. Our technique is scalable with process and can be applied to other types of amplifiers where transconductance determines gain and we demonstrate similar reductions in gain variation for a common source amplifier.

117

We also designed a switched-capacitor based feedback scheme that tracks drifts of center frequency of a current starved ring oscillator (CSRO) that occur due to variations in process and temperature. The designed circuit generates an error signal to compensate for this change. Measured results for CSROs designed in the IBM 90nm CMOS process reduce the spread in center frequency from 15.2% to 6.2% and decrease the spread to less than 1% across temperature.
As processes scale, random mismatches among identically designed circuit blocks becomes increasingly exacerbated. This affects the performance of various circuit blocks where accurate matching of circuit units is extremely important, such as resistors in a resistor ladder and DAC, differential input pairs in a comparator bank, and sensing transistors in imagers. In this dissertation, we study how mismatch of unit current cells in a thermometer current steering DAC affects its non-linearity performance. Instead of making the cells larger to reduce mismatch at the expense of increased area and power, we propose two new techniques – redundancy among current units, and reordering of unit cells – to improve both DNL and INL performance of such DACs. Redundancy among unit cells is used to remove outlier elements from the DAC to reduce the overall error and mismatch of unit cells. Using two redundant elements per row of a two dimensional DAC reduces DNL and INL error by 38% and 28% respectively in an 8-bit thermometer current steering DAC designed in the TSMC 65nm CMOS process. Reordering of unit elements reduces overall error accumulation in the DAC. In our two-dimensional DAC design, we alternate between rows with opposite error signs. Measured results show an additional 30% error reduction in INL.
118

Future Work
The techniques we have presented in this dissertation reduce the effects of variation on performance of various mixed signal blocks that can be used in a variety of wireline and wireless systems. A natural succession to the work presented is to focus on studying the effects of variation on larger systems and combining some of the techniques we have proposed to improve yield across process, temperature, and supply voltage. Preliminary work has been demonstrated by Gangasani, et. al. in [107].
Systems can also be optimally designed to improve an overall metric such as yield, power, and speed, instead of just variation reduction of a specific electric parameter. Work shown in [108] by Dutta propose a sizing algorithm to improve overall profit of ICs rather than overall yield.
On-chip compensation circuit techniques are a great tool for designing robust mixed signal circuits and systems without large overheads in area, power, and cost. By targeting the metric whose variation we are controlling and narrowing down the electrical parameters in CMOS devices which contribute to its variation, we can derive inspiration from a wide variety of feedback techniques and statistical solutions to design self-healing circuit topologies which track and adapt to changes in process, temperature, and supply voltage. Occupying smaller footprints and consuming less power than traditional post-fabrication techniques, on-chip variation compensation circuit techniques are increasingly becoming the modus operandi for designing low cost, robust mixed signal systems.
119

REFERENCES
[1] http://www.theinquirer.net/inquirer/news/1635320/global-semiconductor-market-growscent
[2] Moore, G.,; “Cramming more components onto integrated circuits”, Electronics, vol. 38, No. 8, April 19, 1956
[3] Cherry, S.; , "Edholm's law of bandwidth," Spectrum, IEEE , vol.41, no.7, pp. 58- 60, July 2004
[4] Edholm, P., “Towards High Speed Wireless Ubiquity and Openness: Delivering Hyperconnectivity”, Nortel Enterprise Solutions.
[5] Abidi, A.A.; , "The Path to the Software-Defined Radio Receiver," Solid-State Circuits, IEEE Journal of , vol.42, no.5, pp.954-966, May 2007
[6] Brandolini, M.; Rossi, P.; Manstretta, D.; Svelto, F.; , "Toward multistandard mobile terminals - fully integrated receivers requirements and architectures," Microwave Theory and Techniques, IEEE Transactions on , vol.53, no.3, pp. 1026- 1038, March 2005
[7] Brandolini, M.; Rossi, P.; Manstretta, D.; Svelto, F.; , "Toward multistandard mobile terminals - fully integrated receivers requirements and architectures," Microwave Theory and Techniques, IEEE Transactions on , vol.53, no.3, pp. 1026- 1038, March 2005
[8] Steyaert, M.S.J.; De Muer, B.; Leroux, P.; Borremans, M.; Mertens, K.; , "Low-voltage low-power CMOS-RF transceiver design," Microwave Theory and Techniques, IEEE Transactions on , vol.50, no.1, pp.281-287, Jan 2002
[9] http://rfdesign.com/news/CMOS-transceiver-sockets/ [10] Kwyro Lee; Nam, I.; Ickjin Kwon; Gil, J.; Kwangseok Han; Park, S.; Bo-Ik Seo; , "The
impact of semiconductor technology scaling on CMOS RF and digital circuits for wireless application," Electron Devices, IEEE Transactions on , vol.52, no.7, pp. 14151422, July 2005 [11] Woerlee, P.H.; Knitel, M.J.; van Langevelde, R.; Klaassen, D.B.M.; Tiemeijer, L.F.; Scholten, A.J.; Zegers-van Duijnhoven, A.T.A.; , "RF-CMOS performance trends," Electron Devices, IEEE [12] Fukui, H.; , "Design of Microwave GaAs MESFET's for Broad-Band Low-Noise Amplifiers," Microwave Theory and Techniques, IEEE Transactions on , vol.27, no.7, pp. 643- 650, Jul 1979 [13] Kwyro Lee; Nam, I.; Ickjin Kwon; Gil, J.; Kwangseok Han; Park, S.; Bo-Ik Seo; , "The impact of semiconductor technology scaling on CMOS RF and digital circuits for wireless application," Electron Devices, IEEE Transactions on , vol.52, no.7, pp. 14151422, July 2005 [14] International Technology Roadmap for Semiconductors – Executive Summary, 2011 [15] Bowman, K.A.; Alameldeen, A.R.; Srinivasan, S.T.; Wilkerson, C.B.; , "Impact of Dieto-Die and Within-Die Parameter Variations on the Clock Frequency and Throughput of Multi-Core Processors," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on , vol.17, no.12, pp.1679-1690, Dec. 2009 [16] Menezes, N., “The Good, the Bad, and the Statistical”, Intl. Symp. Phys. Design, March 18-21 2007 [17] Duvall, S.G.; , "Statistical circuit modeling and optimization," Statistical Metrology, 2000 5th International Workshop on , vol., no., pp.56-63, 2000
120

[18] Bowman, K.A.; Duvall, S.G.; Meindl, J.D.; , "Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integration," Solid-State Circuits, IEEE Journal of , vol.37, no.2, pp.183-190, Feb 2002
[19] Agarwal, K.; Nassif, S.; , "Characterizing Process Variation in Nanometer CMOS," Design Automation Conference, 2007. DAC '07. 44th ACM/IEEE , vol., no., pp.396-399, 4-8 June 2007
[20] Asenov, P, Kamsani N., Reid, D., Millar, C., Roy, S., Asenov, A., “Combining process and statistical variability in the evaluation of the effectiveness of corners in digital circuit parametric yield analysis,” Solid-State Device Research Conference (ESSDERC), 2012 Proceedings of the European, 14-16 Sept. 2010
[21] Duvall, S.G.; , "Statistical circuit modeling and optimization," Statistical Metrology, 2000 5th International Workshop on , vol., no., pp.56-63, 2000
[22] Asenov, A.; Brown, A.R.; Davies, J.H.; Kaya, S.; Slavcheva, G.; , "Simulation of intrinsic parameter fluctuations in decananometer and nanometer-scale MOSFETs," Electron Devices, IEEE Transactions on , vol.50, no.9, pp. 1837- 1852, Sept. 2003
[23] Bazizi, E.M.; Pakfar, A.; Fazzini, P.F.; Cristiano, F.; Tavernier, C.; Claverie, A.; Burenkov, A.; Pichler, P.; , "Comparison between 65nm bulk and PD-SOI MOSFETs: Si/BOX interface effect on point defects and doping profiles," Solid State Device Research Conference, 2009. ESSDERC '09. Proceedings of the European , vol., no., pp.292-295, 14-18 Sept. 2009
[24] Asenov, A.; , "Random dopant induced threshold voltage lowering and fluctuations in sub-0.1 μm MOSFET's: A 3-D “atomistic” simulation study," Electron Devices, IEEE Transactions on , vol.45, no.12, pp.2505-2513, Dec 1998
[25] Asenov, A.; Saini, S.; , "Suppression of random dopant-induced threshold voltage fluctuations in sub-0.1-μm MOSFET's with epitaxial and δ-doped channels," Electron Devices, IEEE Transactions on , vol.46, no.8, pp.1718-1724, Aug 1999
[26] Yun Ye; Liu, F.; Min Chen; Nassif, S.; Yu Cao; , "Statistical Modeling and Simulation of Threshold Variation Under Random Dopant Fluctuations and Line-Edge Roughness," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on , vol.19, no.6, pp.987-996, June 2011
[27] Reid, D.; Millar, C.; Roy, G.; Roy, S.; Asenov, A.; , "Analysis of Threshold Voltage Distribution Due to Random Dopants: A 100 000-Sample 3-D Simulation Study," Electron Devices, IEEE Transactions on , vol.56, no.10, pp.2255-2263, Oct. 2009
[28] Reid, D.; Millar, C.; Roy, G.; Roy, S.; Asenov, A.; , "Understanding LER-induced statistical variability: A 35,000 sample 3D simulation study," Solid State Device Research Conference, 2009. ESSDERC '09. Proceedings of the European , vol., no., pp.423-426, 14-18 Sept. 2009
[29] Asenov, A.; Brown, A.R.; Davies, J.H.; Kaya, S.; Slavcheva, G.; , "Simulation of intrinsic parameter fluctuations in decananometer and nanometer-scale MOSFETs," Electron Devices, IEEE Transactions on , vol.50, no.9, pp. 1837- 1852, Sept. 2003
[30] Asenov, A.; Kaya, S.; Brown, A.R.; , "Intrinsic parameter fluctuations in decananometer MOSFETs introduced by gate line edge roughness," Electron Devices, IEEE Transactions on , vol.50, no.5, pp. 1254- 1260, May 2003
121

[31] Nassif, S.; Bernstein, K.; Frank, D.J.; Gattiker, A.; Haensch, W.; Ji, B.L.; Nowak, E.; Pearson, D.; Rohrer, N.J.; , "High Performance CMOS Variability in the 65nm Regime and Beyond," Electron Devices Meeting, 2007. IEDM 2007. IEEE International , vol., no., pp.569-571, 10-12 Dec. 2007
[32] Agarwal, A.; Paul, B.C.; Mukhopadhyay, S.; Roy, K.; , "Process variation in embedded memories: failure analysis and variation aware architecture," Solid-State Circuits, IEEE Journal of , vol.40, no.9, pp. 1804- 1814, Sept. 2005
[33] Tschanz, J.W.; Kao, J.T.; Narendra, S.G.; Nair, R.; Antoniadis, D.A.; Chandrakasan, A.P.; De, V.; , "Adaptive body bias for reducing impacts of die-to-die and within-die parameter variations on microprocessor frequency and leakage," Solid-State Circuits, IEEE Journal of , vol.37, no.11, pp. 1396- 1402, Nov 2002
[34] Datta, A.; Bhunia, S.; Jung Hwan Choi; Mukhopadhyay, S.; Roy, K.; , "Profit Aware Circuit Design Under Process Variations Considering Speed Binning," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on , vol.16, no.7, pp.806-815, July 2008
[35] Nieuwoudt, A.; Ragheb, T.; Massoud, Y.; , "Mitigating the Impact of Component Variations on Narrow-Band Low Noise Amplifiers for System-on-Chip Applications," Wireless and Microwave Technology Conference, 2006. WAMICON '06. IEEE Annual , vol., no., pp.1-5, 4-5 Dec. 2006
[36] Salinas, D., Pena, D., “Design of Reconfigurable RF circuits for self-compensation”, 2010 Barcelona Forum on Ph.D. Research in Communications, Electronics, and Signal Processing
[37] Hassan Hassan, Mohab Anis, Mohamed Elmasry, Impact of technology scaling and process variations on RF CMOS devices, Microelectronics Journal, Volume 37, Issue 4, April 2006, Pages 275-282
[38] Tomkins, A.; Aroca, R.A.; Yamamoto, T.; Nicolson, S.T.; Doi, Y.; Voinigescu, S.P.; , "A Zero-IF 60 GHz 65 nm CMOS Transceiver With Direct BPSK Modulation Demonstrating up to 6 Gb/s Data Rates Over a 2 m Wireless Link," Solid-State Circuits, IEEE Journal of , vol.44, no.8, pp.2085-2099, Aug. 2009
[39] Kupp, N.; He Huang; Makris, Y.; Drineas, P.; , "Improving Analog and RF Device Yield through Performance Calibration," Design & Test of Computers, IEEE , vol.28, no.3, pp.64-75, May-June 2011
[40] http://mct.com/News_tacklingrisingcost.aspx [41] Williams, E., Ayres, R., Heller, M., “The 1.7 Kilogram Microchip: Energy and Material
Use in the Production of Semiconductor Devices”, Environ. Sci. Technologies., vol. 36, pp. 5504 – 5510, 2002 [42] Liqiu Deng; Williams, E.; , "Measures and trends in energy use of semiconductor manufacturing," Electronics and the Environment, 2008. ISEE 2008. IEEE International Symposium on , vol., no., pp.1-6, 19-22 May 2008 [43] Branham, M., Gutowski, T., “Deconstructing Energy Use in Microelectronics Manufacturing: An Experimental Case Study of a MEMS Fabrication Facility”, Environ. Sci. Technologies., vol. 44, pp. 4295 – 4301, 2010 [44] http://techcrunch.com/2011/04/28/worldwide-mobile-phone-market-grew-20-in-q1fueled-by-smartphone-boom/ [45] D. Han, B.S. Kim, A. Chatterjee, “DSP-Driven Self-Tuning of RF Circuits for ProcessInduced Performance Variability”, IEEE Trans. VLSI Syst., vol. 18, no. 2, pp. 30-39, February 2010
122

[46] K. Bowman, S. Duvall, J. Meindl, “Impact of Die-to-Die and Within-Die Parameter Fluctuations on the Maximum Clock Frequency Distribution for Gigascale Integration”, IEEE J. Solid-State Circuits., vol. 37, No. 2, pp. 183 – 190, February 2002
[47] W. Zhao, F. Liu, K. Agarwal, D. Acharyya, S. Nassif, K. Nowka, Y. Cao, “Rigorous Extraction of Process Variations for 65-nm CMOS Design”, IEEE Trans. Semicon. Manuf., vol. 22, No. 1, pp. 196 – 203, February 2009
[48] S. Nassif, “Modeling and Analysis of Manufacturing Variations”, IEEE Custom Integrated Circuits Conf. (CICC’01), pp.223 – 228, May 6-9 2001
[49] L. Counts, “Analog and mixed-signal innovation: The process-circuit-system-application interaction”, IEEE Int. Solid-State Circuits Conf., 2007 Dig. Tech. Papers., pp 24-30,
2007
[50] A. Tomkins, R.A. Aroca, T. Yamamoto, S.T. Nicolson, Y. Doi, S.P. Voinigescu, "A
Zero-IF 60 GHz 65 nm CMOS Transceiver With Direct BPSK Modulation
Demonstrating up to 6 Gb/s Data Rates Over a 2 m Wireless Link," IEEE J. Solid-State
Circuits, vol.44, no.8, pp.2085-2099, Aug. 2009 [51] D. Balemarthy, R. Paily, “Process Variations and Noise Analysis on a Miller Capacitance
Tuned 1.8 – 2.4 GHz Dual-Band Low Noise Amplifier”, Int. Conf. Advances in Comp., Control and Telecon. Tech., pp. 414 – 418, 28-29 Dec. 2009 [52] A. Neiuwoudt, T. Ragheb, H. Nejati, Y. Massoud, “Increasing Manufacturing Yield for Wideband RF CMOS LNAs in the presence of Process Variations”, Int. Symp. on Q.E.D.,
pp. 801 - 806, 26-28 March 2007 [53] K. Jayaraman, Q. Khan, B. Chi, W. Beattie, Z. Wang, P. Chiang, “A Self-Healing 2.4
GHz LNA with On-Chip S11/S21 Measurement/Calibration for In-Situ PVT Compensation”, IEEE Radio Freq. Integ. Circuits Symp. (RFIC’10), pp. 311 – 314, 23 –
25 May, 2010 [54] S. Sen, A. Chatterjee, “Design of Process Variation Tolerant Radio Frequency Low Noise
Amplifier”, IEEE Int. Symp. Circuits and Systems (ISCAS 2008), pp. 392 - 395, 18-21
May 2008 [55] P. Sivonen, A. Vilander, A. Parssinen, “A Gain Stabilization Technique for Tuned RF
Low-Noise Amplifiers”, IEEE Trans. Circuits Syst. I, vol. 51, no. 9, pp. 1702 – 1707,
September 2004 [56] Y. Cheng, “The Influence and Modeling of Process Variation and Device Mismatch for
Analog/RF Circuit Design (Invited), Fourth IEEE Int. Caracas Conf. on Devices, Circuits, Syst., pp. D046-1 – D046-8, April 17-19, 2002 [57] D. Gomez, M. Sroka, J. Jimenez, “Process and Temperature Compensation of RF LowNoise Amplifiers and Mixers”, IEEE Trans. Circuits Syst. I, vol. 57, no. 6, pp. 1204 –
1211, June 2010 [58] S. Nicolson, K. Phang, “Improvements in Biasing and Compensation of CMOS
OPAMPS”, IEEE Int. Symp. Circuits and Systems (ISCAS 2004), May 2004 [59] D. Shaeffer, T.H. Lee, “A 1.5V, 1.5 GHz, CMOS Low Noise Amplifier”, IEEE J. Solid-
State Circuits, vol. 32, pp. 745 – 759, May 1997 [60] A.M. Pappu, X. Zhang, A.V. Harrison, A.B. Apsel, “Process-Invariant Current Source
Design: Methodology and Examples”, IEEE J. Solid-State Circuits, vol. 42, pp. 2293 –
2302, October 2007 [61] M. Pelgrom, A. Duinmaijer, A. Welbers, “Matching Properties of MOS Transistors”,
IEEE J. Solid-State Circuits, vol. 24, No. 5, pp. 1433 – 1439, October 1989
123

[62] T. Sakurai, R. Newton, “Alpha-Power Law MOSFET Model and its Applications to CMOS Inverter Delay and Other Formulas”, IEEE J. Solid-State Circuits, vol. 25, No. 2, pp. 584 – 594, April 1990
[63] M. Mukadam, O. Filho, X. Zhang, A. Apsel, “Process Variation Compensation of a 4.6
GHz LNA in 65nm CMOS", IEEE Int. Symp. Circuits and Systems (ISCAS 2010), pp.
2490 - 2493, 29-31 May 2010 [64] W. Chen, S. Chang, K. Chen, G. Huang, J. Chang, “Temperature Effect on Ku-Band
Current-Reused Common-Gate LNA in 0.13 µm CMOS Technology”, IEEE Trans. On Micro. Theory, Tech., vol. 57, No. 9, pp. 2131 – 2138, September 2009 [65] Y. Wu, C. Shi, M. Ismail, H. Olsson, “Temperature Compensation Design for a 2.4 GHz Low Noise Amplifier”, IEEE Int. Symp. Circuits and Systems (ISCAS 2000), pp. 323 -
326, May 28-31 2000
[66] Y. Tsividis, Operation and Modeling of The MOS Transistor, New York, NY: McGraw
Hill, 1999 [67] M. Budnik, K. Roy, “A Power Delivery and Decoupling Network Minimizing Ohmic
Loss and Supply Voltage Variations in Silicon Nanoscale Technologies”, IEEE Trans. VLSI Syst., vol. 14, No. 12, pp. 1336 – 1346, December 2006 [68] J. Borremans, P. Wambacq, D. Linten, “An ESD-Protected DC-to-6GHz 9.7mW LNA in 90nm Digital CMOS”, IEEE Int. Solid-State Circuits Conf, pp. 422 – 613, February 11 –
15, 2007 [69] P. Mak, R. Martins, “A 0.46mm2 4dB-NF Unified Receiver Front-End for Full-Band
Mobile TV in 65nm CMOS”, IEEE Int. Solid-State Circuits Conf, pp. 172 – 174, February 20 – 24, 2011
[70] T. H. Lee, The Design of CMOS Radio-Frequency Integrated Circuits. New York:
Cambridge, 1998, ch. 12 [71] Bowman, K.A., Duvall, S.G., Meindl, J.D., “Impact of Die-to-Die and Within-Die
Parameter Fluctuations on the Maximum Clock Frequency Distribution for Gigascale Integration”, IEEE J. Solid State Circuits, vol. 37, February 2002
[72] Tschanz, J.W., Kao, J.T., Narendra, S.G., Nair, R., Antoniadis, D.A., Chandrakasan, A.P., De, V., “Adaptive Body Bias for Reducing Impacts of Die-to-Die and Within-Die Parameter Variations on Microprocessor Frequency Tuning and Leakage”, IEEE J. Solid
State Circuits, vol. 37, November 2002 [73] Sundaresan, K., Allen, P.E., Ayazi, F., “Process and Temperature Compensation in a 7-
MHz CMOS Clock Oscillator”, IEEE J. Solid State Circuits, vol. 41, February 2006 [74] Chen, H., Lee. E., Geiger, R., “A 2GHz VCO with Process and Temperature
Compensation”, Proc. IEEE Int. Sypm. Circuits Syst. ISCAS, May 1999 [75] Zhang, X., Apsel, A.B., “A process compensated 3-GHz ring oscillator”, Int. Symp.
Circuits Syst. ISCAS, May 2009 [76] Zhang, X., Pappu, A.M., Apsel, A.B., “Low variation current source for 90nm CMOS”,
Int. Symp. Circuits Syst. ISCAS, May 2008 [77] Lim, Q., Kordesch, A., Keating, R., “Performance Comparison of MIM Capacitors and
Metal Finger Capacitors for Analog and RF Applications”, RF and Microwave Conf.
RFM 2004, October 2004 [78] Zhang, X., Apsel, A.B., “A Low Power, Process-and-Temperature-Compensated Ring
Oscillator with Addition-Based Current Source”, submitted to TCAS-I, March 2010
124

[79] Vankka, J., Ketola, J., Sommarek, J., Vaananen, O., Kosunen, M., Halonen, K., “A GSM/EDGE/WCDMA Modulator With On-Chip D/A Converter for Base Stations”,
IEEE Trans. on Circuits and Systems-II, vol. 49, No. 10, October 2002 [80] Yang, C., Stojanovic, V., Modjtahedi, S., Horowitz, M., Ellersick, W., “A Serial-Link
Transceiver Based on 8-GSamples/s A/D and D/A Converters in 0.25-µm CMOS”, IEEE
J. Solid State Circuits, vol. 36, No. 11, November 2001 [81] Schafferer, B., Adams, R., “A 3 V CMOS 400mW 14b 1.4 GS/s DAC for multi-carrier
applications”, IEEE ISSCC, 2004 [82] Palmers, P., Steyaert, M., “A 10-Bit 1.6-GS/s 27-mW Current-Steering D/A Converter
With 550-MHz 54-dB SFDR Bandwidth in 130-nm CMOS”, IEEE Trans. on Circuits
and Systems-I, vol. 57, No. 11, November 2010
[83] Lin, C., van der Goes, F., Westra, J., Mulder, J., Lin, Y., Arslan, E., Ayranci, E., Liu, X., Bult, K., “A 12 bit 2.9 GS/s DAC With IM3 < -60 dBc Beyond 1 GHz in 65 nm CMOS”,
IEEE J. Solid State Circuits, vol. 44, No. 12, December 2009 [84] Mercer, D., “Low-Power Approaches to High-Speed Current-Steering Digital-to-Analog
Converters in 0.18-µm CMOS”, IEEE J. Solid State Circuits, vol. 42, No. 8, August 2007
[85] Pengfei Zhang; Nguyen, T.; Lam, C.; Gambetta, D.; Soorapanth, T.; Baohong Cheng;
Hart, S.; Sever, I.; Bourdi, T.; Tham, A.; Razavi, B.; , "A 5-GHz direct-conversion
CMOS transceiver," Solid-State Circuits, IEEE Journal of , vol.38, no.12, pp. 2232-
2238, Dec. 2003
[86] Cafaro, G.; Gradishar, T.; Heck, J.; Machan, S.; Nagaraj, G.; Olson, S.; Salvi, R.;
Stengel, B.; Ziemer, B.; , "A 100 MHz 2.5 GHz Direct Conversion CMOS Transceiver
for SDR Applications," Radio Frequency Integrated Circuits (RFIC) Symposium, 2007
IEEE , vol., no., pp.189-192, 3-5 June 2007
[87] Casper, B.; Martin, A.; Jaussi, J.E.; Kennedy, J.; Mooney, R.; , "An 8-Gb/s simultaneous
bidirectional link with on-die waveform capture," Solid-State Circuits, IEEE Journal of ,
vol.38, no.12, pp. 2111- 2120, Dec. 2003
[88] Jaussi, J.E.; Balamurugan, G.; Johnson, D.R.; Casper, B.; Martin, A.; Kennedy, J.;
Shanbhag, N.; Mooney, R.; , "8-Gb/s source-synchronous I/O link with adaptive receiver
equalization, offset cancellation, and clock de-skew," Solid-State Circuits, IEEE Journal
of , vol.40, no.1, pp. 80- 88, Jan. 2005 [89] Crippa, P., Turchetti, C., Conti, M., “A Statistical Methodology for the Design of High-
Performance CMOS Current-Steering Digital-to-Analog Converters”, IEEE Trans. on
Computer Aided Design of Integrated Circuits and Systems, vol. 21, No. 4, April 2002 [90] Pelgrom, M., Duinmaijer, A., Welbers, A., “Matching Properties of MOS Transistors”,
IEEE J. Solid State Circuits, vol. 24, No. 5, October 1989 [91] Lin, W., Kuo, W., “A Compact Dynamic-Performance-Improved Current-Steering DAC
With Random Rotation-Based Binary-Weighted Selection”, IEEE J. Solid State Circuits,
vol. 47, No. 2, February 2012 [92] Bastos, J., Stayaert, M., Sansen, W., “A high-yield 12-bit 250-MS/s CMOS D/A
converter”, IEEE Custom Integ. Circuits Conf. May 1996, pp 431-434 [93] Radulov, G., Quinn, P., Hegt, H., van Roermund, A., “An on-chip self-calibration method
for current mismatch in D/A Converters”, ESSCIRC, France, 2005 [94] Schofield, W., “A 16b 400MS/s DAC with <-80dBc IMD to 300MHz and <-160dBm/Hz
noise power spectral density”, IEEE ISSCC, 2003, pp. 126-482, vol. 1
125

[95] Huang, Q., Francese, P., Martelli, C., Nielson, J., “A 200 MS/s 14 b 97 mW DAC in 0.18 µm CMOS”, IEEE Intl. Solid-State Circuits Conf. (ISSCC) Dig. Tech., February 2004, pp. 364-532
[96] Bugeja, A., Song, B., “A Self-Trimming 14-b 100-MS/s CMOS DAC”, IEEE J. Solid State Circuits, vol. 35, No. 12, December 2000
[97] Cong, Y., Geiger, R., “A 1.5-V 14-Bit 100MS/s Self-Calibrated DAC”, IEEE J. Solid State Circuits, vol. 38, No. 12, December 2003
[98] Su, C., Geiger, R., “Dynamic Calibration of Current-Steering DAC”, IEEE ISCAS, 2006 [99] Tiilikainen, M., “A 14-bit 1.8-V 20-mW 1-mm2 CMOS DAC”, IEEE J. Solid State
Circuits, vol. 36, No. 7, July 2001 [100] Cong, Y., Geiger, R., “Switching Sequence Optimization for Gradient Error
Compensation in Thermometer-Decoded DAC Arrays”, IEEE Trans. on Circuits and Systems – II, vol. 47, No. 7, July 2000 [101] Geert, A., der Plas, V., Vandenbussche, J., Sansen, W., Steyaert, M., Gielen, G., “A 14bit Intrinsic Accuracy Q2 Random Walk CMOS DAC”, IEEE J. Solid State Circuits, vol. 34, No. 12, December 1999 [102] Lee, D., Lin, Y., Kuo, T., “Nyquist-Rate Current-Steering Digital-to-Analog Converters With Random Multiple Data-Weighted Averaging Technique and QN Rotated Walk Switching Scheme”, IEEE Trans. Circuits and Systems-II: Express Briefs, vol. 53, No. 11, November 2006 [103] Chen, T., Gielen, G., “A 14-bit 200-MHz Current-Steering DAC With SwitchingSequence Post-Adjustment Calibration”, IEEE J. Solid State Circuits, vol. 42, No. 11, November 2007 [104] Lee, D., Kuo, T., Wen, K., “Low-Cost 14-Bit Current-Steering DAC With a Randomized Thermometer-Coding Method”, IEEE Trans. on Circuits and Systems-II: Express Briefs, vol. 56, No. 2, February 2009 [105] Radulov, G., Quinn, P., ven Beek, P., van Roermund, A., “A binary to thermometer Decoder with Built-in Redundancy to Improve Yield”, IEEE ISCAS, 2006 [106] Chen, T., Geens, P., Van der Plas, G., Dehaene, W., Gielen, G., “A 14-bit 130-MHz CMOS current-steering DAC with adjustable INL”, IEEE ESSCIRC, September 2004 [107] Gangasani, G.R.; Chun-Ming Hsu; Bulzacchelli, J.F.; Rylov, S.; Beukema, T.; Freitas, D.; Kelly, W.; Shannon, M.; Jieming Qi; Xu, H.H.; Natonio, J.; Rasmus, T.; Jong-Ru Guo; Wielgos, M.; Garlett, J.; Sorna, M.A.; Meghelli, M.; , "A 16-Gb/s backplane transceiver with 12-tap current integrating DFE and dynamic adaptation of voltage offset and timing drifts in 45-nm SOI CMOS technology," Custom Integrated Circuits Conference (CICC), 2011 IEEE , vol., no., pp.1-4, 19-21 Sept. 2011 [108] Datta, A.; Bhunia, S.; Jung Hwan Choi; Mukhopadhyay, S.; Roy, K.; , "Profit Aware Circuit Design Under Process Variations Considering Speed Binning," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on , vol.16, no.7, pp.806-815, July 2008 [109] http://www.columbia.edu/~klt2127/martha/home/chip-cost-calculator.html
126