xml version 1.0 encoding UTF8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchemainstance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam 2200433Ka 4500
controlfield tag 001 002222726
005 20100804172635.0
007 cr cnuuuuuu
008 100804s2009 flua ob 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E14SFE0002965
035
(OCoLC)652546496
040
FHM
c FHM
049
FHMM
090
TK7885 (Online)
1 100
Venkataraman, Mahalingam.
0 245
Techniques for VLSI circuit optimization considering process variations
h [electronic resource] /
by Mahalingam Venkataraman.
260
[Tampa, Fla.] :
b University of South Florida,
2009.
500
Title from PDF of title page.
Document formatted into pages; contains 96 pages.
Includes vita.
502
Dissertation (Ph.D.)University of South Florida, 2009.
504
Includes bibliographical references.
516
Text (Electronic dissertation) in PDF format.
520
ABSTRACT: Technology scaling has increased the transistor's susceptibility to process variations in nanometer very large scale integrated (VLSI) circuits. The effects of such variations are having a huge impact on performance and hence the timing yield of the integrated circuits. The circuit optimization objectives namely power, area, and delay are highly correlated and conflicting in nature. The inception of variations in process parameters have made their relationship intricate and more difficult to optimize. Traditional deterministic methods ignoring variation effects negatively impacts timing yield. A pessimistic worst case consideration of variations, on the other hand, can lead to severe over design. In this context, there is a strong need for reinvention of circuit optimization methods with a statistical perspective.In this dissertation, we model and develop novel variation aware solutions for circuit optimization methods such as gate sizing, timing based placement and buffer insertion. The uncertainty due to process variations is modeled using interval valued fuzzy numbers and a fuzzy programming based optimization is proposed to improve circuit yield without significant over design. In addition to the statistical optimization methods, we have proposed a novel technique that dynamically detects and creates the slack needed to accommodate the delay due to variations. The variation aware gate sizing technique is formulated as a fuzzy linear program and the uncertainty in delay due to process variations is modeled using fuzzy membership functions. The timing based placement technique, on the other hand, due to its quadratic dependence on wire length is modeled as nonlinear programming problem.The variations in timing based placement are modeled as fuzzy numbers in the fuzzy formulation and as chance constraints in the stochastic formulation. Further, we have proposed a piecewise linear formulation for the variation aware buffer insertion and driver sizing (BIDS) problem. The BIDS problem is solved at the logic level, with lookup table based approximation of net lengths for early variation awareness.In the context of dynamic variation compensation, a delay detection circuit is used to identify the uncertainty in critical path delay. The delay detection circuit controls the instance of data capture in critical path memory flops to avoid a timing failure in the presence of variations.In summary, the various formulation and solution techniques developed in this dissertation achieve significantly better optimization compared to related works in the literature. The proposed methods have been rigorously tested on medium and large sized benchmarks to establish the validity and efficacy of the solution techniques.
538
Mode of access: World Wide Web.
System requirements: World Wide Web browser and PDF reader.
590
Advisor: Nagarajan Ranganathan, Ph.D.
653
Variation awareness
Circuit design
Gate sizing
Incremental timing placement
Buffer insertion
Clock stretching
Fuzzy programming
Logic level
Layout level
690
Dissertations, Academic
z USF
x Computer Science and Engineering
Doctoral.
773
t USF Electronic Theses and Dissertations.
4 856
u http://digital.lib.usf.edu/?e14.2965
PAGE 1
T echniques for VLSI Circuit Optimization Considering Process V ariations by Mahalingam V enkataraman A dissertation submitted in partial fulllment of the requirements for the de gree of Doctor of Philosophy Department of Computer Science and Engineering Colle ge of Engineering Uni v ersity of South Florida Major Professor: Nagarajan Ranganathan, Ph.D. Srini v as Katk oori, Ph.D. Hao Zheng, Ph.D. Justin E. Harlo w III, M.S. Sanjukta Bhanja, Ph.D. Kandethody M. Ramachandran, Ph.D. Date of Appro v al: March 23, 2009 K e yw ords: V ariation A w areness, Circuit Design, Gate Sizing, Incremental T iming Placement, Buf fer Insertion, Clock Stretching, Fuzzy Programming, Logic Le v el, Layout Le v el Cop yright 2009, Mahalingam V enkataraman
PAGE 2
DEDICA TION T o my parents, with all my lo v e and respect.
PAGE 3
A CKNO WLEDGEMENTS I w ould lik e to thank Dr Nagarajan Ranganathan for gi ving me this opportunity to w ork with him. I am grateful to him for his enthusiasm for research, moral support and steadf ast friendship. I cannot imagine ho w he comes up with the right questions to ask, whene v er we ha v e a research discussion on a ne w topic. His technical mentoring has made me a better researcher and his mentorship in teaching moral v alues has made me a better person. Inspite of his b usy schedule, Dr Ranga is al w ays a v ailable as a friend to discuss personal issues as well. He is an ideal role model for aspiring students and I am luck y to ha v e kno wn and w ork ed with him. I w ould also lik e to thank my committee members, Dr Srini v as Katk oori, Dr Hao Zheng, Prof. Justin E. Harlo w III, Dr Sanjukta Bhanja and Dr Kandethody Ramachandran for the v aluable time the y took to re vie w this thesis and their helpful comments. My sincere ackno wledgments to Semiconductor Research Corporation (SRC grant 2007HJ1596) and National Science F oundation (NSF Computing Research Infrastructure grant CNS0551621) for supporting this research in parts. I am v ery grateful to my lo ving f amily without whose moral support this w ork w ould not ha v e reached the completion. I cannot thank them enough for their constant source of inspiration. The y ha v e inculcated in me an attitude of w orking hard and being perse v erant. I am indebted to them for being a perpetual source of inspiration and moti v ation for me. I cannot for get the help and support of all my friends and colleagues who ha v e been with me in e v ery step of this process. My friends inside and outside the research group ha v e made the years spent in USF e xtremely pleasant. Se v eral of their constructi v e criticisms and discussions ha v e been useful in impro ving the quality of this research. Finally thanks to Professor N. V enkatesw aran (W ARFT India) for identifying my research potential, getting me started and moti v ating me to pursue a Ph.D. de gree.
PAGE 4
T ABLE OF CONTENTS LIST OF T ABLES iii LIST OF FIGURES i v ABSTRA CT vi CHAPTER 1 INTR ODUCTION 1 1.1 Moti v ation 1 1.2 Contrib utions of Dissertation Research 4 1.3 Signicance of Contrib utions 6 1.4 Dissertation Outline 7 CHAPTER 2 B A CKGR OUND AND RELA TED W ORK 9 2.1 Design for Manuf acturing 9 2.1.1 Optimization with Multiple P arameters 10 2.2 Process V ariations 12 2.2.1 Delay V ariations 12 2.2.2 Statistical Optimization 13 2.2.3 Dynamic T echniques for V ariation Compensation 14 2.3 Mathematical Programming 15 2.4 Fuzzy Linear Programming Methodology 17 2.4.1 Fuzzy Numbers 17 2.4.2 Solution T echnique: Fuzzy Linear Programming 19 2.5 Stochastic Chance Constrained Programming 20 CHAPTER 3 V ARIA TION A W ARE GA TE SIZING 23 3.1 Problem Denition 23 3.2 Why Fuzzy Programming for V ariation A w are Gate Sizing ? 26 3.3 Proposed V ariation A w are Fuzzy Gate Sizing 28 3.3.1 Po wer and T iming Models 28 3.3.2 V ariation Modeling with Spatial Correlation 29 3.3.3 V ariation A w are Fuzzy Gate Sizing 30 3.4 Experimental Results 34 3.5 Conclusion 38 CHAPTER 4 V ARIA TION A W ARE TIMING B ASED PLA CEMENT 39 4.1 Problem Denition 39 4.2 Moti v ation 42 4.3 Incremental T iming Based Placement 43 4.4 V ariation A w are Fuzzy T iming Based Placement 46 i
PAGE 5
4.5 Stochastic T iming Based Placement 51 4.6 Experimental Results 54 4.7 Conclusion 57 CHAPTER 5 V ARIA TION A W ARE B UFFER INSER TION 58 5.1 Introduction 58 5.2 Modeling Delay V ariations 60 5.3 Problem F ormulation 62 5.3.1 Layout Le v el Modeling 62 5.3.2 Logic Le v el Modeling 64 5.4 Proposed Approach 65 5.4.1 DeterministicBIDS 65 5.4.2 FuzzyBIDS 67 5.5 Simulation Methodology and Results 68 5.5.1 Layout Le v el BIDS 69 5.5.2 Logic Le v el BIDS 71 5.6 Conclusion 72 CHAPTER 6 D YN AMIC CLOCK STRETCHING 74 6.1 Introduction 74 6.2 Proposed Methodology 76 6.3 Experimental Ev aluation 80 6.4 Simulation Results 82 6.5 Conclusion 85 CHAPTER 7 CONCLUSIONS AND FUTURE W ORK 87 REFERENCES 90 ABOUT THE A UTHOR End P age ii
PAGE 6
LIST OF T ABLES T able 3.1 V ariation A w are Gate Sizing Results on Benchmark Circuits 36 T able 4.1 Notations and T erminology 45 T able 4.2 V ariation A w are Placement Results on Benchmark Circuits 56 T able 5.1 Layout Le v el Results on Benchmark Circuits 71 T able 5.2 Logic Le v el Results on Benchmark Circuits 72 T able 6.1 Description of Symbols in Simulation Snapshot 79 T able 6.2 T iming Y ield Results on Benchmark Circuits 85 iii
PAGE 7
LIST OF FIGURES Figure 1.1 Impact of Process V ariations on Po wer and Performance [72] 2 Figure 1.2 Fuzzy Linear Programming Approach for V ariation A w are Optimization 4 Figure 1.3 Major List of Contrib utions 6 Figure 2.1 T riangular Fuzzy Number Modeling for V arying Gate Length 18 Figure 3.1 T axonomy Diagram of Optimization Methods for Gate Sizing 24 Figure 3.2 Spatial Correlation of Process V ariations [Agarw al, 2003] 30 Figure 3.3 Fuzzy Gate Sizing: Simulation Flo w 35 Figure 3.4 Fuzzy Gate Sizing: Ex ecution T ime 36 Figure 3.5 Impro v ement in Po wer Sa vings: FGS W ith Correlations 37 Figure 4.1 T axonomy Diagram of T iming Based Placement 40 Figure 4.2 PreProcessing for Incremental T iming Based Placement 44 Figure 4.3 Process V ariation A w are Incremental Placement 53 Figure 4.4 V ariation A w are T iming Based Placement: Simulation Flo w 55 Figure 5.1 Fuzzy Programming Approach for V ariation A w are Optimization 61 Figure 5.2 Simulation Flo w Fuzzy BIDS 70 Figure 5.3 Fuzzy BIDS: Logic Le v el V ersus Layout Le v el 72 Figure 6.1 Dynamic Clock Stretching for V ariation T olerance 77 Figure 6.2 Simulation Snapshot of Example Circuit: No V ariations; No Clock Stretching 80 Figure 6.3 Simulation Snapshot of Example Circuit: W ith V ariations; No Clock Stretching 80 Figure 6.4 Simulation Snapshot of Example Circuit: No V ariations; W ith Clock Stretching 81 Figure 6.5 Simulation Snapshot of Example Circuit: W ith V ariations and Clock Stretching 81 i v
PAGE 8
Figure 6.6 Simulation Flo w for T iming Y ield Estimation 83 Figure 6.7 Clock Stretch Range V ersus T iming Y ield 84 v
PAGE 9
T echniques f or VLSI Cir cuit Optimization Considering Pr ocess V ariations Mahalingam V enkataraman ABSTRA CT T echnology scaling has increased the transistor' s susceptibility to process v ariations in nanometer v ery lar ge scale inte grated (VLSI) circuits. The ef fects of such v ariations are ha ving a huge impact on performance and hence the timing yield of the inte grated circuits. The circuit optimization objecti v es namely po wer area, and delay are highly correlated and conicting in nature. The inception of v ariations in process parameters ha v e made their relationship intricate and more dif cult to optimize. T raditional deterministic methods ignoring v ariation ef fects ne gati v ely impacts timing yield. A pessimistic w orst case consideration of v ariations, on the other hand, can lead to se v ere o v er design. In this conte xt, there is a strong need for rein v ention of circuit optimization methods with a statistical perspecti v e. In this dissertation, we model and de v elop no v el v ariation a w are solutions for circuit optimization methods such as gate sizing, timing based placement and b uf fer insertion. The uncertainty due to process v ariations is modeled using interv al v alued fuzzy numbers and a fuzzy programming based optimization is proposed to impro v e circuit yield without signicant o v er design. In addition to the statistical optimization methods, we ha v e proposed a no v el technique that dynamically detects and creates the slack needed to accommodate the delay due to v ariations. The v ariation a w are gate sizing technique is formulated as a fuzzy linear program and the uncer tainty in delay due to process v ariations is modeled using fuzzy membership functions. The timing based placement technique, on the other hand, due to its quadratic dependence on wire length is modeled as nonlinear programming problem. The v ariations in timing based placement are modeled as fuzzy numbers in the fuzzy formulation and as chance constraints in the stochastic formulation. Fur ther we ha v e proposed a piecewise linear formulation for the v ariation a w are b uf fer insertion and dri v er sizing (BIDS) problem. The BIDS problem is solv ed at the logic le v el, with lookup table based approximation of net lengths for early v ariation a w areness. In the conte xt of dynamic v ariation comvi
PAGE 10
pensation, a delay detection circuit is used to identify the uncertainty in critical path delay The delay detection circuit controls the instance of data capture in critical path memory ops to a v oid a timing f ailure in the presence of v ariations. In summary the v arious formulation and solution techniques dev eloped in this dissertation achie v e signicantly better optimization compared to related w orks in the literature. The proposed methods ha v e been rigorously tested on medium and lar ge sized benchmarks to establish the v alidity and ef cac y of the solution techniques. vii
PAGE 11
CHAPTER 1 INTR ODUCTION The rapid progress in technology is ha ving a profound impact on the adv ancement of engineer ing. The technology e v olution is enabling the de v elopment of ne w and impro v ed electronic products, which in turn f acilitates the progress of multiple engineering areas. The demand for high accurac y and performance in multifunctional phones, cameras and laptops is greater than e v er before. In addition to the traditional impro v ement in performance upgrades, the consumers are interested in battery life, reliability and green computing. The crucial objecti v es in de v eloping these products are, multifunctional features, high performance with lo w ener gy portable and at the same time be cost ef cient. A popular paradigm to achie v e these objecti v es is to scale do wn the dimensions of the basic circuit elements. The transition to lo wer technology generations, for high performance and denser inte gration capabilities, is becoming comple x due to increase in leakage ener gy and reliability concerns. Plus, the do wnw ard scaling of technology is also gradually reaching the limits of ballistic transportation [52]. 1.1 Moti v ation The abo v e mentioned issues are impacting all elds of engineering. In the conte xt of digital circuit optimization, the follo wing issues need to be addressed. First, the demand for lo w po wer de vices has increased with the gro wth of mobile de vices for longer battery life and green computing. Green computing in this conte xt, refers to the use of computing resources ef ciently during its lifetime and promotes rec yclability and biode gradability [27]. Secondly due to nanometer inte gration le v els, wiring density and aspect ratios of metal lines ha v e increased the coupling capacitance between neighboring interconnects. In the nanometer era, the coupling capacitances in adjacent nets are strong enough to cause timing f ailures in circuits. Hence, the circuit optimization techniques in addition to optimizing performance ha v e to consider the conicting objecti v es of po wer and reliability as well. The optimization objecti v es are inter related, conicting and ha v e become more intricate in the nanometer re gime. 1
PAGE 12
Optimization of performance is not suf cient and it can introduce a se v ere penalty in po wer and reliability which in turn af fects the performance as well. The multi metric optimization is no longer a recommendation option, b ut a necessity V ariability in nanometer v erylar gescalein te gr ate d (VLSI) circuits has continued to increase with the decrease in feature size of transistors. The v ariations in process parameters increase the challenges of simultaneously optimizing po wer performance and reliability The main causes of the v ariations are either due to en vironmental ef fects lik e changes in po wer supply v oltage and temperature or due to physical ef fects lik e changes in transistor width, channel length, oxide thickness and interconnect dimensions. The physical ef fects due to the imprecision in the f abrication process control leads to randomness in the number of dopant atoms in transistors [22]. The uncertainty due to these process parameter v ariations deeply impact the timing and po wer characteristics of the circuits. Identically designed circuits can ha v e a lar ge dif ference in timing and po wer characteristics leading to loss in parametric yield of circuits [72]. Figure 1.1 Impact of Process V ariations on Po wer and Performance [72] Figure 1.1 sho ws the de viation in leakage po wer and performance of Intel processors due to v ariations in process parameters. Hence, capturing the uncertainty due to these v ariations early in the analysis and optimization phase is crucial. Se v eral circuit optimization methods o v er the years, ha v e successfully impro v ed area, po wer and timing of microprocessors. These optimizations mak e most paths of the circuits equally critical to achie v e a optimal balance between po wer consumption and the timing specication. Ho we v er with the increasing ef fects of process v ariations in the nanometer era, 2
PAGE 13
such optimization can w orsen timing yield, as an y of these critical paths can f ail [95]. Here, timing yield is dened as the percentage of chips meeting the timing specication. A guarded approach to eliminate the ef fects of v ariability is to perform deterministic optimization at the w orst case v alues of the v arying parameters. The deterministic w orst case approach guarantees high yield, b ut leads to high o v erhead in terms of circuit area, po wer and loss in optimization. On the other hand, the a v erage case v alues for these parameters ha v e less o v erheads, b ut may result in unacceptable timing yield. Hence, in the nanometer technology le v el, ne w methodologies are needed, which can guarantee high yield without compromising interms of area and po wer o v erheads. The increasing ef fect of process v ariations in nanometer domain, has transitioned the design and optimization problem from the deterministic domain to the probabilistic domain [70, 72]. The process v ariations do not scale proportionally and their impact is increasing with each ne w technology node. In addition to increasing the comple xity of nding an optimal solution, the circuit optimization process with a statistical perspecti v e is inherently slo wer than their deterministic equi v alent. In recent years, se v eral research w orks ha v e addressed v ariation a w areness in the conte xt of timing analysis and circuit optimization. Static timing analysis w as replaced with statistical static timing analysis (SST A) [11, 26, 74], where continuous distrib utions are propagated instead of deterministic v alues to estimate timing in presence of v ariations. More recently statistical design optimization techiques [42, 49, 53, 54, 69, 95 ] for impro ving po wer and area for an acceptable yield ha v e also been attempted. The optimization approach in [95], uses a penalty function to impro v e the slacks of critical paths to impro v e yield. An SST A engine is used in the iterati v e optimization frame w ork [49] to nd the most critical gate to size in terms of po wer/delay sensiti vity In [53, 54], a stochastic programming approach with chance constraints is used to incorporate yield in the gate sizing problem formulation. Ho we v er the approaches using continuous distrib utions require a number of complicated operations to be performed iterati v ely at each node and hence incur a prohibiti v e runtime [20, 30]. The stochastic programming based statistical optimization technique [53, 54], on the other hand is f ast, b ut more conserv ati v e [40] in terms of yield and hence lesser sa vings in terms of area and po wer consumption. Further man y of these methods are based on the assumption that the v ariation sources of the components follo w specic distrib utions, such as Gaussian. Ho we v er manuf acturing tests on f abricated circuits refute such assumptions and suggest against the use of assumptions on the distrib utions of v ariation parameters [44]. 3
PAGE 14
1.2 Contrib utions of Dissertation Resear ch In this dissertation, we propose the use of fuzzy mathematical programming for circuit optimization considering the uncertainty due to process v ariations. Fuzzy set theory can deal with multiple types of v ariations, specically is popular in cases, where we cannot e v en predict the a v erage beha vior of the uncertain parameters. In fuzzy terminology the abo v e uncertainty is referred to as imprecision. Probability theory usually models situations where a v erage beha vior is predictable (situations that obe y the la w of lar ge numbers) and enough information is a v ailable to model the probability distrib ution functions. The theory of fuzzy sets and systems, on the other hand, has been pro v en to successfully model imprecision and an interv al v alue is suf cient to model v ariations. Fuzzy mathematical programming has successfully been applied in se v eral areas of computer science and engineering lik e vision and robotics [64]. In VLSI circuit automation, fuzzy programming has also been used in VLSI testing and high le v el synthesis tasks lik e scheduling to model imprecise coef cients [30]. T o the best of our kno wledge, this is the rst time the concepts of fuzzy sets and systems and fuzzy mathematical programming are being attempted at modeling uncertainty due to process v ariations in VLSI circuits. Solve problem X with coefficients set to lower interval bound Solve problem X with coefficients set to upper interval bound Combine above solutions and introduce variation parameter to create a crisp nonlinear problem fuzzy coefficients with interval based Optimization problem X a nonlinear optimization solver Solve the crisp problem using Figure 1.2 Fuzzy Linear Programming Approach for V ariation A w are Optimization Figure 1.2 sho ws an outline of the fuzzy optimization approach. Initially a deterministic optimization is performed assuming the w orst and the a v erage case v alues for the v ariation parameters and 4
PAGE 15
the results are used to con v ert the fuzzy optimization problem into a crisp nonlinear problem using the symmetric relaxation method [28, 64]. The crisp problem is then solv ed using a nonlinear optimization solv er The crisp problem in general, has been pro v ed to pro vide the most satisfying solution in presence of imprecision or v ariations in coef cients of the constraints or objecti v e function in the optimization problem [43, 56]. Ne xt, we summarize the major contrib utions of this dissertation. In this dissertation, we ha v e proposed the use of fuzzy numbers for modeling uncertainty in delay due to process v ariations. Delays are modeled in an abstract f ashion, considering the delay coef cients as an interv al v alue with linear membership function. W e ha v e constructed fuzzy mathematical programming based formulations for gate sizing, timing based placement and b uf fer insertion to impro v e timing yield with minimal penalty in terms of design o v erheads. Fuzzy programming has been e xtensi v ely used in ci vil, mechanical and computer science areas. Ho we v er this is the rst time fuzzy techniques are emplo yed for process v ariation a w are VLSI circuit optimization. Secondly we ha v e also proposed a no v el methodology for timing based placement using stochastic chance constrained programming (CCP). The CCP is formulated to capture the dependence of the constraints and objecti v es of the optimization using mean and v ariance of the uncertain parameters. W e ha v e compared the ef cienc y of fuzzy and stochastic CCP techniques for v ariation a w are gate sizing and timing based placement. The results ha v e conrmed the prediction in [40], that the ef cienc y of fuzzy mathematical programming is better than or comparable to stochastic CCP based optimization. In the conte xt of b uf fer insertion and dri v er sizing, we are the rst to propose logic and layout le v el methodologies for simultaneous optimization of v ariation resistance, delay and po wer The b uf fer insertion methodology w as formulated as a v ariation a w are piecewise linear program at the circuit le v el. The circuit le v el methodology also o v ercomes the limitations of path based and netbased approaches. A lookup table based technique is used for predicting interconnect length at the logic le v el. The prediction technique is based on layout le v el simulations on sample benchmarks. The logic le v el approximation is sho wn to be comparable with detailed layout le v el results. Finally we proposed a dynamic clock stretching technique for impro ving timing yield. Unlik e the statistical design optimization methods, the clock stretching technique dynamically detects v ariations and incurs performance penalty only when there is a possible timing f ailure. The important contrib utions are summarized in Figure 1.3. 5
PAGE 16
for variation aware timing based placement gate sizing in the presence of delay uncertainty Fuzzy linear programming formulation for parameters using fuzzy membership functions Methodology for modeling variations in process improvement in the presence of parametric variations Dynamic clock stretching technique for timing yield Fuzzy piecewise linear programming formulation for logic and layout level buffer insertion Fuzzy and Stochastic nonlinear programming formulations Major contributions Figure 1.3 Major List of Contrib utions 1.3 Signicance of Contrib utions In this dissertation, we ha v e proposed the use of fuzzy mathematical programming for uncertainty a w are VLSI circuit optimization. The fuzzy optimization methodology is sho wn to abstractly model the v ariations in delay with delay coef cients as interv al v alues. Thus, the methodology can be used to model v ariations in process parameters e v en at logic le v el. Secondly the fuzzy optimization methodology is sho wn to con v eniently model v ariations in linear nonlinear and piecewise linear mathematical programming formulations. The abo v e observ ation, infers that the fuzzy methodology is an ef fecti v e 6
PAGE 17
tool for se v eral mathematical programming based VLSI circuit optimization problems. W e ha v e also sho wn that the fuzzy technique guarantees yield (e v aluated using MonteCarlo simulations) and ef fecti v ely impro v es the objecti v e function compared to the w orst casing and stochastic chance constrained programming approach. Hence, the fuzzy mathematical programming (i) modeling, (ii) formulation and (iii) solution technique is a signicant addition to the VLSI tools in the conte xt of v ariation a w are design. A lookup table based interconnect length prediction technique is proposed and is used for logic le v el v ariation a w are b uf fer insertion. The results of the logic le v el technique for v ariation a w are b uf fer insertion is comparable to the v alues at the layout le v el, inspite of the approximation. The statistical design optimization using fuzzy programming is a design time technique to combat the ef fect of process v ariations on circuit performance. In this dissertation, we ha v e also proposed a run time technique to dynamically detect delay due to v ariations and stretch the clock to a v oid timing f ailures. The clock stretching methodology is an ef fecti v e timing f ailure pre v ention technique and has less o v erhead compared to critical path isolation based clock stretching [76]. The dynamic technique does not require e xtra mar gin (o v er design) in the absence of v ariations. It is only acti v ated in the presence of v ariations in delay on one of the top critical paths. Hence, the method stretches the clock only when necessary to a v oid timing f ailure. 1.4 Dissertation Outline The remainder of this dissertation is or ganized as follo ws. In Chapter 2, we describe the background that forms the basis of this research follo wed by rele v ant research contrib utions in areas related to the problems being addressed in this dissertation. The background discussion starts with the basic concepts of mathematical programming, importance of considering process v ariations in circuit optimization and briey e xplains the fuzzy mathematical programming (FMP) and stochastic chance constrained programming (CCP) techniques for v ariation a w are circuit optimization. Increasing le v els of v ariations in process parameters are af fecting performance and hence timing yield of VLSI circuits. In this conte xt, we propose statistical and dynamic optimization techniques for impro ving timing yield without signicant o v er design. 7
PAGE 18
The fuzzy gate sizing (FGS) methodology for process v ariation a w are optimization of po wer delay and noise is presented in Chapter 3. The FGS technique is a post layout gate sizing approach, formulated as a linear programming problem with v ariations modeled as linear membership based fuzzy numbers. In Chapter 4, we describe both fuzzy and stochastic CCP based approach for timing based placement to optimize performance in the presence of process v ariations. The timing based placement problem is inherently nonlinear due to quadratic interconnect delay constraints and hence is modeled as a v ariation a w are nonlinear programming problem. The process v ariations are modeled using fuzzy numbers with linear membership functions and chance constraints in FMP and CCP conte xts respecti v ely W e propose logic and layout le v el optimization techniques for v ariation a w are b uf fer insertion and dri v er sizing in Chapter 5. The b uf fer insertion and dri v er sizing is formulated at the circuit le v el (instead of at the net or path le v el) and piecewise linear constraints are used for modeling change in circuit delay The v ariations in delay coef cients are modeled as fuzzy numbers with linear member ship functions. In Chapter 6, we propose a dynamic technique for compensating the uncertainty in delay due to process v ariations. The methodology in the presence of v ariations uses a clock stretching logic circuit to increase the a v ailable slack and a v oid a timing violation. The concluding remarks and future w ork in terms of e xtensions to the problems are addressed in Chapter 7 of this dissertation. 8
PAGE 19
CHAPTER 2 B A CKGR OUND AND RELA TED W ORK In this chapter we present a brief introduction of the v arious concepts that form the basis for the research described in this dissertation. Specically we discuss the preliminaries of linear nonlinear programming, circuit optimization with multiple objecti v es, fuzzy mathematical programming technique and the stochastic chance constrained programming. The uncertainty based optimization techniques namely fuzzy and stochastic approaches are used for solving gate sizing, placement and b uf fer insertion considering process v ariations. 2.1 Design f or Manufacturing The shrinking design nodes, e xpanding design comple xity and density in the nanometer dimensions are resulting in huge yield losses due to electro migration, leakage and inte grity issues. Further v ariations in process parameters at smaller geometries are adding to the design issues requiring additional rules to be administered before pattern generation. Physical xups by processing mask data happen too late and cannot catch yield reducing problems. Hence, design engineers no w need to address manuf acturing issues throughout the design chain. Y ieldsa vvy design tools and methodologies are required early in the design o w that can address performance, po wer design functionality and manuf acturing yield. In the early nineties, the chip designers considered testing as an after concern. Ev en with increased number of test patterns and huge testing time, design engineers did not consider testability as a part of their design process. The test engineers in the manuf acturing di vision w as responsible for making the chip testable and ha v e good f ault co v erage. V ariation a w are design for manuf acturing is still considered as an after thought by some chip designers. Maximizing yield early in the design o w is only a recommended strate gy and meeting physical and electrical design rules is suf cient. Y ield optimizations and tweaks are still handled at the manuf acturing en vironment. Ho we v er design optimization 9
PAGE 20
without considering v ariations can w orsen yield to irreco v erable le v els at the manuf acturing stage. In the 90 nm era, only slightly more than 40% of chip designs operate as e xpected and the rest need a complete mask respin to achie v e acceptable performance and yield. Y ield optimized standard cells, recommended design rules on via placement, statistical timing analysis and statistical optimization during sizing, b uf fer insertion are some popular design le v el techniques considering yield [57, 92]. Y ield enhancing methodologies are highly interdependent on other metrics lik e po wer and perfor mance as well and hence needs to simultaneously consider multiple metrics. Ne xt, we introduce the v arious objecti v es and briey gi v e an o v ervie w of yielda w are statistical design techniques. 2.1.1 Optimization with Multiple P arameters In the nanometer era, the term high performance of a VLSI circuit is not a simple function of delay or frequenc y of the circuit. The components such as po wer crosstalk noise and process v ariations also ha v e a lar ge ef fect on delay and hence it is crucial to be considered as optimization metrics. A simultaneous optimization of these metrics (cost functions) is essential to design rob ust and reliable high performance circuits. Physical design steps lik e oor planning, standard cell placement and routing typically concentrate on area, wire length and delay The computational comple xity of these algorithms is huge for lar ge sized benchmarks. Hence, the responsibilities of the circuit optimization techniques are not only maximizing performance, b ut also minimize po wer crosstalk noise and effects due to v ariations in process parameters. The problem of circuit optimization primarily in v olv es incremental changes or tuning of circuit components lik e gate sizes, threshold v oltage, wire size, cell locations, and b uf fer insertion to maximize or minimize the o v erall objecti v e or function [21]. Since, the focus of this dissertation is simultaneous optimization of po wer delay crosstalk noise and yield, the methods that are ef fecti v e for the optimization for these metrics are discussed. Ho we v er the optimization frame w ork de v eloped in this dissertation is b uilt from a generalized point of vie w and hence an y e xisting or ne w metrics can be added or remo v ed with minimal ef fort. The transistor sizing, gate sizing and wire sizing problems are important in VLSI design because the y enable us to e xplore the tradeof fs between the multiple objecti v es in the cost function. The methodology (TILOS) in [2], use a con v e x programming based iterati v e transistor sizing technique based on critical delay sensiti vity to impro v e performance. TILOS uses con v e x delay models for tran10
PAGE 21
sistor sizing as the y ha v e the adv antage that a local optimum is a global one. The iterati v e methodology w as further impro v ed in [80]. Ho we v er the general technique is not ef cient for problems with more than a fe w thousand sizeable components. The authors in [1], proposed a lagrangian relaxation based gate sizing technique with simple constraints, that can be solv ed ef ciently The methodology ho we v er requires a good initial solution and sub gradient optimization step to con v er ge practically In a bid to impro v e computational comple xity without signicant impact on solution quality the authors in [17], ha v e proposed a rob ust linear programming frame w ork for gate sizing. Secondly in the conte xt of increasing wire delay b uf fer insertion and wire sizing are tw o popular methods to impro v e circuit per formance. The delay of a net is directly proportional to the product of the resistance and capacitance along the wire. Since, the y both are internally a function of the length of the net, the delay e xhibits a quadratic dependence on the wire length. The objecti v e of the b uf fer insertion problem is to di vide the wire into a number of se gments, such that the sum of delay of each wire se gment becomes a linear function of the length of the total wire [68]. V an Ginnek en in [47], presented a dynamic programming based optimal b uf fer insertion algorithm. The approach has quadratic comple xity in the number of b uf fer locations and has been a foundation for se v eral later w orks in the conte xt of netbased b uf fer insertion. The leakage current of a transistor can be controlled by tting the circuit with a higher threshold v oltage (Vth). The e xponential dependence of leakage po wer on threshold v oltage reduces the po wer consumption, with a delay penalty Hence, se v eral w orks [46, 61] ha v e increased the Vth of the noncritical transistors to impro v e leakage po wer The threshold v oltage assignment step is usually combined with the gate/wire sizing methodology to ef ciently optimize dynamic and leakage po wer T iming dri v en placement is another important step in the physical design of inte grated circuits. It can be formally dened as the process of nding the optimal locations of cells in a critical sub circuit such that the delay of the circuit is minimized. In timing based placement, the objecti v e is to minimize the length of the critical interconnects, by incrementally adjusting the locations of cells [83, 91]. The timing of a circuit in the conte xt of timing based placement is usually measured in terms of the w orst ne gati v e slack and the total ne gati v e slack. The term slack in this conte xt, is dened as the dif ference between the required time and actual arri v al time of the signal. The incremental placement process creates o v erlaps, which is usually remo v ed by a le galization step. The le galization step can impact timing in the re v erse direction. Ho we v er with mo v ement restrictions on cells o v er a localized neigh11
PAGE 22
borhood, it w as sho wn in [50], the re v erse impact on timing is ne gligible. In the ne xt section, we discuss in detail the importance of process v ariations and v arious techniques which are proposed to reduce the impact of v ariations on timing yield. 2.2 Pr ocess V ariations Process v ariations, in general, refer to the dif ference between the intended and obtained parameter dimensions prior and post f abrication of the circuit. W ith rapid scaling of technology into nanometer dimensions, the v ariations in semiconductor parameters lik e de vice length ( L e f f ), threshold v oltage ( V t h ), gate oxide thickness is becoming f atal to circuit yield. The v ariations in process parameters can be classied as inter die and intradie v ariations. As the name suggests, inter die v ariations are constant within a die b ut v ary from one die to another die. The ef fect of inter die v ariations can be detected during production test and a v oltage or frequenc y change can pre v ent the loss of yield. Intradie v ariations on the other hand, are v ariations that within a single die, meaning that a de vice/net parameter v ary dif ferently between dif ferent locations on the same die. Intradie v ariations af fecting doping concentration mainly result from f abrication equipment limitations. Intradie v ariation also e xhibits spatial correlation. De vices that are close together in the layout ha v e a higher probability of being alik e in characteristics than de vices placed f ar apart. CMP (chemicalmechanical polishing) ef fects and optical proximity ef fects also increase the magnitude of intradie v ariation in nanometer technology [14, 89]. Hence, the amount of randomness in between gate/net within a die due to intradie v ariations is lar ge. Since, intradie v ariations are random across the die, adjusting v oltage or frequenc y to meet the w orst case setting of all the parameters is often pessimistic. Hence, to meet the conicting objecti v es of high performance, lo w po wer and high yield it is necessary to bring v ariation a w areness in the design o w 2.2.1 Delay V ariations The impact of intradie process v ariations on delay is lar ge as man y f actors can af fect it. The delay for a gate and interconnect can be represented as, ga t ed el ia ib i s ic i C l oadj (2.1) 12
PAGE 23
Equation 2.1 models gate delay as a linear function of gate size and load capacitance [17]. The gate delay can also be modeled using comple x functions. Ho we v er from the conte xt of gate sizing the model optimally tradeof fs solution quality and e x ecution time. The wire delay (Equation 2.2) can be represented as a quadratic function of interconnect length [83]. ne t d el iR 0l en i05C 0l en iC pin(2.2) The complete notations for these equations are described in later sections. The changes in ef fecti v e length, oxide thickness and related process parameters can ha v e signicant impact on the coef cients of the abo v e equations. It has been predicted in [12, 77], that a delay v ariation of around 25% is possible in nanometer technology generations. In this w ork, we abstractly model the ef fects of v ariations on delay using the coef cients b ic iR 0 and C 0 Note that in addition to the process parameter v ariations, en vironmental f actors such as po wer supply and temperature can also af fect delay Ho we v er we only consider the physical process parameter v ariation impacts as it is the dominating f actor af fecting the delays [77]. Plus, the abo v e methodology can incorporate the impact of v oltage and temperature v ariations by using more sophisticated models. 2.2.2 Statistical Optimization Considering the sensiti vity of circuit delay to process v ariations, se v eral pre vious attempts focused on timing analysis. Ne w approaches for both deterministic static timing analysis (ST A) and statistical ST A ha v e been proposed. Pre viously in deterministic ST A [2, 63], process v ariations ha v e been modeled using case analysis. Thge best case, nominal and w orst case parameter sets are constructed and timing analysis is repeated for each corner The deterministic ST A has a linear run time comple xity as a function of the circuit size. Ho we v er in the nanometer era, with multiple v ariation sources, using the w orst case parameters for intradie v ariations leads to a pessimistic estimate. Hence, numerous w orks ha v e proposed to replace ST A with statistical ST A [11, 67, 74]. The SST A approaches ho we v er ha v e high run time comple xity due to recon v er ging paths in the circuit and comple x computations. Recently se v eral researchers ha v e attempted to optimize po wer delay and noise in the presence of process v ariations [10, 26, 49, 53, 54, 84, 87, 95]. The w orks in [10, 49] mainly focused on circuit optimization schemes with a statistical perspecti v e. In other w ords, a statistical delay model or SST A 13
PAGE 24
is used to guide timing analysis. The authors in [53, 54], presented a statistical optimization approach that tak es into account randomness in gate delays by formulating an ef cient mathematical program. The major part of the w ork in this dissertation focuses on statistical optimization of yield, delay and po wer as a v ariation a w are mathematical program formulation. 2.2.3 Dynamic T echniques f or V ariation Compensation The increasing impact of v ariations in the nanometer era is necessitating the use of dynamic and runtime technique to combat to impro v e yield. Con v entional techniques lik e scaling up supply v oltage of upsizing logic gates o v er consume resources in the presence or absence of v ariations. The objecti v e of these w orks has been to impro v e the po wer/o v erheads compared to w orst case design while maintaining the timing yield. On the other end of the spectrum, design techniques ha v e been proposed based on adapti v e body biasing for post silicon process compensation [71]. Due to the quadratic dependence of supply v oltage to dynamic po wer researchers ha v e proposed adapti v e v oltage scaling technique that are rob ust with respect to process v ariations and has limited o v er design. One such technique called Razor [24], uses dynamic detection of delay due to v ariations and corrects timing f ailures by adjusting the v oltage of the processor The Razor technique eliminates the need for v oltage mar gins and hence can achie v e signicant sa vings in o v erheads compared to w orst casing and statistical optimization. Secondly the authors in [76], proposed a no v el design paradigm which achie v es rob ustness with respect to timing f ailure by using the concept of critical path isolation. The methodology isolates critical paths by making them predictable and rare under parametric v ariations. The top critical paths, which can f ail in single c ycle operation, are predicted ahead of time and are a v oided by pro viding tw o c ycle operations. The approach is e xtremely useful for certain design with rare critical paths (e x: adder). The critical path approach when tested on random designs, sa v e po wer with a small timing penalty Since, se v eral circuit optimization techniques lik e, gate sizing, b uf fer insertion and incremental placement are inherently suited to be modeled as a mathematical program. In the ne xt section, we discuss briey the basics of mathematical programming and uncertainty a w are optimization schemes. 14
PAGE 25
2.3 Mathematical Pr ogramming A mathematical program is either a maximization or minimization program to optimize an objecti v e function with a set of constraints. In linear programming, the objecti v e function and the constraints are al w ays linear functions of the decision v ariables. A simple e xample of an linear programming problem can be sho wn as, maximize c 1 x 1c 2 x 2 c n x n (2.3) sub j ec t t o a 11 x 1a 12 x 2 a 1 n x nnb 1 a 21 x 1a 22 x 2 a 2 n x nnb 2 a m 1 x 1a m 2 x 2 a mn x nnb n x 1x 2 x n0 Here, x i is the decision v ariable, m is the number of constraints and n is the number of v ariables. The objecti v e here is to maximize the objecti v e function c 1 x 1c 2 x 2 c n x n A feasible solution in linear programming conte xt is the one which satises all the constraints and maximizes the objecti v e function. Simple x method has been a simple, f ast and ef cient approach to solv e linear programming problems. It is an iterati v e process in which, we start with a simple solution and then impro v e the same in a relati v e f ashion. The iteration continues until no further impro v ement is possible. The second class of mathematical optimization programs are broadly termed as the nonlinear programming (NLP). The main dif ference between the linear and the nonlinear problems is that the constraints and the objecti v e functions are allo wed to be nonlinear functions of the decision v ariables. A major challenge in NLP problems is the e xistence of local optima. Local optima is formally dened as spurious solutions that barely satises the requirements on the deri v ati v es of the functions. Nonlinear optimization algorithms that o v ercome this dif culty are called global optimization algorithms. The nonlinear problems are further classied as con v e x and noncon v e x problems. Con v e x methods ha v e a special property where an y local optimal solution is also a global optimum. Least squares, linear pro15
PAGE 26
gramming, conic program, geometric program, quadratic program and semi denite programs are all e xamples of con v e x optimization programs [73]. The theory of con v e x sets and con v e x optimization is not ne w to VLSI and has been used in transistor sizing [2, 35, 80]. The delay of a gate in these w orks is modeled as a posynomial function of its transistor size using the Elmore delay model. A popular optimization package to solv e con v e x programming problem is the MOSEK solv er It is dened to solv e lar ge scale linear conic, quadratic, con v e x nonlinear and mix ed inte ger problems. Further in addition to the optimization solv ers, the representation of the optimization problem is also a crucial task. AMPL (A mathematical programming language) [65] and GAMS (General algebraic modeling system) are popular highle v el modeling system for mathematical programming based optimization. A primary adv antage of AMPL is the similarity of its syntax to the mathematical notation of optimization problems, which mak es the optimization problems more readable. The optimization problem represented in a mathematical programming language can be solv ed using commercial optimization solv ers. Se v eral of these commercial softw are are also a v ailable for free usage at the NEOS serv er for optimization. The optimization problems can be submitted via email or on its website [66]. The coef cients of the mathematical programming problem are generally assumed to be e xactly kno wn. Ho we v er the assumption is not true for majority of the real w orld problems. Usually the coefcients of the programming problem is either subject to errors of measurement or v ary with technology or mark et conditions. Hence, it is crucial to introduce imprecision or uncertainty into the modeling process, while solving real w orld problems. A classical approach has been to use the concepts of probability theory The probabilistic analysis is only proper for situations, which are reproducible and has happened a suf cient number of times. Therefore, the probabilistic techniques require some statistical data to pro vide information on the v ariation distrib ution of the random v ariable occurring in the mathematical model. The modeling of imprecision or uncertainty can also be done using fuzzy set theory and the concepts of fuzzy programming. The fuzzy programming techniques assume that the distrib ution of random v ariables are not e xactly kno wn and only belong to nonv oid sets or interv als. In the ne xt sections, we discuss the preliminaries of v ariation a w are optimization using fuzzy and stochastic programming. 16
PAGE 27
2.4 Fuzzy Linear Pr ogramming Methodology In this section, we briey discuss important concepts in uncertainty a w are optimization using fuzzy programming and v ariation modeling using fuzzy numbers. The reader is referred to [Saka w a 2002 [56]; Klir and Y uan 1995 [43]; and Zadeh 1970 [64]] for a detailed treatment of fuzzy mathematical programming. Zadeh (1965) introduced the concept of fuzzy sets and systems in which an element belonging to a set need not be binary v alued, b ut could be an y v alue in between [0, 1]. The membership v alue of the element in the interv al [0, 1] is decided on ho w much it belongs to it and higher the de gree of belonging, then, higher the membership v alue is. 2.4.1 Fuzzy Numbers Fuzzy set theory and fuzzy optimization techniques pro vide an ef cient mechanism for modeling and optimizing systems that e xhibit imprecision. The theory and methodology of fuzzy programming based optimization has been popular since the inception of decision making in fuzzy en vironments by Bellman and Zadeh, in 1970 [64]. Se v eral models and approaches ha v e been proposed for uncertainty management using fuzzy linear programming, fuzzy multiobjecti v e programming, fuzzy dynamic programming, fuzzy inte ger programming, possibilistic programming and fuzzy nonlinear programming. An e xtensi v e list of references can be found in [56] and [39]. A recent surv e y on fuzzy linear programming based optimization from a practical perspecti v e has been pro vided by Inuiguchi and Ramik [51]. Hence, in order to solv e a optimization problem with uncertain v ariable using fuzzy optimization, the principal requirement is an ef cient fuzzy modeling of uncertainty A simple information such as Â”the processing time for a task is around 23 minutes or within the range of 21 and 25Â” can be e xpressed by means of the follo wing membership function, 23x r r r r r r r r x212 if 21nxn2325x2 if 23xn25 0 otherwise (2.4) The uncertainty due to process v ariations are usually modeled as normally distrib uted random v ariables with mean E and standard de viation s In this w ork, instead of using normal distrib ution, we model these v ariations as interv al v alued fuzzy numbers in the range E3 s and E+3 s The 3 s v alue 17
PAGE 28
is assumed to be the deterministic w orst case v ariation v alue, meaning all uncertain process parameters are set to 3 s for maximum timing yield in w orst case deterministic optimization. Interv al v alued fuzzy numbers were rst e xplained by Zadeh, in [64], using possibilistic distrib utions. In the conte xt of fuzzy mathematical programming, possibilistic distrib utions p are analogous to linear membership functions [30]. Figure 2.1 sho ws a symmetric triangular fuzzy number for modeling the v ariations in channel length of nanometer le v el transistors. T riangular and trapezoidal memberships are commonly used possibilistic distrib utions in solving fuzzy mathematical programming problems. Fuzzy optimization with nonlinear membership functions ha v e also been attempted for impro ving modeling accurac y [56] Figure 2.1 T riangular Fuzzy Number Modeling for V arying Gate Length The triangular fuzzy number in Figure 2.1 is commonly denoted by a triple X = ( x mx lx u ), where x m is the most possible v alue or the mean v alue and x lx u are the lo wer and upper bounds, denoting the pessimistic and optimistic v alue of the number Depending on the conte xt, the v alue x l can be pessimistic or optimistic v ariation from the mean v alue x m and the same holds for the v alue x u In the conte xt of VLSI circuit optimization, the triple L e f f L m e f fL l e f fL u e f fcan be used to model v ariations in channel length. Since the general objecti v e in circuit optimization is to minimize delay or po wer the pessimistic v alue in this conte xt for ef fecti v e channel length is L u e f f which is the sum 18
PAGE 29
of L m e f f3 s Similarly if we model the gate oxide thickness as a fuzzy triple, the pessimistic v alue is once again the upper bound v alue. 2.4.2 Solution T echnique: Fuzzy Linear Pr ogramming In this subsection, we e xplain the solution methodology of v ariation a w are optimization using fuzzy linear programming. Fuzzy linear programming (FLP) is a special case of fuzzy mathematical programming, where objecti v e function and constraints of the optimization problem are linear The v arying coef cients are assumed to v ary linearly in the specied interv al. The FLP problem sho wn here is a maximization problem with uncertain coef cient in the constraints. maximize n i1 a i x i (2.5) sub j ec t t o n i1 b j i x inc j1njnm where, m is the number of constraints, n the number of v ariables and at least one x j0. In the abo v e optimization problem, the coef cient b j i is the interv al v alued fuzzy number which has a mean v alue, b j i and a maximum v ariation of d j i The upper bound is assumed to be the pessimistic v ariation for this fuzzy number The fuzzy number b j i is also assumed to v ary linearly with the v alue of the v ariable x i T o defuzzify the FLP we need to identify the lo wer and upper bounds of the optimal solution. The upper bound v alue for the fuzzy optimization problem can be estimated by setting the fuzzy coef cients x ed to the a v erage case of the triangular fuzzy number as sho wn in the follo wing equation. Ob j 1maximize n i1 a i x i (2.6) sub j ec t t o n i1 b j i x inc j1njnm Similarly the lo wer bound v alue is found by setting the fuzzy coef cients to the pessimistic (assumed to be upper bound here) v alue, 19
PAGE 30
Ob j 2maximize n i1 a i x i (2.7) sub j ec t t o n i1b j id j ix inc j1njnm No w with these bound objecti v e v alues and a ne w v ariation parameter we can formulate a crisp problem, which will represent a optimal solution in presence of v ariations. The objecti v e function for the fuzzy programming problem tak es v alues between this lo wer Ob j l = min( Ob j 1 Ob j 2 ) and upper Ob j u = max( Ob j 1 Ob j 2 ) bound v alues. Using these bound v alues and the symmetric denition of fuzzy decision proposed by Bellman and Zadeh [64], the fuzzy problem can be defuzzied into a crisp nonlinear problem as sho wn in the follo wing equation. maximize l (2.8) lOb j lOb j u n i1 a i x iOb j un0n i1b j il d j ix ic jn01njnm x j00nln1 1ninn where, l is the v ariation parameter introduced in the crisp problem which is to be maximized for an fuzzy optimal solution between the lo wer and upper bound v alues. The solution of this nonlinear programming problem can be interpreted as representing an o v erall de gree of satisf action in presence of v arying parameters [56]. In the general formulation the v ariation parameter l can tak e v alues between 0 and 1. In the ne xt section, we e xplain the basic of our second technique, namely the stochastic chance constrained programming problem. 2.5 Stochastic Chance Constrained Pr ogramming Programming under probabilistic constraints as a decision model under uncertainty has been introduced by charnes, cooper and symonds [8]. The methodology w as coined as chance constrained programming by these authors for this model, its e xtension and its v ariants [7]. Similar to the fuzzy programming technique, the chance constrained programming handles uncertainty in mathematical 20
PAGE 31
programming based formulations. The chance constrained technique uses a relaxation step to conv ert the uncertain optimization problem into a crisp nonlinear programming problem. The focus of stochastic chance constrained programming is to increase feasibility of the program' s ability to meet constraints in an uncertain en vironment. Ne xt, we e xplain the solution technique of the chance constraint programming methodology The discussion starts with the underlying linear programming problem, which can be sho wn as, minimize C t x (2.9) stAxbx0 Here, h and x are nv ectors, b is a mv ector and A is a m x n matrix. In se v eral engineering problems, the estimation of the constraint matrix A is dif cult and has a certain amount of uncertainty associated with it. In such situations, the system is required to satisfy the corresponding constraint with a probability p 01. The stochastic equi v alent of the linear programming problem can be sho wn as, minimize c t x (2.10) stPAxbpx0 The probability p, in the conte xt of engineering type problems, may reect the reliability of the system. The probability p ensures that the state of the system remains within a subset of all possible states. The subset mainly focuses on functioning of the system without major f ailures. The choice of the probability p is often arbitrary and accounts for the loss whene v er constraints are violated. The abo v e sho wn formulation can be ef ciently transformed under the assumptions of node delay independence and Gaussian structured random v alues. The transformed probabilistic constraint can be e xpressed as follo ws, Ax f11p z T C z05b (2.11) Here, the assumption is that the random v ariable has a joint normal distrib ution with e xpectation and co v ariance C (matrix) and z is the standard de viation. The v alue of f11pfor the constraints can 21
PAGE 32
be used to control the subset size and the associated reliability of the system. A high v alue for the abo v e coef cient increases the yield and reliability of the system. Similar to the fuzzy programming solution, the stochastic chance constrained programming can also be solv ed using a nonlinear optimization solv er In this dissertation, we propose statistical design optimization solutions using both fuzzy and stochastic chance constrained programming for v ariation a w areness in circuit optimization. 22
PAGE 33
CHAPTER 3 V ARIA TION A W ARE GA TE SIZING 3.1 Pr oblem Denition Gate sizing is a simple yet po werful technique to impro v e the po wer/delay ratio of VLSI circuits. Se v eral modeling schemes and solution methodologies ha v e been proposed o v er the years to optimize po wer and performance through gate sizing. The process of gate sizing can be dened as nding the optimal dri v e strengths of indi vidual gates of a circuit for a gi v en objecti v e function and constraints. F or e xample, the objecti v e function can be to minimize po wer or area for a specied timing tar get. T axonomy of v arious gate sizing approaches, found in the literature, cate gorized as (i) deterministic [1, 2, 13, 18, 41, 60, 80, 90] and (ii) v ariation a w are gate sizing [10, 15, 26, 35, 42, 49, 53Â– 55, 58, 59, 69, 78, 87, 88, 95] approaches is sho wn in Figure 3.1. The w orks listed in the Figure ha v e used iterati v e sizing, linear programming, geometric programming, game theory and se v eral other interesting formulations to identify the optimal gate sizes for minimizing the po wer and/or timing of circuits. The v ariation a w are gate sizing w orks models the process, v oltage and temperature (PVT) v ariation using a statistical approach. The PVT v ariations in the nanometer era, can be cate gorized as inter die and intradie v ariations. The inter die v ariation occurs across dif ferent dies and af fects all the transistors in the chip in a similar f ashion. The intradie v ariations, on the other hand, refer to v ariability within a single chip resulting in the gate lengths of some transistors lar ger and some others smaller than the intended sizes. The characteristics of the intradie v ariations are correlated with respect to the position of the transistor in the die. The modeling of process v ariations, initially w as limited to statistical static timing analysis (SST A) [11, 26, 74], where continuous distrib utions are propagated instead of deterministic v alues to nd closed form e xpressions for performance in presence of v ariations. More recently statistical design optimization for impro ving po wer and area for an acceptable yield has been in v estigated in [42, 49, 53, 54, 69, 95]. In [95], the optimization uses a penalty function to impro v e the slacks of critical paths to impro v e yield. 23
PAGE 34
Hashimoto and Onodera, Optimizing Delay Analysis based Statistical Timing Murugavel and Ranganathan, 2004 Chen and Sarrafzadeh, 2002 Game Theory Tenakoon and Sechen, 2002 Lagrangian relaxation Sizing and Scaling Chen et.al., 2000 Sizing and Placement Coudert, 1995 Constrained power/delay Sapatnekar et.al., 1993 Posynomial Models Berkelaar and Jess, 1990 Linear Programming 1985 Iterative TILOS Fishburn and Dunlop, Gate Sizing Variation Aware Gate Sizing Deterministic Leakage Power Mathematical Programming based Geometric Programming 2007 Hanchate and Ranganathan Stochastic Games Fuzzy Programming Mahalingam et.al., 2006 Chopra et.al., 2005 using Gradients Yield Maximization Singh et.al., 2005 2004 Srivastava et.al., 2004 Binning Yield 2005 Davoodi and Srivastava, Guthas et.al., 2005 Yield Constraint Timing Yield Sinha et.al., 2006 Variation Tolerance Neiroukh and Song, 2006 Leakage Power Bhardwaj et.al., 2006 Stochastic Programming Mani and Orshansky, Gate Sizing Methods Penalty function Bai et.al., 2002 2000 Figure 3.1 T axonomy Diagram of Optimization Methods for Gate Sizing An SST A engine is used in the iterati v e optimization frame w ork [49] to nd the most critical gates to size in terms of po wer/delay sensiti vity A stochastic programming approach with chance (probabilistic) constraints is used in [53] and [54] to incorporate yield in the gate sizing problem formulation. Ho we v er the SST A based approaches [42, 49, 95] use continuous distrib utions, which require a number of operations to be performed iterati v ely at each node and hence, in v olv e higher runtimes [20, 30]. The stochastic programming based statistical optimization technique, on the other hand, is reasonably f ast, b ut is claimed in [40] that it can produce less optimized solutions compared to fuzzy programming, when tested with MonteCarlo simulations. 24
PAGE 35
In this chapter we propose a ne w v ariation a w are gate sizing algorithm considering the uncertainty due to process v ariations using the concept of fuzzy linear programming. In the conte xt of fuzzy set theory imprecision is dened as an uncertainty where it is dif cult to e v en predict the a v erage beha vior of the outcome. Probability theory can be used to model situations in which the a v erage beha vior is predictable (situations that obe y the la w of lar ge numbers) and enough information is a v ailable to model the probability distrib ution functions. The theory of fuzzy sets and systems on the other hand, has been used to model imprecision in dif ferent applications such as vision and robotics [64]. In VLSI design automation, fuzzy logic has been applied to model imprecise coef cients in VLSI testing and scheduling in high le v el synthesis [30]. T o the best of our kno wledge, this is the rst time the concepts of fuzzy sets and systems and fuzzy mathematical programming is being used to model the uncertainty due to process v ariations in nanometer VLSI circuits. F or simplicity we use linear delay models [17] and linear membership functions [28]. Ho we v er more comple x models including nonlinear or other posynomial models [56] can be easily incorporated into the fuzzy optimization o w The fuzzy optimization is a tw o step process. Initially a deterministic optimization is performed assuming the w orst and the a v erage case v alues for the v ariation parameters to identify the bounds of the uncertain problem. The solution bounds and a v ariation parameter l is used to transform the uncertain fuzzy problem into a crisp nonlinear programming problem [28, 64]. The term crisp problem in fuzzy mathematical programming conte xt refers to a nonfuzzy or noninterv al based real v alue. The v ariation parameter l in this crisp problem ranges from (0,1) and implicitly models the interv al v alues (fuzzy v ariables) of the original problem. This transformation is referred to as symmetric relaxation [28]. The additional parameter l introduced during the transformation implicitly captures the uncertainty due to process v ariations and a maximization of this v ariable leads to high process v ariation resistance. Fuzzy numbers with nonlinear membership functions can be modeled by replacing this linear parameter l with a nonlinear function in terms of the v ariation parameter The solution of this crisp nonlinear problem represents the optimal v alue in the presence of v ariations. The crisp problem, in general, has been pro v en to pro vide the most satisfying solution in the presence of v ariations in the coef cients of the constraints or objecti v e function in the optimization [43, 56]. In the conte xt of v ariation a w are circuit optimization, the abo v e crisp model with delay and po wer as constraints, can be used to maximize the rob ustness, i.e., the v ariation resistance of the circuit and thus the yield. The proposed approach has been tested on ITC'99 benchmark circuits 25
PAGE 36
and the results indicate sizable sa vings in po wer compared to the w orst case deterministic gate sizing approach. The proposed fuzzy programming based gate sizing also pro vides better results than the stochastic programming based gate sizing approach, [54], in terms of po wer sa vings with a comparable runtime. The results are v alidated using MonteCarlo simulations, which indicate a high timing yield for the circuits designed with fuzzy gate sizing methodology The rest of the chapter is or ganized as follo ws. In Section 3.2, we discuss the moti v ation as to why fuzzy programming is suitable for v ariation a w are gate sizing. The details of the proposed modeling and the methodology for fuzzy gate sizing approach is gi v en in Section 3.3, follo wed by e xperimental results and conclusions in Sections 3.4 and 3.5 respecti v ely 3.2 Wh y Fuzzy Pr ogramming f or V ariation A war e Gate Sizing ? In this section, we discuss why fuzzy mathematical programming is well suited for modeling the uncertainty due to process v ariations in VLSI circuits. The impact of process v ariations in the nanometer era are completely nondeterministic and the de gree of uncertainty is e xpected to be w orse in future generations [95]. A common approach to handling the uncertainties due to process v ariations has been to use probabilistic models, in which the uncertain parameters are represented in terms of probability distrib utions. Ho we v er the probabilistic w ay of e v aluating and optimizing the uncertainties is computationally e xpensi v e due to the need for complicated multiple inte gration techniques needed for continuous distrib utions [20, 30] or due to the lar ge number of scenarios for the corresponding discrete representation. V ariation a w are gate sizing, proposed in [49], using statistical static timing analysis requires v ery high e x ecution times. Furthermore, certain probabilistic modeling requires e xhausti v e description of uncertain parameters to b uild probabilistic distrib utions from historic (empirical) data. When such a description is not a v ailable (for e xample, in a ne w technology or for a ne w v ariation parameter where e xtensi v e details of uncertainty is not kno wn), we do not ha v e enough information for deri ving or obtaining the probabilistic distrib utions. Also, e xhausti v e MonteCarlo simulations are needed to generate probability distrib utions for all the v arying parameters. An alternati v e treatment of uncertainty is needed in the situations, where in an e xpert can predict or obtain only the mean and w orst case v alues of an uncertain 26
PAGE 37
parameter Fuzzy mathematical programming and interv al arithmetic can be used to mak e decisions in the abo v e conditions. In addition to the abo v e ar guments, Buckle y has also sho wn in [40] that fuzzy programming based optimization guarantees solutions that are better or at least as good as their stochastic counterparts. The author pro vides a comparison of the stochastic and fuzzy programming methodologies using MonteCarlo simulations. The fuzzy optimization, in uncertain en vironments, nds the best solution (supremum operation o v er all feasible solutions) as opposed to a v eraging (inte grals o v er all feasible solutions) in stochastic programming based optimization. Hence, fuzzy programming selects a solution which is better than or at least as good as the stochastic solution. The abo v e ar guments pro vided us the moti v ation to in v estigate fuzzy mathematical programming approach to model uncertainty due to process v ariations in VLSI design automation. The performance of the proposed algorithm is compared with that of the stochastic programming based gate sizing in order to illustrate the ef cienc y of fuzzy programming for optimizing in presence of v ariations. It is sho wn in Section 3.4, that the proposed approach yields better po wer sa vings than stochastic gate sizing with comparable e x ecution times under the assumptions of same models, setup, parameters and objecti v e function for their implementations. In the conte xt of fuzzy gate sizing, the uncertainty due to process v ariations can be modeled as nor mally distrib uted random v ariables with mean E and standard de viation s W e model these v ariations as interv al v alued fuzzy numbers in the range E3 s and E+3 s instead of using normal distrib ution,. The 3 s v alue is assumed to be the deterministic w orst case v ariation v alue, meaning all uncertain process parameters are set to 3 s for maximum timing yield in w orst case deterministic optimization. In the conte xt of fuzzy mathematical programming, possibilistic distrib utions p are analogous to linear membership functions [30]. The triangular fuzzy number in Figure 2.1 is usually denoted by a triple X = ( x mx lx u ), where x m is the most possible v alue or the mean v alue and x lx u are the lo wer and upper bounds, denoting the pessimistic and optimistic v alue of the number Depending on the conte xt, the v alue x l can be pessimistic or optimistic v ariation from the mean v alue x m and the same holds for the v alue x u In the conte xt of VLSI circuit optimization, the triple L e f f L m e f fL l e f fL u e f fcan be used to model v ariations in channel length. Since the general objecti v e in circuit optimization is to minimize delay or po wer the pessimistic v alue in this conte xt for ef fecti v e channel length is L u e f f which is the sum of L m e f f3 s Similarly if we model the gate oxide thickness as a fuzzy triple, the 27
PAGE 38
pessimistic v alue is once again the upper bound v alue. In the ne xt section, we e xplain the modeling and formulation of the v ariation a w are gate sizing problem. 3.3 Pr oposed V ariation A war e Fuzzy Gate Sizing In this section, we describe our formulation of the gate sizing problem in the presence of uncer tainty due to process v ariations. The problem of gate sizing can be dened as nding the optimal dri v e strengths such that the specied critical path timing is met and the o v erhead (po wer in this w ork) is minimized. W e determine the size of the gates with the goal of minimizing dynamic po wer with delay as constraints. W e use linear programming due to simplicity of modeling, f aster e x ecution time and a v ailability of well de v eloped fuzzy linear programming techniques for v ariation a w are optimization [43]. Ho we v er it should be noted that, fuzzy programming based optimization can also solv e optimization problems in a nonlinear programming setup [56]. Ne xt, we present the po wer and delay models used in this w ork. 3.3.1 P o wer and T iming Models The dynamic po wer consumption of a gate (i) is gi v en as, P i1 2 f V 2 d d E iC iC wir e P sc (3.1) where, P i is the total dynamic po wer consumed by gate i f is the clock frequenc y V d d is the supply v oltage for the gate, E i is the a v erage switching acti vity of the gate, C i is the intrinsic gate capacitance internal to the gate and C wir e is the sum of all the interconnects that f anout from gate i Thus, reducing the size ( s i ) of the gate reduces the intrinsic gate capacitance of gate i po wer consumption and f anin load capacitance of the gate. Secondly in this w ork we model gate delay as a linear function of gate size. The linear delay model proposed in [17], is gi v en by d ia ib i s ic i j e f ois j (3.2) where, s i refers to the size of gate i f oiis the set of gates that f anout from gate i constant coef cients a ib ic i are empirically determined by e xtensi v e SPICE simulations for each gate in the library 28
PAGE 39
for v arious sizes and f anout counts. A similar nominal delay model has also been used in a recent stochastic programming based statistical gate sizing approach [54]. 3.3.2 V ariation Modeling with Spatial Corr elation The increasing inuence of process v ariations on circuit yield is making v ariation a w are optimization a requirement early in the design o w The uncertainty due to process v ariations has been modeled in most w orks using the follo wing equation, Dd in j1 d j X jd r X r (3.3) where, d i is the nominal delay and X j and X r are the random parameters representing correlated and independent v ariations respecti v ely The magnitude of these v ariations is gi v en by the v ariables d j and d r which is determined from e xtensi v e simulations. The correlations of dif ferent areas of transistors are dif ferent and are usually high for gates close to each other in the die. The correlations in this w ork were included in the delay model as proposed by the authors in [4]. The modeling of spatial correlations by di viding the chip into re gions and le v els is sho wn in Figure 3.2. The lo west le v el (Le v el 0) of the die is di vided into sixteen re gions and the y are grouped into subblocks upper le v els. The correlation of connected gates in the die is directly proportional to the number of common re gions the gates share in le v els 0 and 1. The gates placed closer to each other ha v e a similar v ariation character istic and hence high correlation. The magnitude of v ariation in a gate is determined by the count of its f anout gates, placed in the same Le v el 0 re gion. A gate with all its f anout gates in the same re gion is modeled to ha v e a smaller v ariation v alue. W e capture these v ariations using the concept of fuzzy numbers. W e assume the delay of the gate as an interv al with lo wer and upper bound v alues. In other w ords, each gate' s delay is no w a triangular v alue (a v erage, lo w high), instead of a single discrete v alue. W e focus on the intradie v ariations, meaning, each transistor can ha v e a dif ferent amount of v ariation. In the pioneering w ork on process v ariations by Sani Nassif [79], its been pointed out that in the absence of real statistical data on a process run it' s reasonable to assume a v ariation parameter v alue of 25% on the delay due to process v ariations. The uncertainty in the delay v alues are transferred to the coef cients b i and c i of the linear delay model sho wn in Equation 3.2. F ollo wing the abo v e assumption, in the w orks, reported 29
PAGE 40
1, 1 1,3 1, 4 1,2 2,1 2,3 2, 9 2,11 2,2 2,4 2,10 2,12 2,5 26 2,13 2,15 2,6 2,8 2,4 2,16 0, 1 Figure3.2SpatialCorrelationofProcessVariations[Agarwal,2003] in[53,54],ithasbeenpointedoutthattheregressioncoefcients b i and c i closelyapproximatethe variationeffectsof l eff and t ox basedonsimulationexperimentswithstatisticaldata.Theyobserved thattheregressioncoefcients b i and c i tohaveavariationvaluesof8%and10%,whichinturn correspondedtothe25%variationeffectincircuitdelay.Thesecoefcientsarethefuzzynumbers ofthetriangularformwithalinearlyvaryingmembershipfunction.Sincetheseobservationswere anoutcomeofsignicantexperimentsbasedonstatisticaldatafromrealprocessruns,wefollowed thesameassumptionsinourwork.Thisallowsustomakeafaircomparisonwiththeworkreported in[54].Next,weexplaintheproposedfuzzygatesizingapproachforoptimizationinpresenceof processvariations. 3.3.3VariationAwareFuzzyGateSizing Inthissection,weusedelayconstraineddynamicpowerminimizationforthegatesizingproblem. Ifminimizingpowerisoursoleinterest,thenallthegatescanbesettominimumsize.However,the problemobjectiveistoachieveminimumpowerforaspeciedtimingtarget.Hence,thecostfunction ofthedeterministicoptimizationformulationmustincludebothdelayandpower.Thedeterministic formulationofthesizingproblemisgivenby, 30
PAGE 41
min i P i (3.4) stD pnT s pecp e P and D p i e pa ib i s ic i j e f ois jwhere, T s pec is the specied timing tar get of the circuit, p denotes a particular path number in a circuit which belongs to the set of all paths P and D p is the sum of the delays of all gates in path p. The summation of the dynamic po wer of all the gates is used as as the objecti v e function. The dynamic po wer model in Equation 3.1 is substituted here. The fuzzy v ersion of the abo v e deterministic optimization problem with uncertain parameters is gi v en by min i P i (3.5) stD pnT s pecp e P and D p i e pa i b i s i c i j e f ois jwhere, s i is bounded by minimum and maximum gate size, the coef cients b i and c i are the uncertain parameters. The uncertain parameters are modeled as fuzzy number triples of the form ( b ib ig ib ig i ) and ( c ic ih ic ih i ), where g i and h i are the maximum v ariations for the coef cients b i and c i respecti v ely The coef cient b i and c i closely approximate the v ariation in ef fecti v e channel length ( L e f f ) and oxide thickness ( t ox ). The authors in, [54], also follo w a similar modeling for gate sizing in the presence of uncertainty using chance constrained programming. The fuzzy gate sizing problem is then transformed into a crisp nonlinear problem using the follo wing steps. A deterministic optimization is performed initially with the v arying coef cients set to w orst and a v erage case v alues of the fuzzy number In the w orst case optimization, the fuzzy gate delay equation in the fuzzy problem is replaced with the follo wing equation. d i a i b ig is i c ih i j e f ois j(3.6) 31
PAGE 42
The gate delay in the abo v e equation is the most pessimistic estimate, resulting in the w orst possible delay for the gate. It can also be seen that the w orst case estimate corresponds to the lo wer bound in coef cient b i since b i is in v ersely proportional to the ef fecti v e channel length and upper bound in coef cient c i as it is directly proportional to gate oxide thickness. Similarly the typical or nominal case of the gate delay is the case where the fuzzy numbers are x ed to their a v erage case v alues. In the nominal case optimization, the fuzzy delay equations in the fuzzy problem is replaced with the follo wing equation. d i a i b is i c i j e f ois j(3.7) The deterministic optimization problem (Equation 3.4) is solv ed with the delay equations set to the w orst case and nominal case equations. The KNITR O optimization solv er a v ailable through the NEOS optimization serv er is used to solv e the linear programming problems. The results of these optimization correspond to w orst case gate sizing ( wc sizing ) and nominal case gate sizing ( nc sizing ) v alues. The abo v e v alues and a ne w v ariation parameter l are used to transform the fuzzy optimization problem into a crisp nonlinear programming problem using the symmetric relaxation method [64]. The crisp nonlinear problem for gate sizing in the presence of process v ariations is gi v en by maximize l (3.8) lnc sizingwc sizing i P iwc sizingn0stD pnT s pecp e P and D p i e pa i b ig ils i c ih il j e f ois jwhere, the parameter l is bounded by 0 and 1. Ev en though, the parameter l can tak e an y v alues between 0 and 1, for the gate sizing problem, it can be easily bounded to a smaller v alue. In this w ork, we bound the l v alue to be between 0.5 and 0.75. W e estimated that such a smaller bound is suf cient due to the dual requirement of high yield and lo w o v erhead for the gate sizing optimization in presence of v ariations. The smaller bound speeds up the fuzzy gate sizing procedure by 23 times, without af fecting the nal solution. The crisp optimization problem has three v ariables, po wer ( P i ), delay ( D p ) and process v ariations ( l ) in the abo v e formulation. The parameter l is the v ariation resistance (ro32
PAGE 43
b ustness) property of the circuit, meaning the ability to meet the timing constraint e v en in the presence of v ariations. The problem tries to maximize v ariation resistance, constraints delay v alue e v en with v ariations to be less than specied timing, and bounds the po wer v alue to be in between wc sizing and nc sizing v alues. One can f a v or the po wer v alue to be close to the nc sizing v alue by maximizing the v ariation resistance v alue. Hence, the crisp optimization problem tries to satisfy all the three requirements to the maximum de gree. It has been sho wn for problems in other application domains that the abo v e formulation pro vides the most satisfying optimization solution in the presence of uncertainty [43, 56]. Another issue in the abo v e optimization formulation is that the the number of paths in the circuit gro ws e xponentially in the number of gates. Hence, the path based formulation is con v erted to a node based optimization problem [53, 54]. The node based formulation is a widely used technique [1, 13, 42, 53, 54, 95] to impro v e the computational ef cienc y of optimizing lar ge circuits. The gate sizing problem with the node based formulation can be sho wn as, min i P i (3.9) sta jnT s pecj e in pu tPOsta jD ina ii and j e in pu tiand D i a ib i s ic i j e f ois jHere a j is the arri v al time at node j and T s pec is the timing specication. In the node based approach, the path based constraints are brok en by using arri v al time v ariables at each node. The number of such constraints is linear and is proportional to the number of interconnects in the circuit. The node based formulation introduces some suboptimality (decrease in po wer optimization in this w ork). Ho we v er the decrease is ne gligible for circuits with less than 20 le v els of logic [54]. Since the trend in the nanometer era, is to w ards higher clock speeds and lesser le v els of logic, we belie v e that the o v erall impact is v ery less and the computational benet in terms of running time of the optimization justies such a modication. In the ne xt section, we present the simulation steps and the e xperimental results of the fuzzy gate sizing approach tested on ITC'99 benchmark circuits. 33
PAGE 44
3.4 Experimental Results The proposed fuzzy linear programming optimization for gate sizing w as tested on ITC'99 benchmark circuits. The complete simulation o w is sho wn in Figure 3.3. First, the R TL le v el VHDL netlists are con v erted to structural le v el V erilog netlist using the synopsys design compiler tool. The gate le v el netlist is completely attened to the basic gates in the standard cell library The output V er ilog le from the design compiler is then placed and routed using the cadence design encounter tool. The benchmark circuits are synthesized using the TSMC 90nm db, lef and tlf libraries. The placed and routed netlist (DEF le), library of cell delay information and the V erilog le are gi v en as an input to a C script ( DEF2AMPL ), which con v erts the netlist into a AMPL based mathematical program format for po wer minimization using fuzzy gate sizing. AMPL is a widely used modeling language for lar ge scale mathematical programming problems. The a v erage switching acti vity in each line w as calculated by simulating each of the benchmark circuits with 100,000 random v ectors. The equation coef cients for po wer and delay models for the standard cell library cells in the TSMC 90nm libraries are characterized for v arious gate sizes and f anouts using hspice simulations. The t is justied as the rms error is less than 7% for a restricted range of gate sizes (1x 4x). The DEF2AMPL script uses these delay equations to generate the linear programming models for the benchmarks with delay coef cients set to mean and the maximum possible v ariation (w orst case). The maximum v ariation in gate delay is assumed to be 25% from its mean v alue [79]. This is translated into appropriate v alues for the coef cients b i and c i in the delay model. The linear optimization problems are solv ed using the KNITR O Solv er a v ailable through the NEOS serv er for optimization. The DEF2AMPL script uses the results of these optimizations and generates a fuzzy nonlinear AMPL model. The fuzzy nonlinear optimization problem is also solv ed using the KNITR O solv er to nd the optimal gate sizes in presence of v ariations in gate delay The proposed fuzzy sizing approach is compared with the stochastic programming based gate sizing under uncertainty [54]. The latter method w as also implemented with the same setup, parameters and objecti v e functions for f airness in comparison. The po wer reduction achie v ed by the fuzzy sizing approach compared to w orst case deterministic sizing and the stochastic programming approach is documented in T able 5.2. The w orst case sizing results correspond to the delay coef cients set to their maximum v ariation case. T s pec corresponds to the minimum required delay for the circuit, obtained by unconstrained delay optimization. Optimized 34
PAGE 45
VHDL Netlist RTL Level TSMC 90 nm Libraries DEF, Verilog KNITRO Solver using MonteCarlo simulations Analyze results for Timing yield bounds from deterministic optimizations solved above for GateSizing in AMPL Format Solve the Variation aware Fuzzy Program with Mathematical Program generation DEF2AMPL Script for KNITRO Solver Solve the Program with KNITRO Solver Solve the Program with Delay Coefficients for gates and interconnects Design Encounter Netlist output from for variation aware Fuzzy AMPL Model to worst case values variation parameters set AMPL Model with to nominal case values variation parameters set AMPL Model with Cadence Design Encounter Gate Level Netlist Synopsys Design Compiler Gate Sizing with Figure 3.3 Fuzzy Gate Sizing: Simulation Flo w po wer v alues for w orst case gate sizing, stochastic programming based gate sizing and fuzzy programming based gate sizing is sho wn in columns 4, 5 and 6 and the percentage reduction in po wer of fuzzy approach compared to w orst case and stochastic approach is gi v en in columns 7 and 8 respecti v ely The percentage sa vings compared to deterministic w orst case sizing in po wer is calculated using the 35
PAGE 46
Figure 3.4 Fuzzy Gate Sizing: Ex ecution T ime T able 3.1 V ariation A w are Gate Sizing Results on Benchmark Circuits ITC '99 Number T S pec Gate Sizing Po werw CPU time (sec) % Reduction Circuit of gates (ns) of FGS o v er D WCGS SGS FGS SGS FGS D WCGS SGS b11 385 0.25 288 254 232 1.54 1.62 19% 9% b12 834 0.39 465 397 357 7.12 7.84 23% 10% b14 4232 2.62 1826 1695 1524 29.18 34.45 16% 9% b15 4585 2.98 1774 1521 1397 31.2 38.4 21% 8% b20 8900 2.68 3797 3423 3120 68.14 72.13 18% 9% b17 21191 3.48 8423 7812 7044 133.5 136.3 16% 10% b18 43151 4.11 14176 13045 11753 269.55 282.36 17% 10% A v erage Sa vings Percent 18.57% 9.2% T s pec : T iming Specication; D WCGS : Deterministic W orst Case Gate Sizing SGS : Stochastic Gate Sizing [54] and FGS : Fuzzy Gate Sizing [This w ork] follo wing equation, PR 1P ower D W CGSP ower FGS P ower D W CGS100 (3.10) Similarly the percentage impro v ement of fuzzy sizing compared to stochastic sizing is calculated as, PR 2P ower SGSP ower FGS P ower SGS100 (3.11) It can be seen that there is a sizable sa vings in po wer by using the fuzzy sizing approach as compared to deterministic w orst case gate sizing and stochastic programming approach. The e x ecution time of the fuzzy optimization approach is also sho wn in T able 5.2. W e also studied the impact of our algorithm 36
PAGE 47
on leakage po wer using the leakage models proposed in [15]. The fuzzy approach and the stochastic approach on the a v erage impro v e leakage po wer by close to 17% and 8% respecti v ely compared to the deterministic w orst case (D WC) approach during v ariation a w are gate sizing. It should be noted that the abo v e result only indicates the impact of our proposed algorithm formulation on leakage po wer Ho we v er a multiobjecti v e optimization frame w ork will need to be formulated to ef ciently optimize both dynamic and leakage po wer Figure 3.5 Impro v ement in Po wer Sa vings: FGS W ith Correlations The runtime of the fuzzy logic based optimization is comparable to the stochastic programming approach as seen from T able 5.2. Figure 3.4 also illustrates that the runtime comple xity of fuzzy linear programming for gate sizing is close to linear in the number of gates in the circuit. Secondly the ef fects of spatial correlation are considered as mentioned in Section 3.3. The v ariation magnitudes g i and h i of gate delay coef cients b i and c i are discretized and the contrib ution of each f anout gate is weighed in v ersely with respect to the number of sharing re gions between the gates. Figure 3.5 sho ws the percentage sa vings of the correlation a w are v ariation modeling when compared to base fuzzy gate sizing approach. It can be clearly seen that the spatial correlation model eliminates further pessimism in the v ariation modeling and achie v es a a v erage sa vings of 3.4% po wer reduction compared to the base FGS approach. Finally to v erify the result of the fuzzy sizing approach, we generated 10000 samples of the ITC'99 benchmarks. The circuits were x ed with gate size outputs from the fuzzy sizing method and the gate coef cient v alues b i and c i were assumed to ha v e random v ariation v alue in the range b i to b ig i and c i to c ih i respecti v ely The v ariation v alue w as generated from a uniform 37
PAGE 48
distrib ution between these ranges. W e then performed MonteCarlo simulation with these random samples to determine the frequenc y of timing violations. The fuzzy logic approach had an timing yield of around 99100% for all the benchmark circuits. This conrms the f act that the fuzzy gate sizing approach pro vides high resistance to process v ariations without compromising on the po wer o v erheads. 3.5 Conclusion In this chapter we proposed a ne w approach for gate sizing considering process v ariations using fuzzy linear programming. The v ariations in channel length and oxide thickness are modeled as fuzzy numbers with linear membership functions. The proposed fuzzy gate sizing approach maximizes v ariation resistance (rob ustness) of the circuit, with delay and po wer as constraints in the formulation. Experimental results on ITC'99 benchmark circuits indicate sizable sa vings in po wer and a runtime comparable with that of the stochastic sizing approach. The results v alidated using MonteCarlo simulations, conrms the high v ariation resistance of the circuits sized using the fuzzy programming approach. 38
PAGE 49
CHAPTER 4 V ARIA TION A W ARE TIMING B ASED PLA CEMENT 4.1 Pr oblem Denition Circuit optimization techniques such as, gate sizing, incremental placement, b uf fer insertion, is commonly used to impro v e the performance of inte grated circuits. T iming based incremental placement is crucial in nanometer circuits to meet the high performance requirement. The process nds the optimal locations of cells in a critical sub circuit such that the delay of the circuit is minimized. Circuit designers o v er the years, ha v e used corner case models to optimize and analyze designs. The idea is to meet the timing specication at the best, w orst and typical case model v alues. Ho we v er with process v ariations, the abo v e test results can be f ar from the actual v alues. A guarded approach in terms of yield, to eliminate the ef fects of v ariability is to perform deterministic optimization at the w orst case v alues of the v arying parameters. The w orst case approach guarantees high timing yield, b ut leads to suboptimal solutions in terms of performance. T iming yield in this conte xt, is dened as the per centage of chips meeting the timing specication. T ypical case v alue, on the other hand, guarantees optimal solutions b ut can result in unacceptable timing yield. It is clear that ne w methodologies are needed, which can guarantee a high timing yield and at the same time pro vide a solution with a high po wer/performance ratio. Se v eral researchers ha v e in v estigated the ef fects of v ariations in timing analysis and statistical design optimization. Static timing analysis w as replaced with statistical static timing analysis (SST A) [11, 74], where continuous distrib utions are propagated instead of deterministic v alues to nd closed form e xpressions for performance in the presence of v ariations. Recently v ariation a w are gate sizing for impro ving po wer and area for an acceptable yield has been in v estigated in [49, 54, 87]. Thus, the consideration of process v ariations is important in the design and the optimization of circuits. In this paper we propose the use of fuzzy mathematical programming (FMP) and stochastic chance constrained programming for v ariation a w are timing based incremental placement problem. The un39
PAGE 50
certainty due to process v ariations are modeled using fuzzy numbers in the FMP case and using probabilistic constraints in the chance constrained programming formulation. Recently the authors in [5, 81, 84] ha v e considered process v ariations, while solving the placement problem. The authors in [81], ha v e considered the ef fects of v ariations during placement in FPGAs. V ariations due to lens aberrations ha v e been considered in [5] and a fuzzy optimization o w for timing v ariations in [84, 86]. A taxonomy of related w orks on deterministic and v ariation a w are timing based placement w orks are sho wn in Figure 4.1. Kahng et.al. Pedram Hwang and Luo et.al. Kahng et.al. Ren et.al. et.al. Ranganathan Chowdhary Bazargan Chow and Authors Programming with Fuzzy and Stochastic Minimize Timing Variation Timing Variations FPGA Placement with Timing Variation Reduction Mahalingam and Taxonomy of Timing based Placement Works with Fuzzy Programming Minimize Timing Variation 2008 2007 2006 2006 2006 2006 2005 2004 2003 2002 Year Ranganathan Mahalingam and Narayanan Srinivasan and Lens Abberation Aware Placement Timing based Variation Aware Timing Placement Monotone Ordering based Timing Minimization Path based Accurate length Minimization Netweighted Interconnect MinMax Timing Minimization Timing Minimization Linear Programming based High Performance LP based Timing Minimization Methodology Based Placement Deterministic Timing Type Figure 4.1 T axonomy Diagram of T iming Based Placement 40
PAGE 51
The problem of timing based incremental placement is an important part of the timing con v er gence o w It can be formally dened as the process of nding the optimal locations of cells in a critical sub circuit such that the delay of the circuit is minimized. In timing based placement, the length of interconnects in the critical paths need to be minimized by changing the locations of certain cells [83, 91]. The timing of a circuit is usually measured in terms of the w orst ne gati v e slack and the total ne gati v e slack. Slack in this conte xt, is dened as the dif ference between the required time and actual arri v al time of the signal. T iming dri v en placement approaches can be cate gorized into netbased [83, 97] and path based [6, 9, 19] approaches. The netbased approach translates the timing requirements into sensiti vity coef cients of timing critical nets and performs a weighted wire length minimization. Hence, modeling the ef fects of process v ariations in these netbased approaches is not straightforw ard. On the other hand, the path based approaches hold an accurate timing vie w and minimize critical path delay more directly by in v olving path delay constraints in the optimization problem. A problem with the path based approach is their high computational comple xity due to the e xponential number of paths. But path based delay constraints can be transformed into nodebased constraints [9, 87] to impro v e the feasibility of optimizing lar ge circuits. The transformation only introduces a suboptimality of 12% [54]. The theory of fuzzy sets and systems o v er the years, has been applied in VLSI design automation for high le v el synthesis [30] and for modeling v ariations in gate sizing [87]. The uncertainty due to v ariations can be modeled using fuzzy numbers with linear membership functions. The proposed timing based placement approach is formulated to minimize the w orst ne gati v e slack of the circuit in the presence of process v ariations. The fuzzy optimization approach, starts with a deterministic optimization assuming the w orst and the a v erage case v alues for the v ariation parameters. The results of these deterministic optimizations are used to con v ert the fuzzy optimization problem into a crisp nonlinear problem using the symmetric relaxation method [64]. The crisp problem formulation, in general, has been sho wn to pro vide satisf actory solution in the presence of imprecision or v ariations in coef cients of the constraints or objecti v e function in the optimization problem [56]. W e sho w that the fuzzy optimization approach impro v es the v ariation resistance of the circuit without compromising on the achie v able performance. The stochastic chance constrained programming (CCP) approach is again a well established technique for performing uncertainty a w are optimization. It has pre viously been applied to model process v ariations during the gate sizing problem [53, 54]. Here, we also perform 41
PAGE 52
v ariation a w are nonlinear timing based placement using stochastic CCP The stochastic CCP is cast as a rob ust mathematical program with v arying parameters in the constraints of the formulation. The proposed approach uses probabilistic constraints to capture the uncertainty due to process v ariations. The optimization as a preprocessing step, con v erts these probabilistic constraints into an equi v alent secondorder conic program (SOCP) by e xplicitly using the mean, v ariance and the in v ersedistrib utio n of the v arying parameters. Similar to the crispfuzzy problem, the translated stochasticSOCP is solv ed using an interior point nonlinear optimization solv er The rest of the paper is or ganized as follo ws. In Section 4.2, we moti v ate why FMP and SOCP are well suited for v ariation a w are optimization. The proposed fuzzy timing based placement and Stochastic placement techniques are presented in Sections 4.4 and 4.5 respecti v ely The e xperimental results are presented in Section 4.6 and the conclusions in Sections 4.7. 4.2 Moti v ation T iming based, incremental placement impro v es circuit delay by decreasing the length of the nets in the most critical paths, which are not identied in the global placement o w Se v eral researchers ha v e in v estigated the timing based placement problem with deterministic models. Ho we v er with the increasing impact of v ariability in process parameters there is a strong need to de v elop optimization approaches with nondeterministic models. Probability density function and cumulati v e distrib ution function based techniques ha v e been commonly used to perform uncertainty a w are optimization. Ho we v er the probabilistic w ay of propagating and optimizing these uncertainties is computationally e xpensi v e due to the requirement of complicated multiple inte gration techniques needed for continuous distrib utions [20, 30]. The discrete probabilistic representation, on the other hand, can ha v e huge e x ecution time due to the lar ge number of scenarios, which needs to be considered. Secondly the problem of timing based placement is inherently suited to mathematical programming based optimization for mulation. Hence, we in v estigate fuzzy mathematical programming and stochastic chance constrained programming based techniques for uncertainty a w are optimization. Further both Stochastic and fuzzy programming based techniques, ha v e been widely used to optimize uncertainty in se v eral engineering areas. Modeling process v ariations using these techniques only requires the mean and v ariance v alues of the uncertain parameters. Further Buckle y [40] used 42
PAGE 53
MonteCarlo simulation, to sho w that fuzzy programming based optimization guarantees solutions that are better or at least as good as their stochastic counterparts. In [87], it is sho wn that fuzzy programming based optimization, can produce better solutions compared to their stochastic chance constrained programming technique. Ho we v er the abo v e results were obtained using linear constraints and objecti v e functions in the optimization formulation. In this w ork, we in v estigate a nonlinear formulation of both methods for the timing based placement problem. 4.3 Incr emental T iming Based Placement In this section, we initially e xplain the deterministic path based timing based placement formulation used as the basis in this w ork. Ne xt, we discuss the fuzzy modeling of v ariations, the fuzzy mathematical programming formulation as well as the stochastic formulation for the timing based placement problem. The v ariationa w are timing based placement formulation is e xplained in detail in the conte xt of fuzzy programming solution and only the necessary dif ferences are highlighted in the stochastic placement subsection. In timing based placement (TBP), the objecti v e is to impro v e the performance by changing the locations of the critical cells. W e use a critical cell selection algorithm, similar to the one proposed in [83] to identify the set of mo v able cells. The algorithm for the critical cell selection used in this w ork is sho wn in Figure 4.2. The algorithm marks cells with dif ferent critical id' s (crit id) based on their adjacenc y to the most critical path. Each critical cell is also mark ed with a mo v able distance, which is proportional to its crit id and the mo v e length. The mo v e length is estimated as a function of the number of gates placed in the neighborhood of the current cell. The problem of incremental TBP can be naturally modeled as a mathematical programming problem. The crit id and mo v e distance v alues set during the preprocessing stages are used as constraints and constants in the timing critical programming formulation. The v ariables in the timing based placement are the mo v able cell locations x i and y i and the interconnect boundary v ariables l e f t jr igh t jt o p j and bo t j Here, we start with describing the changes in interconnect length and load capacitance and then sho w ho w these changes af fect the delay of the circuit. Assume x I and y I to be the ne w locations of the cell I, then the half perimeter bounding box model for the net j introduces the follo wing constraints. 43
PAGE 54
Figure 4.2 PreProcessing for Incremental T iming Based Placement l e f t jnx i ; r igh t jx i ;cel l I connec t ed (4.1) bo t jny i ; t o p jy i t o in t er connec t j where, I include all the cells connected to net j. The wire length of net j is then calculated as, L j r igh t jl e f t j t o p jbo t j; (4.2) 44
PAGE 55
T able 4.1 Notations and T erminology Symbol Meaning Symbol Meaning L j Half perimeter wire length of line j Dg i Delay of gate I x iy i x and y coordinates of cell I S g i T ransition delay of gate I C p j Output capacitance of line j S i Sle w of net I c Unit capacitance v alue l e f t jr igh t j Left and right coordinates of net j r Unit resistance v alue t o p jbo t j T op and bottom coordinates of net j c pin j Pin capacitance of line j Dne t j Delay of interconnect j S ne t j T ransition delay on net j l V ariation resistance v alue ar r iar r g t j Arri v al time on net I and gate j Tspec T iming Specication h T iming Y ield of circuit s Standard de viation A 0A 1A 2 Fitting coef cients gate delay B 0B 1B 2 Fitting coef cients transition delay K DK S Elmore delay constants V A 1V A 2V K D V ariation Coef ents for A 1A 2K D The output capacitance C p j on net j is gi v en by the sum of the wire capacitance on j and the pin capacitance C pin j C p jcL jC pin j ; (4.3) where, c is the constant denoting the unit capacitance v alue. The gate delay and transition function are linear functions of the output load capacitance C p i and the input sle w S i Dg iA 0A 1S iA 2C p i ; (4.4) S g iB 0B 1S iB 2C p i ; (4.5) where, the constants A 0A 1A 2B 0B 1 and B 2 are the tting coef cients for dif ferent gates in the library Also, these coef cients dif fer for dif ferent inputs of a gate and depending on the f alling or a rising transition. Finally the interconnect delay and transition time on net j is gi v en by Dne t jK DrL j cL j 2C pin j; (4.6) S ne t jK srL j cL j 2C pin j; (4.7) where, r is the unit resistance and K D K s are the Elmore delay constants 0.69 and 2.2 respecti v ely A complete list of the notations and terminology in the equations of timing based placement, fuzzy and stochastic programming conte xt is gi v en in T able 4.1. The abo v e equations for interconnect and gate 45
PAGE 56
delays denote the base (nominal) delay used in this w ork. The uncertain parameters in accordance with, recent statistical design optimization w orks, can be modeled as: Dd in j1 d j X jd r X r (4.8) where, d i is the nominal delay and X j and X r are the random parameter determining correlated and independent v ariations respecti v ely The magnitude of these v ariations is gi v en by the v ariables d j and d r which is determined from e xtensi v e simulations. In this w ork, we capture these v ariations using the concept of fuzzy numbers and probabilistic constraints. W e assume that the delay of a gate and interconnect as a v alue with upper and lo wer bounds. In other w ords, each gate' s delay is no w a triangular v alue of the form (a v erage, lo w high). The spatial correlations of dif ferent areas of transistors are dif ferent and captured by di viding the circuit area into n re gions as approached in [35]. Further we model the uncertainty in the gate delay equation due to the v ariations in the net length using the tting coef cients A 1 and A 2 These coef cients are modeled as a triangular v alue with higher and lo wer bounds representing the de viation from the nominal v alue. Secondly the uncertainty in the interconnect delay is modeled by abstracting the Elmore delay constant ( K D ) to be an interv al v alue. These coef cients are of triangular form with a linearly v arying membership function. Ne xt, we e xplain the proposed fuzzy timing based placement approach for optimization in presence of process v ariations. 4.4 V ariation A war e Fuzzy T iming Based Placement The concept of fuzzy sets w as introduced by Zadeh in 1965, where an element' s membership in a set could be an y v alue within the range [0, 1]. Along with fuzzy sets, Bellman and Zadeh [64] also introduced fuzzy programming based optimization. Since then, se v eral models and approaches ha v e been proposed in the literature for uncertainty management which is based on fuzzy linear programming, fuzzy multiobjecti v e programming, fuzzy dynamic programming, fuzzy inte ger programming, possibilistic programming and fuzzy nonlinear programming [56]. The reader is referred to Saka w a [56] and Zadeh [64] for detailed treatment of the basics on fuzzy mathematical programming (FMP). An important step in fuzzy programming is the modeling of uncertainty In this w ork, we model the uncertainty due to process v ariations using fuzzy numbers with linear membership func46
PAGE 57
tions. In recent statistical optimization w orks [49, 74], these v ariations are modeled as normally distrib uted random v ariables with zero mean and standard de viation s In this w ork, we e xtract the v alues of mean and 3 s to model these v ariations as interv al v alued fuzzy numbers. The 3 s v alue is assumed to be the deterministic w orst case v ariation v alue, meaning all the random parameters are set to 3 s for maximum timing yield in the w orst case bound deterministic optimization. Fuzzy numbers are dened using possibilistic distrib utions. In the conte xt of FMP it is analogous to linear membership functions [30]. T riangular and trapezoidal and other nonlinear membership functions can be used for modeling uncertainty while solving FMP problems. The nonlinear membership functions can be used for more accurate modeling [56]. F or simplicity we model the uncertainty due to process v ariations using triangular membership functions. The v ariations are represented using a triple X = ( x mx lx u ), where x m is the most possible v alue or the mean v alue and x lx u are the lo wer and upper bounds. Depending on the conte xt, the v alue x l can be a pessimistic or optimistic v ariation from the mean v alue x m and the same holds for x u In the conte xt of VLSI circuit optimization, the triple L e f f L m e f fL l e f fL u e f fcan be used to model the v ariations in channel length. The pessimistic v alue in this case is the upper bound L u e f f which is the sum of L m e f f3 s Since the ef fecti v e length is in v ersely proportional to our objecti v e performance, the v alue L m e f f3 s is the upper bound or w orst case v ariation v alue. The fuzzy programming solution methodology can be e xplained with a simple linear programming problem of the form, maximize n i1 a i x i (4.9) sub j ec t t o n i1 b j i x inc j1njnm where, m is the number of constraints, n the number of v ariables, c j is the limit of the constraint, b j i and a i are constant re gression coef cients, x i is the v ariable in the optimization formulation and at least one x i0. In the abo v e optimization problem, the coef cient b j i is the interv al v alued fuzzy number which has a mean v alue, b j i and a maximum v ariation of d j i The upper bound is assumed to be the pessimistic v ariation for this fuzzy number The fuzzy number b j i is also assumed to v ary linearly with the v alue of the v ariable x i As a preprocessing step, the fuzzy programming problem is solv ed with 47
PAGE 58
the v ariation parameters set to their w orst and typical case v alues. The results of these deterministic optimizations and the symmetric relaxation theorem are used to con v ert the fuzzy program into a crisp program (Equation 4.10), which represents an optimal solution in the presence of v ariations. The con v ersion of the fuzzy program into a crisp program is also referred to as the defuzzication step. maximize l (4.10) lOb j lOb j u n i1 a i x iOb j un0n i1b j il d j ix ic jn01njnm x j00nln1 1ninn Here, l is referred to as the v ariation resistance parameter The crisp problem is formulated as to maximize the v ariation resistance parameter and maintain the original optimization objecti v e within the deterministic optimization bounds. The solution of the problem can be interpreted as representing an o v erall de gree of satisf action in the presence of v arying parameters [56]. In the ne xt section, we e xplain the fuzzy linear programming formulation for the timing based incremental placement problem in the presence of uncertainty due to process v ariations. In this w ork, we perform timing minimization by changing the locations of the critical cells in the presence of v ariations. The problem minimizes the w orst ne gati v e slack with the location constraints gi v en in Equation 4.1 and the delay constraints gi v en in Equations 4.4, 4.5, 4.6 and 4.7). The deterministic formulation of the incremental TBP problem can be sho wn as, minimize T s pec (4.11) ar r ne tnT s pecne t e E P ar r jar r g t i 1Dg g t i 1 wher e j e ou tg t i 1ar r g t iar r g t ikDne t g t ikk e i pg t iwhere, T s pec is the w orst ne gati v e slack or the critical path delay of the circuit, EP is the set of end points of the circuit, which are the primary outputs in case of a combinational circuit and input to ipops in the case of a sequential circuit. The ar r n e t id and ar r g t id denote the arri v al time v ariables 48
PAGE 59
for each interconnect and gate, which are replicated for the whole circuit along with the gate and delay equations. The delay equation' s Dg i and Dne t j are e xpanded as sho wn in equations 4.4 and 4.6. In addition to the abo v e constraints the location constraints (Equation 4.1) are also part of the optimization formulation, Ho we v er not sho wn here, since it is not inspected by the v ariation a w are formulation. The fuzzy v ersion of the abo v e deterministic optimization formulation with uncertain parameters is gi v en as follo ws, minimize T s pec (4.12) ar r ne tnT s pecne t e E P ar r jar r g t i 1 Dg g t i 1 wher e j e ou tg t 1ar r g t iar r g tk Dne t g t ikk e i pg t iwhere, the coef cients Dg i and Dne t j are the uncertain parameters. The uncertain parameters are modeled as fuzzy number triples of the form ( Dg iDg im iDg im i ) and ( Dne t iDne t in iDne t in i ), where m i and n i are the maximum v ariations for the nominal gate delay Dg i and interconnect delay Dne t i respecti v ely The fuzzy problem is then transformed into a crisp nonlinear problem using the follo wing steps. The deterministic optimization is performed initially with the v arying coef cients set to w orst and a v erage case v alues of the fuzzy number In the w orst case optimization, the gate delay equations in the fuzzy optimization problem are replaced with the follo wing equation. Dg iA 0 A 1V A 1 S i A 2V A 2 C p i (4.13) Dne t j K DV K D rL j cL j 2C pin jwhere, V A 1V A 2 and V K D represents the v ariation v alues applied to these coef cients to represent the w orst case v ariation in gate and interconnect delay The gate and interconnect delay in the abo v e equations are the pessimistic estimates, resulting in high delay v alue. Similarly the typical or nominal v alue of the gate delay is the case where the fuzzy numbers are x ed to their a v erage v alues. In the nominal case optimization, the fuzzy delay equations in the fuzzy problem are replaced with the 49
PAGE 60
follo wing equation. Dg iA 0 A 1 S i A 2 C p i ; (4.14) Dne t j K D rL j cL j 2C pin j; The deterministic optimization problem is solv ed with the delay equations set to these w orst case and nominal case equations. KNITR O optimization solv er a v ailable through the NEOS optimization serv er is used to solv e these linear programming problems. The results of these deterministic optimizations correspond to the w orst ne gati v e slack of the w orst case timing setting ( wc t b p ) and nominal case timing setting ( nc t b p ). Using these bounds and a ne w v ariation parameter l the fuzzy optimization problem is transformed into a crisp nonlinear programming problem using the symmetric relaxation method [64]. The incremental TBP problem in the presence of process v ariations is con v erted to its corresponding crisp formulation as sho wn in Equation 4.15. maximize l (4.15) lnc t b pwc t b p T s pecwc t b pn0 ar r ne tnT s pecne t e E P ar r jar r g t i 1 Dg g t i 1 wher e j e ou tg t i 1ar r g t iar r g t ik Dne t g t ikk e i pg t iDg iA 0 A 1V A 1l S i A 2V A 2l C p i ; Dne t j K DV K Dl rL j cL j 2C pin j; where, the parameter l is bounded by 0 and 1. The spatial correlations can be incorporated by making the coef cients V A 1 V A 2 and V K D in the delay equation as a function of the current and f anout gate' s location in the chip. The chip area can be partitioned into n areas as in [35] such that the gates within the same block will ha v e high correlation. Ev en though, the parameter l can tak e an y v alue between 0 and 1, it can be easily bounded to a smaller v alue in the TBP Here, we bound the l v alue to be between 0.3 and 0.7. W e estimated that such a smaller bound is suf cient due to the dual requirement of high 50
PAGE 61
yield and high performance for the timing based placement optimization in presence of v ariations. The smaller bound impro v es the performance of the fuzzy optimization problem by a f actor of 23x, without an y ef fect on the solution optimality The crisp optimization problem has tw o v ariables in the cost function namely delay and v ariation parameter l The parameter l is the v ariation resistance (rob ustness) property of the circuit, meaning the ability to meet timing constraint e v en in the presence of v ariations. The problem tries to maximize v ariation resistance and bounds the delay v alue to be in between wc t b p and nc t b p v alues. F a v oring the delay v alue to be close to the nc t b p v alue, as the objecti v e is to maximize v ariation resistance of the circuit. It has been pro v ed for problems in other domain that the abo v e formulation pro vides the most satisfying solution for optimization in presence of v ariations [56]. Finally a timing dri v en le galization is performed to remo v e cell o v erlaps in the circuit. In the ne xt section, we e xplain the stochastic placement formulation, with only the necessary changes in the v ariationa w are modeling of the placement problem. 4.5 Stochastic T iming Based Placement In this section, we describe our formulation of the stochastic timing based placement optimization technique. The formulation is cast as a rob ust mathematical program, which is then reformulated into an equi v alent second order conic program (SOCP). The SOCP analytically captures the dependence of the constraints and objecti v es of the optimization using the mean and v ariance of the uncertain parameters. The adv antage of stochastic formulation is the ability to consider the uncertainty of the constraints in an e xplicit f ashion. The stochastic chance constrained programming technique models the uncertainty using probabilistic constraints. The probabilistic constraints are con v erted to an equi valent SOCP with mean m and standard de viation s The mean corresponds to nominal delay v alue as e xplained in the fuzzy modeling section and the standard de viation is chosen in accordance with the v alues in [79]. The SOCP formulation also has an in v erse cumulati v e distrib ution function (cdf) function, which controls the yield of the optimization problem. The estimated timing yield of the optimized circuit is directly proportional to the v alue of the in v erse cdf function. W e use an in v erse cdf v alue of 3 (corresponding to parameter 3 s in fuzzy formulation) to achie v e a 99.7% timing yield. The main dif ference of the stochastic placement formulation compared to the fuzzy placement technique 51
PAGE 62
describedintheprevioussectionisthearrivaltimeconstraintswithgatedelay(Dg)andinterconnect delay(Dnet ),whichareassumedtovaryduetoprocessvariations.Thedeterministicversionofthe timingbasedplacement'sarrivaltimeconstraintscanbeshownas, arr jarr gtiDg gti wherej eoutgti(4.16) arr gtiarr gtikDnet gtikk eipgtiHere,theincrementalinterconnectarrivaltimeconstraintboundsthearrivaltimeofaninterconnect output( j )tobegreaterthanorequaltothesumofarrivaltimeofthegate(gti)andthegatedelay (Dg gti ).Thesecondconstraint,boundsthearrivaltimeofagate(gti)tobegreaterthanorequalto thesumofarrivaltimeofitsinputs arr gtik andtheinterconnectdelay Dnet gtik .Aseparateconstraint isaddedtotheformulationforeachinput k ofgate(gti).Theabovetwoconstraintsareaddedforall thegateandinterconnectofthecircuit.Thestochasticformulationfortheabovedeterministicarrival timeconstraintscanbeshownas, Parr jarr gtiDg gti h wherej eoutgti(4.17) Parr gtiarr gtikDnet gtik hk eipgtiTheformulationisbasedonchanceconstrainedprogramming,withrootsinstochasticprogramming, inwhichtheconstrainthastobemetwithaprobabilityof h.Inthecontextoftimingbasedplacement, theparameter h correspondstothetimingyieldofthecircuit.Anequivalentformulationfortheabove probabilisticconstraintusingmeanornominalvalue D,cumulativedistributionfunction(cdf(f))and standarddeviation(s)canbeshownas, arr jarr gti Dg gtif 1 hs Dg gti wherej eoutgti1(4.18) arr gtiarr gtik Dnet gtikf 1 hs Dnet gtikk eipgti52
PAGE 63
Here,theprobabilisticconstraintforgatedelayisreplacedwiththesumofmeangatedelay Dg gti1 andtheproductofitsstandarddeviation s Dg gti andaninversecdfvalue f 1 hforhightiming yield.Similarly,theconstraintforinterconnectdelayisalsoreplacedwithitsequivalentmean,cdf andstandarddeviationvalue.Theaboveconstraintisqualitativelydifferentfromthedeterministic one,sinceitconsidersboththemeanandvarianceofthedelayvaluesasthedecisionvariablesofthe optimizationproblem.ThecompletestochasticCCPbasedtimingbasedplacementformulationcan beshownas, minimizeT spec (4.19) arr netnT specnet eEP arr jarr gti Dg gtif 1 hs Dg gti wherej eoutgti1arr gtiarr gtik Dnet gtikf 1 hs Dnet gtikk eipgti Figure4.3ProcessVariationAwareIncrementalPlacement Asmentionedintheprevioussection,Inadditiontotheaboveshowntimingconstraintsthelocationconstraints(Equation4.1)arealsopartofthestochasticoptimizationformulation.TheSOCP formulationwithmeanandvariancevaluescanalsobeefcientlysolvedusinganinteriorpointoptimizationsolver.Itwasestimatedfromsimulations,thattheinversecdfvalue f 1 hforhightiming 53
PAGE 64
yield can be substituted with a v alue of 3 for a predicted timing yield of 99.7%. A simple outline for the steps in v olv ed in the fuzzy and stochastic approaches is sho wn in Algorithm 4.3. 4.6 Experimental Results In this section, we present the simulation o w and e xperimental results of the proposed fuzzy programming based timing placement and compare it with stochastic and w orst case process v ariation approaches. The v ariationa w are placement approaches were tested on ITC'99 benchmark circuits. The complete simulation o w of the proposed fuzzy and stochastic approaches is sho wn in Figure 6.6. First, the R TL le v el VHDL netlists are con v erted to structural le v el V erilog netlist using the synopsys design compiler tool. The output V erilog le from the design compiler is then placed and routed using the cadence design encounter tool. The design encounter is also used to perform clock synthesis and timing analysis of the input netlist. The placed and routed netlist (DEF File), timing analysis report (T ARPT) and the V erilog le is gi v en as an input to a C script ( DEF2AMPL ), which con v erts the netlist into an AMPL based mathematical program format for timing based placement optimization. AMPL is a widely used modeling language for lar ge scale mathematical programming problems. The DEF2AMPL script, on dif ferent options, generates the w orst case deterministic, typical case deterministic, stochastic or the fuzzy v ersion of the timing based placement problem. The DEF2AMPL script, as preprocessing step also generates the coef cients A 0A 1A 2B 0B 1 and B 2 using interpolation. The script selects the maximum allo w able displacement for cells depending on the circuit area and the gate' s criticality The list of v alues for the interpolation is generated from the design encounter tool. The maximum v ariation in gate and interconnect delay is assumed to be 25% from the mean v alue due to the v arying process parameters, which is in accordance with the results in, [12, 79]. This is translated to the appropriate v alues of the v ariation parameters A 1A 2 and K D as mentioned in the pre vious section. The mathematical programming problems are solv ed using the KNITR O nonlinear optimization solv er a v ailable through the NEOS serv er for optimization. The results of the deterministic nominal and w orst case optimizations are also fed to DEF2AMPL script for generating the bound' s constraint in the fuzzy nonlinear AMPL model. The fuzzy and the stochastic optimization problem nd the optimal 54
PAGE 65
Solve with KNITRO Solver Solve with KNITRO Solver Solve with Gate Level Verilog Netlist DEF, TARPT, SPEF Files Placed and Routed Netlist with bounds from Nominal/Worst case optimizations Solve with KNITRO Solver Input: VHDL Netlist RTL Level Output: Cell Locations Delay Coefficients Gates/Nets Fuzzy AMPL Model Stochastic Case AMPL model Worst Case AMPL model Nominal Case AMPL model Synopsys Design CompilerGeneration for Timing based Placement KNITRO Solver TSMC LEF/TLF/DB Libraries Cadence Design Encounter DEF2AMPL Script for Mathematical Program Figure 4.4 V ariation A w are T iming Based Placement: Simulation Flo w 55
PAGE 66
T able 4.2 V ariation A w are Placement Results on Benchmark Circuits ITC' 99 T otal Mo v able W orst Ne gati v e Slack (ns) % Impro v ement Circuit gates Cells D WCTBP STBP FTBP Fuzzy Vs WC Stoc Vs WC b13 309 42 0.121 0.106 0.104 14.1% 12.4% b11 385 79 0.143 0.127 0.125 12.5% 11.1% b12 834 98 0.210 0.189 0.185 11.9% 10.0% b14 3651 1099 3.87 3.43 3.36 13.1% 11.3% b15 6452 665 2.72 2.47 2.41 11.4% 9.19% b20 8900 665 5.65 4.99 4.92 12.9% 11.6% b22 12128 891 7.22 6.68 6.33 12.3% 7.47% b17 21191 1280 8.61 7.64 7.52 12.6% 11.2% A v erage Percent T iming Impro v ement 12.6% 10.53% Le gend WNS: W orst Ne gati v e Slack; FTBP: Fuzzy T iming Based Placement; Le gend D WC: Deterministic W orst Case; STBP Stochastic T iming Based Placement; cell locations in the presence of v ariations. Finally timing a w are cell le galization is performed to remo v e o v erlaps. The timing impro v ement achie v ed by the fuzzy sizing approach compared to w orst case deter ministic sizing is documented in T able 5.2. The w orst case placement results correspond to the delay coef cients set to their maximum v ariation case. The percentage impro v ement of fuzzy approach compared to deterministic w orst case approach is calculated as, PR 1W CT BPFT BP W CT BP100 (4.20) It can be seen that there is a sa vings of around 12% in w orst ne gati v e slack by using the fuzzy placement approach as compared to deterministic w orst case optimization. Secondly we also present the results of the stochastic optimization frame w ork in T able 5.2. The percentage impro v ement of stochastic placement approach compared to w orst case setting is calculated as, PR 1W CT BPST BP W CT BP100 (4.21) It can be seen that the stochastic placement approach impro v es the timing by around 10% as compared to the deterministic w orst case optimization at 99.7% timing yield le v el. As predicted by Buckle y in [40], the fuzzy optimization approach outperforms stochastic programming techniques e v en with nonlinear constraints in the formulation. Finally to v erify the timing yield of the fuzzy based placement approach, we generate multiple samples of the ITC benchmark circuits. The sample 56
PAGE 67
instances of the benchmark circuits are x ed with placement location outputs from the fuzzy method and the coef cients of delay are assumed to ha v e random v ariation v alue. The v ariation v alue is generated from a uniform distrib ution between minimum and maximum v ariation v alues used in the optimization. W e then performed MonteCarlo simulation of these random instances to determine the frequenc y of timing violations, i.e., number of times delay of the random circuit is greater than specied timing ( T s pec ). The fuzzy logic approach had a timing yield of around 99100% for all the benchmark circuits. This conrms the f act that the FMP is an ef cient approach to design circuits with high yield without sacricing much on performance. 4.7 Conclusion In this chapter we described a formulation for v ariationa w are timing based placement problem using fuzzy and stochastic approaches. The uncertainties due to process v ariations in these formulations are respecti v ely modeled as fuzzy numbers and probabilistic constraints. The coef cients in the gate and interconnect delay arri v al time constraints are assumed to v ary in the optimization formulation. The proposed v ariationa w are timing based placement maximizes v ariation resistance (rob ustness) of the circuit, with the timing information represented as constraints. Experimental results on ITC' 99 benchmark circuits indicate a sa vings of around 12% for fuzzy programming and 10% for stochastic programming in a v erage compared to the w orst case deterministic approach. The proposed results v alidated using MonteCarlo simulations also conrm high timing yield for circuits designed with the v ariationa w are techniques. 57
PAGE 68
CHAPTER 5 V ARIA TION A W ARE B UFFER INSER TION 5.1 Intr oduction In nanometer era, it is crucial to consider area, po wer and process v ariation metrics in the optimization formulation. Further interconnects ha v e become longer and net delay has become more dominant than logic delay The interconnect delay to its rst order is proportional to the square of the length of the wire. This has increased the importance for interconnect dri v en performance optimization techniques such as, b uf fer insertion/sizing, wire sizing/spacing and dri v er sizing. Of these techniques, b uf fer insertion has ef fecti v ely been able to di vide the wires into smaller se gments and bring the wire delay to almost linear in terms of its length. Further it has also been estimated in [62], that 35% of the total standard logic cells in a circuit will be b uf fers at the 65nm technology le v el. Therefore, it is highly important to nd optimal number of b uf fers for lo w o v erhead timing optimization. Se v eral researchers ha v e proposed b uf fer insertion techniques and the y can be mainly classied as netbased [16, 93], path based [23, 29] and netw orkbased [31, 85, 96] techniques. In netbased approach, b uf fers are inserted in nets to create positi v e slack at the source. Ev en with criticality based netordering mechanisms, it may lead to suboptimal o v er b uf fering due to a lack of global vie w The path based b uf fer insertion algorithms abstracted a path as a routing tree and inserted b uf fers on them to minimize the critical path delay [23]. The approach achie v es more reduction in b uf fers costs compared to netbased approaches, b ut still suf fers from lack of global vie w as it considers each path independently Because of their greedy approach, earlier processed nets/paths can o v er consume b uf fers resulting in a nonoptimal solution [29]. Circuit wise b uf fer insertion techniques, on the other hand tak es a whole circuit as an input instead of an indi vidual net or path. The rst such approach in, [31], uses lar grangian relaxation techniques b ut suf fers from unrealistic assumptions of at least one b uf fer in each interconnect. The netw orkbased approach in [96] uses a piecewise linear programming formulation to model the nonlinear delay impro v ements of the b uf fer insertion problem. Ho we v er 58
PAGE 69
the abo v e approaches do not consider v ariability in their formulation and hence are not suitable for optimizing designs in the nanometer re gime. V ariation a w are techniques for b uf fer insertion ha v e also been proposed in [36, 45]. Ho we v er the approaches, in [36, 45], were based on traditional netbased techniques propagating continuous distrib utions and hence can produce o v er b uf fered nonoptimal solutions [23]. Here, we propose a ne w fuzzy optimization approach for v ariation a w are simultaneous b uf fer insertion and dri v er sizing using a netw orkbased algorithm. The deterministic v ersion of the proposed BIDS algorithm is formulated to minimize the cumulati v e sum of b uf fers inserted and gate sizes. Delay constraints are modeled using required time v ariables at each node using a node based formulation as in, [87]. The delay constraints for b uf fer insertion and dri v er sizing is modeled as a piecewise linear function of the b uf fer/dri v er types in the library The optimization engine inserts zero or more b uf fers and increases dri v er sizes in each interconnect, considering the impact on circuit' s critical path delay and the resource (b uf fer dri v er/gate) cost ef cienc y Secondly the pre vious w orks in b uf fer insertion are all performed after placement stage, where only incremental changes are possible. W ith increasing circuit comple xity it is becoming necessary to perform v ariation a w are optimization early in the design phase. In this chapter we propose the use of process v ariation a w are circuitwise b uf fer insertion and dri v er sizing formulation at the logic le v el. Most importantly b uf fer insertion at the logic le v el requires careful abstraction of wire length, which is only a v ailable at the post placement stage. Here, we adopted the use of an accurate and f ast interconnect length prediction technique at the logic le v el taking into account the number of cells/interconnect s and f anout of each cell. The technique is lookup table based wire length prediction and is similar to the one proposed in [38]. Further solutions obtained from optimizations at the logic le v el can also be used as an estimate for planning during the layout le v el optimizations. The fuzzy optimization approach, as a preprocessing step, initially performs deterministic optimization with the v ariation parameters set to the w orst case and a v erage case v alues. The change in delay due to b uf fer inser tion/dri v er sizing is assumed to v ary in between a v erage and w orst case v alue. The interv al based delay v ariation is modeled as a triangular fuzzy number with a linear membership function. The results of these preprocessing deterministic optimizations are used to con v ert the uncertain fuzzy problem into a crisp nonlinear problem using the symmetric relaxation method [28, 56]. In the conte xt of BIDS, 59
PAGE 70
the crisp problem aims to maximize v ariation resistance ( l ) or yield with circuit delay po wer (b uf fer dri v er cost) as constraints. The proposed approach w as tested on ITC'99 benchmark circuits and results indicate sizeable sa vings in b uf fer cost and dri v er sizes compared to the deterministic w orst case approach. Finally we also present a comparison of our logic le v el b uf fer insertion technique with a more accurate post layout v ersion of the b uf fer insertion problem. The comparison is to highlight the ef cac y of the wire length prediction mechanism. The dif ference in results on ITC benchmarks indicate that the logic le v el solutions are within 10% of the post layout le v el b uf fer insertion. The rest of the chapter is or ganized as follo ws. The problem formulation and the proposed fuzzyBIDS frame w ork is gi v en in Sections 5.4. In Section 5.5, we present the e xperimental results follo wed by some conclusions in Section 5.6. 5.2 Modeling Delay V ariations In this w ork, we model the change in delay due to b uf fer insertion or gate sizing as an uncertain v ariable. The uncertain change in delay is modeled as a fuzzy triangular triple of the form Delay = ( Del ay m Del ay l Del ay u ). Here Del ay m is the most possible v alue or the a v erage case v alue and Del ay l & Del ay u corresponds to the lo wer and upper bounds, denoting the best and w orst case changes of circuit delay In accordance with pre vious w orks on process v ariations [36, 45, 87], we ha v e also used the 3 s v ariation v alue for the w orst case setting. The fuzzy programming problem similar to stochastic chance constrained programming frame w ork, in v olv es a relaxation step to con v ert the uncertain (fuzzy) constraints or objecti v es into a crisp (deterministic) frame w ork. The relaxation in fuzzy programming starts with a set of deterministic optimizations by assuming the interv al v alued coef cients set to the w orst and the a v erage case setting. The results of these optimizations ( Resul t aver a g e and Resul t wor s t and a v ariation resistance parameter ( l ) is used to con v ert the fuzzy problem into a crisp problem. A brief outline of the fuzzy methodology for uncertaintya w are optimization is sho wn in Figure 5.1. T raditionally the uncertainties due to process v ariations are usually handled using probability distrib utions. Ho we v er the probabilistic w ay of e v aluating and optimizing the uncertainties is computationally e xpensi v e due to the need for complicated inte gration or lar ge number of scenarios. Secondly Buckle y in [40], has sho wn that fuzzy programming based optimization guarantees solutions that are better or at least as good as their stochastic counterparts. The authors compared the 60
PAGE 71
nominal case setting coefficients set to Solve problem X with coefficients set to worst case setting Solve problem X with Create a crisp nonlinear problem using these parameter using symmetric relaxation preprocessing solutions and a variation The variation parameter in crisp problemWorst and nominal caseranges from (0, 1) and models the interval valued fuzzy variables in original problem Linear programming optimization problem model uncertainty due to process variations Preprocessing Solutions with interval based fuzzy coefficients to Solve crisp problem using a nonlinear solver and it represents the optimal solution in the presence of uncertainty Figure 5.1 Fuzzy Programming Approach for V ariation A w are Optimization stochastic and fuzzy programming methodologies using MonteCarlo simulations. The main dif fer ence between the techniques is that, the fuzzy optimization, in uncertain en vironments, nds the best solution (supremum operation o v er all feasible solutions) as opposed to a v eraging (inte grals o v er all feasible solutions) in stochastic programming based optimization. Hence, fuzzy programming selects a solution which is better than or at least as good as the stochastic solution. The abo v e ar guments led us to in v estigate fuzzy programming approach to model uncertainty due to process v ariations in post layout and logic le v el b uf fer insertion and dri v er sizing problem. 61
PAGE 72
5.3 Pr oblem F ormulation In a placed and routed combinational circuit, after timing analysis, certain paths may violate the timing constraint. At this le v el, b uf fer insertion and dri v er sizing (BIDS) techniques ha v e been able to successfully impro v e performance with a good po wer and noise tradeof f. W ithout loss of generality the list of standard cells (combinational gates/dri v ers) and interconnects in between re gister stages are considered for optimization. The BIDS technique can also be handled at the logic le v el with proper approximation of interconnect length. In the follo wing sections, we e xplain the layout and logic le v el formulation of the b uf fer insertion and dri v er sizing problem. 5.3.1 Lay out Le v el Modeling Initially dri v ers are x ed to minimum sizes and a x ed number of size increments for each dri v er are assumed to be a v ailable. The layout of the circuit is di vided into nre gions and the density (number of de vices to white space) of each re gion is calculated. The maximum size for each gate is decided based on the density of the re gion in which it is placed. The abo v e restriction modeled as a bound for the v ariable in the optimization formulation. Similarly for a set of candidate b uf fer locations, constraints are formulated for dif ferent b uf fer types with its associated output resistance, input capacitance and intrinsic delay The layout le v el optimization for identifying candidate b uf fer locations were based on the concepts in [16]. The candidate b uf fer locations in this conte xt, refers to the possible channels in critical interconnects where layout le v el optimization can insert b uf fers. The routed wires were di vided into channels and channels in sparse re gions were preferred as candidate b uf fer locations than denser ones. In addition to the density channels which equally di vide the critical connection are also preferred. The change in delay coef cients can be used to model the candidate b uf fer locations for each interconnect. The BIDS optimization problem aims to minimize resource cost, namely number of b uf fers inserted and total dri v er sizes increased such that required arri v al time at each primary output is less than the specied timing constraint. The delay of a dri v er can be modeled as a linear function of its size and load from f anout gates. d g ia ib i s ic iC l oadj (5.1) 62
PAGE 73
where, s i refers to the size of dri v er i C l oadj is the load seen from the dri v er which is a function of the sizes of f anout gates, constant coef cients a ib ic i are empirically determined by e xtensi v e SPICE simulations for each gate in the library for v arious sizes and f anout counts. Sizing gatei impro v es the delay of current gate (as s i increases) and increases the C l oadj seen by its f anin gates. The interconnect delay on the other hand, has a quadratic proportionality to the length of the wire and is gi v en by d in t iR 0l en i05C 0l en iC pin(5.2) where, R 0 and C 0 refers to unit resistance and capacitance, C pin the pin capacitance and l en i is the interconnect length. Buf fer insertion on a interconnect impacts both the source gate delay (as a function of C l oadj ) and interconnect delay (as a function of l en i ). Since, the interconnect delay is a quadratic function of its length which can change during b uf fer insertion, we model the optimization problem using required arri v al times and piecewise arri v al time changes during b uf fer insertion and dri v er sizing. The linear programming formulation is e xplained comprehensi v ely in Section IV In nanometer re gime, the wiring density has increased considerably leading to high aspect ratios in metal lines. This results in increased coupling between nets and can af fect the timing and functionality of the circuits. Hence, in addition to considering process v ariations it is also necessary to minimize the ef fects of interconnect coupling to reduce losses due to timing yield f ailures. Noise on a net can be easily controlled during dri v er sizing. Interconnect coupling noise depends on the size of the dri ving gate (victim) and adjacently placed aggressor gates. Increasing the size of a gate increases the signal strength on the dri v en net and thereby the coupling noise on its victims. Hence, the upsizing of a gate can increase the noise on the coupled nets and do wnsizing a gate can reduce the same ef fect. Hence, we add a noise constraint to maintain the sizes of victim and the aggressor gates as in [25]. Secondly the uncertainty due to process v ariations can be modeled as, Dd in j1 d j X jd r X r (5.3) where, d i is the nominal delay and X j and X r are the random parameters representing correlated and independent v ariations respecti v ely The magnitude of these v ariations is gi v en by the v ariables d j and d r which is determined from e xtensi v e simulations. W e capture these v ariations using the concept of 63
PAGE 74
fuzzy numbers. The gate' s delay is no w a triangular v alue (a v erage, lo w high), instead of a single discrete v alue. Ne xt, we e xplain the proposed fuzzy gate sizing approach for optimization in presence of process v ariations. 5.3.2 Logic Le v el Modeling In the conte xt of logic le v el modeling, the delay of a dri v er is again a linear function of the dri v er size and sum of the sizes of its f anout gates. Hence, there is no signicant dif ference between modeling the gate delay at the logic and the layout le v el. The interconnect delay on the other hand, has a quadratic proportionality to the length of the wire and is gi v en by d in t iR 0l en i05C 0l en iC pin(5.4) where, R 0 and C 0 refers to unit resistance and capacitance, C pin the pin capacitance and l en i is the interconnect length. Accurate modeling of the interconnect length at the logic le v el is crucial to the ef fecti v eness of the methodology In this w ork, the wire length is obtained using a f ast and accurate lookup table based estimation. Se v eral researchers ha v e w ork ed on the problem of apriori length estimation. The authors in [94], ha v e used the Rent' s rule to deri v e the upper bounds for interconnection lengths of linear and square interconnection components. Ho we v er the rent' s rule does not hold true at all le v els of partition hierarchy in the nanometer era [38]. In this w ork, we use a lookup table based methodology taking into account the number of cells/interconnects and f anout count of each cell. The estimation starts with the layout synthesis of a set of benchmark circuits. The benchmarks were selected from the MCNC benchmark suite and the comple xity in function of the number of gates ranges from 500 to 10000 approximately The layouts ha v e to be generated for the tar get technology library F or each net a report is generated displaying its length and the f anout count of its dri v er Nets with same number of f anout counts are grouped and the a v erage net length for each f anout count size is calculated. The table is then grouped based on benchmark circuit size and then a v eraged again. Hence, the array based table lookup requires circuit size (number of gates/nets) and f anout count to e xtract the length of a interconnect at the logic le v el. The table is created with a maximum f anout size of 20 and all interconnects with more than 20 f anout gates are modularized to 20, before accessing the table. Buf fer insertion on a 64
PAGE 75
interconnect impacts both the source gate delay (as a function of C l oadj ) and interconnect delay (as a function of l en i ). The candidate b uf fer locations are selected with the objecti v e of the di ving the critical interconnect connections into equal halv es. Since, the interconnect delay is a quadratic function of its length which can change during b uf fer insertion, we model the optimization problem using required arri v al times and piecewise arri v al time changes during b uf fer insertion and dri v er sizing. The linear programming formulation, in the layout le v el conte xt, is e xplained comprehensi v ely in Section 5.4. The logic le v el formulation does not ha v e the layout le v el constraints due to routing issues and the interconnect length is approximated using v alues from a lookup table. 5.4 Pr oposed A ppr oach In this section, we e xplain our modeling and solution methodology of the fuzzy b uf fer insertion and dri v er sizing (BIDS) problem in the presence of uncertainty due to process v ariations. Se v eral formulations ha v e appeared for the gate sizing and b uf fer insertion problems at v arious le v els in the design o w In this section, we describe a continuous linear programming approach to minimize resource cost with delay and noise constraints in the presence of v ariations. 5.4.1 DeterministicBIDS In this formulation, we start by e xplaining the modeling the delay constraints of gates and inter connect in the linear program. The interconnect delay is a quadratic function of the interconnect length and is signicantly af fected during b uf fer insertion. W e formulate the delay constraints of the linear program using required arri v al time and change in delay due to BIDS for each node from primary input to primary output. The required arri v al time and impro v ement in delay based formulation enables the use of a piecewise linear formulation for the nonlinear BIDS problem. The required time v alue of each sink node ( r eq j ) must be greater than the sum of required time of its source node ( r eq i ) plus gate ( d g i ) and interconnect delay ( d in t i ) in between them. The slack in between the nodes can be impro v ed by adding b uf fers nb u f i and sizing gates by a f actor of nga t i Hence, the required time constraints can 65
PAGE 76
be formulated as sho wn belo w r eq id g id in t iC b 1 inb u f i (5.5)C g 1 inga t iC g 2 in f ga t inr eq j r eq iC b 2 i nb u f i1 C g 1 inga t i(5.6)d g id in t iC b 1 i C g 2 in f ga t inr eq j where, C b 1 i is the change in the delay due to the insertion of the rst b uf fer and C b 2 i is the change in the delay due to insertion of the subsequent b uf fers. The piecewise required time constraints are inserted for all (i, j) dri v er recei v er pairs. The magnitude of the coef cients are also adjusted in accordance with the routing constraints and the candidate b uf fer locations. Inserting a second and third b uf fers tend to af fect the delay lesser compared to the insertion of the rst b uf fer Hence, we use piecewise require time formulation to model the change in delay due to b uf fer insertion. The C g 1 i term is the change in delay due to gate sizing increments nga t i and is proportional to the coef cient b i in gate delay (Equation. 5.1). The term n f ga t i is the change in the sizes of the f anout gates of node i and its coef cient C g 2 i is proportional to the coef cient c i in Equation 5.1. The C b 1 i and C b 2 i coef cients also depends on the candidate b uf fer locations of each interconnect i. The node based required arri v al time formulation also a v oids the e xponential comple xity of the path based formulation. The required time constraints are then related to the nal timing objecti v e as r eq i T s peci e PO In addition to the delay constraints, the impact of coupling capacitance on timing of the circuit is also modeled. A preprocessing step identies, the set of aggressors and a constraint nga t vic t imsizenga t a g gr essorsize 1 is added for each victim gate. The deterministic v ersion of the b uf fer insertion and dri v er sizing 66
PAGE 77
problem can be sho wn as, min inb u f inga t i(5.7) str eq inT s pec ;i e PO ; str eq id g id in t iC b 1 inb u f iC g 1 inga t iC g 2 in f ga t inr eq j str eq iC b 2 i nb u f i1 C g 1 inga t i d g id in t iC b 1 i C g 2 in f ga t inr eq j stnga t vic t imsizenga t a g gr essorsize 1; The coef cients C g 1C b 1 and C b 2 in this formulation, which is assumed to be v arying between w orst case and best case bound, are modeled using fuzzy numbers with linear membership function. The fuzzy modeling and optimization methodology is e xplained ne xt. 5.4.2 FuzzyBIDS Fuzzy optimization techniques pro vide an ef cient mechanism for modeling and optimizing systems that e xhibit imprecision and v ariations. The fuzzy mechanism starts with a set of preprocessing optimization with the v arying parameters set to w orst case and nominal case v alues. In this w ork, we model uncertainty due to process v ariations, as an imprecision in the delay impro v ement due to BIDS. The coef cients C g 1C b 1 and C b 2, which control the impro v ement in delay due to b uf fer insertion and dri v er sizing are modeled as triangular fuzzy numbers. The w orst case v alues of these coef cients are assumed to be C g 1V g 1C b 2V b 2C b 1V b 1, where the v alues V g 1V b 2V b 1 are selected to create a w orst case (3 s ) delay v ariation, in accordance with recent v ariation a w are optimization frame w orks [35, 36, 87]. The optimization problem in Equation 5.7 is solv ed with the w orst case coef cient setting C g 1V g 1C b 2V b 2C b 1V b 1 and the results of this optimization is referred to as Ob j wc Similarly the optimization problem in Equation 5.7 is solv ed with the nominal v alues and the results are referred to as Ob j nc The results Ob j wc Ob j nc and a ne w v ariation parameter l are used to transform the uncertain fuzzy optimization problem into a crisp nonlinear programming problem using the symmetric 67
PAGE 78
relaxationmethod[28,56].ThecrispnonlinearproblemforBIDSinthepresenceofprocessvariations isgivenbythefollowingequation. maximize l (5.8) stlObj ncObj wc inbuf ingat i Obj wcn0; streq ponT spec ;pePO; streq idg idint i Cb1 iVb1 il nbuf i Cg1 iVg1 il ngat iCg2 infgat inreq j streq i Cb2 iVb2 il nbuf i1 Cg1 iVg1 il ngat i dg idint i Cb1 iVb1 il Cg2 infgat inreq j stngat victimsizengat aggressorsize )Tj/T1_0 10.909 Tf1 0 0 1 413.28 430.44 Tm(1; where,theparameter l isboundedby0and1.Theparameter l cantakeanyvaluesbetween0and1, fortheBIDSproblem.Thecrispoptimizationproblemhasfourvariables,delay,noise,resourcecost andprocessvariations(l)intheaboveformulation.Theparameter l,asseenfromtheformulation, maximizesvariationresistanceandalsocontrolsthedelayandresourcecostvaluesintheoptimization. Hence,amaximizationofthevariationresistanceparametersimultaneouslyguarantees,highyield, lowcostandlowdelay.Ithasbeenshownforproblemsinotherapplicationdomainsthattheabove formulationprovidesthemostsatisfyingoptimizationsolutioninthepresenceofuncertainty[56]. 5.5SimulationMethodologyandResults Theproposedfuzzylinearprogrammingoptimizationforbufferinsertionanddriversizingwas testedonITC'99benchmarkcircuits.First,theRTLlevelVHDLnetlistsareconvertedtoaattened gatelevelVerilognetlistusingthesynopsysdesigncompiler.Veriloglefromthedesigncompiler isthenplacedandroutedusingthecadenceencounter.TheTSMCdb,lefandtlflibrariesareused tosynthesizethebenchmarkcircuits.Theplacedandroutednetlist(DEFle),libraryofcelldelay 68
PAGE 79
information and the V erilog le are gi v en as an input to a C script ( DEF2AMPL ), which con v erts the netlist into a AMPL based mathematical program format. The DEF2AMPL script is then used to generate the linear programming models for the benchmarks with delay coef cients set to mean and the maximum possible v ariation (w orst case). The change in delay coef cients due to b uf fer insertion and dri v er sizing is chosen in accordance with the gate and interconnect delay v ariation v alues in, [12, 79]. The DEF2AMPL script uses the results of these optimizations and generates a crisp nonlinear AMPL model. The crisp nonlinear optimization problem is solv ed using the KNITR O solv er to nd the optimal gate sizes in presence of v ariations in gate delay The logic le v el simulations are performed without using the place and route tools. Depending on the tar get technology library a lookup table is created for predicting wire length with the number of gates of the circuit and the number of f anout count as the parameters. The lookup table is b uilt from e xtracting wire length v alues from pre viously placed and routed benchmarks. The predicted wire length v alues, library of cell delay information and the logic le v el V erilog netlist are gi v en as an input to the same C script ( DEF2AMPL ), which con v erts the netlist into a AMPL based mathematical program format. The DEF2AMPL model rst creates a w orst case model with timing as the only objecti v e, to identify the best performance in which the circuit can operate. The timing v alue identied in this step is used as a constraint in the follo wing simulations to optimize the cost (b uf fers and dri v er sizes) with nominal case, w orst case and fuzzy modeling of v ariations. The DEF2AMPL script is then used to generate the piecewise linear programming models for the benchmarks with delay coef cients set to mean and the maximum possible v ariation (w orst case). The complete simulation o w for the logic le v el e xperiment is sho wn in Figure 5.2. The layout le v el simulation approach bypasses the wirelength prediction step, as actual interconnect length v alues are a v ailable after place and route. The cadence encounter place and route tools are used on the gate le v el netlist to generate the initial layout. 5.5.1 Lay out Le v el BIDS The b uf fer and gate resource cost reduction achie v ed by the fuzzy sizing approach compared to w orst case deterministic sizing is documented in T able 5.2. The table also pro vides information on the circuit characteristics and the comple xity (constraints) of the fuzzy optimization formulation. The w orst case BIDS results in column 5, correspond to the delay coef cients set to their maximum v aria69
PAGE 80
VHDL Netlist RTL Level Design Compiler TSMC LibrariesNumber of Gates and fanout count of Delay Coefficients for gates and wires DEF2AMPL Script for AMPL mathematical program generation for buffer insertion and gate sizing Fuzzy AMPL model for variation aware BIDS with bounds from preprocessing KNITRO optimizations Solve fuzzy model using KNITRO Solver KNITRO solver Solve using each interconnect Assign netlengths for each interconnect from lookup table Gate level Netlist Preprocessing Nominal Case AMPL model Preprocessing Worst Case AMPL model Figure 5.2 Simulation Flo w Fuzzy BIDS tion case for high yield. It can be seen that there is a sizable sa vings in b uf fer and gate cost by using the fuzzy sizing approach as compared to deterministic w orst case gate sizing approach. The e x ecution time of the fuzzy optimization approach is also sho wn in T able 5.2. The continuous solutions from the proposed approach were rounded to get discrete b uf fer numbers and gate sizes, with v ery little ef fect on timing and resource cost. 70
PAGE 81
T able 5.1 Layout Le v el Results on Benchmark Circuits ITC' 99 No. of No. of No. of Buf fer & Gate Cost Objecti v e Percent Runtime Benchmark of gates Nets Constraint D WC Fuzzy V alue ( l ) Change (secs) b13 249 269 1027 19 14 0.55 26.4% 0.97 b11 385 322 1516 70 46 0.57 34.2% 1.85 b12 834 847 3810 100 80 0.58 20% 70.2 b14 4232 4544 30437 830 330 0.69 60.21% 200 b15 4585 4716 31951 591 260 0.74 56.00% 1185 b20 8900 9538 63855 991 408 0.68 58.8% 3800 b22 12128 13093 87108 534 237 0.66 55.6% 5791 A v erage Percent 44.5% Le gend D WC: Deterministic W orst Case BIDS Approach; Fuzzy: Fuzzy BIDS F ormulation; Le gend Constraints: No. of Constraints in FLP program; Percent Impro v ement: Fuzzy Vs D WC; 5.5.2 Logic Le v el BIDS The b uf fer and gate resource cost reduction achie v ed by the fuzzy sizing approach compared to w orst case deterministic sizing is documented in T able 5.2. The table also pro vides information on the circuit characteristics and the comple xity (constraints) of the fuzzy optimization formulation. The w orst case BIDS results in column 5, correspond to the delay coef cients set to their maximum v ariation case for high yield. It can be seen that there is a sizable sa vings (35% on the a v erage) in b uf fer and gate cost by using the fuzzy sizing approach as compared to deterministic w orst case gate sizing approach. The e x ecution time of the fuzzy optimization approach is also sho wn in T able 5.2. The continuous solutions from the proposed approach were rounded to get discrete b uf fer numbers and gate sizes, with v ery little ef fect on timing and resource cost. Also, without loss of generality the proposed methodology is assumed to optimize gates/interconnects in between tw o re gister stages. W ith increasing performance tar gets, the number of le v els of logic inbetween re gister stages are decreasing. Hence, the number of gates in between tw o re gister stages is reducing. Thus, we feel that optimizing and reporting results on upto 12000 gates is a good indicator on lar ge industrial benchmarks. Figure 5.3 compares the results in number of b uf fers and gate sizes reported by the logic le v el and the layout le v el simulation. It can be clearly seen, that the results are v ery close between the logic and layout le v el formulations for all the benchmarks, with b20 being an e xception. Ho we v er e v en with b20 included, the logic le v el optimization on the a v erage is within 10% of the placed and routed b uf fer insertion and dri v er sizing results, indicating the ef cac y of the wire length prediction and logic le v el b uf fer insertion mechanism. 71
PAGE 82
T able 5.2 Logic Le v el Results on Benchmark Circuits ITC' 99 No. of No. of No. of Buf fer & Gate Cost Percent Runtime Benchmark of gates Nets Constraint D WC Fuzzy Impro v ement (secs) b13 249 269 1027 33 20 39.3% 4 b11 385 322 1516 67 47 29.8% 2.5 b12 834 847 3810 86 64 25.5% 65 b14 4232 4544 30437 425 309 27.4% 190 b15 4585 4716 31951 555 332 40.3% 1025 b20 8900 9538 63855 1014 585 41.1% 3500 b22 12128 13093 87108 550 298 45.6% 5650 A v erage Percent 35% Le gend D WC: Deterministic W orst Case BIDS Approach; Fuzzy: Fuzzy BIDS F ormulation; Le gend Constraints: No. of Constraints in FLP program; Percent Impro v ement: Fuzzy Vs D WC; Figure 5.3 Fuzzy BIDS: Logic Le v el V ersus Layout Le v el 5.6 Conclusion In this chapter we proposed a ne w approach for simultaneous b uf fer insertion and dri v er sizing at the logic le v el considering process v ariations using fuzzy linear programming technique. An ac72
PAGE 83
curate and f ast interconnect length prediction technique w as de v eloped at the logic le v el taking into account the number of cells/interconnects and f anout of each cell. The proposed fuzzy BIDS approach maximizes v ariation resistance (rob ustness) of the circuit, with delay as constraint and cost bounded between the w orst and nominal case v alues. Experimental results on ITC'99 benchmark circuits indicate sizable sa vings in resource cost compared to the deterministic w orst case approach. The logic le v el b uf fer optimization technique w as also compared with post layout b uf fer insertion technique, for accurac y of wire length prediction, and the dif ference in results is found to be less than 10%. 73
PAGE 84
CHAPTER 6 D YN AMIC CLOCK STRETCHING 6.1 Intr oduction The po wer performance tradeof f in the nanometer era, has only e xacerbated with the inception of parameter v ariations in nanometer technology P arameter v ariations comprise process de viation due to doping concentration, temperature uctuations, po wer supply v oltage v ariations and noise due to coupling. V ariations can cause frequenc y and po wer dissipated to v ary from the specied tar get and hence can result in parametric yield loss. P arametric yield, in this conte xt, is dened as a design' s sensiti vity to v ariations, and is e xpected to cause 6070% of all yield losses in the impending technology generations [72]. T o ensure correct operation under all possible v ariations (process, v oltage and temperature), circuits are often designed with a conserv ati v e mar gin. The mar gins are added to the v oltage and/or de vice structures to account for the uncertainty due to w orst case combination of v ariations. Ho we v er such a w orst case combination is v ery rare or e v en impossible in most situations making this design strate gy o v erly conserv ati v e. In this conte xt, se v eral researchers ha v e proposed the use of statistical timing analysis and statistical optimization mechanism to meet timing in the presence of v ariations without signicant o v er design [11, 49, 74, 84, 87]. The v ariation a w are optimization methodologies use stochastic or fuzzy methodology to minimize the impact of uncertainty due to process v ariations on performance, po wer and other design o v erheads. Statistical timing analysis (SST A) w as in v estigated in, [11, 74], where continuous distrib utions are propagated instead of deterministic v alues to nd closed form e xpressions for performance in presence of v ariations. V ariation a w are solutions ha v e also been de v eloped for circuit optimization problems lik e gate sizing, b uf fer insertion and incremental placement [49, 84, 87]. The main objecti v e of these w orks has been to impro v e yield, without compromising on performance, po wer and area. The v ariation a w are optimization techniques ha v e sho wn to impro v e design o v erheads without loss in parametric yield. Ho we v er the statistical optimization methods still o v er consume re74
PAGE 85
sources irrespecti v e of whether the circuit is af fected by v ariations or not. Hence, to f acilitate more aggressi v e po wer performanceyield tradeof f impro v ement, dynamic schemes to detect and correct the uncertainty due to process v ariations are becoming necessary Further the authors in [76], proposed a no v el design paradigm which achie v es rob ustness with respect to timing f ailure by using the concept of critical path isolation. The methodology isolates critical paths by making them predictable and rare under parametric v ariations. The top critical paths, which can f ail in single c ycle operation, are predicted ahead of time and are a v oided by pro viding tw o c ycle operations. The methodology w orks well for special circuits with rare critical paths, ho we v er has se v ere timing penalty on benchmark designs. One of the popular methods to dynamically combat process v ariation' s impact on design has been to use adapti v e v oltage scaling (A VS) [24, 48, 75]. The v oltage scaling systems tracks the actual silicon beha vior with an onchip detection circuit and scales v oltage in small increments to meet performance without high o v erheads in the presence of process v ariations. In [3, 82], the critical path of the system w as duplicated to form a ring oscillator and actual performance requirement of the circuit is corelated to the speed of the oscillator and appropriate v oltage scaling is performed. Ho we v er in the nanometer era, it is not feasible to use a single reference for a critical path and the v ariations spread, can mak e the close to critical delay paths critical on actual implementation. Recently the authors in [48], proposed an A VS system which can emulate critical paths with dif ferent characteristics. W ith increasing amount of onchip v ariations and spatial correlation the methodology can ha v e se v ere discrepancies. In a bid to reduce such mar gin and remo v e the dependenc y of feedback mechanism on a single path, a no v el onchip timing check er w as proposed in [24] to test a set of potential critical paths. The method uses a shado w latch with a delayed clock to capture data in all potential critical paths. An error signal is generated if the v alue in original and shado w latch is dif ferent due to a timing violation caused by process v ariations. The methodology ho we v er aims at correcting (not pre v enting) errors caused by aggressi v e dynamic v oltage scaling. T o guarantee high timing yield and lo w o v erheads in the presence of v ariations, the ultimate solution is to dynamically alter the clock signal frequenc y The authors in [33, 34], proposed a technique to control and adjust clock phase dynamically in the presence of v ariations. The methodology focused on the design of a dynamic delay b uf fer cell that senses v oltage and temperature v ariations and alters clock phase proportionately Ho we v er it is not generic to all types of v ariations and does not include spatial correlation between the delay b uf fer and the gates in the critical path. Plus, the methodology is not input data dependent and hence changes the 75
PAGE 86
clock capture trigger in more than required number of instances. In this chapter we propose a ne w approach for dynamic clock stretching by dynamically detecting delay due to process v ariations. Here, instead of modulating the clock duty c ycle, we delay the capture clock edge to critical memory cells to accommodate the increased signal propagation delay due to v ariations. The methodology captures the signal transition halfw ay in the critical path in a positi v e le v el triggered latch. Ho we v er if the signal transition on the critical path, which is e xpected to occur before time T/2 (positi v e le v el of clock) is delayed due to process v ariation. The latch in the detection circuit holds an opposite v alue compared to the signal line and a delayag is set. Here, T is the clock c ycle time. The delayag dynamically stretches the clock at the destination memory op, to accommodate the e xtra signal propagation delay due to v ariations. Thus the clock stretching methodology a v oids a mismatch in the data being captured and hence pre v ents timing error The detection circuitry needs to be added to the top Â”nÂ” critical paths and an error signal from an y of these paths can stretch the clock in the appropriate destination memory cell. The clock is stretched (the capture edge trigger is delayed) considering both spatial correlations between closely spaced critical path gates and an a v erage v ariation range as reported, in [12, 79]. The proposed methodology is demonstrated on e xample critical paths to e xplain the functioning of methodology Experimental results based on MonteCarlo simulations on ITC'99 benchmark circuits indicate ef cient impro v ement in timing yield with ne gligible area o v erhead. The rest of the chapter is or ganized as follo ws. In Section 2, we e xplain the construction of the delay detection circuit for ef cient clock stretching Experimental e v aluation and results for e xample circuit conguration and benchmark circuits is presented in Sections 3 and 4 respecti v ely The chapter is concluded in Section 5. 6.2 Pr oposed Methodology In this section, we e xplain the process v ariation a w are delay detection circuitry and dynamic clock stretching methodology The delay detection and clock stretching logic (CSL) is added only on critical and near critical path cells to accommodate the increased data path delay due to process v ariations. In the presence of delay due to v ariations in a path, the CSL circuit of that critical path delays the instant of the acti v e clock edge trigger on the destination memory ops. Popular dynamic schemes lik e frequenc y scaling require some acti v ation time (PLL delay) to adjust (increase/decrease) the clock 76
PAGE 87
triggered latch postive level interconnect critical mid Clock 1 0 2x1 Mux Clk Q D S Destination flop Source flop Clk Q D Clk Q D XOR critical path Example Figure 6.1 Dynamic Clock Stretching for V ariation T olerance frequenc y The proposed methodology can pro vide immediate acti v ation and enable pre v ention of timing f ailures. The proposed CSL mechanism detects and stretches the clock in the same clock period. Further since the detection circuit monitors data transitions on critical net, the methodology is independent of the type of process v ariation. Hence, it is suitable for tolerance against delay due to process, temperature, v oltage v ariations and e v en coupling noise. The proposed dynamic detection and clock stretching technique is sho wn in Figure 6.1. A critical path in between ipop stages, the transition capture latch, delayag setting XOR gate, clock stretching b uf fer and a multiple xor is sho wn in Figure 6.1. The critical path operates normally in the absence of v ariations, where the data path propagation delay is within the clock c ycle time (T). An y v ariation, due to process, v oltage, temperature or coupling, in the critical path forces propagation delays to increase, which can lead to synchronization errors in destination ipop. The only solution, other than resorting to a conserv ati v e timing specication, is to dynamically stretch or delay the capture clock trigger in the destination ops. Hence, the clock stretching technique enables the capture of correct data in the presence of v ariations without signicant o v er design. The delayed transition detection and clock stretching circuit ha v e to be pro vided for the entire top Â”nÂ” most delay critical paths. Since, an y of these paths can become the most critical path and violate 77
PAGE 88
the timing specication due to process v ariations. The paths with a delay v alue within 15% of the most critical path are selected as the candidates for dynamic stretching [12, 24]. Another important preprocessing step w ould be the identication of critical locations (interconnects), halfw ay in the critical path, where delay due to v ariations can be detected. In normal conditions, the transition on the critical interconnect should settle before the ne gati v e edge of the clock. In other w ords, the transition on the critical interconnect should happen, before the ne gati v e edge of the clock in the absence of v ariations. In the presence of v ariations, the transitions ha v e to be after the ne gati v e edge of the clock due to the increased delay W e can use timing analysis results, to nd a critical interconnect in each of the top Â”nÂ” critical paths of the circuit. If a critical interconnect does not e xist automatically in an y of the top Â”nÂ” critical paths. W e can perform circuit sizing, b uf fer insertion or other incremental changes to create a net with the specied characteristics. Another important aspect in the proposed technique is checking the feasibility of capturing the delayed transition with reference to the ne gati v e edge of the clock. The k e y issue is to mak e sure the critical transition is not missed due to setup/hold timing violation of the positi v e le v el triggered latch. In an ef fort to conrm, we constructed an e xample critical path with 20 le v els of logic at the 65nm technology node le v el. W e used real delay information from 65 nm technology library for this e xperiment. The simulations were performed on critical paths with multiple logic gate congurations and a 15% range for delay changes due to process v ariations. The setup clearly conrmed the feasibility of the proposed transition capture methodology The positi v e le v el triggered latch, sho wn in Figure 6.1, captures the v alue oating on critical interconnect during the positi v e le v el of the clock. The critical transitions on this interconnect, on normal conditions, needs to ha v e the transition completed before time (T/2). If the transition had been delayed due to process v ariations, then the inputs to the XOR gate will be dif ferent. Hence, in the presence of delay due to v ariations the XOR gate will output a 1. The multiple xor sho wn in Figure 6.1, has the ability to select the normal (undelayed) or delayed clock for the destination ipop. The XOR gate output (referred to as the delayag in earlier sections) is used as the select line of the multiple xor Thus, the proposed methodology dynamically selects the delayed clock, in case the signal propagation is delayed in the data path due to v ariations. An important assumption in the proposed methodology is the property of spatial correlations. Spatial correlations property denes, that closer a set of de vices are placed, the higher is the probability that the y ha v e a similar v ariation range [37, 87]. Plus, in the nanometer era due to the increased timing requirement, gates in the top Â”nÂ” critical paths 78
PAGE 89
T able 6.1 Description of Symbols in Simulation Snapshot Clock signals CK Circuit Clock delclk Delayed clock muxoutclk Output clock from multiple xor Data signals in11 Input data v alue a5 Critical mid interconnect v alue out1 Output data v alue Clock stretch signals a5test Latch output muxsel Select line of multiple xor are usually placed closer to each other Hence, we can safely assume the irrespecti v e of the gate' s le v el in a critical path, if one gate is af fected by v ariations, there is high probability other gates in the critical path are af fected in similar f ashion. F or e xample, if there is a v ariation in the second half of the path, due to the property of spatial correlations some gates in the rst half of the path will also af fected by v ariations and vice v ersa. The magnitude of these v ariations, will be hard to predict, and can be dif ferent based on their location in the chip area. Ho we v er the presence or absence of v ariations can be safely assumed with the property of spatial correlations. Further the use of a delayed clock edge trigger for the destination ops, can result in timing inconsistenc y The main issues in this conte xt, are the short paths and consecuti v e critical paths. A short path raises the possibility of data corruption (f ailures) in a destination memory cell. Ho we v er in nanometer designs short paths are usually rare due to the multiple objecti v es of po wer performance and yield. Plus, with a small mar gin of clock stretching (1015%), it is easy to perform sizing to eliminate short path f ailures. Secondly in pipeline circuits if a critical path is follo wed by another critical path in the follo wing pipeline stage, the CSL methodology can cause timing f ailures. This is because the delayed clock circuitry reduces the data capture time a v ailable in subsequent pipeline stage. Hence, in pipeline circuits with consecuti v e critical paths, the delayag has to be propagated to subsequent stages to create the necessary slack by automatic clock stretching. In the ne xt section, we v alidate the proposed CSL technique on simple e xample and benchmark circuits. 79
PAGE 90
Figure 6.2 Simulation Snapshot of Example Circuit: No V ariations; No Clock Stretching Figure 6.3 Simulation Snapshot of Example Circuit: W ith V ariations; No Clock Stretching 6.3 Experimental Ev aluation T o e v aluate the proposed methodology we simulated an e xample circuit using Cadence NCV erilog simulator The purpose of this simulation is to elucidate the functionality of the methodology in the presence of v ariations in circuit elements. The ef cienc y of the methodology ho we v er is calculated with MonteCarlo based timing yield simulations. A chain of in v erters in between tw o ipops stages is chosen as the e xample circuit. In this circuit, all interconnects in the path mak es a transition. Hence, the net halfw ay in the path becomes the necessary critical interconnect. The clock c ycle time is selected in reference to the critical path delay at the nominal corner In other w ords, the clock c ycle time is chosen without adding a mar gin for uncertainty in delay due to process v ariations. In addition to the input, output and clock signals, a critical interconnect that transitions from 0"1 (or 1"0) just 80
PAGE 91
Figure 6.4 Simulation Snapshot of Example Circuit: No V ariations; W ith Clock Stretching Figure 6.5 Simulation Snapshot of Example Circuit: W ith V ariations and Clock Stretching before the ne gati v e edge of the clock is also displayed in the simulation snapshots (Figures 6.2 and 6.3). A simulation snapshot of the e xample circuit with no v ariations is sho wn in Figure 6.2. The signals sho wn in the snapshot are classied as clock (CK) and data (in11, a5, out1)signals. The signal 81
PAGE 92
in11 is the primary input (output from the source ipop), a5 is the critical interconnect halfw ay in the path and out1 is the output signal connected to the destination ipop. It can be seen from Figure 6.2, that a 0"1 transition on the input signal (in11), initiates a transition on a5 and is captured on the ne xt clock c ycle in the destination ipop (out1). The simulation snapshot for the same circuit in the presence of uncertainty in delay due to process v ariations is sho wn in Figure 6.3. In the presence of v ariations, the same 1"0 transition on critical net a5, happens after the ne gati v e edge of the clock. The delay due to process v ariations also caused a timing violation on the output ipop (out1), as the (0"1) transition is captured on the subsequent clock c ycle. Hence, the transition on the critical interconnect, which is an indicator of a timing violation is sent to the clock stretching logic (CSL) unit. The latch in the CSL unit captures the transition if it happens in the positi v e le v el of the clock. If the transition is delayed due to process v ariations, the inputs to the XOR gate (signal line and latch output) dif fer in v alue, causing a 1output in the XOR gate. The output of the XOR gate, which is connected to the multiple xor selects the delayed clock. The delayed clock sent to the destination ipop, a v oids a timing violation. The simulation snapshot for the e xample circuit with the clock stretching logic is sho wn in Figures 6.4 and 6.5. The delayed clock (delclk) and the mux output to the destination op (muxoutclk) are added to the list of clock signals. In addition to the input and clock signals, the CSL snapshots also sho w clock stretch signals a5test and muxsel. The muxsel signal is the output of the XOR gate and a5test is the output of the positi v e le v el triggered latch. It can be clearly seen in Figure 6.5, that the delayed clock is sent to the muxoutclk, whene v er the transitions on the critical interconnect happens after the ne gati v e edge of the clock. Further in Figure 6.4, we sho w that in the absence of v ariations the circuit operates normally and clock (CK) is used at the source and the destination ipops. A brief description of the symbols used in the simulation snapshot is sho wn in T able 6.1. Similar to the simulation snapshot, the signals in the symbol table are also grouped as clock, data and clock stretch signals. 6.4 Simulation Results The proposed dynamic clock stretching logic w as tested on ITC'99 benchmark circuits. The impro v ement in timing yield for the circuits w as estimated using MonteCarlo simulations. The simulation o w for the timing yield estimation is sho wn in Figure 6.6. The gate and net delay of the 82
PAGE 93
Identify the most critical path using a timing analysis and select a suitable clock period T no variations Perform MonteCarlo simulations by randomly varying process parameters If midinterconnect arrival time is after the negative edge of clock If there is no midinterconnect incremental changes to critical path Create a midinterconnect using For each of the top n extracted critical path Identify a midinterconnect, whose delay is within 10% of the negative edge of the clock In addition to the most critical path, extract paths which are likely to fail in the presence of process variations yes no Calculate path delay and estimate timing yield Calculate path delay and estimate timing yield Stretch clock and calculate timing yield Figure 6.6 Simulation Flo w for T iming Y ield Estimation circuit elements were assumed to ha v e a v ariation range of around 20% from the nominal v alue. In the absence of real statistical data, it has been pointed out in [79], that it' s reasonable to assume a v ariation parameter v alue of around 2025% on the delay due to process v ariations. The R TL le v el VHDL netlists of the benchmark circuits were con v erted to a attened gate le v el V erilog netlist using the Synopsys design compiler The output V erilog le from the design compiler is then placed and routed using the cadence encounter tool. A timing analysis report (T ARPT) le is then generated to identify the critical paths whose delay v alue is within 15% of the most critical path. A MonteCarlo simulation frame w ork is created in a Cprogram en vironment for the ITC benchmarks with placed 83
PAGE 94
Figure 6.7 Clock Stretch Range V ersus T iming Y ield and routed (DEF), parasitics le (SPEF), timing analysis report (T ARPT) and standard cell delay libraries as input. The MonteCarlo simulation creates 20000 instance of the benchmark with v aried delay between nominal and maximum v ariation range to estimate the timing yield. T iming yield in this conte xt, is dened as the percentage of circuit meeting the timing specication. The circuits are tested for timing yield in tw o congurations, namely (i) original circuit with v ariations and (ii) CSL added circuit with v ariations. The timing specication for the original and the CSL added circuit, is assume to be 100% in the absence of v ariations. The impro v ement in timing yield achie v ed by the CSL methodology compared to the original conguration is sho wn in T able 6.2. It can be seen that the a v erage timing yield of the original circuit with the nominal timing specication is approximately 60% and is not acceptable. Hence, circuit designers (with or without statistical optimization) close timing with an e xtra mar gin. The e xtra timing mar gin increases the o v erhead and/or decrease the performance at which the circuit can operate. The proposed methodology dynamically detects the delay due to v ariations and adds the e xtra timing mar gin only when required. The proposed CSL methodology has increased the a v erage timing yield to around 99.9%. The clock w as stretched to create an e xtra timing slack of 10% only if delay due to process v ariations are acti v ated in the w orst case critical paths. In the conte xt of timing f ailures due to short paths, it is crucial to k eep the clock stretching range as short as possible. Hence, we performed 84
PAGE 95
T able 6.2 T iming Y ield Results on Benchmark Circuits ITC' 99 No. of No. of Near Critical CSL T iming Y ield Benchmark Gates Nets P aths o v erhead without CSL with CSL b11 385 322 9 9% 96.5% 99.64% b12 834 847 16 7.6% 82% 99.65% b14 4232 4544 65 6.1% 66.2% 99.97% b15 4585 4716 80 6.9% 48.6% 99.99% b20 8900 9538 110 4.9% 62.1% 99.95% b22 12128 13093 118 3.8% 37.0% 99.92% b17 15524 15911 150 3.8% 56.2% 99.97% b18 42435 44554 152 1.5% 61.2% 99.99% A v erage Percent 5.4% 63.7% 99.9% Le gend: CSLClock Stretching Logic Le gend: CSL o v erhead: Percentage of CSL logic area compared to total circuit area Le gend: Near Critical P aths: P aths that can violate timing yield with v ariations a simple analysis on selected benchmark circuits to see the impact of clock stretch range on timing yield (Figure 6.7). A smaller v alue for clock stretch range, for e xample 5% is sho wn to impact the timing yield signicantly Hence, with the dual objecti v e of near perfect timing yield and zero short path f ailures, we ha v e selected the clock stretch range to be 10% of the clock period. In addition to the timing yield impro v ement results, we ha v e also specied the benchmark characteristics (number of gates and interconnects), the number of near critical paths and the area o v erhead due to CSL logic in T able 6.2. The proposed CSL methodology also incurs an a v erage area o v erhead of 5%. The area o v erhead can be further reduced, if we resort to isolating critical paths similar to the pre vious w orks on dynamic clock stretching [34, 76]. 6.5 Conclusion In this chapter we ha v e proposed a dynamic clock stretching technique to impro v e the timing yield of circuits in the presence of uncertainty due to process v ariations. Statistical optimization based techniques due to their o v er design property consume e xtra resources (performance and/or po wer) e v en in the absence of v ariations. The proposed methodology on the other hand, adds timing slack/mar gin (clock stretching) only in the presence of v ariations. Further e v en in the presence of v ariations, the proposed methodology acti v ates clock stretching logic only on input patterns that enable the w orst critical paths. The dynamic delay detection circuitry impro v es yield by controlling the instance of 85
PAGE 96
data capture in critical path memory ops. Experimental results based on MonteCarlo simulation indicate sizeable impro v ement in a v erage timing yield with a ne gligible area o v erhead. 86
PAGE 97
CHAPTER 7 CONCLUSIONS AND FUTURE W ORK The de vice and interconnect scaling in CMOS circuits with the objecti v e to follo w Moore' s curv e ha v e brought out numerous issues for design, test and manuf acturing engineers. The le v el of miniatur ization and inte gration of billion transistors on a single chip [32], gi v es a clear picture of the nanometer circuit comple xity The increasing inte gration le v els is introducing ne w issues, which is making multimetric circuit optimization more comple x. The do wnw ard scaling of technology is also gradually reaching the limits of ballistic transportation. Hence, it is crucial to de v elop circuit optimzation techniques in the nanometer era, that can achie v e high performance, lo w po wer dissipation and high reliablity The optimization objecti v es are highly correlated and conicting in nature. Further with increasing le v els of v ariations in process parameters, performance is greatly af fected leading to yield loss. It is a challenging task to address all these issues in a single frame w ork. The focus of this dissertation is to address these concerns, by proposing ne w techniques for modeling and optimiziation of nanometer VLSI circuits considering process v ariations. Se v eral researchers ha v e proposed timing analysis based iterati v e techniques to solv e the v ariation a w are circuit optimization problem. The methods, ho we v er ha v e a prohibiti v e runtime. Secondly the probability distrib ution based techniques in this area needs detailed statistical infromation on the randomness of the v ariations. The de vice le v el manuf acturing tests on f abricated circuits suggest that the uncertainty due to v ariations in process parameters does not follo w an y specic distrib ution. Hence, in this research we propose the use of a fuzzy mathematical programming based optimization technique for v ariation a w are circuit optimization. The uncertainty in delay due to process v ariations are modeled as interv al v alued fuzzy numbers with linear membership functions. In specic, we ha v e proposed solutions for the follo wing problems: I: A layout le v el gate sizing frame w ork for simultaneous optimization of delay po wer noise and timing yield using a fuzzy linear programming approach 87
PAGE 98
II: A post layout timing based placement technique to optimize delay and timing yield using the concepts of fuzzy nonlinear programming III: A stochastic chance constrained programming based technique for timing based placement to optimize delay and timing yield at the layout le v el IV : A circuitwise b uf fer insertion and dri v er sizing at the layout le v el using the concepts of fuzzy piecewise linear programming to optimze po wer delay noise and timing yield V : A logic le v el b uf fer insertion and dri v er using a lookup table based net length prediction using fuzzy piecewise linear programming to optimize po wer delay and timing yield VI: A dynamic delay detection and v ariation compensation using clock stretching to a v oid timing f ailure due to process v ariations. The v ariation a w are gate sizing technique is formulated as a fuzzy linear program and the uncertainty in delay due to process v ariations is modeled using fuzzy numbers. The fuzzy numbers in all the problems use linear membership functions for simplicity The process v ariation a w are incremental timing based placement problem is modeled as a nonlinear programming problem due to the quadratic dependence of delay on interconnect length. The v ariation a w are timing based placement problem is solv ed using nonlinear FMP and Stochastic CCP formulations. The stochastic and fuzzy problem pro vide comparable solutions for the timing based placement problem. In the mathematical programming based v ariation resistance impro v ement, we ha v e also proposed a piecewise linear solution to the b uf fer insertion and dri v er sizing (BIDS) problem. The BIDS problem is also solv ed at the logic le v el, with lookup table based approximation of net lengths for v ariation a w areness early in the design o w The logic and layout le v el solutions are comparable conrming the ef cienc y of the proposed methodology Finally we ha v e proposed a ne w technique that can dynamically enable v ariation compensation. A dynamic delay detection circuit is used to identify the uncertainty in delay due to v ariations. The delay detection circuit controls the instance of data capture in critical path memory ops to a v oid a timing f ailure only in the presence of v ariations. The proposed methodology impro v es the timing yield of the circuit with o v er compenstation only when required. In summary the v arious formulation and solution techniques de v eloped in this dissertation achie v e signicantly better optimization and run times compared to other related methods. The techniques lack an y assumption 88
PAGE 99
on the v ariation parameters and hence can be used to model v ariations early in the design o w The proposed methods ha v e been rigorously tested on medium and lar ge sized benchmarks to establish the v alidity and ef cac y of the solution techniques. Based on the results presented in this dissertation future w ork could be to inte grate the proposed solutions in a single frame w ork and de v elop ne w fuzzy mathematical programming based techniques for v ariation a w are multimetric optimization. I: The v ariation a w are techniques for gate sizing, b uf fer insertion and placement can be inte grated into a single frame w ork to simultaneous consider the best possible option in each step of the optimization. II: The multiple threshold v oltage assignment and clock sk e w minimization problems is inherently suited to mathematical programming based formulation. Further with increasing v ariations in parameters it is crucial to model the uncertainty in threshold v oltage. Hence, the problems can be formulated as a fuzzy mathematical program at the layout le v el with v ariations modeled as fuzzy membership functions. III: In this dissertation, we ha v e modeled the v ariation parameters as a fuzzy numbers with linear membership functions. The uncertainty in delay due to v ariations in cirucit optimization problems can also be modeled as fuzzy nonlinear membership functions to compare the ef cienc y of fuzzy techniques in the presence of multiple v ariation distrib ution. The v ariation resistance coef cient l will ha v e to be changed to represent the nonlinear membership function. IV : The dynamic delay detection technique proposed in Chapter 6, in addition to a v oiding timing f ailures due to clock stretching, can also be used for adapti v e v oltage scaling. The error signal generated from the delay detection latch can be forw arded to the v oltage controller to adjust the v oltage. If the error signal indicates that multiple critical paths ha v e been af fected, and clock stretching occurs too man y times, then adapti v e v oltage scaling can be a better option. The problem needs further in v estigation. 89
PAGE 100
REFERENCES [1] H. T ennak oon and C. Sechen. Gate Sizing using Lagrangian Relaxation Combined with a F ast Gradient based Preprocessing Step. In International Confer ence on Computer Aided Design pages 395Â–402, 2002. [2] J.P Fishb urn and A.E. Dunlop. TILOS A Posynomial Programming Approach to T ransistor Sizing. In International Confer ence on Computer Aided Design pages 326Â–336, 1985. [3] T K uroda, K. Suzuki, S. Mita, T Fujita, F Y amane, F Sano, A. Chiba, Y W atanabe, K. Matsuda, T Maeda, T Sakurai and T Furuyama V ariable Supply V oltage Scheme for Lo wpo wer Highspeed CMOS Digital Design. In IEEE J ournal of SolidState Cir cuits pages 33(3) 454Â–462, 1993. [4] A. Agarw al, D. Blaauw V Zoloto v S. Sundaresw aran, M. Zhao, K. Gala and R. P anda Statistical Delay Computation Considering Spatial Correlations. In Asia South P acic Design A utomation Confer ence pages 271Â–276, 2003. [5] A. B. Kahng, C. P ark, P Sharma and Q. W ang. Lens Aberration A w are T imingdri v en Placement. In Design A utomation and T est in Eur ope pages 890Â–895, 2006. [6] A. B. Kahng, S. Mantik and I. L. Mark o v. Minmax placement for Lar ge Scale T iming Optimization. In International Symposium on Physical Design pages 143Â–148, 2002. [7] A. Charnes and W .W Cooper. Deterministic Equi v alents for Optimizing and Satisfying under Chance Constraint. In Oper ations Resear c h pages 18Â–39, 1963. [8] A. Charnes, W .W Cooper and G.H. Symonds. Cost Horizons and Certainty Equi v alents: An Approach to Stochastic Progrmaming of Heating Oil. In Mana g ement Science pages 235Â–263, 1958. [9] A. Cho wdhary K. Rajagopal, S. V enkatesan, T Cao, V T iourin, Y P arasuram and B. Haplin. Ho w Accurately can W e Model T iming in a Placement Engine? In Design A utomation Confer ence pages 801Â–806, 2005. [10] A. Da v oodi and A. Sri v asta v a. V ariability Dri v en Gate Sizing for Binning Y ield Optimization. In Design A utomation Confer ence pages 294Â–299, 2005. [11] A. De vgan and C. Kashyap. Blockbased Static T iming Analysis with Uncertainty In International Confer ence on Computer Aided Design pages 607Â–614, 2003. [12] A. De vgan, and S. Nassif. Po wer V ariability and its Impact on Design. In International Confer ence on VLSI Design pages 679Â–682, 2005. [13] A. K. Muruga v el and N. Ranganathan. Gate Sizing and Buf fer Insertion using Economic models for Po wer Optimization. In International Confer ence on VLSI Design pages 195Â–200, 2004. 90
PAGE 101
[14] A. Kahng and Y P ati. Subw a v elength Optical Lithography: Challenges and Impacts on Physical Design. In International Symposium on Physical Design pages 112Â–119, 1999. [15] A. Sri v asta v a, D. Sylv ester and D. Blaauw. Statistical Optimziation of Leakage Po wer Considering Process V ariations using DualVth and Sizing. In Design A utomation Confer ence pages 773Â–778, 2004. [16] C. J. Alpert, M. Hrkic, and S.T Quay A f ast algorithm for identifying good b uf fer insertion candidate locations. In International Symposium on Physical design pages 47Â–52, 2004. [17] M Berk elaar and J. Jess. Gate Sizing in MOS Digital Circuits with Linear Programming. In Eur opean Design A utomation Confer ence pages 217Â–221, 1990. [18] C. Chen and M. Sarrafzadeh Simultaneous V oltage Scaling and Gate Sizing for Lo wPo wer Design. In IEEE T r ansactions on Cir cuits and Systems pages 400Â–408, 2002. [19] C. Hw ang and M. Pedram. T imingDri v en Placement Based on Monotone Cell Ordering Constraints. In Asia South P acic Design A utomation Confer ence 2006. [20] C. Schmidt and I.E. Grossman. The Exact Ov erall T ime Distrib ution of Uncertain T ask Durations. In Eur opean J ournal of Oper ational Resear c h 2000. [21] C. V iswesw ariah. Optimization T echniques for HighPerformance Digital Circuits. In International Confer ence on Computer Aided Design pages 198Â–207, 1997. [22] C. V iswesw ariah. Death T ax es and F ailing Chips. In Design A utomation Confer ence pages 343Â–347, 2003. [23] C.N. Sze, C.J. Alpert, J. Hu and W Shi. P ath based b uf fer insertion. In Design A utomation Confer ence pages 509Â–514, 2005. [24] D. Ernst, N. Kim, S. Das, S. P ant, R. Rao, T Pham, C. Ziesler D. Blaauw T Austin, K. Flautner and T Mudge. A Lo w Po wer Pipeline Based on Circuit Le v el T iming Speculation. In IEEE/A CM International Symposium on Micr oar c hitectur e pages 7Â–18, 2003. [25] D. Sinha and H. Zhou Gate Sizing for Crosstalk Reduction under T iming Constraints by Lagrangian Relaxation. In International Confer ence on Computer Aided Design pages 14Â–19, 2004. [26] D. Sinha, N.V Sheno y and H. Zhou. Statistical T iming Y ield Optimization by Gate Sizing. In IEEE T r ansactions on VLSI Systems pages 1140Â–1146, 2006. [27] D. W ang. Meeting Green Computing Challenges. In Internationaal Symposium on High Density P ac ka ging and Micr osystem Inte gr ation pages 1Â–4, 2007. [28] G. N. Raf ail and K. Y enilmez. Fuzzy Linear Programming Problems with Fuzzy Membership Functions. In Mathematical Subject Classication pages 375Â–396, 2000. [29] H.R. Kheirabadib, M.S. Zamani and M. Saeedi. An ef cient analytical approach to pathbased b uf fer insertion. In IEEE Computer Society Annual Symposium on VLSI pages 219Â–224, 2007. [30] I. Kark o wski. Architectural Synthesis with Possibilistic Programming. In International Confer ence on System Sciences pages 14Â–22, 1995. 91
PAGE 102
[31] I. Liu and A. Aziz. Meeting Delay Constraints in DSM by Minimal Repeater Insertion. In Design, A utomation and T est in Eur ope pages 436Â–440, 2000. [32] Intel Corporation. W orlds First T w o Billion T ransistor Microprocessor In http://www .intel.com/tec hnolo gy /a r c hit ect ur esil icin /2 bil lio n.ht m 2008. [33] J. Semiao, J.J. RodriguezAndina, F V ar gas, M. Santos, I. T eix eira and P T eix eira. Impro ving the T olerance of Pipeline Based Circuits to Po wer Supply or T emperature V ariations. In Inter national Symposium on Defect and F aultT oler ance in VLSI Systems pages 303Â–311, 2007. [34] J. Semiao, J.J. RodriguezAndina, F V ar gas, M. Santos, I. T eix eira and P T eix eira. Process T olerant Design Using Thermal and Po wer Supply T olerance in Pipeline Based Circuits. In IEEE workshop on Design and Dia gnostics of Electr onic Cir cuits and Systems pages 1Â–4, 2008. [35] J. Singh, V Nookala, Z. Luo and S. Sapatnekar. Rob ust Gate Sizing by Geometric Programming. In Design A utomation Confer ence pages 315Â–320, 2005. [36] J. Xiong and L. He. F ast Buf fer Insertion Considering Process V ariations. In International Symposium on Physical Design 2006. [37] J. Xiong, V Zoloto v and L. He. Rob ust Extraction of Spatial Correlations. In IEEE T r ansactions on Computer Aided Design pages 619Â–631, 26(4) 2007. [38] J.B. Martins, F Moraes and R. Reis. Interconnection Length Estimation at Logicle v el. In Inte gr ated Cir cuits and Systems Design pages 98Â–102, 2001. [39] J.F T ang, D. W W ang, Y .K. Fung, and K.L. Y ung. Understanding of Fuzzy Optimization: Theories and Methods. In J ournal of System Science and Comple xity pages 117Â–136, 2004. [40] J.J. Buckle y. Stochastic v ersus Possibilistic Programming. In International J ournal on Fuzzy Sets and Systems pages 173Â–177, 34(2), 1990. [41] K. Bhattacharya and N. Ranganathan. Reliabilitycentric Gate Sizing with Simultaneous Optimization of Soft Error Rate, Delay and Po wer. In International Symposium on Low P ower Electr onic Design pages 99Â–104, 2008. [42] K. Chopra, S. Shah, A. Sri v asta v a, D. Blaauw and D. Sylv ester. P arametric Y ield Maximization using Gate Sizing Based on Ef cient Statistical Po wer and Delay Gradient Computation. In International Confer ence on Computer Aided Design pages 1023Â–1028, 2005. [43] K. J. Geor ge and B. Y uan. Fuzzy Sets and Fuzzy Logic, by Prentice Hall, 1995. [44] L. Cheng, J. Xiong and L. He. NonLinear Statistical Static T iming Analysis for NonGaussian V ariations Sources. In Design A utomation Confer ence pages 250Â–255, 2007. [45] L. Deng and M. D. F W ong. Buf fer Insertion Under Process V ariations for Delay Minimization. In IEEE/A CM International Confer ence on Computer aided Design pages 317Â–321, 2005. [46] L. W ei, K. Ro y and C. K oh. Mix edvth CMOS Circuit Design Methodology for Lo w Po wer Applications. In Custom Inte gr ated Cir cuits Confer ence pages 413Â–416, 2000. [47] L.P .P .P v an Ginnek en. Buf fer placement in distrib uted rctree netw orks for minimal elmore delay In International Symposium on Cir cuits and Systems pages 190Â–197, 1990. 92
PAGE 103
[48] M. Elgebaly and M. Sachde v. V ariationA w are Adapti v e V oltage Scaling System. In IEEE T r ansactions on V ery Lar g e Scale Inte gr ation (TVLSI) pages 560Â–571, 15(5), 2007. [49] M. Hashimoto and H. Onodera. A Performance Optimization Method by Gate Sizing using Statistical Static T iming Analysis. In International Symposium on Physical Design pages 111Â– 116, 2000. [50] M. Hrkic, J. Lillis and G. Beraudo. An Approach to Placement Coupled Logic Replication. In Design A utomation Confer ence pages 711Â–716, 2004. [51] M. Inuiguchi and J. Ramik. Possibility Linear Programming: A Brief Re vie w of Fuzzy Mathematical Programming and Comparison with Stochastic Programming in Portfolio Selection Problem. In Fuzzy Sets and Systems pages 3Â–28, 2000. [52] M. Lundstrom. Elementary Scattering Theory of the Si MOSFET. In IEEE Electr on De vice Letter s pages 361Â–363, 18(7), 1997. [53] M. Mani, A. De vgan, and M. Orshansk y. An Ef cient Algorithm for Statistical Minimization of T otal Po wer under T iming Y ield Constraints. In Design A utomation Confer ence pages 309Â–314, 2005. [54] M. Mani and M. Orshansk y. A Ne w Statistical Optimization Algorithm for Gate Sizing. In International Confer ence on Computer Design pages 272Â–277, 2004. [55] M. R. Guthas, N. V enkatesw aran, C. V iswesw ariah and V Zoloto v Gate Sizing using Incremental P arameterized Statistical T iming Analysis. In International Confer ence on Computer Aided Design pages 1029Â–1036, 2005. [56] M. Saka w a. Genetic Algorithms and Fuzzy Multiobjecti v e optimization, by Kluwer Academic Publishers, 2002. [57] M.E. Le vitt. Design for Manuf acturing? Design for Y ield! In International Symposium on Quality Electr onic Design pages 19Â–19, 2004. [58] N. Hanchate and N. Ranganathan. Statistical Gate Sizing for Y ield Enhancement at Post Layout Le v el. In International Symposium on V ery Lar g e Scale Inte gr ation pages 225Â–232, 2007. [59] N. Ranganathan, U. Gupta and V Mahalingam. Simultaneous Optimization of T otal Po wer Crosstalk Noise and Delay Under Uncertainty. In Gr eat Lak es Symposium on VLSI pages 171Â– 176, 2008. [60] O. Coudert Gate Sizing for Constrained Delay/Po wer/Area Optimization. In IEEE T r ansactions on VLSI Systems pages 465Â–472, 1997. [61] P P ant, V De and A. Chatterjee. De vicecircuit Optimization for Minimal Ener gy and Po wer Consumption in CMOS Random Logic Netw orks. In IEEE T r ansactions on VLSI pages 390Â– 394, 9(2), 2001. [62] P Sax ena, N. Menezes, P Cocchini, D.A. Kirkpatrick. Repeater scaling and its impact on cad. In IEEE T r ansactions on Computer Aided Design of Inte gr ated Cir cuits and Systems pages 451Â– 463, 23(4), 2004. [63] R. B. Hitchcock. T iming V erication and the T iming Analysis Program. In Design A utomation Confer ence pages 594Â–604, 1982. 93
PAGE 104
[64] R. E. Bellman and L. A. Zadeh. Decision Making in Fuzzy En vironment. In J ournal of Mana g ement Science pages 141Â–164, 17(4), 1970. [65] R. F ourer and D. M. Gay and B. W K ernighan. AMPL: A Modeling Language for Mathematical Programming, by Duxb ury Press, 2002. [66] R. H. Byrd, J. Nocedal, and R. A. W altz. KNITR O: An Inte grated P ackage for Nonlinear Optimization. In Lar g eScale Nonlinear Optimization (http://wwwneos.mcs.anl.go v/neos/so lve r s/ nc o:KNITR O/AMPL.html) pages 35Â–59, 2006. [67] R.B. Lin and M.C. W u. A Ne w Statistical Approach to T iming Analysis of VLSI Circuits. In International Confer ence on VLSI Design pages 507Â–513, 1998. [68] R.H.J.M Otten and R.K. Brayton. Planning for Performance. In Design A utomation Confer ence pages 122Â–127, 1998. [69] S. Bhardw aj, Y Cao and S.B.K. Vrudhula. Statistical Leakage Minimization Through Joint Selection of Gate Sizes Gate Lengths and Threshold V oltage. In Asia South P acic Design A utomation Confer ence pages 953Â–958, 2006. [70] S. Borkar. Design Challenges of T echnology Scaling. In IEEE MICR O pages 23Â–29, 1999. [71] S. Borkar T Karnik and V De. Design and Reliability Challenges in Nanometer T echnologies. In Design A utomation Confer ence pages 75Â–75, 2004. [72] S. Borkar T Karnik, S. Narendra, J. Tschanz, A. K esha v arzi and V De. P arameter V ariations and Impact on Circuit and Microarchitecture. In Design A utomation Confer ence pages 338Â–342, 2003. [73] S. Bo yd and L. V andenber ghe. Con v e x Optimization, by Cambridge Uni v ersity Press, 2008. [74] S. De v adas, H. F Jyu, K. K eutzer and S. Malik. Statistical T iming Analysis of Combinational Circuits. In International Confer ence on Computer Design pages 38Â–43, 1992. [75] S. Dhar D. Maksirno vi and B. Kranzen. ClosedLoop Adapti v e V oltage Scaling Controller for StandardCell ASICs. In IEEE symposium on Low P ower Electr onic Design pages 103Â–107, 2002. [76] S. Ghosh, S. Bhunia and K. Ro y. CRIST A: A Ne w P aradigm for Lo wPo wer V ariationT olerant, and Adapti v e Circuit Synthesis Using Critical P ath Isolation. In IEEE T r ansactions on Computer Aided Design of Inte gr ated Cir cuits and Systems pages 1947Â–1956, 26(11), 2007. [77] S. Nassif. Delay V ariability: Sources, Impacts and T rends. In International Solid States Cir cuits Confer ence pages 368Â–369, 2000. [78] S. Neiroukh and X. Song. Impro ving the ProcessV ariation T olerance of Digital Circuits using Gate Sizing and Statistical T echniques. In Design A utomation and T est in Eur ope pages 959Â– 964, 2006. [79] S. R. Nassif. The Impact of V ariability on Po wer. In International Symposium on Low P ower Electr onic Design pages 350Â–350, 2004. 94
PAGE 105
[80] S. Sapatnekar V Rao, P V aidya and S. Kang. An Exact Solution to the T ransistor Sizing Problem for CMOS Circuits Using Con v e x Optimization. In IEEE T r ansactions on Computer Aided Design pages 1621Â–1634, 12(11), 1993. [81] S. Srini v asan and V Narayanan. V ariation A w are Placement for FPGAs. In International Symposium on VLSI pages 422Â–425, 2006. [82] T Burd, T Pering, A. Stratak os and R. Brodersen. A Dynamic V oltage Scaled Microprocessor System. In IEEE SolidState Cir cuits Confer ence (ISSCC) pages 294Â–295, 2000. [83] T Luo, D. Ne wmark and D. Z. P an. A Ne w LP based Incremental T iming Dri v en Placement for High Performance Designs. In Design A utomation Confer ence pages 1115Â–1120, 2006. [84] V Mahalingam and N. Ranganathan. V ariation A w are T iming Based Placement Using Fuzzy Programming. In International Symposium on Quality Electr onic Design pages 327Â–332, 2007. [85] V Mahalingam and N. Ranganathan. A Fuzzy Optimization Approach for Process V ariation A w are Buf fer Insertion and Dri v er Sizing. In IEEE International Symposium on VLSI pages 329Â–334, 2008. [86] V Mahalingam and N. Ranganathan. T iming Based Placement Considering Uncertainty due to Process V ariations. In Accepted for Publication (Sep 2008) IEEE T r ansactions on VLSI Systems page 2008. [87] V Mahalingam, N. Ranganathan and J. E. Harlo w. A No v el Approach for V ariation A w are Po wer Minimization during Gate sizing. In International Symposium on Low P ower Electr onic Design pages 174Â–179, 2006. [88] V Mahalingam, N. Ranganathan and J. E. Harlo w. Fuzzy Optimization Approach for Gate Sizing in the Presence of Process V ariations. In IEEE T r ansactions on VLSI Systems pages 975Â–984, 16(8), 2008. [89] V Mehrotra, S. Sam, D. Boning, A. Chandrakasan, R. V al1ishayee, and S. Nassif. A Methodology for Modeling the Ef fects of Systematic W ithinDie Interconnect and De vice V ariation on Circuit Performance. In Design A utomation Confer ence pages 172Â–175, 2000. [90] W Chen, C.T Hseih and M. Pedram Simultaneous Gate Sizing and Placement. In IEEE T r ansactions on Computer Aided Design pages 206Â–214, 2000. [91] W Cho w and K. Bazar gan. Incremental Placement for T iming Optimization. In International Confer ence on Computer Aided Design pages 463Â–466, 2003. [92] W Maly. Quality of Design from an IC Manuf acturing Perspecti v e. In International Symposium on Quality Electr onic Design pages 235Â–236, 2001. [93] W Shi and Z. Li. An o(nlogn) time algorithm for optimal b uf fer insertion. In Design A utomation Confer ence pages 580Â–585, 2005. [94] W .E Donath. Placement and A v erage Interconnection lengths of Computer Logic. In IEEE T r ansactions on Cir cuits and Systems pages 272Â–277, 26(4), 1979. [95] X. Bai, C. V iswesw ariah, S. N. Philip and H. J. Da vid. UncertaintyA w are Circuit Optimization. In Design A utomation Confer ence pages 58Â–63, 2002. 95
PAGE 106
[96] Z. Jiang and W Shi. Circuitwise b uf fer insertion and gate sizing with scalability In Design A utomation Confer ence pages 708Â–713, 2008. [97] Z. Ren, D. Z. P an and D. S. K ung. Sensiti vity Guided Net W eighting for Placement Dri v en Synthesis. In International Symposium on Physical Design pages 124Â–131, 2004. 96
PAGE 107
ABOUT THE A UTHOR Mahalingam V enkataraman (V Mahalingam)(S'03) recei v ed the B.E. de gree in Computer Science from Sri V enkatesw ara Colle ge of Engineering (SVCE), Uni v ersity of Madras, India, in 2003 and the Master' s de gree in Computer Engineering from the Uni v ersity of South Florida, T ampa, in 2005, where he is currently w orking to w ard the Ph.D. de gree in Computer Science and Engineering. In 2002 and 2003, he w ork ed as a part time research assistant at W aran Research F oundation (W ARFT), India. In 2008, he w ork ed as a student intern in the ASIC Design for T est (DFT) group at T e xas Instruments, Dallas. His research interests include design automation, circuit design, VLSI testing and Design for Manuf acturability Mahalingam is a student member of the IEEE and IEEE Computer Society and has recei v ed the IEEE Richard E. Merwin scholarship a w ard in 2006 for e xcellence in USFIEEECS leadership. He has coauthored o v er 15 papers in refereed journals and conferences and has acti v ely been in v olv ed as a re vie wer in TVLSI, TCAD, T OD AES, D A C, VLSID, ISLPED, ISQED and ISVLSI.
