MichaelLüdersmediatum.ub.tum.de/doc/1277587/1277587.pdf · 2019. 3. 26. · 4 Preface...

TECHNISCHE UNIVERSITÄT MÜNCHEN

Lehrstuhl für Technische Elektronik

A Fully-Integrated, Digitally-EnhancedLow-Dropout Voltage Regulator for

Energy-Constrained Microcontroller Systems

Michael Lüders

Vollständiger Abdruck der von der Fakultät für Elektrotechnik und Informationstechnik derTechnischen Universität München zur Erlangung des akademischen Grades eines

Doktor-Ingenieurs

genehmigten Dissertation.

Vorsitzender: Prof. Dr.-Ing. Ulf Schlichtmann

Prüfer der Dissertation:

1. Prof. Dr. rer. nat. Doris Schmitt-Landsiedel

2. Prof. Dr.-Ing. Walter Stechele

Die Dissertation wurde am 18.05.2016 bei der Technischen Universität München eingereicht unddurch die Fakultät für Elektrotechnik und Informationstechnik am 22.08.2016 angenommen.

Preface

Ultra-low-power microcontroller units (MCU) serve as the brain for most of today’s small-scalebattery-powered and energy harvesting applications such as smart meters, alarm systems and mon-itoring devices. The requirements for such systems are rather different compared to performanceand cost driven main stream applications. Instead, the ultimate goal is to minimize the systemenergy consumption and in turn to maximize the battery lifetime, while maintaining the processingpower needed for the application. This book intends to provide a systematic introduction into ar-chitectural and circuit design techniques for power management of ultra-low-power MCU systems.In this way, a holistic energy saving concept for ultra-low-power MCU systems is elaborated step-by-step - involving application requirements, system architecture, and circuit design techniques.The holistic energy saving concept thereby outperforms the traditional circuit design based on ana-log/digital partitioning and leaves much more flexibility for system energy optimization. The keyof this concept is a digitally-enhanced low-dropout voltage regulator (LDO) with multiple currentdrive levels, controlled depending on the power demand of the MCU digital core. By combininghigh LDO current efficiency under all load conditions with a small integrated capacitance at theLDO output, this regulator enables a highly flexible and energy-efficient system operation. In thisway, the presented power management solution combines ultra-low-power consumption during sleepwith an energy-efficient wake-up enabling system energy savings by up to a factor of 4.6. It is highlyflexible thereby maximizing the battery lifetime for a broad range of application scenarios, whileachieving lowest system cost.

This book consists of a total of eight chapters, addressing different system and circuit aspectsof power management for ultra-low-power MCUs. By shining a light on ultra-low-power MCU sys-tems and typical application scenarios, the power management requirements and challenges areoutlined in chapter 2. Stepping further into details, chapter 3 at first provides a systematicpower analysis of an MCU system and typical application scenarios. In this way, the power man-agement requirements and trade-offs are identified, thereby providing valuable initial directions forarchitectural decisions. Supplying the MCU digital core by a fully-integrated LDO with only asmall load capacitance turns out to be highly beneficial for a broad range of typical ultra-low-powerapplication scenarios. In detail, this solution enables ultra-low power consumption during sleep aswell as energy-efficient wake-up while achieving lowest system costs. chapter 4 covers theoreticaland practical issues according to the system architecture decision, which have to be considered forthe design of a fully-integrated LDO to supply CMOS digital circuits. The previously presented,fully-integrated LDO topologies are reviewed - revealing that the vast majority of them are not

4 Preface

suited to supply CMOS digital circuits. An alternative LDO topology, proposed by Ivanov in 2008,is introduced in chapter 5. A detailed circuit analysis identifies and quantifies the fundamentaldesign trends and trade-offs. Based on these findings, digital-enhancement techniques are proposedin chapter 6. By efficiently making use of known system power information, the fundamental LDOtrade-offs are relaxed and the power management overhead in both active mode and sleep mode isdrastically reduced. chapter 7 ultimately demonstrates the benefits of these digital-enhancementtechniques for system energy consumption, thereby always keeping typical ultra-low-power appli-cation scenarios in mind. This work concludes with a discussion of the benefits and limitations ofthe presented holistic energy saving concept as well as an outlook on future research work. A briefsummary at the end of each chapter provides a quick reference for those readers who want to getan overview without going through all the details.

Contents

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2 Ultra-Low-Power Microcontrollers and Applications . . . . . . . . . . . . . . . . . . . . . . . . . 192.1 Ultra-Low-Power Microcontrollers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.2 Typical Application Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.3 Energy-Aware System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.3.1 Energy Saving Requirements and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.3.2 Technology Scaling for Ultra-Low-Power Microcontrollers . . . . . . . . . . . . . . . . . 24

2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3 System Level Energy Saving Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.1 Analytical Investigation of System Energy Consumption . . . . . . . . . . . . . . . . . . . . . . . . 28

3.1.1 System Sub-Blocks and their Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.1.2 System Operating Modes and their Contribution . . . . . . . . . . . . . . . . . . . . . . . . 30

3.2 Voltage Regulator Topology for MCU Digital Core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.3 Power Saving Techniques during Sleep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.4 Efficient Switching between Power Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4 LDO Voltage Regulators for Supplying Digital Circuits . . . . . . . . . . . . . . . . . . . . . . 454.1 LDO Fundamentals and Control Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.1.1 Negative Feedback Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.1.2 Regulation Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494.1.3 Loop Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.2 LDO Topologies and Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524.2.1 Externally Compensated LDO Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.2.2 Internally Compensated LDO Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.3 LDO Load Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.3.1 Power Supply Requirements of the MCU Digital Core . . . . . . . . . . . . . . . . . . . . 65

6 Contents

4.3.2 On-Chip Power Distribution Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674.3.3 External Load Capacitance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5 Any-Load Stable LDO Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735.1 Multiple-Loop LDO Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.1.1 Loop Stability in a Multiple-Loop System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755.1.2 Circuit Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.2 Slow Stage: Folded-Cascode Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785.3 Fast Stage: Cascoded Flipped Voltage Follower . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

5.3.1 Basic Voltage Follower . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805.3.2 Flipped Voltage Follower . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 805.3.3 Cascoded Flipped Voltage Follower . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.4 Analysis of Cascoded Flipped Voltage Follower . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 855.4.1 Pole Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 875.4.2 Voltage Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 905.4.3 Loop Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935.4.4 Transient Response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955.4.5 Equivalent Circuit Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

5.5 Analysis and Mitigation of Parameter Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 985.5.1 Supply Voltage Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1005.5.2 Temperature Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1015.5.3 Process Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5.6 Pass-Transistor Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1055.6.1 Conventional Pass-Transistor Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1065.6.2 Cascode Pass-Transistor Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

5.7 Current Sink Capability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125.8 LDO Scaling Laws and Trade-Offs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

5.8.1 Scaling Laws for LDO Performance Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 1155.8.2 CMOS Technology Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1205.8.3 Universal LDO Design and Adaption Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . 122

5.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6 Digitally-Enhanced LDO Voltage Regulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1276.1 Introduction to Digital-Enhancement Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1276.2 Demonstrator System and Design-for-Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6.2.1 Demonstrator System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1316.2.2 Measurement and Test Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

6.3 Principle of Discrete Load Adaption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1356.3.1 Drive Capability Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1386.3.2 Dynamic Drive Level Adaption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1396.3.3 LDO Circuit Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

6.4 Discrete Load Adaptive LDO Steady-State Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 1426.4.1 Drive Capability Adaption Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1426.4.2 Maintaining LDO Loop Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1486.4.3 Adapting LDO Quiescent Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

Contents 7

6.4.4 Experimental Results on Static Regulation Performance . . . . . . . . . . . . . . . . . . 1516.4.5 Experimental Results on Transient Regulation Performance . . . . . . . . . . . . . . . 1556.4.6 Dynamic Range Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

6.5 Discrete Load Adaptive LDO Dynamic Drive Level Adaption . . . . . . . . . . . . . . . . . . . 1596.5.1 Dynamic Drive Level Adaption Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1596.5.2 Experimental Results on Dynamic Drive Level Adaption . . . . . . . . . . . . . . . . . 166

6.6 LDO Operation under Applicative Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1696.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

7 Conclusion on System Energy Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1777.1 Demonstrator System and Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1777.2 Experimental Results on System Energy Consumption . . . . . . . . . . . . . . . . . . . . . . . . . 181

7.2.1 System Power Consumption during Sleep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1817.2.2 Efficient Switching between Power Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1837.2.3 Energy/Performance Scalability in Active Mode . . . . . . . . . . . . . . . . . . . . . . . . . 185

7.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

8 Summary and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

Selected Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

List of Figures

1.1 Measured power consumption profile of smoke and glass break detector . . . . . . . . . . . 18

2.1 Block diagram and chip micrograph of an ultra-low-power MCU system . . . . . . . . . . . 202.2 Schematic power profile of typical ultra-low-power application scenarios . . . . . . . . . . . 212.3 Optimal CMOS technology selection for ultra-low-power MCU systems . . . . . . . . . . . . 25

3.1 Block diagram of a basic ultra-low-power MCU system . . . . . . . . . . . . . . . . . . . . . . . . . . 283.2 Power consumption of digital CMOS circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.3 Predicted system energy/power consumption in active and sleep mode . . . . . . . . . . . . 313.4 Predicted system power consumption for a periodic sensor data analysis . . . . . . . . . . . 323.5 Integrated voltage regulator to supply the MCU digital core . . . . . . . . . . . . . . . . . . . . . 343.6 Switching-off the MCU digital core by power gating or voltage regulator disable . . . . 373.7 Predicted power consumption for a periodic sensor data analysis . . . . . . . . . . . . . . . . . 43

4.1 Circuit diagram of the basic LDO topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.2 Small-signal equivalent circuit model of the basic LDO topology . . . . . . . . . . . . . . . . . . 474.3 LDO behavior in response to load transient step and line transient step . . . . . . . . . . . 504.4 Fundamental design challenges and trade-offs for LDO circuit design . . . . . . . . . . . . . . 534.5 Externally compensated LDO topology and its schematic frequency response . . . . . . . 554.6 Internally compensated LDO topology and its schematic frequency response . . . . . . . 584.7 Performance comparison of selected internally compensated LDO topologies . . . . . . . 634.8 Power supply system of a highly-integrated ultra-low-power MCU system . . . . . . . . . . 644.9 Transient load current profile of the MCU digital core and resulting core voltage . . . 664.10 On-chip power distribution network and its equivalent circuit model . . . . . . . . . . . . . . 674.11 External LDO load capacitance and its equivalent circuit model . . . . . . . . . . . . . . . . . . 70

5.1 Small-signal block diagram of the any-load stable LDO . . . . . . . . . . . . . . . . . . . . . . . . . . 745.2 Bode plot of the any-load stable LDO at different load conditions . . . . . . . . . . . . . . . . 765.3 Phase margin of the any-load stable LDO at different load conditions . . . . . . . . . . . . . 775.4 Circuit diagram of the any-load stable LDO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785.5 Circuit diagram of the slow stage - folded-cascode amplifier . . . . . . . . . . . . . . . . . . . . . . 795.6 Circuit diagram of the basic voltage follower and flipped voltage follower . . . . . . . . . . 815.7 Circuit diagram of the fast stage - cascoded FVF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

10 List of Figures

5.8 Small-signal equivalent circuit diagram of the cascoded FVF . . . . . . . . . . . . . . . . . . . . . 865.9 Pole frequency of the cascoded FVF as a function of load current . . . . . . . . . . . . . . . . . 895.10 Small-signal voltage gain of the cascoded FVF as a function of load current . . . . . . . . 925.11 Phase margin of the cascoded FVF as a function of load current . . . . . . . . . . . . . . . . . . 945.12 Transient voltage undershoot of the cascoded FVF as a function of load current . . . . 965.13 Simplified equivalent circuit model of the cascoded FVF . . . . . . . . . . . . . . . . . . . . . . . . . 975.14 Cascoded FVF under the presence of PVT parameter variations . . . . . . . . . . . . . . . . . . 995.15 Impact of supply voltage and temperature variations on the cascoded FVF . . . . . . . . 1025.16 Impact of process corners variations on the cascoded FVF . . . . . . . . . . . . . . . . . . . . . . . 1045.17 Conventional pass-transistor topology and required device dimensions . . . . . . . . . . . . . 1075.18 Basic principle and circuit diagram of the cascode pass-transistor topology . . . . . . . . 1095.19 Extension of the cascoded FVF to improve its current sink capability . . . . . . . . . . . . . 1135.20 Simulated transient behavior for universal LDO design and adaption strategy . . . . . . 123

6.1 Overview of digital-enhancement techniques to relax LDO design trade-offs . . . . . . . . 1296.2 Simplified block diagram of the power management unit . . . . . . . . . . . . . . . . . . . . . . . . . 1326.3 Wide bandwidth sensing scheme to characterize the LDO transient behavior . . . . . . . 1346.4 On-chip programmable load to characterize the LDO transient behavior . . . . . . . . . . . 1356.5 Load-adaptive LDO adapting its quiescent current depending on the load current . . . 1366.6 Timing diagram for dynamic adaption of the LDO current drive capability . . . . . . . . 1396.7 Circuit diagram of the discrete load adaptive LDO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1416.8 Chip micrograph of the discrete load adaptive LDO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1416.9 Drive capability adaption strategies for the cascoded FVF . . . . . . . . . . . . . . . . . . . . . . . 1446.10 Cascoded FVF small-signal characteristics for each current drive level . . . . . . . . . . . . . 1476.11 Folded-cascode amplifier small-signal characteristics for each current drive level . . . . 1486.12 Simulated Bode plot for each current drive level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1496.13 LDO quiescent current as a function of current drive capability . . . . . . . . . . . . . . . . . . . 1516.14 Static line regulation performance measured for each current drive level . . . . . . . . . . . 1526.15 Static load regulation performance measured for each current drive level . . . . . . . . . . . 1536.16 Offset error measured for each current drive level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1546.17 Load transient response measured under various operating conditions . . . . . . . . . . . . . 1566.18 Circuit diagram of the pass-transistor with dynamic drive level adaption . . . . . . . . . . 1616.19 Circuit diagram of the cascoded FVF with dynamic drive level adaption . . . . . . . . . . . 1626.20 Implementation details of string resistor ladder and programmable current mirror . . 1646.21 Circuit diagram of the folded-cascode amplifier with dynamic drive level adaption . . 1656.22 Transient voltage error in response to a dynamic LDO drive level adaption . . . . . . . . 1676.23 Settling time in response to a dynamic LDO drive level adaption . . . . . . . . . . . . . . . . . 1686.24 Steady-state operation measured under applicative operating conditions . . . . . . . . . . . 1716.25 Load transient response measured under applicative operating conditions . . . . . . . . . . 1726.26 Dynamic drive level adaption measured under applicative operating conditions . . . . . 1736.27 Key performance parameters of the discrete load adaptive LDO . . . . . . . . . . . . . . . . . . 175

7.1 Demonstrator systems for evaluation of the energy-aware power management unit . . 1787.2 Overview of MCU operating modes and associated functions . . . . . . . . . . . . . . . . . . . . . 1807.3 Measured power consumption in sleep mode as function of temperature . . . . . . . . . . . 1827.4 Measured energy consumption required for wake-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1837.5 Measured power consumption for a periodic sensor data analysis . . . . . . . . . . . . . . . . . 1847.6 Measured energy consumption in active mode as function of clock frequency . . . . . . . 186

Nomenclature

List of Abbreviations

AC alternating currentAPM active power modeBG bandgap referenceBIA buffer impedance attenuationBias bias current generatorBOR brownout resetBSIM Berkeley metal-oxide transistor modelCASFVF cascoded flipped voltage followerCDL current drive levelCMOS complementary metal-oxide semiconductorCPU central processing unitCS clock generation unitDAC digital-to-analog converterDC direct currentDCO digitally controlled oscillatorDFC damping factor controlEA error amplifierESL equivalent series inductanceESR equivalent series resistanceFeRAM ferroelectric random access memoryFLL frequency-locked loopFO4 process-independent delay metricFOM figure-of-meritFVF flipped voltage followerIoT internet of thingsIR resistive voltage dropITRS international technology roadmap for semiconductorsKCL Kirchhoff’s current lawLDO low-dropout voltage regulatorLHP left half plane

12 Nomenclature

LPM low power modeLSTP low standby powerMCU micro-controller unitMIAC model identification adaptive controlMIM metal-insulator-metal capacitanceMIPS mega instructions per secondMOS metal-oxide semiconductorMRAC model reference adaptive controlNMC nested Miller compensationNMOS n-channel metal-oxide transistorOTA operational transconductance amplifierPFM pulse frequency modulationPLL phase-locked loopPMM power management unitPMOS p-channel metal-oxide transistorPSRR power supply rejection ratioPTAT proportional to absolute temperaturePVT process, voltage and temperature variationsPWM pulse width modulationRF radio frequencyRHP right half planeRNMC reverse-nested Miller compensationRTC real-time clockSMC single Miller compensationSoC system-on-chipSRAM static random access memorySVS supply voltage supervisorVDD_Fail supply voltage failure detectionVREF reference voltage generator

List of Latin Symbols

Av,i small-signal voltage gain iCi capacitance iCGATE,i total gate capacitance of a MOS transistorCGD,i gate-drain capacitance of a MOS transistorCGS,i gate-source capacitance of a MOS transistorCLOAD load capacitanceCM Miller compensation capacitanceCOUT,i output capacitanceCOX oxide capacitanceD duty cycle of system active modeEtotal total system energy consumption

Nomenclature 13

Ewake−up energy required for wake-upEAi error amplifier iEAAUX auxiliary error amplifierEAFAST fast error amplifierEASLOW slow error amplifierfCLK switching clock frequencyGCL(s) closed-loop supply-to-output transfer-functionGOL(s) open-loop supply-to-output transfer-functiongds,i small-signal output conductancegm,i small-signal transconductancegmb,i small-signal backgate transconductanceHCL(s) closed-loop error-to-output transfer-functionHOL(s) open-loop error-to-output transfer-functionIi branch current iIBIAS bias currentIDS,i drain-source current of a MOS transistorIDYN dynamic current consumptionIleak leakage currentILOAD load currentIREF reference currentISD,i source-drain current of a MOS transistorISTAT static current consumptionIq quiescent currentii (s) branch current i in frequency domaink Boltzmann constantL channel length of a MOS transistorLi inductance iLD lateral diffusion overlapMi MOS transistor iMCASC cascode pass-transistorMCG,i common-gate transistorMCM,i current-mirror transistorMCS,i common-source transistorMPASS pass-transistorMSINK sink transistorm mirror ratioN number of digital logic gatesn weak inversion slope factorPactive system power consumption in active modePDYN dynamic power consumptionPLOSS power lossPsleep system power consumption in sleep modePSTAT static power consumptionPS active-sleep power saving factorpi small-signal pole iq elementary charge

14 Nomenclature

Ri resistance irds,i small-signal drain-source resistancerin,i small-signal input resistancerout,i small-signal output resistanceT absolute temperatureT0 absolute reference temperaturetG gate delaytactive time in active modetcycle cycle timetox oxide thicknesstsleep time in sleep modetwake−up time required for wake-upVi node voltage iVAMP folded-cascode output voltageVBIAS bias voltageVCASC cascode voltageVCGATE cascode pass-transistor gate voltageVCORE core supply voltageVDD positive supply voltageVDG,i drain-gate voltage of a MOS transistorVDROP pass-transistor voltage dropVDS,i drain-source voltage of a MOS transistorVFOLD folding node voltageVGATE pass-transistor gate voltageVGD,i gate-drain voltage of a MOS transistorVGS,i gate-source voltage of a MOS transistorVIN input voltageVOS offset voltage errorVOUT output voltageVREF reference voltageVSD,i source-drain voltage of a MOS transistorVSG,i source-gate voltage of a MOS transistorVT thermal voltageVth,n threshold voltage of an NMOS transistorVth,p threshold voltage of a PMOS transistorvi (s) node voltage in frequency domainW channel width of a MOS transistorW/L aspect ratio of a MOS transistorZCL(s) closed-loop output impedanceZOL(s) open-loop output impedancezi small-signal zero izout,i small-signal output impedance

Nomenclature 15

List of Greek Symbols

αi activity rate of the digital logic gate iαµ temperature dependence of the carrier mobilityαV th temperature dependence of the threshold voltageβFB feedback divider ratioηC current efficiencyηP power efficiencyλsi channel length modulation in strong inversionλwi channel length modulation in weak inversionµn carrier mobility of an NMOS transistorµp carrier mobility of a PMOS transistorωGBW angular frequency gain-bandwidthωn angular resonance frequencyωp,i angular frequency of the small-signal poleωz,i angular frequency of the small-signal zeroφM phase marginζ damping factor

1

Introduction

Ultra-low-power micro-controller units (MCU) serve as the brain for most of today’s small-scalebattery-powered and energy harvesting applications such as smart meters, alarm systems or sensorand monitoring devices. The requirements for such systems are rather different compared to perfor-mance and cost driven main stream applications. Instead, the ultimate goal here is to minimize thesystem energy consumption and in turn to maximize the battery lifetime, while maintaining theprocessing power needed for the application. For this reason, a sophisticated energy saving conceptis required, which needs to be addressed in the context of the application.

Typical ultra-low-power applications periodically wake-up to acquire and process sensor data.During active mode, the MCU system needs adequate processing performance to be reactive toits environment. For the vast majority of time when being idle, the MCU system is set into adeep sleep mode consuming only a few microwatts. This periodic operation is also reflected inthe measured power consumption profiles shown in Fig. 1.1. Here, two typical examples, a smokedetector (Mitchell, 2006) and a glass break detector (Kammel and Venkat, 2007), represent bothends of the broad range of application scenarios. The overall energy consumption of the MCUsystem can be defined as the sum of the following three components:

Etotal = Pactive · tactive + Psleep · (tcycle − tactive) + Ewake−up (1.1)

The major design challenge is to achieve minimum energy consumption within each mode as wellas for transition between modes. Due to fundamental design trade-offs, these requirements cannotbe optimized concurrently. Instead, a “sweet spot” for system energy consumption needs to befound, which strongly depends on the application scenario. For example, some applications like thesmoke detector spend most of their time sleeping and thus require ultra-low power consumptionduring sleep. Other applications such as the glass break detector wake up very frequently makingan energy-efficient wake-up critical. The energy saving concept should therefore be highly flexiblein order to adapt to the needs of different applications.

For ultra-low-power MCU systems fabricated in modern deep submicron CMOS technologies, thebattery lifetime becomes more and more limited by the increasing leakage current. This limitationis exacerbated by the excessive idle times and the accordingly very low average digital activity oftypical ultra-low-power applications. In order to reclaim the benefits of duty cycling, aggressivepower saving techniques are required during sleep. However, power management techniques knownfrom mobile application processors, which most generally combine a switching regulator with powergating for leakage reduction (Gammie et al., 2010), are not best suited here. It will be examined

18 1 Introduction

97%

1%

2%

0 20 40 60 80 100

Sleep Energy

Active Energy

Wake-Up Energy

0

500

1000

1500

2000

0 1 2 3 4 5 6 7 8 9 10

0

500

1000

1500

2000

0 10 20 30 40 50 60 70 80 90 100

(a) Smoke Detector

tcycle: 8s

tactive

0.7ms

twake-up

1.0ms

tcycle: 2ms

(b) Glass Break Detector

tactive

35µs

twake-up

50µs

48%

32%

20%

0 20 40 60 80 100

Sleep Energy

Active Energy

Wake-Up Energy

PSYSTEM

0.5mW/div

PSYSTEM

0.5mW/div

Fig. 1.1. Measured power consumption profile of (a) a smoke detector (Mitchell, 2006), and (b) a glassbreak detector application (Kammel and Venkat, 2007) at VDD = 3.0 V and Temp. = 25 C. The results ofthese measurement scenarios will be discussed in more detail at the end of this work (see Chap. 7).

why they target a different class of applications with a different “sweet spot” for system energyconsumption.

In this work, a holistic energy saving concept for ultra-low-power MCU systems fabricatedin deep submicron CMOS technologies is presented, involving application requirements, systemarchitecture and novel circuit design techniques. Thereby, a mixed-signal approach is presented toovercome both the system level and circuit level challenges and in this way to minimize the energyconsumption in both the analog and digital domain. Supplying the MCU digital core by a fully-integrated low-dropout voltage regulator (LDO) with a small load capacitance only turns out tobe highly beneficial for a broad range of typical ultra-low-power application scenarios. In detail,this concept enables ultra-low power consumption during sleep and energy-efficient wake-up whileachieving lowest system costs. At the same time, the absence of a large external capacitance presentssevere design challenges for stability and transient behavior of an LDO. By efficiently making useof known system power information, the fundamental LDO trade-offs are relaxed and the powermanagement overhead in active mode is drastically reduced.

2

Ultra-Low-Power Microcontrollers and Applications

Deeply embedded into the physical world, ultra-low-power micro-controller units (MCU) are nowa-days an inconspicuous companion in the everyday life (Saha and Mukherjee, 2003). They evaluateenvironmental information in real-time, by acquiring, processing, analyzing and eventually dissem-inating the sensor data. For a ubiquitous use, these systems should be cost effective, have a smallform factor and operate autonomously over several years (Chee et al., 2006, p. 3ff.). The emergenceof these small-scale systems is essentially the latest trend of Moore’s Law toward the miniaturiza-tion and ubiquity of computing devices, heralding the era of internet-of-things (IoT). To achievean autonomous operation over several years, these systems are severely energy-constrained. Whilea frequent battery replacement is usually cost and effort prohibitive, both the small form factorand cost considerations limit the amount of energy storage available. To overcome the battery life-time limitations, harvesting energy from the environment has recently gained tremendous attentionboth in industry and academia (Strba, 2009; Vullers et al., 2010). Nevertheless, the power availableis again very limited and, above all, highly variable (Brunelli and Benini, 2009). In either case,successful operation over a long period of time demands ultra-low power consumption under alloperating conditions.

This chapter provides an introductory overview of ultra-low-power MCU systems and typicalapplication scenarios. Based on this, the requirements and challenges are highlighted for the designof ultra-low-power MCU systems in general, and their energy saving concept in particular. In detail,this work is motivated by examining two fundamental questions: (1) What are the required charac-teristics of ultra-low-power MCU systems and their energy saving concept in order to achieve theminimum power consumption under all system operating conditions? (2) What are the technologyscaling trends for ultra-low-power MCU systems and the hereof arising requirements and challengesfor the energy saving concept in the deep submicron regime? Based on this discussion, a holisticenergy saving concept for ultra-low-power MCU systems is envisioned - including system, circuitand technology aspects.

2.1 Ultra-Low-Power Microcontrollers

An ultra-low-power micro-controller unit (MCU) forms a highly-integrated mixed-signal system-on-chip (SoC), which is particularly optimized for the needs of small-scale sensor systems (Hempsteadet al., 2006). It typically integrates the complete analog and digital signal processing chain required

20 2 Ultra-Low-Power Microcontrollers and Applications

2.0mm

2.2

mm

(a)

(a)

CSPMM

(BG, REF,

SVS, LDO)

FeRAM

16k

MPU

RAM

1k

ROM

4k

XTAL

DCO

CPU

(MSP430)

JTAG

GP

IO

Tim

ers

US

CI

(I2

C,

SP

I, U

AR

T)

RT

C

CO

MP

10

b-A

DC

MAB/MDB (16bit)

(b)

Fig. 2.1. (a) Block diagram, and (b) chip micrograph of an ultra-low-power MCU system fabricated in a0.13µm standard CMOS technology (Zwerg et al., 2011). It comprises a 16-bit MSP430 CPU, an integratedpower management (PMM) and clock generation unit (CS), analog and digital peripherals and a non-volatileFeRAM memory for fast write capability.

for sensor data processing. For this purpose, an ultra-low-power MCU system combines not onlya CPU with both volatile (e.g. SRAM) and non-volatile memory (e.g. Flash, FeRAM) on a singlechip, but also various, multifaceted analog, digital, and mixed-signal peripherals. This, for example,includes data converters and complete sensor front-ends for acquiring the sensor data as well asdisplay drivers and various communication interfaces for disseminating the sensor data. At thesame time, the MCU system offers programmability and wide flexibility to adapt to a broad rangeof application scenarios and their needs. The block diagram as well as the chip micrograph of acommercially available ultra-low-power MCU system is depicted in Fig. 2.1 (Zwerg et al., 2011).

The major challenge for the design of ultra-low-power MCU systems is achieving unprecedentedenergy efficiency while meeting at the same time the functional and performance requirements. Forthis purpose, ultra-low-power MCU systems are optimized for the event-driven nature of smartsensor systems by providing various low-power operating modes, in which the system is graduallydisabled. These low-power modes are entered by firmware routines and can be exited in responseto internal and/or external events. In this context, both the power management and the clockgeneration unit play an essential role. While they do not add any application functionality, theirdesign is the key for dynamically scaling the system performance according to the applicationneeds while keeping both digital and analog power consumption always minimal. In detail, thepower management unit provides power to the MCU digital core, which combines all low-voltagedigital CMOS circuits. It manages the system operating modes as well as the transition betweenthem. The system clock is generated by a highly flexible clock generation unit, which providesseveral clock sources the application can choose from. This typically includes an on-chip digitallycontrolled oscillator as well as a crystal oscillator (Zwerg et al., 2011).

The external supply voltage of ultra-low-power MCU systems is determined by battery voltageover lifetime. Considering the characteristics of various battery chemistries suggests a typical supplyvoltage range from 3.6 V down to 1.9 V. The cell voltage of most battery types thereby remainsrelatively constant over lifetime and drops quickly when approaching the end of lifetime (Lahiriet al., 2002).

2.2 Typical Application Scenarios 21

2.2 Typical Application Scenarios

Typical ultra-low-power applications are characterized by a periodic operating profile, which isschematically depicted in Fig. 2.2. In response to an interrupt event (e.g. a timer expiration),the MCU system wakes up in order to acquire and process sensor data. During active mode, theMCU system needs adequate processing performance to be reactive to its environment. For thevast majority of time when being idle, the MCU system is set into an ultra-low power sleep modeconsuming only a few microwatts. While this operating profile is common to most of all ultra-lowpower applications, it significantly differs in terms of parameters.

First and foremost, the cycle time (tcycle) is defined by the rate of sensor data acquisition.Strongly depending on the application requirements, or more precisely, on the measured physicalproperty and its characteristics, the cycle time can range from some hundred microseconds up totens of seconds or even minutes (Lee et al., 2012). Also the processing performance required in activemode (tactive) strongly depends on the application requirements. The sensor data analysis can beas simple as a level supervision and detection or, in the other extreme case, can be as complex asadvanced signal-processing algorithms like a Fast Fourier Transform. It seems obvious, that fastercode execution should help to save energy. A higher clock frequency reduces the time spent in activemode and, consequently, extends the time spent in sleep mode. However, in many ultra-low-powerapplications, the speed of processing is gated by external events. These can be for instance analogsettling times required during sensor data acquisitions. To avoid running idle, the MCU systemoften does not operate at the maximum operating speed, but instead adapts its speed. In anycase, the power consumption during active mode (P active) is often increased by more than threedecades compared to sleep mode. Nevertheless, as typical ultra-low-power applications wait the vastmajority of time for the next trigger event (tsleep), the power consumption during sleep (P sleep)significantly contributes to the total energy consumption. To minimize the power consumption, theMCU system enters a sleep mode, which still meets the functionality requirements during sleep.For example, a real-time clock (RTC) might remain active in sleep mode, providing a time base toperiodically trigger a wake-up event. Recovering back from sleep mode into active mode can impose asignificant delay (twake−up), which is primarily determined by settling of analog components withinthe power management and the clock generation unit. Associated to this is an energy overhead

Time

log

(P

ow

er)

tcycle

tsleeptactive

Ewake-up

Pactive

Psleeptwake-up

Fig. 2.2. Schematic power profile of typical ultra-low-power application scenarios. Very low duty-cycles oftypically one percent and below lead to the need for ultra-low power consumption in sleep mode as well asan energy-efficient wake-up.


(Ewake−up) required to bring the system from sleep mode into active mode. From an applicationperspective, the wake-up time is limited by the maximum tolerated interrupt latency. While thisis uncritical for many ultra-low-power application scenarios (e.g. measuring a temperature somehundred microseconds earlier or later), this becomes an essential parameter for applications withhard real-time requirements (e.g. handling of data protocols).

In summary, the total system energy consumption of the MCU system can be expressed as:

Etotal = Pactive · tactive + Psleep · tsleep + Ewake−up (2.1)

By normalizing both time and power consumption, the ultra-low-power application scenarios canbe further abstracted:

Etotal = Pactive · tcycle · (D + (1−D) · PS) + Ewake−up (2.2)

where D = tactive/tcycledefines the duty-cycle and PS = Psleep/Pactivedefines the active-sleep powersaving factor. Strongly depending on the application scenario, these components contribute a dif-ferent amount to the total energy consumption. The energy saving concept thus clearly needs to beaddressed in the context of the application. A flexible adaption to the application needs is therebygreatly advantageous.

2.3 Energy-Aware System Design

In order to give an exact conception of the design space and boundary conditions for ultra-low-power MCU systems and their energy saving concept, two vital questions need to be examined:(1) What are the required characteristics of ultra-low-power MCU systems and their energy savingconcept in order to achieve ultra-low power consumption under all system operating conditions? (2)What are the technology scaling trends for ultra-low-power MCU systems and the hereof arisingrequirements and challenges for the energy saving concept in the deep submicron regime?

2.3.1 Energy Saving Requirements and Challenges

For any kind of microprocessor, a fundamental trade-off exists between maximum operating speedand energy consumption (Chandrakasan and Brodersen, 1995; Markovic et al., 2004). Either a cer-tain minimum amount of energy is required to achieve a specified operating speed, or the operatingspeed is limited to a certain value to meet a specified energy budget. Most designs that pretendto provide both low power and high performance at the same time are in fact optimized duringdesign time to reach the Pareto-optimal limit of this trade-off (Markovic et al., 2004). At the lowerend of the performance spectrum, various ultra-low-power concept studies have been presented re-lying on sub-threshold operation (Zhai et al., 2006; Kwong et al., 2008; Sridhara et al., 2011). Byoperating at the minimum energy point, these systems are uncompromisingly optimized for ultra-low energy consumption, while suffering from a very limited peak processing performance. Thislimits their usage in a broad range of applications, which require a higher performance operatingmode. Sub-threshold operation is for this reason not within the scope of this work - the interestedreader is instead referred to Wang et al. (2006). At the other end of the performance spectrum,mobile application processors satisfy the ever increasing performance demands, with the process-ing frequency reaching 1 GHz and above (Luftner et al., 2006; Gammie et al., 2008). The higher

2.3 Energy-Aware System Design 23

processing performance in combination with an increasing number of features continuously drivesthe system energy consumption higher, while the battery technology is not keeping pace. For thisreason, complex power and performance management techniques are employed to extend the bat-tery life time. Examples for this are numerous and range from power gating (Henzler et al., 2005)over the division into multiple voltage domains (Lackey et al., 2002) to dynamic voltage frequencyscaling (Pering et al., 1998; Ernst et al., 2003). Nevertheless, the system energy consumption ofthese mobile application processors is clearly dominated by power consumption in active mode,even though the technology scaling results in drastically increased leakage current.

Ultra-low-power MCU systems in contrast need to be both: They must provide moderate pro-cessing performance at relatively rare times (active mode), while their operating environment allowsmuch longer periods of relaxed performance requirements with ultra-low-power consumption (sleepmode). Meeting this challenge requires to operate at different points along the Pareto-optimal curvefor each specific processing performance requirement (Calhoun et al., 2010). Ultra-low-power MCUsystems for this purpose take advantage of the characteristic periodic nature of typical ultra-low-power applications. In this way, they aim to achieve high performance in active mode, lowest powerconsumption in sleep mode as well as energy-efficient switching between power modes. The powerconsumption during the time when the MCU system is doing nothing, just waiting for the nexttrigger event, is therefore as important as the power consumption due to the actual applicationprocessing. Common performance-power metrics like MIPS/mA are thus only of little significancein this case. To provide dynamic energy/performance scalability, the power management architec-ture must provide the flexibility to operate in multiple operating modes. This particularly includesenergy-efficient operation at any clock frequency, such that the system operating speed can be de-termined by application requirements only without sacrificing the efficiency of operation. At thesame time, also the power consumption during sleep - determined by the leakage current of theMCU digital core as well as the quiescent current of the power management and clock generationunit - must be considered carefully. The challenge of power management optimization becomes evenmore complex as the weighting of the various energy contributors significantly varies depending onthe application scenarios.

Many research work has been devoted either on the power optimization of the CMOS digitalcircuits (Gammie et al., 2008; Chandrakasan and Brodersen, 1995) or of the analog power man-agement circuits (Hazucha et al., 2004). But clearly, in a highly-integrated system-on-chip, onecannot be optimized without considering the other. The requirements and trade-offs instead haveto be carefully balanced during system definition in order to minimize the overall system energyconsumption. For this reason, a mixed-signal approach is presented in this work to overcome boththe system-level and circuit level challenges necessary to minimize the energy consumption in boththe analog and digital domain. The considerations required for maximizing the battery lifetime arethereby twofold: (1) indicating which components contribute to the system energy consumption(“where” power is consumed); (2) indicating which operating modes contribute to the system en-ergy consumption (“when” power is consumed). This work consequently addresses the interactionbetween the power management architecture on the one side and the MCU digital core on the otherside. A holistic enhancement approach is applied to fulfill the stringent requirements on systemenergy consumption, thereby always keeping typical ultra-low-power application scenarios in mind.


2.3.2 Technology Scaling for Ultra-Low-Power Microcontrollers

Starting with Intel’s 4004 - the first microprocessor - in 1971 (Faggin et al., 1996), and TexasInstrument’s TMS1000 - the first single-chip microprocessor with integrated program and datamemory - in 1974 (Gercekci and Krueger, 1985), the MCU design has experienced a dramaticevolution. The major driving force behind this development is a reduction of the overall systemcosts (Moore, 1998; Rabaey et al., 2006), which enables additional application scenarios and inthis way new business potentials. The system costs are thereby not restricted to the actual chipcosts, but are in fact strongly determined by the amount of system integration. A highly-integratedmixed-signal MCU system allows to substantially reduce the overall system costs. By integratingmore and more functionality on the same chip, the need for additional, costly circuits is avoided.This particularly includes analog and mixed-signal functionality, as for instance high accuracy dataconverters and wireless communication interfaces. At the same time, the processing performancedemand increases only moderately for ultra-low-power MCU systems. This demand mainly arisesfrom a trend towards more complex data processing algorithms, increased user functionality as wellas more comfortable user interfaces. Associated to this, also the needs for integrated memory, bothvolatile and non-volatile, increase. In addition, every ultra-low-power MCU system has to meet thestringent energy constraints, which forms a fundamental boundary condition for the circuit design.This is particularly important since the battery technology development cannot keep pace with thedevelopment in microelectronics (Lahiri et al., 2002).

Compared to high-performance, purely digital microprocessors, the priorities for technology scal-ing are different for ultra-low-power MCU systems, both in terms of area and power consumption.For a highly-integrated mixed-signal MCU system, the CMOS technology characteristics are notsolely defined by the requirements of the digital circuits, but also by the integrated analog andmixed-signal functionality (Rabaey et al., 2006). Modern deep submicron technologies thereforeoffer various transistor types optimized for the different purposes. This typically includes at least athin-oxide transistor with minimum channel length as well as a thick-oxide transistor with highervoltage reliability. However, each transistor type requires additional masks causing additional costs.Evident from the chip micrograph depicted in Fig. 2.1(b), a significant amount of area in modernultra-low-power MCU systems is occupied by the I/O ports and analog peripherals. These circuits,however, are predominantly realized by thick-oxide transistors. Their area therefore does not scaleat the same rate as that of the digital and memory circuits, which are realized by thin-oxide transis-tors (Annema et al., 2005). In this context, Rabaey et al. (2006) has introduced the term of optimaltechnology. For a highly-integrated ultra-low-power MCU system with a significant amount of ana-log circuits, the preferred choice is typically not a leading edge technology, but some technologynodes behind.

The term optimal technology also gets a great significance in the context of system energyconsumption. While the dynamic power consumption significantly reduces with technology scaling(Kim et al., 2003; Roy et al., 2003), this however comes at the expense of increased leakage currentin the deep submicron regime. The first leakage component is sub-threshold leakage - a weakinversion current across the transistor, which increases exponentially due to the ongoing reductionof threshold voltage. The second basic component is gate leakage - a tunneling current throughthe gate oxide, which increases due to the ongoing reduction of gate oxide thickness. The trendtowards increasing leakage current is fortified since the functionality and consequently both thedigital gate count and the memory size continuously increase in ultra-low-power MCU systems.For performance-driven microprocessors, technology scaling continues to be a win-win strategy inthe deep submicron regime (see bottom-right corner of Fig. 2.3; Lee et al., 2012). In detail, lower

2.3 Energy-Aware System Design 25

sparsely active,

high performance

sparsely active,

low performance

always active,

high performance

always active,

low performance

Performance (40FO4)

50kHz 500kHz 5MHz 50MHz 500MHz

Performance (40FO4)

50kHz 500kHz 5MHz 50MHz 500MHz

1

0.1

0.01

0.001

Du

ty-C

ycle

65nm

1

0.1

0.01

0.001

Du

ty-C

ycle

65nm

90nm

90nm

90nm90nm

130nm

130nm

130nm

130nm

180nm180nm

180nm

180nm

180nm

250nm

250nm

250nm

250nm

250nm

Fig. 2.3. Optimal CMOS technology selection for ultra-low-power MCU systems (adopted from Lee et al.,2012). While small process geometries with low gate capacitance are preferred for performance-drivenmicroprocessors, larger process geometries with low leakage current are preferred for ultra-low-power MCUsystems with low performance requirements and low duty-cycles.

energy consumption and smaller area can be achieved for given performance and functionalityrequirements. However, this trend does not hold true in the fields of ultra-low-power MCU systems.Although these low duty-cycle systems have a lower leakage current compared to performance-driven microprocessors, they spend significantly less time and energy in active mode. Consequently,the total system energy consumption becomes predominated by leakage current during sleep, andthe expected benefits of duty-cycling significantly degrade for deep submicron CMOS technologies.For this reason, and as depicted in Fig. 2.3 (see top-left corner), larger process geometries arepreferred for typical ultra-low-power MCU systems with low performance requirements and lowduty-cycles. To nevertheless enable CMOS technology scaling for these systems, an appropriatesolution must be twofold: (1) optimizing the CMOS technology being used for low leakage current,and (2) employing aggresive power saving techniques to minimize power consumption during sleep.Ideally, definition of CMOS technology and development of system energy saving concepts therebygo hand in hand, complementing each other.

From a CMOS technology point-of-view, this dilemma can be counteracted by optimizing theCMOS technology for low standby power (LSTP). For this reason, the ITRS roadmap defines threescaling scenarios, namely for high performance, for low operating power as well as for low standbypower (Kim et al., 2003). For the LSTP scaling scenario, the transistor design and its scaling areaimed to minimize the sub-threshold leakage, with a trade-off of reduced maximum operating speed.For this purpose, the transistor threshold voltage remains approximately constant, which in turnforces the supply voltage scaling to slow down to meet the performance requirements despite of thehigh threshold voltage. In this way, the LSTP scaling scenario is optimized for the needs of most


ultra-low power application scenarios. For further details on CMOS technologies optimized for lowstandby power, the interested reader is referred to (Skotnicki et al., 2005).

From an MCU system architecture point-of-view, an appropriate solution around this dilemmais to apply mixed-signal expertise to implement an advanced energy saving concept, particularlyto minimize leakage current during sleep. In this way, ultra-low power consumption in sleep modecan ideally be achieved regardless of the underlying CMOS technology, reclaiming the benefits ofduty cycling. Since the battery cell voltage does not scale at the same rate as the supply voltage ofthe MCU digital core, an integrated voltage regulator must be introduced to the system in orderto adapt the voltage levels and to help saving active switching energy. At the same time, the powermanagement architecture requires a tremendous effort and overhead, both in terms of area andpower consumption. As a result, the area benefits of technology scaling are partially offset and asignificant power offset is introduced to the system, which needs to be considered carefully.

In conclusion, modern ultra-low-power MCU systems fabricated in deep submicron CMOS tech-nologies do not achieve a power consumption during sleep as low as that of classic MCU systems,even despite of major technology and architectural efforts. However, they offer substantial costadvantages thereby enabling new business potentials. Cost considerations are consequently thepredominant (or maybe the only...) driving force behind technology scaling in the fields of ultra-low-power MCU systems.

2.4 SummaryUltra-low-power micro-controller units (MCU) form a highly-integrated mixed-signal system-on-chip (SoC). By offering various operating modes, these MCU systems are optimized for the periodicoperation of typical ultra-low-power applications. At this, they typically spend the vast majorityof time in sleep mode consuming only a few microwatts and waking up periodically to performa certain task. When activated, their power consumption abruptly increases by more than threedecades. Due to the very low duty-cycles, the power consumption during sleep can be as importantas the energy consumed for actual operation. Assuming a rather typical duty-cycle of 0.1 %, bothmodes consume half of the available energy and need to be optimized concurrently to maximize thebattery lifetime.

The trend towards lower chip cost and higher system integration pushes towards deep submicronCMOS technologies. However, as leakage drastically increases for these technologies, the batterylifetime of typical ultra-low-power applications becomes more and more limited by leakage currentduring sleep. Its impact exacerbates for ultra-low-power application scenarios with excessive idletimes and accordingly very low average switching activity. As a result, the leakage can clearlydominate the overall system energy consumption. To enable migration, modern ultra-low-powerMCU systems fabricated in deep submicron CMOS technologies require aggresive power savingtechniques to overcome the process imperfections and minimize the power consumption duringsleep. Optimizing the energy consumption of an ultra-low-power MCU systems thereby requires toaddress all aspects of the design process, including (1) process technology, (2) circuit design, (3)system architecture, and (4) application requirements. While energy optimization for each of thesedisciplines is usually well understood when considered independently, this work pursues a holisticenergy saving concept to solve the power management challenges of ultra-low-power MCU systemsfabricated in deep submicron regime. By dissolving the strict boundaries between the individualdisciplines, it leaves much more flexibility for system energy optimization - in this way enabling anadditional dimension of energy optimization.

3

System Level Energy Saving Concepts

Ultra-low-power MCU systems in general, and their energy saving concept in particular, must betailored to the needs of typical ultra-low-power application scenarios characterized by a periodicoperating profile. As outlined in the preceding chapter, this particularly includes minimal powerconsumption in both active mode and sleep mode as well as fast and energy-efficient transitionbetween these modes. The ultimate aim is to minimize the system energy consumption while main-taining the processing performance required for any ultra-low-power application scenario. Thoughthis may appear obvious, it involves complex considerations for the power management architec-ture. Since the individual contributors to the system energy consumption are closely interrelatedand are subject to fundamental trade-offs, a “sweet spot” needs to be identified with regard totypical ultra-low-power application scenarios.

Classic ultra-low-power MCU systems achieve very low power consumption in sleep mode whileexhibiting comparatively high power consumption in active mode. These characteristics make themperfectly tailored to the needs of a broad range of ultra-low-power application scenarios. Theyserve therefore as reference during the following investigation of energy saving concepts in thedeep submicron regime. However, the classic ultra-low-power MCU systems suffer from substantialdisadvantages with respect to manufacturing costs and integration capabilities. To enable cost-efficient solutions, the trend clearly goes towards smaller process geometries, even though the scalingtrends and priorities for highly-integrated mixed-signal MCU systems turn out to be differentcompared to mainstream digital processors (also see Chap. 2.3.2). Since typical ultra-low-powerapplications wait the vast majority of time for the next trigger event, the power consumptionduring sleep significantly contributes to the overall energy consumption. The drastically increasingleakage current of deep submicron technologies limits the battery lifetime of typical ultra-low-power application scenarios characterized by excessive idle times. Gating the system clock, therebyeliminating all active power consumption contributors, is consequently not a sufficient solutionanymore. Instead, modern ultra-low-power MCU systems must provide aggressive power savingtechniques in sleep mode. However, the power saved during sleep is partially offset by additionalenergy required for entering and exiting the sleep mode. Moreover, also the power management andclock generation contribute to the system energy consumption and thus should be included in thedesign considerations as well. This particularly includes the integrated voltage regulator requiredto supply the digital core in modern ultra-low-power MCU systems.

This chapter explores alternative architectural choices and their trade-offs for power managementof ultra-low-power MCU systems fabricated in deep submicron CMOS technologies. An analytical

28 3 System Level Energy Saving Concepts

investigation identifies the contributors to the system energy consumption with regard to bothsystem sub-blocks (“where power is consumed”) and system operating modes (“when power is con-sumed”). Based on this investigation, the requirements and challenges for an energy-efficient powermanagement architecture are defined. Various commonly used power management techniques areexplored, thereby evaluating their benefits and limitations. This particularly includes the aspects ofvoltage regulator selection for best energy/performance scalability, power saving techniques duringsleep as well as efficient switching between power modes. In this way, this chapter presents a system-atic way to derive an energy-efficient power management architecture suitable for ultra-low-powerMCU systems, addressed in the context of typical application scenarios.

3.1 Analytical Investigation of System Energy Consumption

To systematically find the “sweet spot” for system energy consumption, an analytical investigationis required. This investigation is based on a basic subset of an ultra-low-power MCU system asdepicted in Fig. 3.1 - comprising a digital core as well as a power management and a clock gener-ation unit. The analog/mixed-signal peripherals can be optimized independently and are thereforeexcluded from the following considerations. The aim of this analytical investigation is to identifythe contributors to the system energy consumption with regard to both system sub-blocks (“wherepower is consumed”) and system operating modes (“when power is consumed”). In this way, thefundamental power management requirements and trade-offs for typical ultra-low-power applicationscenarios are identified, quantified and evaluated.

The analytical investigation of system energy consumption is based on a commercially avail-able ultra-low-power MCU system fabricated in a 0.13µm standard CMOS technology (Texas-Instruments, 2011). For this technology, the supply voltage of the digital core is 1.52 V, while themaximum system clock frequency is 16 MHz. The ultra-low-power MCU system comprises a non-volatile Flash memory as well as a volatile static RAM (SRAM) memory. Both can be used for datastorage, but also to execute program code.

Power Management

Digital Core

VCORE

Q

QD

Q

QD

Clock

Internal Oscillators

Voltage Supervisor

VDD

(1.9V-3.6V)

ClockGeneration

Voltage Reference

Bias Generator

Voltage Regulator

Fig. 3.1. Block diagram of a basic ultra-low-power MCU system comprising a digital core as well as apower management and clock generation unit.

3.1 Analytical Investigation of System Energy Consumption 29

3.1.1 System Sub-Blocks and their Contribution

This section introduces the individual sub-blocks of the basic ultra-low-power MCU system. Thedigital core is thereby solely considered from a load point-of-view. To guarantee its proper operation,as well as to minimize its power consumption, the power management and the clock generationunit are key system blocks in every ultra-low-power MCU system. While they do not add anyapplication functionality, they are crucial to scale the system performance dynamically accordingto the application needs. At the same time, they add their overhead to the overall system energyconsumption and need to be considered carefully.

MCU Digital Core

The digital core combines all low-voltage digital CMOS circuits of the ultra-low-power MCU system.Much research work has been devoted in the past to analyze and optimize its power consumption(Chandrakasan and Brodersen, 1995), which is, however, not within the focus of this work. Instead,the MCU digital core is in the following solely considered from a load point-of-view. For any digitalCMOS circuit, the total power consumption can be broadly classified into a static component,determined by leakage currents, and a dynamic component, determined by switching of the digitallogic gates (Liao et al., 2005; Chandrakasan and Brodersen, 1995) - both of which are illustratedin Fig. 3.2 and are briefly recapitulated in the following.

In the deep submicron regime, the two predominant components of static power consumptionare (1) sub-threshold leakage, which is a weak inversion current across the transistor, as well as(2) gate leakage, which is a tunneling current through the gate oxide insulation. The static powerconsumption of the MCU digital core can be most generally expressed as:

PSTAT = VCORE · (Ileak,sub + Ileak,gate)

∝ VCORE ·N ·(e−q·

Vthk·T + e

−F · toxVCORE

)(3.1)

where VCORE is the supply voltage and N the effective number of digital logic gates. The sub-threshold leakage current depends on the transistor threshold voltage Vth, while the gate leakagecurrent depends on the electric field across the gate oxide and thus on the oxide thickness tox.

VCORE

Ci C(i+1)

VIN VOUT

(b)VCORE

Gate Leakage

Sub-threshold Leakage

(a)

Fig. 3.2. For any digital CMOS circuit, the total power consumption can be classified into (a) a staticcomponent, as well as (b) a dynamic component. While static power consumption is caused by transistorimperfections and affects all digital logic gates, dynamic power consumption is caused by charging (anddischarging) the various load capacitance whenever they are switched.


Both the threshold voltage and the oxide thickness are technology parameters, which are optimizedfor lowest leakage current at the definition of the CMOS technology with a trade-off of reducedmaximum operating speed (also see Chap. 2.3.2). The sub-threshold leakage current increases expo-nentially with temperature, while the gate leakage current is in contrast insensitive to temperature.

The dynamic power consumption of the MCU digital core is caused by charging (and discharging)the various parasitic capacitances of the digital logic gates whenever they are switched synchronouslyto the system clock. The dynamic power consumption can be therefore expressed as:

PDYN = VCORE · IDYN =N∑i=1

(αi · Ci) · fCLK · VCORE 2 (3.2)

where αi is the activity rate and Ci is the load capacitance of a single digital logic gate, fCLK is thesystem clock frequency and VCORE is the core voltage. Evidently, the active power consumptionstrongly depends on the activity factor of each digital logic gate, and consequently on the exactcode being executed as well as the data being processed.

Power Management Unit

The power management unit provides a lower sub-regulated supply voltage to the MCU digital coreand guarantees its reliable operation by supervising both the external and internal supply voltage.The sub-regulated supply voltage is generated by an integrated voltage regulator, which in caseof the basic ultra-low-power MCU system is implemented as linear low-dropout voltage regulator(LDO). The power management unit comprises further circuit blocks for generation of the referencevoltage and bias currents. In conclusion, the overall quiescent current demand is 57µA in activemode, which adds a constant overhead to the system power consumption. Since the power demandof the MCU digital core is highly relaxed in sleep mode, the power management unit can be setinto a power-saving mode with a reduced quiescent current of 2.1µA. This particularly includesthe usage of a dedicated voltage regulator optimized for ultra-low quiescent current (Kristjansson,2006; Baumann et al., 2013).

Clock Generation Unit

The clock generation unit provides the system clock to the MCU digital core with several clocksources the application can choose from. Among others, this includes an on-chip digitally controlledoscillator, which can be completely disabled during sleep. However, the clock generation unit isessentially independent of the power management architecture and the related trade-offs. It istherefore excluded from the following considerations for system energy optimization.

3.1.2 System Operating Modes and their Contribution

This section illustrates the power consumption of the basic ultra-low-power MCU system duringactive mode and sleep mode as well as the energy required for transition between the operatingmodes. In order to determine the significance of the power consumption in the individual operatingmodes for typical ultra-low-power application scenarios, their contribution is identified for a periodicsensor data analysis as a function of the cycle time.

3.1 Analytical Investigation of System Energy Consumption 31

100

200

300

400

500

-1.00 0.00 1.00 2.00 3.00 4.00

Syste

m E

nerg

y [

pW

s]

Time Period [ms]

15

30

45

60

75

Sys

tem

En

erg

y [

nW

s]

0.5 1.0 2.0 4.0 8.0 16.0

Clock Frequency [MHz]

MCU Digital Core: Dynamic and Static

MCU System including


0

25

50

75

100

-40 -20 0 20 40 60 80

Sle

ep

Po

we

r [

µW

]

Temperature [ºC]

0

25

50

75

100

-40 -20 0 20 80

Temperature [°C]

40 60

MCU System including

Digital Core Leakage

(a) (b)

Sys

tem

Po

wer

[µW

]

47%

54%

83%


Fig. 3.3. Basic ultra-low-power MCU system (a) predicted system energy consumption in active modeas function of clock frequency, and (b) predicted system power consumption in sleep mode as function oftemperature at VDD = 3.0 V

Fig. 3.3(a) shows the system energy consumption in active mode as a function of the clockfrequency when performing a sensor data analysis. This analysis requires 152 clock cycles for calcu-lating the average of four measurement values and comparing the result against a fixed threshold.At high clock frequencies, the power consumption is dominated by the dynamic power consumptionof the MCU digital core due to charging (and discharging) of the various parasitic capacitances.The overhead caused by leakage current as well as power management quiescent current can beneglected. This however changes when operating at low clock frequencies - the leakage current andthe power management overhead contribute to 47 % when operating at 1 MHz. It seems obviousthat the system energy consumption is reduced at higher clock frequency, when the code is executedfaster and the ultra-low-power MCU system can return to sleep mode earlier. In many ultra-low-power applications, however, the speed of processing is gated by external events (e.g. analog settlingtimes). To avoid wasting energy by running idle, the best the system can do is to adapt to the speedof such events.

To minimize the power consumption during sleep, the operation of the MCU digital core isstopped by gating the system clock, the clock generation unit is disabled and the power managementunit is set into a power-saving mode. Nevertheless, the MCU digital core remains powered and itsleakage current continues to contribute to the system power consumption during sleep. Since theleakage current increases dramatically in deep submicron CMOS technologies, the system powersavings become more and more limited, particularly at high temperature. As depicted in Fig. 3.3(b),the leakage current contributes to 54 % of the system power consumption at room temperature,while the percentage significantly increases to 83 % at high temperature. This in turn significantlylimits the battery lifetime of typical ultra-low-power application scenarios characterized by excessiveidle times. To wake-up from sleep mode, the power management and clock generation unit need tobe reactivated, requiring additional time and energy. They are both optimized for a fast wake-up,resulting in an overall wake-up time of 10µs with an energy overhead of 43 nWs.

Complementing the above considerations on system power consumption in active and sleepmode, Fig. 3.4(a) shows the power consumption of the basic ultra-low-power MCU system for aperiodic sensor data analysis as a function of cycle time both at room temperature and high tem-


0

25

50

75

100

0.1 1.0 10.0 100.0 1000.0 10000.0

Sle

ep

En

erg

y [

µW

s]

Cycle Time [ms]

1

10

100

1000

0.1 1.0 10.0 100.0 1000.0 10000.0

Sys

tem

En

erg

y [

µW

s]

Cycle Time [ms]

(a) (b)

0.1 1.0 10.0 100.0 1000.0 10000.0

Cycle Time [ms]

0.1 1.0 10.0 100.0 1000.0 10000.0

Cycle Time [ms]

25°C

85°C

1.0

10.0

100.0

1000.0

Sys

tem

Po

wer

[µW

]

0

25

50

75

100

Po

wer

Co

ntr

ibu

tio

n [

%]

85°C 25°C

Cycle

TimeActive

Mode

Sleep

Mode

Fig. 3.4. (a) Predicted system power consumption for periodically performing a sensor data analysis(taking 152 clock cycles at a clock frequency of 8 MHz) versus cycle time at VDD = 3.0 V, and (b) thecorresponding contribution of the system power consumption in active mode and sleep mode.

perature. Clearly, the energy required for wake-up is very low and is thus neglected here. Fig. 3.4(b)shows the corresponding power distribution between active mode and sleep mode. For very shortcycle times (< 0.1 ms), the power consumption is dominated by the active mode. In contrast, itbecomes dominated by power consumption during sleep for excessive cycle times (> 10 s). For theseapplications, lowest power consumption during sleep (including both leakage current and powermanagement overhead) is essential. The break-even point, at which both active and sleep modecontribute the same amount to the system power consumption, is at 5.0 ms for room temperature.Since the leakage current increases exponentially with temperature, the break-even point shifts toshorter cycle times with increasing temperature and is at 0.8 ms for high temperature.

Ultra-low-power MCU systems fabricated in deep submicron CMOS technologies benefit fromgreatly reduced power consumption in active mode. However, since the leakage current is drasticallyincreased, the battery lifetime of typical ultra-low-power application scenarios with excessive idletimes becomes more and more limited. Consequently, gating the system clock and thereby elimi-nating all active power contributors is not a sufficient solution anymore. Instead, aggressive powersaving techniques are required to cope with the leakage challenge of deep submicron CMOS tech-nologies. While most power saving techniques focus on the MCU digital core only, also the powermanagement unit contributes to the power consumption during sleep and thus must be included inthe considerations for system energy optimization. This particularly includes the integrated voltageregulator supplying the MCU digital core. At the same time, every power saving technique intro-duces an overhead in terms of area and design complexity, as well as an overhead in terms of energyand time required to recover back into active mode. Consequently, there exists a penalty for theimplementation of a power saving technique, as well as a penalty for activating it.

Throughout the following sections, some well-known power management techniques widely ap-plied to mobile application processors are presented and evaluated for their efficacy in ultra-low-power MCU systems. Since ultra-low-power MCU systems target a different class of applicationscompared to mobile application processors, the “sweet spot” for system energy optimization ishowever different.

3.2 Voltage Regulator Topology for MCU Digital Core 33

3.2 Voltage Regulator Topology for MCU Digital Core

To provide a lower sub-regulated supply voltage to the MCU digital core, ultra-low-power MCUsystems fabricated in deep submicron CMOS technologies require an integrated voltage regulator(also see Fig. 3.1). The voltage regulator must be laid out for providing the maximum currentconsumption of the MCU digital core at peak performance levels. The voltage conversion ratiodefined by the external supply voltage VDD as well as the core voltage VCORE is commonly nota free design parameter. These voltages are determined by the battery technology and the CMOStechnology, respectively. With these boundary conditions being defined, the voltage conversionefficiency is not a free design parameter either. At the same time, the current consumption ofthe MCU digital core greatly varies and spans a range of more than three decades, which in turndemands a high conversion efficiency across a wide load range. With regard to this, any voltageregulator adds a constant overhead to the power consumption of the ultra-low-power MCU system,which needs to be considered carefully.

The following section provides a systematic analysis of the power management system relatedtrade-offs for supplying the low-voltage MCU digital core, particularly focusing on the selection ofthe voltage regulator topology. The integrated voltage regulator can be implemented either as linearlow-dropout voltage regulator (LDO), or as inductive switch-mode voltage regulator (see Fig. 3.5).Both voltage regulator schemes are introduced in the following, with a particular focus on theconversion efficiency under different system operating conditions as well as the cost of integration.Switched-capacitor voltage regulators are intentionally excluded from the following discussion dueto their limited flexibility (El-Damak et al., 2013; van Breussegem and Steyaert, 2013). Their con-version efficiency can be optimized only for a fixed voltage conversion ratio, requiring a tremendousimplementation effort in case of more finely graduated voltage conversion ratios.

Linear Low-Dropout Voltage Regulator

Fig. 3.5(a) shows a schematic diagram of a linear low-dropout voltage regulator (LDO), at which apass-transistor is connected in series to the MCU digital core. By applying a negative feedback loop,this pass-transistor is modulated in a continuous way in order to cause a voltage drop and generatea constant output voltage (Widlar, 1971; Rincon-Mora, 2009). The power efficiency of an LDO isdetermined by the quiescent current, the load current as well as the pass-transistor voltage drop.The LDO quiescent current Iq is defined as the total ground current of the LDO circuit. Moreover,the ratio of the output (load) power to the input (supply) power yields the power efficiency ηP .

ηP,LDO = VCORE · ILOADVDD · (ILOAD + Iq)

(3.3)

It can be distinguished between two load conditions. While the power efficiency is limited by theLDO quiescent current at low load conditions (ILOAD Iq), the power efficiency at high loadconditions (ILOAD Iq) is determined by the pass-transistor voltage drop and approaches itsmaximum.

ηP,LDO(ILOAD=max) ≈ VCOREVDD

(3.4)

The LDO power efficiency at high load conditions is however not a free design parameter. Whilethe supply voltage is determined by battery voltage over lifetime, the output voltage is determined


S L

D CLOADVDD RLOAD

Feedback Control

CLOADVDD RLOAD

Feedback Control

RPASS VCOREVCORE

(a) (b)

Po

wer

Eff

icie

ncy

[%

]

Po

wer

Eff

icie

ncy

[%

]

Load Current [mA] Load Current [mA]

VCORE=1.9V VCORE=1.9V

VDD=2.4V

VDD=3.0V

VDD=3.6V

VDD=2.4VVDD=3.0V

VDD=3.6V

Fig. 3.5. The supply voltage of the MCU digital core is provided by an integrated voltage regulator, whichcan be either implemented as (a) linear low-dropout voltage regulator (LDO), or as (b) inductive switch-mode voltage regulator. The bottom figures show the respective conversion efficiency of state-of-the-artstand-alone voltage regulators (Rincon-Mora and Allen, 1998a; Texas-Instruments, 2012).

by the needs of the MCU digital core. For this reason, the current efficiency ηC is often a moreconvenient metric for comparison of the LDO efficiency.

ηC,LDO = ILOADILOAD + Iq

(3.5)

In conclusion, and as illustrated in Fig. 3.5(a), the LDO power efficiency strongly depends on thevoltage conversion ratio. While it offers a very high power efficiency at high voltage conversion ratiosclose to one (almost 80 % at VDD = 2.4V and VCORE = 1.9V ), the power efficiency significantlydrops at small voltage conversion ratios. Due to its low quiescent current, an LDO is however ableto maintain a high power efficiency also at very low load conditions.

Due to the continuous control scheme, the LDO gain-bandwidth is in principle only limited bythe load capacitance. The control scheme thereby does not necessarily rely on a large (external)load capacitance. Instead, the load capacitance can be considered as a degree of freedom for systemenergy consumption, as will be revealed in Chap. 3.4. This particularly also includes the completeintegration of the load capacitance, omitting the need for any external components. In this way, anLDO can provide a very cost efficient solution with regard to silicon area as well as number andcost of external components.

Inductive Switch-Mode Voltage Regulator

Fig. 3.5(b) shows a schematic diagram of an inductive switch-mode voltage regulator (also widelyreferred to as buck converter). It generates an output voltage lower than the supply voltage by

3.2 Voltage Regulator Topology for MCU Digital Core 35

transferring energy from the input to the output with the help of the power switch, the inductor aswell as the diode (Arbetter and Maksimovic, 1998). In contrast to an LDO, the energy is transferredin a quasi-lossless scheme by alternately energizing the inductor from the supply and de-energizingit into the output. To control the output voltage, the circuit generates an analog error signal byapplying negative feedback, and converts it into a pulse-width-modulated (PWM) digital signal tocontrol the power switch. By adapting the switching duty-cycle, the output voltage is controlledwith respect to the supply voltage.

The power efficiency of an inductive switch-mode voltage regulator is determined by three majorloss mechanisms, which namely are conduction loss, switching loss as well as quiescent current loss.It can be accordingly expressed as:

ηP,BUCK = VCORE · ILOADVDD · ILOAD + PLOSS

(3.6)

At high load conditions, the power efficiency is primarily limited by conduction loss, caused byresistive voltage drops of each element within the energy-transfer path. In order to achieve a highpower efficiency in the 90 % range, the rectifier diode is typically replaced by a synchronous switchto minimize the resistive voltage drops. With decreasing load current, the switching loss componentbecomes dominant. It is caused by hard switching of the power switches with finite turn-on andturn-off times, as well as by periodic charging of their gate capacitances. The switching loss isproportional to the switching frequency. In addition, the control circuit requires quiescent current,which results in a constant power overhead and becomes dominant at low load conditions. To keepthe conversion efficiency nevertheless high at low load conditions, an inductive switch-mode voltageregulator might enter a discontinuous operation mode with pulse-frequency modulation (PFM). Inthis mode, the inductor current is discontinuous whereas only single pulses of energy are providedto the output. The output is thereby controlled by varying the frequency of pulses. Comparedto the continuous conduction mode, the switching loss (both switching transition loss and gatecharge) can be minimized in this way. Fig. 3.5(b) shows the power efficiency of a state-of-the-art stand-alone inductive switch-mode voltage regulator optimized for ultra-low-power applications(Texas-Instruments, 2012). It generates a fixed output voltage of VCORE = 1.9 V from a supplyvoltage range from VDD = 2.0 V to 3.9 V with a maximum output current of 100 mA. Particularlynoteworthy, and in contrast to an LDO, the power efficiency is almost independent of the supplyvoltage. While it has particularly advantages over an LDO at high load conditions and large dif-ferences between supply voltage and output voltage, an inductive switch-mode voltage regulatortypically shows disadvantages at very low load conditions due to the large regulation overhead ofthe time-discontinuous control scheme.

In contrast to an LDO, the potential for system integration is limited in case of an inductiveswitch-mode voltage regulator. Due to the switching operation with a discontinuous energy transfer,a large (external) load capacitance is an intrinsic part of every switch-mode voltage regulator. Thisbecomes particularly evident at voltage regulators designed for low power levels, such as required forultra-low-power MCU systems. Typically, at least two external components are required, namely theinductor as well as the capacitor (e.g. 2.2µH and 2.2µF, respectively, in case of the aforementionedstate-of-the-art voltage regulator). While increasing the switching frequency enables smaller pas-sive components, the switching loss are directly proportional to the switching frequency and, as aresult, the power efficiency is sacrificed. Although some implementations have been presented withan integrated inductor and/or capacitor (Mathuna et al., 2012; Bunel, 2010), the quality of theintegrated components cannot yet compete with the external.


Conclusion on Voltage Regulator Topology

In order to refer the power consumption of the MCU digital core to the battery supply voltage,the power efficiency of the integrated voltage regulator must be taken into account. In case of anLDO, the voltage drop over the pass-transistor - equal to the difference between the supply voltageVDD and the core voltage VCORE - directly translates into power loss. However, the overall powerconsumption of the MCU digital core is still reduced linearly with the voltage reduction comparedto a direct connection to the battery supply voltage. Assuming the static (leakage) power can beneglected compared to the dynamic power, the power consumption in active mode, referred to thebattery supply voltage, can be expressed as:

Pactive = (PDYN + PSTAT ) · 1ηP,LDO

∼=k∑i=1

(αi · Ci) · fCLK · VCORE · VDD (3.7)

Due to the quasi-lossless energy-transfer scheme, an inductive switch-mode voltage regulator incontrast offers potentially a quadratic power saving. With ongoing technology scaling, the supplyvoltage of the MCU digital core optimized for low standby power is predicted to fall in the rangeof 1.2 V, while the battery voltage is expected to remain constant. Due to the increasing voltagedrop over the LDO pass-transistor, its power efficiency becomes ultimately limited in active mode.Against this background, an inductive switch-mode voltage regulator might become more and moreattractive with further CMOS technology scaling.

The integrated voltage regulator introduces a constant overhead to the system power consump-tion, which particularly cannot be neglected at low digital activity and clock frequency. Stronglydepending on the exact design and implementation, the power overhead is higher for an inductiveswitch-mode voltage regulator due to its time-discontinuous control scheme. In conclusion, an in-ductive switch-mode voltage regulator offers a higher power efficiency during active mode, whilean LDO provides lower power overhead as well as enables a higher system integration at lowercosts. The voltage regulator topology best suited for the presented ultra-low-power MCU system isselected based on a holistic concept for power management architecture, which particularly mustalso include the power saving techniques during sleep.

3.3 Power Saving Techniques during Sleep

To achieve lowest system power consumption during sleep, ultra-low-power MCU systems fabricatedin deep submicron CMOS technologies require aggressive power saving techniques. These need tobe applied to all parts of the system architecture: Besides the leakage current of the MCU digitalcore, also the quiescent current of the power management unit significantly contributes to thesystem power consumption during sleep. Tremendous research work has been done in the pastdecade to cope with the leakage challenge of deep submicron CMOS technologies. For a detailedinvestigation of the various leakage mitigation techniques, the interested reader is at this pointreferred to Henzler (2006) as well as Roy et al. (2003). Among many other proposed techniques, themost effective approach to minimize the leakage current is to switch-off the MCU digital core, eithercompletely or partially. For this purpose, the conventional approach widely used today for mobile

3.3 Power Saving Techniques during Sleep 37

Active Mode

Module A

(CPU)

Module B

(Peripheral)

Module C

(RTC)

VCORE

MSW_CMSW_BMSW_A

VCTRL_B VCTRL_CVCTRL_A

Voltage

Regulator

Voltage

Reference

VDD

(1.9V-3.6V)

Sleep Mode

Voltage

Regulator

Voltage

Reference

VDD

(1.9V-3.6V)

CLOAD

Active Mode Module A

(CPU)

Module B

(Peripheral)

VCOREVoltage

Regulator

Voltage

Reference

VDD

(1.9V-3.6V)

CLOAD

Sleep Mode Module C

(RTC)

VRTCVoltage

Regulator

Voltage

Reference

VDD

(1.9V-3.6V)

(a)

(b)

Power Gating

Voltage Regulator Disable

Fig. 3.6. To reduce the power consumption during the sleep, the MCU digital core can be switched-off,either (a) by inserting large power-gating switches into the core supply rail, or (b) by utilizing the integratedvoltage regulator.

application processors is the insertion of large power-gating switches into the supply rail (Gammieet al., 2010; Henzler, 2006). An alternative approach to switch-off the MCU digital core is to utilizethe integrated voltage regulator. This approach does not only address the leakage challenge, butalso the power management overhead. These two alternative techniques for power reduction duringsleep are introduced and compared in the following.

Power Gating

A straight-forward solution to reduce the leakage current is power gating: As evident from theblock diagram of the basic ultra-low-power MCU system depicted in Fig. 3.6(a), the MCU digitalcore is divided into multiple voltage domains, which can be individually switched-off. For this


purpose, each voltage domain is separated from the core supply rail, either from the positive railby a PMOS header device, or alternatively from the negative rail by an NMOS footer device.At the same time, the power management unit (including the integrated voltage regulator) mustremain active during sleep to supply any non-power-gated voltage domain. This for instance mightbe a real-time clock (RTC) providing a time-base to periodically trigger a wake-up event, or thevolatile memory (SRAM) retaining the application data during sleep. While conceptually simple,the physical implementation of power gating requires a systematic approach (Henzler, 2006, p. 72ff.).First and foremost, this includes the sizing and control of the power-gating switches, the granularityof the voltage domains as well as the reactivation scheme. At this, the potential benefits of powersaving during sleep need to be carefully traded against the resulting area and complexity overhead.

The sizing of the power-gating switches must be chosen as trade-off between leakage reductionand performance/speed degradation (Henzler et al., 2005). This trade-off is caused by a limitedION/IOFF -ratio of the power-gating switch, determined by the CMOS technology and the transistortype being used. A smaller power-gating switch has a higher off-resistance, and thus offers a goodleakage reduction in sleep mode. A larger power-gating switch in contrast has a lower on-resistance,and thus minimizes the voltage drop and consequently the performance/speed penalty in activemode. Since the performance requirements for ultra-low-power MCU systems are moderate, thetrade-off is typically decided in favor of a high leakage reduction. To improve the ION/IOFF -ratio of the power-gating switch, various circuit techniques have been proposed - which include anadaptive body biasing of the power-gating switch as well as a negative gate-source voltage to drivethe power-gating switch into super cut-off (Kawaguchi et al., 2000).

The benefit of leakage reduction comes at the cost of an increased area and complexity overhead.Each power-gating switch requires a large area, which is typically in the range of 10 % to 20 % ofthe overall digital circuit area (Chinnery and Keutzer, 2005). Additional area overhead is causedby the level-shifters and the isolation cells required at the voltage domain boundaries, which is inthe range of another 5 % to 10 %. Besides of the area overhead, the design complexity increaseswith an increasing number of voltage domains and boundary crossings, and needs to be consideredcarefully. Against this background, the granularity of the voltage domains has to be decided astrade-off between increased area overhead on the one side and high system flexibility on the otherside. In this context, it can be distinguished between coarse-grained as well as fine-grained powergating, which differ in terms of the basic aim of power gating in the system power diagram.

When reactivating the power-gating switches, the integrated voltage regulator is suddenly loadedby the charging current. To limit the resulting drop of the core voltage, various reactivation schemeshave been proposed (Henzler et al., 2006a; Wu et al., 2011). All of these schemes aim to limit thecharging current, either by turning on the power-gating switch slowly, or by enabling multiple smallpower-gating switches sequentially. While the voltage drop is effectively limited in this way, the timerequired for wake-up necessarily increases. However, since the overall wake-up time is dominatedby the reactivation of the power management unit, the additional delay can be typically acceptedfor ultra-low-power MCU systems.

Voltage Regulator Disable

By considering the typical ultra-low-power application scenarios, two major system states can beidentified, which can be leveraged for the definition of the power management architecture. In activemode, the ultra-low-power MCU system provides high processing performance, while leakage currentis only of limited concern. In sleep mode, in contrast, the system power consumption needs to be

3.3 Power Saving Techniques during Sleep 39

minimized, while commonly only a real-time clock remains active. The fine granularity potentiallyoffered by power gating is therefore only of limited benefit for the majority of typical ultra-low-power application scenarios. Instead, corresponding to the needs of these application scenarios, theMCU digital core is divided into (at least) two separate voltage domains, as depicted in Fig. 3.6(b).The high-performance domain (VCORE) comprises the CPU, the hardware accelerator and otherdigital peripheral modules, while the low-power domain (VRTC) comprises a real-time clock andmay provide data retention. Each of these voltage domains is supplied by a dedicated integratedvoltage regulator. The voltage regulator for the high-performance domain is implemented eitheras linear low-dropout voltage regulator (LDO) or as inductive switch-mode voltage regulator, asdiscussed in Chap. 3.2. It is designed to provide the maximum current consumption of the MCUdigital core at peak performance levels. By disabling this voltage regulator, the high-performancedomain is completely switched-off during sleep. To guarantee a defined system state at wake-up,the regulator output is thereby actively discharged. As many ultra-low-power applications need atime-base to periodically trigger a wake-up event, the low-power domain in contrast (may) remainactive in sleep mode. For this purpose, the voltage regulator for the low-power domain is optimizedfor ultra-low quiescent current with a very limited current drive capability.

In case the integrated voltage regulator is implemented as LDO, its pass-transistor is first andforemost designed for the needs of voltage regulation. As a high voltage drop is a desired featureof every LDO design, the pass-transistor can be sized significantly smaller compared to a power-gating switch and the trade-off between leakage reduction and performance/speed degradation iseffectively avoided. In this way, the leakage reduction during sleep is highly improved compared topower gating. However, this does not apply in case the integrated voltage regulator is implementedas inductive switch-mode voltage regulator. Here, the power switch is typically sized larger in orderto minimize the conduction loss and support the quasi-lossless energy-transfer scheme. At the sametime, disabling the voltage regulator during sleep does not only reduce the leakage current of theMCU digital core, but also the power management overhead. The major portion of the powermanagement unit - particularly including the integrated voltage regulator for the high-performancedomain - is disabled. Only the voltage regulator for the low-power domain and its reference (may)remain active in sleep mode.

Compared to power gating, this approach does not only minimize the power consumption duringsleep, but also simplifies the system partitioning. The number of supply boundary crossing signalsrequiring level-shifters and isolation cells is drastically reduced. The drawback of this reduced areaoverhead and design complexity is a limited flexibility of operation. As soon as a single sub-moduleis required to remain active during sleep, the complete high-performance domain cannot be disabled.While a finer granularity is in principle possible, each voltage domain requires a dedicated integratedvoltage regulator, adding a significant overhead both in terms of area and power consumption duringactive mode. For this reason, the number of voltage domains is limited in practice.

Conclusion on Power Saving Techniques

The most efficient strategy for leakage avoidance is to switch-off the MCU digital core, eithercompletely or partially (Gammie et al., 2010). Since the leakage current increases exponentiallywith temperature, this becomes particularly important at high temperature. By applying powergating, the leakage current of the MCU digital core is typically reduced by a factor of 20 to 40 ina 0.13µm standard CMOS technology (Kim et al., 2007). This leakage reduction ratio is howeverfurther limited by level-shifters, isolation cells as well as data retention flip-flops, which remain active


during sleep. In addition, also the power management unit must remain active and thus continuesto contribute to the system power consumption, even though it is set into a power-saving mode.Considering the basic ultra-low-power MCU system, and assuming a practical leakage reductionratio of four, the power consumption during sleep is predicted to be 9.1µW at 25 C and 34.8µWat 85 C, which is dominated by the power management overhead. At the same time, power gatingintroduces a significant overhead, both in terms of area and design complexity.

When switching-off the MCU digital core by disabling the integrated voltage regulator, in con-trast, the leakage reduction ratio is further improved. Also the major portion of the power man-agement unit (particularly including the integrated voltage regulator) is disabled during sleep, andonly a continuous supply voltage supervisor remains active. While the power consumption is trulyminimized, the energy and time required for wake-up necessarily increases, as will be illustratedin the following section. In this way, this system configuration is tailored to the needs of typicalultra-low-power application scenarios with the two major system states (active mode as well assleep mode). Compared to power gating, it however suffers from a limited flexibility for systemoperation.

3.4 Efficient Switching between Power Modes

With the benefit of lower power consumption during sleep, another aspect of system energy opti-mization arises, which is neglected by the majority of publications as well as many commerciallyavailable MCU systems. Each wake-up from sleep mode takes time and needs additional energy,which partially offsets the energy saved during sleep. The energy required for wake-up is composedof three major components, which namely are (1) reactivation of the power management unit,(2) recharging of the power-switched voltage domains, and (3) recovering the system state. Thepreviously presented power saving techniques during sleep offer different trade-offs between powerconsumption during sleep and the energy required for wake-up: As lower the power consumptionduring sleep, as higher the energy and time required for wake-up. This trade-off strongly depends onthe cycle time defined by the typical ultra-low-power application scenarios, and must be carefullyconsidered for the design of the power management architecture. The three major components ofwake-up energy are discussed separately in the following, thereby evaluating the trade-off for thepreviously presented power saving techniques during sleep.

System Reactivation

The power management unit needs to be reactivated, depending on the power management archi-tecture either partially or completely. This takes a few microseconds and needs power resulting inenergy lost from the battery. To minimize the wake-up energy, the power management unit mustbe particularly optimized for a fast start-up. While fast starting analog circuits typically requirea large bias current, which increases their overhead during active mode, novel circuit design tech-niques enable a fast start-up in combination with nanoampere bias currents (Baumann et al., 2013).From an application perspective, the wake-up time is limited by the maximum tolerated interruptlatency, which may prohibit to switch-off the MCU digital core during sleep for some applicationscenarios.

3.4 Efficient Switching between Power Modes 41

Capacitance Recharging

When switching-off the MCU digital core during sleep, either by power gating or by disablingthe integrated voltage regulator, the voltage domain needs to be reactivated, and the associatedcapacitance needs to be recharged at wake-up. In case of power gating, the integrated voltageregulator is compensated by a large (external) load capacitance, while each voltage domain is inaddition decoupled by small on-chip capacitances. While the voltage regulator remains active and,consequently, the main capacitance remains charged during sleep, the small on-chip decouplingcapacitances are discharged in case the respective voltage domain is switched-off. However, sincethese capacitances are small, the energy required for recharging is also small.

In contrast hereto, the large (external) load capacitance is discharged when switching-off theMCU digital core by disabling the integrated voltage regulator. Since the voltage regulator output isactively discharged during sleep, the conditions at wake-up are well defined, particularly includingthe energy required for capacitance recharging. The energy required can be expressed as:

Ewake−up,cap = 12 · CLOAD · VCORE

2 (3.8)

The load capacitance in this case becomes a vital parameter for system energy optimization. Evi-dently, lowest capacitance at the voltage regulator output is preferred to achieve a fast and energy-efficient wake-up from sleep mode. This, however, is in strong contrast to the needs of the voltageregulator design, as will become more obvious in the further course of this work (particularly seeChap. 4.2). In case the integrated voltage regulator is implemented as LDO, the continuous controlscheme does not necessarily rely on a large (external) load capacitance. The load capacitance canbe instead considered as a degree of freedom for system energy optimization. While a small loadcapacitance enables an energy-efficient wake-up from sleep mode, the control bandwidth and thusalso the quiescent current necessarily increase when the LDO is active. In contrast to an LDO, alarge (external) load capacitance is an intrinsic part of any switch-mode voltage regulator due toits discontinuous control scheme (also see 3.2). A small load capacitance at the output demands ahigh switching frequency, which in turn causes high switching loss and thus results in a reducedpower efficiency. At the same time, it also requires a large control bandwidth with a tremendousregulation overhead. A switch-mode voltage regulator is consequently not best suited to periodicallyget disabled in sleep mode and recover back into active mode quickly and energy-efficiently.

The load capacitance determines not only the energy, but also the time required for wake-up.Due to the high impedance of the battery supply, the inrush current is limited and, consequently,the wake-up time is proportional to the capacitance to be recharged. Since the capacitance is smallin case of power gating, it offers a fast wake-up time, particularly if the power management unitdoes not need to be reactivated. When switching-off the MCU digital core by disabling the voltageregulator, smallest capacitance is preferred to achieve an energy-efficient wake-up as well as a fastwake-up time.

State Recovery

When switching-off the MCU digital core to minimize the power consumption during sleep, criticalapplication data as well as the system state needs to be retained. For this purpose, two basicstrategies can be identified in literature: (1) dedicated state retention flip-flops, and (2) separatenon-volatile memory (Henzler et al., 2006b). While a detailed investigation is beyond the scope of


this work, the following section provides a brief overview of this design aspect, which is critical forthe practicability of switching-off the MCU digital core during sleep. Entering and exiting the sleepmode should be completely transparent from an application perspective.

The dedicated state retention flip-flops offer a fast recovery of the application data and thesystem state (Henzler et al., 2006b). The time required can be particularly neglected comparedto the time required for reactivation of the power management unit. At the same time, the stateretention flip-flops add a significant area overhead, and continue to cause leakage current duringsleep. Alternatively to the state retention flip-flops, the data can be retained by copying to a separatenon-volatile memory - either using a software-based approach with sequential register read andwrite, or using a scan-based approach. While this approach requires minimum area overhead anddoes not add any leakage current during sleep, significant energy and time is required for recovery.Usually, the energy required for saving and recovering the system state from non-volatile memory israther high in flash based systems. By offering a fast and energy-efficient write capability, FeRAMmemory enables an energy-efficient data and state retention (Zwerg et al., 2011; Baumann et al.,2013). In conclusion, also here is a fundamental trade-off between power consumption during sleepand energy required for wake-up.

Conclusion on Efficient Power Mode Switching

The power management architecture for today’s mobile application processors most widely combinesan inductive switch-mode voltage regulator for an energy-efficient operation during active mode withpower gating to reduce the power consumption during sleep. While the power reduction during sleepis limited, this approach offers a fast and energy-efficient wake-up. The power management unitremains mostly active during sleep, the capacitance to be recharged is small, and dedicated state-retention flip-flops allow an energy-efficient state recovery. However, the “sweet-spot” for energyoptimization of ultra-low-power MCU systems is different. Since these systems are tailored to theneeds of applications characterized by excessive idle times and frequent wake-up, lowest powerconsumption during sleep and energy-efficient wake-up are mandatory.

Motivated by these considerations, this work proposes and pursues a different approach, at whichthe MCU digital core is switched-off during sleep by disabling the integrated voltage regulator.In this way, this approach addresses both the leakage current of the digital core and the powermanagement overhead. To enable an energy-efficient wake-up, a small capacitance at the regulatoroutput is required. This is, however, in clear conflict with the needs of an inductive switch-modevoltage regulator. Instead, a linear low dropout voltage regulator (LDO) is used, optimized foroperation with a small capacitance. Compared to a switch-mode voltage regulator, its conversionefficiency is sacrificed (depending on the supply voltage), and the power consumption during activemode is increased. Nevertheless, since typical ultra-low-power MCU systems spend only a minorportion of their time and energy in active mode, this is accepted here. Since the ultra-low-powerMCU system must adapt its speed of processing to the application requirements, the integratedvoltage regulator should be able to adapt to the actual load requirements, thereby minimizing itsoverhead at low clock frequencies.

Fig. 3.7 shows the predicted power consumption of the basic ultra-low-power MCU system for aperiodic sensor data analysis as a function of cycle time at both room temperature and high temper-ature. In the one case, the digital core remains powered during sleep, and consequently both leakagecurrent and the power management unit continue to contribute to the power consumption. As aresult, the total system power consumption becomes dominated by the sleep mode and saturates

3.5 Summary 43

1

10

100

1000

0.1 1.0 10.0 100.0 1000.0 10000.0

Sys

tem

En

erg

y [

µW

s]

Cycle Time [ms]

0.1 1.0 10.0 100.0 1000.0 10000.0

Cycle Time [ms]

1.0

10.0

100.0

1000.0

Sys

tem

Po

wer

[µW

]

LDO Disable with 470nF

MCU Powered at 85°C

MCU Powered at 25°C

LDO Disable with 4.7nF

Fig. 3.7. Predicted system power consumption for a periodic sensor data analysis (taking 152 clock cyclesat a clock frequency of 8 MHz) as a function of cycle time at VDD=3.0 V. Switching-off the MCU digital coresignificantly reduces the power consumption during sleep, while minimizing the capacitance significantlyreduces the energy required for wake-up. The energy required for state recovery is for this analysis neglected.

at long cycle times. In these cases, it is therefore beneficial to switch-off the MCU digital core bydisabling the integrated voltage regulator, in this way further reducing the system power consump-tion. Due to the long cycle times and the rare wake-up events, the additional energy required forwake-up is not significant. This picture changes at shorter cycle times, when the additional energyrequired for wake-up offsets the saved energy during sleep. In conclusion, a break-even time canbe defined for which it is beneficial to enter a sleep mode with lower power consumption, but alsohigher wake-up overhead. Evidently, the energy required for wake-up significantly reduces whenminimizing the capacitance at the regulator output, and the break-even time shifts to shorter cycletimes. While the power consumption during sleep increases at high temperature due to increasingleakage when the digital core remains powered, it is nearly independent of temperature variationswhen switching-off the digital core. As a result, the break-even time shifts to shorter cycle times athigh temperature.

3.5 Summary

By providing an analytical investigation of alternative energy saving concepts, this chapter revealsthe fundamental trade-offs for system energy optimization and provides valuable initial directionsfor architectural decisions of ultra-low-power MCU systems. It is thereby shown that power man-agement in all system operating conditions is the key for lowest energy systems. For this purpose, itis crucial to address the power management challenges and their trade-offs in the context of typicalapplication scenarios.

Due to excessive idle times and accordingly very low average switching activity of typical ultra-low-power application scenarios, the system energy consumption is clearly dominated by leakagecurrent during sleep. Consequently, stopping the system clock, thereby eliminating all active powercontributors, is obviously not a sufficient solution anymore. To enable migration, modern ultra-


low-power MCU systems fabricated in deep submicron CMOS technologies require efficient powersaving techniques to overcome the process imperfections and minimize the power consumptionduring sleep. However, these power saving techniques partially offset the area and cost benefits dueto technology scaling, as well as impose a significant delay and energy overhead when recoveringback into active mode. There hence exists a penalty for the implementation of these techniques,as well as a penalty for activating them. The most efficient strategy for leakage avoidance is toswitch-off the MCU digital core, either completely or partially. While the conventional approachwidely used today for mobile application processors is power gating, the MCU digital core can bealternatively switched-off by utilizing the integrated voltage regulator. This approach does not onlysignificantly reduce the leakage of the MCU digital core, but also the power management overhead.To achieve fast and energy-efficient wake-up, lowest capacitance at the core supply is mandatory -which is in conflict to the needs of any switch-mode voltage regulator.

In conclusion, and as revealed by the systematic analysis of energy saving concepts provided inthis chapter, supplying the MCU digital core by a fully-integrated low-dropout voltage regulator(LDO) with a small load capacitance only is highly beneficial for a broad range of typical ultra-low-power application scenarios. In detail, this solution enables ultra-low power consumption duringsleep and energy-efficient wake-up while achieving lowest system costs. At the same time, however,a small load capacitance presents major challenges for the design of an LDO. The subsequentchapters therefore focus on the design and optimization of a fully-integrated LDO with a smallload capacitance only - with the key targets of fast and energy-efficient start-up as well as lowestquiescent current under all operating conditions.

4

LDO Voltage Regulators for Supplying Digital Circuits

In the proposed ultra-low-power MCU system, the digital core is supplied by a linear low-dropoutvoltage regulator (LDO). Conventionally, such an LDO is stabilized by a large external capacitanceat its output, which is at least in the range of some 100 nF. As described in the preceding chapter,lowest capacitance at the LDO output is preferred to achieve a fast and energy-efficient wake-up from sleep mode. Consequently, the large external capacitance at the LDO output is removedand only a much smaller on-chip capacitance remains. This indeed greatly affects the LDO designresulting in severe design challenges for loop stability and transient behavior.

This chapter covers theoretical and practical issues which have to be considered for the designof a fully-integrated LDO supplying CMOS digital circuits. After introducing the basic LDO topol-ogy, a small-signal model is derived with the complete open-loop and closed-loop transfer-functionsof both the pass-transistor and the error amplifier. Based on this regulator model, an overview ofthe LDO specifications is provided - including DC, small-signal AC, and large-signal performanceparameters. With respect to this, the fundamental LDO design challenges and trade-offs are identi-fied. Ultimately, the previously presented, fully-integrated LDO topologies are reviewed - revealingthat the vast majority of them are not best suited to supply CMOS digital circuits.

4.1 LDO Fundamentals and Control Theory

The purpose of a linear low-dropout voltage regulator (LDO) is to generate a constant output volt-age independent of its environment and operating conditions, particularly including supply voltage,load current, temperature and process variations. For this purpose, and as depicted in Fig. 4.1,an LDO is essentially comprised of a two-stage amplifier connected in a negative feedback loop(Widlar, 1971; Rincon-Mora, 2009). The error amplifier EA1 senses the output voltage, comparesit against a reference voltage VREF , and modulates the resistance of the PMOS pass-transistorMPASS . In this way, the pass-transistor provides a variable current to the load circuit. It is hererepresented simplistically by a load current ILOAD in parallel with a load capacitance CLOAD andthe corresponding equivalent series resistance RESR. The constant, temperature independent ref-erence voltage VREF is commonly provided by a bandgap reference, which is not discussed furtherhere (Kuijk, 1973; Song and Gray, 1983; Ivanov et al., 2011).

In practice, an LDO is only able to generate a constant output voltage within certain operatingboundaries, particularly including the supply voltage and the load current range. Furthermore, an

46 4 LDO Voltage Regulators for Supplying Digital Circuits

+

-MPASS

VREF

VDD

CLOAD

VOUT

ILOAD

Bandgap

Reference

EA1

R2

R1

RESR

Feedback

Loop

VGATE

Fig. 4.1. Circuit diagram of the basic linear low-dropout voltage regulator (LDO) topology.

LDO requires a minimum headroom for proper regulation, which is defined as the dropout voltage.In contrast to its non-low-dropout counterpart, which solely differs by the pass-transistor type,the PMOS pass-transistor in common-source configuration enables proper regulation also at verylow dropout voltages. This makes an LDO suitable for low-voltage, integrated power managementsolutions. However, since the PMOS pass-transistor presents a high output impedance to the load,the reduction in dropout voltage is achieved at the cost of severe design challenges for loop stability(Simpson, 1997). The loop stability is tightly associated to the load capacitance: A capacitancerange can be specified, which is required by the LDO at its output to maintain loop stabilityfor a given load current range. By means of this, it can be distinguished between an externallycompensated LDO requiring a large (external) load capacitance, and an internally compensatedLDO requiring a much smaller (internal) load capacitance.

4.1.1 Negative Feedback Theory

By considering the basic LDO topology as a linear time-invariant system with negative feedbackand a low-pass, second-order behavior, the well-known negative feedback theory can be applied(Doyle et al., 1991; Razavi, 2001, p. 172ff). For this purpose, the basic LDO topology is translatedinto a small-signal equivalent circuit model as depicted in Fig. 4.2. The error amplifier is modeled,independent of its exact implementation, by an operational transconductance amplifier (OTA),which is characterized by the transconductance gm,EA and the output resistance rout,EA. Thepass-transistor forms a common-source stage with negative voltage gain. It is characterized bythe transconductance gm,PASS and the channel resistance rds,PASS , as well as the gate-sourcecapacitance CGS,PASS and the gate-drain capacitance CGD,PASS , respectively. The feedback factorβFB is defined by the feedback resistors R1 and R2. Noteworthy, the feedback signal needs to beconnected to the positive input of the error amplifier in order to establish a negative feedback loop.

Open-Loop Small-Signal Analysis

The aim of the open-loop small-signal analysis is to find mathematical expressions for the LDOoutput voltage vOUT (s) as function of the LDO input perturbation signals, which are the errorvoltage vIN (s), the supply voltage vDD (s), as well as the load current iLOAD (s). An open-loop

4.1 LDO Fundamentals and Control Theory 47

VDD

VGATE

CGS

CGD

VREF

CLOAD

ILOAD

RESR

MPASS

-+

gm·VSG

VOUT

rds

Av·VIN

R2

R1

rin

routVIN

VFB

EA1

β Load

+-

Cutting the

feedback loop

Fig. 4.2. Small-signal equivalent circuit model of the basic LDO topology including the error amplifier, thepass-transistor as well as the load circuit represented by load current and load capacitance. An open-loopanalysis requires cutting the LDO feedback loop.

analysis of the basic LDO topology thereby requires cutting its feedback loop as indicated in Fig. 4.2.By neglecting the feedback resistors (such that rds,PASS R1 + R2) and applying Kirchhoff’sCurrent Law (KCL) to the gate node vGATE and the output node vOUT , the following expressionscan be obtained:

0 = Av,EA · vIN − vGATErout,EA

+ s · (CGS,PASS · (vDD − vGATE) + CGD,PASS · (vOUT − vGATE))

(4.1)

0 = gm,PASS · (vDD − vGATE) + vDD − vOUTrds,PASS

− iLOAD

+s ·(CGD,PASS · (vGATE − vOUT )− CLOAD

1 + s ·RESR · CLOAD· vOUT

)(4.2)

where Av,EA = gm,EA · rout,EA defines the small-signal voltage gain of the error amplifier. Basedhereupon, individual transfer-functions for each of the three LDO input perturbation signals canbe determined. The individual open-loop transfer-functions are derived in the following concisely,thereby omitting intermediate steps. In order to obtain clear and tractable expressions, the equiv-alent series resistance is assumed to be considerably smaller than the channel resistance of thepass-transistor (such that RESR rds,PASS), and the high frequency components associated tothe equivalent series resistance are neglected throughout the subsequent analysis.

First, both the supply voltage vDD (s) and the load current iLOAD (s) are set to zero, representingthe LDO in open-loop configuration with no line and load perturbation. By solving Eq. 4.2 for vGATEand substituting it in Eq. 4.1, the open-loop transfer-function HOL (s), describing the LDO outputvoltage vOUT (s) as function of the error voltage vIN (s), is obtained.


HOL(s)= vOUT (s)vIN (s)

∣∣∣∣vDD=0, iLOAD=0

HOL(s)=gm,EA ·rout,EA ·gm,PASS ·rds,PASS ·

(1−s· CGD,PASSgm,PASS

)·(1+s·RESR ·CLOAD)

1+s·[rout,EA ·(CGS,PASS+(1+Av,PASS)·CGD,PASS)+rds,PASS ·(CLOAD+CGD,PASS)]+s2 ·rout,EA ·rds,PASS ·(CGS,PASS ·CGD,PASS+CGS,PASS ·CLOAD+CGD,PASS · CLOAD)

(4.3)

Analyzing the open-loop transfer-function HOL (s) reveals several interesting characteristics of thebasic LDO topology, particularly with regard to its regulation performance and loop stability, as willbecome obvious in the further course of this small-signal analysis. At low frequencies (s = 0), theopen-loop transfer-function simplifies to HOL(s)=gm,EA·rout,EA·gm,PASS ·rds,PASS , correspondingto the open-loop small-signal voltage gain of the basic LDO topology.

In order to determine the perturbation of the LDO output voltage vOUT (s) as function of thesupply voltage vDD (s), both the error voltage vIN (s) and the load current iLOAD (s) are set to zero.By again solving Eq. 4.2 for vGATE and substituting it in Eq. 4.1, the open-loop transfer-functionGOL (s) is obtained.

GOL(s)= vOUT (s)vDD (s)

∣∣∣∣vIN=0, iLOAD=0

GOL(s)= (1+Av,PASS)·(1+s·rout,EA ·(CGS,PASS+CGD,PASS))+s·rout,EA ·Av,PASS ·CGS,PASS1+s·[rout,EA ·(CGS,PASS+(1+Av,PASS)·CGD,PASS)+rds,PASS ·(CLOAD+CGD,PASS)]+

s2 ·rout,EA ·rds,PASS ·(CGS,PASS ·CGD,PASS+CGD,PASS · CLOAD)(4.4)

As a result of the PMOS pass-transistor in common-source configuration, and as evident from theopen-loop transfer-function GOL (s), the small-signal voltage gain from the supply voltage vDD (s)to the output voltage vOUT (s) is relatively high. The basic LDO topology is therefore not able tosuppress supply voltage variations when operated in open-loop configuration, but instead relies onthe closed-loop feedback, as will become obvious in the further course of this small-signal analysis.

Correspondingly, the open-loop output impedance ZOL (s) describes the perturbation of theLDO output voltage vOUT (s) as a function of the load current iLOAD (s). By setting both thesupply voltage vDD (s) and the error voltage vIN (s) to zero, and again solving Eq. 4.2 for vGATEand substituting it in Eq. 4.1, the open-loop output impedance ZOL (s) is obtained.

ZOL(s)= vOUT (s)iLOAD (s)

∣∣∣∣vIN=0, vDD=0

ZOL(s)=− rds,PASS+s·rout,EA ·rds,PASS ·(CGS,PASS+CGD,PASS)+s·RESR ·CLOAD1+s·[rout,EA ·(CGS,PASS+(1+Av,PASS)·CGD,PASS)+rds,PASS ·(CLOAD+CGD,PASS)]+s2 ·rout,EA ·rds,PASS ·(CGS,PASS ·CGD,PASS+CGD,PASS · CLOAD)

(4.5)

At low frequencies (s = 0), the open-loop output impedance ZOL (s) corresponds to the pass-transistor channel resistance rds,PASS . Since the pass-transistor channel resistance is relativelyhigh, the basic LDO topology is also not able to suppress load current variations when operated inopen-loop configuration, but again relies on the closed-loop feedback.


Closed-Loop Small-Signal Analysis

Based on the results obtained from the open-loop small-signal analysis, a closed-loop small-signalmodel of the basic LDO topology can be derived (Souvignet et al., 2013). For this purpose, theindividual open-loop transfer-functions for each of the three input perturbation signals are combinedaccording to the superposition theorem. In closed-loop operation, the error voltage is defined byvIN (s) = βFB ·vOUT (s)−vREF (s). As a result, the closed-loop small-signal model of the basic LDOtopology - describing the output voltage vOUT (s) as function of the reference voltage vREF (s), thesupply voltage vDD (s) and the load current iLOAD (s) - can be written as:

vOUT (s) = HCL(s)·vREF (s)+GCL(s)·vDD(s)+ZCL(s)·iLOAD(s)

vOUT (s) = HOL(s)1+βFB ·HOL(s) ·vREF (s)+ GOL(s)

1+βFB ·HOL(s) ·vDD(s)+ ZOL (s)1+βFB ·HOL(s) ·iLOAD(s)

(4.6)

Owing to the closed-loop negative feedback, the impact of each of the three input perturbationsignals on the output voltage is attenuated by a factor of (1+βFB ·HOL (s)). In this way, theoutput voltage becomes an accurate replica of the reference voltage vREF (s), which can be bestillustrated when evaluating the above closed-loop LDO transfer-function at low frequencies (s = 0).

VOUT = Av,EA·Av,PASS1+βFB ·Av,EA·Av,PASS · (VREF + (1 +Av,PASS) · VDD − rds,PASS · ILOAD) (4.7)

In case the error amplifier gain Av,EA is high, the output voltage approaches its ideal target value,which is solely defined by the reference voltage VREF and the feedback factor βFB , such thatVOUT = 1

βFB· VREF .

The small-signal model and its mathematical representation is the starting point for the furtheranalysis of the basic LDO topology as well as the discussion of the LDO circuit implementationaspects. In addition, however, each LDO topology shows significant non-linearity, which have beendisregarded so far and need to be considered during LDO circuit analysis. First and foremost,the pass-transistor exhibits dramatic variations of its characteristics over supply voltage and loadcurrent variations, introducing substantial non-linearity. Another source of non-linearity is largesignal slewing during transient conditions, particularly occurring at the pass-transistor gate node.The small-signal analysis can thus provide only approximate results for the LDO characteristics -as will become more evident in the following section.

4.1.2 Regulation Performance

While an LDO ideally keeps its output voltage perfectly constant independent of supply voltageand load current variations, it suffers in practice from both static and transient regulation errors.The regulation performance of an LDO can be classified into three categories: (1) static regulationperformance, (2) transient regulation performance, and (3) high-frequency regulation performance,which are examined in the following for the basic LDO topology.

Evaluating the above closed-loop LDO transfer-function (compare Eq. 4.6) at low frequencies(s = 0) reveals the static regulation errors of the basic LDO topology. A steady-state voltagevariation at the LDO output resulting from changes in load current is defined as load regulation.A parasitic line resistance in series to the load thereby might further degrade the load regulationperformance.


Static Error

Transient

Undershoot

Transient

Overshoot

Load

CurrentΔILOAD

(a)

Output

Voltage

Transient

Undershoot

Static Error

Transient

Overshoot

Input

VoltageΔVDD

(b)

Output

Voltage

Fig. 4.3. Schematic illustration of the LDO behavior in response to (a) a load transient step, and (b) a linetransient step. While the static regulation errors define the steady-state voltage variations, the transientregulation errors define the LDO behavior in response to dynamic variations in supply voltage and loadcurrent, respectively.

∆VOUT∆ILOAD

= − rds,PASS(1 + βFB ·Av,EA ·Av,PASS)

∼= −1

βFB ·Av,EA · gm,PASS(4.8)

Correspondingly, a steady-state voltage variation resulting from changes in supply voltage is definedas line regulation.

∆VOUT∆VDD

= 1 +Av,PASS1 + βFB ·Av,EA ·Av,PASS

∼=1

βFB ·Av,EA(4.9)

Both the line and load regulation error are directly related to the error amplifier gain Av,EA. Hence,an accurate LDO output voltage clearly demands a high error amplifier gain. Though disregardedby the majority of publications, a static offset of the error amplifier VOS,EA directly contributes tothe LDO regulation error.

∆VOUT∆VOS,EA

= 1βFB

(4.10)

While the static regulation errors define the steady-state voltage variations, the transient regu-lation errors define the LDO behavior in response to fast transient variations in supply voltage andload current, respectively. Due to its finite gain-bandwidth, the LDO feedback loop is not able torespond instantaneously to these transient variations. The output voltage is instead sustained bycharging or discharging the load capacitance CLOAD. At this, the equivalent series resistance RESRcauses an additional voltage drop, which is determined by the difference between the pass-transistordrain current ISD,PASS and the load current ILOAD and appears immediately. The LDO behaviorin response to fast transient variations can be most generally expressed as:

∆VOUT =´ t1t0

(ISD,PASS(t)-ILOAD(t)) dtCLOAD

+ (ISD,PASS(t)-ILOAD(t)) ·RESR (4.11)

The LDO response time is related to various design parameters and can be divided into a settlingphase as well as a slewing phase (Kamath et al., 1974; Azzolini et al., 2006). While the settlingphase is determined by small-signal parameters only, the slew-rate is a large-signal parameter that


is highly dependent on the error amplifier design, its biasing current as well as the parasitic pass-transistor gate capacitance.

In response to a fast load transient step, the LDO behavior during the settling phase is defined bythe closed-loop output impedance ZCL (s) = ZOL(s)

1+βFB ·HOL(s) , which is determined by the open-loopoutput impedance ZOL (s) divided by the open-loop transfer-function HOL (s). The LDO responsetime is approximately reciprocal to the closed-loop LDO gain-bandwidth. A large load capacitanceCLOAD as well as a large closed-loop gain-bandwidth improves the transient regulation performance.In addition to the settling phase, the LDO response time might be further delayed by the limitedslew-rate at the pass-transistor gate node. Corresponding to the load transient response, the LDObehavior during the settling phase in response to a fast line transient step is defined by the closed-loop transfer-function GCL (s) = GOL(s)

1+βFB ·HOL(s) . While the supply voltage of an ultra-low-powerMCU system is rather constant in a battery-powered application, the digital nature of the loadplaces stringent requirements on the LDO load transient performance. The focus throughout thiswork is therefore on the LDO load transient response.

Besides of the above defined static and transient regulation errors, the power supply rejectionratio (PSRR) as a high-frequency regulation parameter refers to the ability of the LDO to regulateits output against low- and high-frequency variations in the supply voltage. It is therefore definedas inverse of the supply-to-output transfer-function GCL (s) in closed-loop operation. The PSRRdegrades for frequencies beyond the LDO gain-bandwidth, whereas the main noise path arises fromthe PMOS pass-transistor in common-source configuration (El-Nozahi et al., 2010). While it is notof major concern when supplying digital CMOS circuits, a high PSRR also at high frequencies isa key requirement when supplying noise sensitive analog and RF circuits. For a detailed analysisof the PSRR, the interested reader is at this point referred to El-Nozahi et al. (2010) as well asRincon-Mora (2009).

In summary, and as illustrated in Fig. 4.3, an output voltage tolerance window can be specified,which is guaranteed by the LDO and combines all of the above listed static and transient regulationerrors. Recapitulating the above considerations on the LDO regulating performance, it is clearlyevident that (1) a good static regulation performance demands a high voltage gain, while (2) a goodtransient regulation performance demands a large gain-bandwidth of the LDO feedback loop. Inconflict with these requirements is the loop stability, to be guaranteed under all operating conditions,as well as a preferably small quiescent current demand.

4.1.3 Loop Stability

Due to the high channel resistance of the PMOS pass-transistor, loop stability is one of the majorchallenges in LDO design. The frequency response of the basic LDO topology is determined by twolow-frequency poles, a left-half-plane (LHP) zero as well as a right-half-plane (RHP) zero. The LDOfeedback loop is therefore potentially unstable (Rincon-Mora and Allen, 1998b; Chava and SilvaMartinez, 2004). Before giving attention to the practical aspects of loop stability for the externallycompensated LDO topology as well as the internally compensated LDO topology in the subsequentsections, theoretic expressions for the pole and zero locations are derived in the following.

One low-frequency pole is associated to the pass-transistor gate node, while the other is asso-ciated to the LDO output node. The pole locations are obtained by evaluating the denominatorof the open-loop transfer-function HOL (s) (see Eq. 4.3). While the denominator appears rathercomplicated, it can yield intuitive expressions by assuming the two low-frequency poles are widely


spaced (such that |ωp1| |ωp2|) (also see Razavi, 2001, p. 174). By writing the denominator in theform of a general second-order polynomial, it can be simplified to:

D = 1 + s ·(

1ωp1

+ 1ωp2

)+ s2

ωp1 · ωp2∼= 1 + s

ωp1+ s2

ωp1 · ωp2(4.12)

Under this approximation, and by evaluating the open-loop transfer-function HOL (s), the locationof the dominant pole p1 can be expressed as:

ωp1 = 1rout,EA ·(CGS,PASS+(1+Av,PASS)·CGD,PASS)+rds,PASS ·(CGD,PASS+CLOAD)

(4.13)

while the location of the non-dominant pole p2 is given as:

ωp2 = rout,EA ·(CGS,PASS+(1+Av,PASS)·CGD,PASS)+rds,PASS ·(CGD,PASS+CLOAD)rout,EA ·rds,PASS ·(CGS,PASS ·CGD,PASS+CGS,PASS ·CLOAD+CGD,PASS ·CLOAD)

(4.14)

The numerator of the open-loop transfer-function HOL (s) (see Eq. 4.3) exhibits both the LHP-zeroand the RHP-zero explicitly. The LHP-zero is caused by the series combination of the load capaci-tance and its equivalent series resistance (ESR). Its location is given by the following expression:

ωz,ESR = 1RESR · CLOAD

(4.15)

The RHP-zero arises due to the feed-forward path established by the pass-transistor gate-draincapacitance CGD,PASS . Evaluating the numerator of the open-loop transfer-function HOL (s) yieldsits location:

ωz,RHP = gm,PASSCGD,PASS

(4.16)

A more detailed analysis of the LDO transfer-function HOL (s) reveals the presence of further polesand zeros. However, all of these poles and zeros reside at much higher frequencies, in particularclearly above the LDO gain-bandwidth, and can therefore be neglected for the further LDO circuitanalysis.

4.2 LDO Topologies and Design Considerations

Based on the theoretical LDO performance definitions presented in the preceding chapter, thefundamental LDO design considerations are identified in the following. Ideally, an LDO shouldhave a high voltage gain for an accurate final value as well as a large gain-bandwidth for a fasttransient response, while loop stability is guaranteed under all operating conditions. In practice, it isnot so easy to combine these contradicting requirements. The LDO specifications are in fact closelyinterrelated, leading to fundamental design challenges and trade-offs as illustrated in Fig. 4.4. TheseLDO design challenges and trade-offs become more apparent when considering the common LDOdesign procedure. For a given LDO specification, including the operating boundaries and regulationperformance requirements, an LDO is commonly designed from the pass-transistor forward. The

4.2 LDO Topologies and Design Considerations 53

Loop Stability

Accura

te

Fin

al V

alu

e

Fast T

ransie

nt

Response

Fig. 4.4. Every LDO topology has to cope with the fundamental design challenges and trade-offs - namelybetween high voltage gain for an accurate final value, large gain-bandwidth for a fast transient response,and loop stability under all operating conditions.

ability to source high load current while achieving low dropout voltage requires the use of a largesize PMOS pass-transistor. This accordingly translates into a large parasitic gate capacitance of thepass-transistor, which needs to be driven rapidly by the error amplifier. At this, the pass-transistorexhibits tremendous variations of its operating point across the LDO operating conditions, namelyacross supply voltage and load current variations. The LDO load current commonly spans a rangeof more than four decades. While the pass-transistor is operated in weak inversion at low loadconditions, it passes into strong inversion with increasing load current. At high load conditionsand very low supply voltage, it might even enter triode region (depending on the pass-transistorsizing approach). The resistance seen from the pass-transistor drain is large and, above all, inverselyproportional to the load current. The output pole thus widely moves with load current. Besides,also the pass-transistor voltage gain varies across load conditions. Clearly, the high variability ofthe pass-transistor operating conditions contributes significantly to the LDO design challenges andtrade-offs.

The LDO performance vitally depends on the LDO error amplifier. In conclusion to the aboveconsiderations, two major challenges for its design can be deduced. (1) For loop stability of anyLDO topology, two low frequency poles need to be taken into consideration. Besides the poleassociated to the LDO output and determined by the load capacitance, another pole is associatedto the error amplifier output. Due to the high output resistance of the error amplifier and the largeparasitic gate capacitance of the pass-transistor, this pole is located at low frequencies, resultingin a potentially unstable system. To nevertheless achieve loop stability, it might seem attractive tolower the output resistance of the error amplifier. This, however, is in strong conflict with a highvoltage gain required for a good static accuracy. (2) The LDO transient response is determined bythe gain-bandwidth and the slew-rate of the LDO error amplifier. To avoid slewing effects, highquiescent current is needed to rapidly drive the large parasitic capacitance at the pass-transistorgate node. In this context, the load transient response is of particular concern when supplyingdigital CMOS circuits. In conflict with these two design challenges for the LDO error amplifier isits quiescent current demand. While lowest quiescent current is obviously preferred from a systempower perspective, it is however defined by the gain-bandwidth and the slew-rate requirements ofthe LDO error amplifier.


The LDO design challenges and trade-offs are further exacerbated when minimizing the loadcapacitance, i.e. by removing the large (external) load capacitance conventionally used for compen-sation. Since the output pole is inversely proportional to the load capacitance, its frequency in thiscase increases drastically. The most obvious solution to maintain loop stability is to push the poleassociated to the error amplifier output towards higher frequencies at the same rate. Similar con-siderations also apply for the transient response. Since the load capacitance acts as passive chargereservoir at fast transient conditions, an even higher slew-rate is required for the LDO error ampli-fier when minimizing it. As a result, both the gain-bandwidth and the slew-rate requirements of theLDO error amplifier demand a tremendous quiescent current when minimizing the load capacitance- unless alternative design strategies are applied, as for instance high slew-rate voltage buffers andadvanced frequency compensation schemes.

In summary, every LDO topology has to cope with these fundamental design challenges andtrade-offs. During the design procedure, these challenges should be tackled jointly by topology def-inition and not by tweaking of design parameters - according to the structural design approachpresented by Ivanov and Filanovsky (2004). At this, the load capacitance plays an essential role inLDO design, both with respect to loop stability and transient response. Consequently, the externallycompensated LDO topology and the internally compensated LDO topology pursue fundamentallydifferent approaches to address these design challenges and trade-offs. While the externally compen-sated LDO topology is compensated by a large (external) capacitance with the dominant pole at theLDO output, the internally compensated LDO topology employs some form of Miller compensationto establish an internal, dominant pole.

For easy bench marking of different LDO topologies and implementations, a figure-of-merit(FOM) has been introduced by Hazucha et al. (2005). It combines the maximum load current(ILOAD,max), the load capacitance (CLOAD), the transient voltage error (∆VOUT ) in response toa full-scale load current step as well as the LDO quiescent current (Iq). A smaller figure-of-meritthereby indicates a superior LDO performance.

FOM [s] = CLOAD ·∆VOUT · IqILOAD,max 2 (4.17)

4.2.1 Externally Compensated LDO Topology

The externally compensated LDO topology relies on the presence of a large (external) load capac-itance. This capacitance serves as passive charge reservoir, in this way playing an important rolefor both loop stability and transient response (Rincon-Mora and Allen, 1998a; Chava and SilvaMartinez, 2004). These two major challenges in LDO design are in the following examined for theexternally compensated LDO topology. Based on these results, the previously presented, externallycompensated LDO topologies are briefly reviewed with a particular view to addressing these designchallenges.

Loop Stability

Fig. 4.5 illustrates the principle frequency response of the externally compensated LDO topologyat different load conditions. The frequency response is dominated by the pole ωp,OUT , which isassociated to the LDO output (Rincon-Mora, 2009, p. 197ff). Due to the large (external) load


10m

at min. load

at max. load

increasing load current

decreasing load current

pOUT pOUT pOUT pGATEpOUT

Vo

ltag

e G

ain

[d

B]

pGATE

1GFrequency (Hz)

Fig. 4.5. Schematic frequency response of the externally compensated LDO topology at different loadconditions. The frequency response is dominated by the pole associated to the LDO output pOUT . Thecritical condition for loop stability is at high load.

capacitance and the high channel resistance of the PMOS pass-transistor, this pole is typicallylocated at low frequencies. Throughout the subsequent analysis, the external load capacitance isassumed to be considerably larger than all other parasitic capacitances, particularly including thepass-transistor gate capacitance (CLOAD CGS,PASS , CGD,PASS).

The location of the output pole pOUT is obtained by evaluating the expression for the dominantpole p1 (see Eq. 4.13). Assuming the pass-transistor is operated in strong inversion (which is validfor mid-range to high load conditions), the dominant pole frequency can be approximated as:

ωp,OUT ∼=1

rds,PASS · CLOAD∼=λsi · ILOADCLOAD

(4.18)

Since the load current commonly spans a range of more than four decades, the output pole widelymoves with load current due to varying channel resistance of the pass-transistor. To maintain loopstability even at high load conditions, either larger load capacitance is required, or alternatively,the first non-dominant pole pGATE needs to be pushed to higher frequencies. By evaluating theexpression for the non-dominant pole p2 (see Eq. 4.14), its location can be expressed as:

ωp,GATE = 1rout,EA · (CGS,PASS + (1 +Av,PASS) · CGD,PASS)

ωp,GATE ∼=1

rout,EA · (CGS,PASS + CGD,PASS) (4.19)

This pole is, in a first approximation, independent of the load current. It can be pushed to higherfrequencies by decreasing the error amplifier output resistance. This in turn either requires addi-tional LDO quiescent current, or alternatively, results in reduced voltage gain of the error amplifier.Noteworthy, the pass-transistor is not (considerably) affected by Miller multiplication for the ex-ternally compensated LDO topology. As the dominant pole pOUT is associated to the output here,the voltage gain of the pass-transistor drops well in advance of the non-dominant pole frequency,such that Av,PASS ≤ ωp,GATE/ωp,OUT . The RHP-zero zRHP can thus be neglected for the analysisof the externally compensated LDO topology.


Not only the location of the dominant pole significantly varies over load conditions, but also thepass-transistor voltage gain Av,PASS = gm,PASS · rds,PASS . Assuming again the pass-transistor isoperated in strong inversion, the total voltage gain of the externally compensated LDO topologycan be expressed as:

Av,LDO = Av,EA ·Av,PASS ∼= Av,EA ·1λsi·

√2 · µp · COX ·

(W

L

)PASS

· (1 + λsi · VSD,PASS)ILOAD

(4.20)The pass-transistor voltage gain drops with the reciprocal of the square root of the load current,in this way partially counteracting the frequency increase of the dominant pole pOUT . But still,the critical case for loop stability is clearly at maximum load condition. Assuming both the firstnon-dominant pole pGATE and the ESR zero zESR are located well beyond the unity-gain frequency,the gain-bandwidth of the externally compensated LDO topology can be expressed as:

ωGBW = Av,LDO · ωp,OUT ∼=Av,EACLOAD

·

√2 · µp · COX ·

(W

L

)PASS

· ILOAD (4.21)

For the externally compensated LDO topology, there is in conclusion a fundamental trade-off be-tween high accuracy (or, more precisely, high LDO voltage gain) and loop stability. This trade-offcan be resolved by pushing the first non-dominant pole to higher frequencies, which results, however,in a quiescent current penalty.

Transient Response

For the externally compensated LDO topology, the large (external) load capacitance serves as pas-sive charge reservoir. Due to its limited gain-bandwidth, the error amplifier is not able to instanta-neously change the pass-transistor gate voltage in response to fast transient variations of the supplyvoltage and/or the load current. While the pass-transistor is thus not able to instantaneously supplythe load current demanded, the large load capacitance provides charge to the load, thus smoothingout the LDO transient under- and overshoot to a major extend (Milliken et al., 2007). This becomesparticularly apparent when the transient step is much faster than the LDO gain-bandwidth, whichis usually the case. The load transient response of the externally compensated LDO topology canbe best illustrated by evaluating the closed-loop output impedance ZCL (s) = ZOL(s)

1+βFB ·HOL(s) (seeEq. 4.5), which is determined by the open-loop output impedance ZOL (s) divided by the open-loop transfer-function HOL (s). At low frequency, the open-loop voltage gain is high, and thus,due to negative feedback, the load regulation is good. At frequencies beyond the dominant polepOUT , both the open-loop voltage gain and the open-loop output impedance start to drop. Sincethe closed-loop output impedance is defined as the ratio of both, it remains approximately con-stant. At frequencies beyond the LDO gain-bandwidth, the denominator of the closed-loop outputimpedance approaches unity, while the numerator continues to decrease - causing the closed-loopoutput impedance to further drop. Since the closed-loop output impedance is low over the entirefrequency range, the externally compensated LDO topology in conclusion exhibits a very goodtransient performance. In particular, both loop stability and transient response can be consistentlyaddressed during the design procedure, independent of the load capacitance. By increasing the qui-escent current, the first non-dominant pole is pushed to higher frequencies - improving the loopstability, and the gain-bandwidth is increased - enhancing the transient response.


Review of Previous Work

The most basic approach for improving the loop stability of the externally compensated LDO topol-ogy is to utilize the equivalent series resistance (ESR) of the external load capacitance (Rincon-Moraand Allen, 1998a; Lee, 1999). In this way, a left-half-plane (LHP) zero zESR is introduced, whichcounteracts the negative phase shift of the first non-dominant pole pGATE . For proper compensation,the location of the zero zESR must be carefully chosen and well defined such that loop stability canbe ensured under all load conditions. The series resistance of a capacitor is, however, not properlyspecified and significantly varies over temperature. This compensation scheme becomes even moreproblematic as a large series resistance causes undesirable large output voltage variations duringtransient conditions.

Numerous proposals have been made to overcome the dependence on the LHP-zero. The mostwidely proposed approach is to interpose a buffer stage between the error amplifier output andthe pass-transistor gate node. In this way, the pass-transistor is effectively isolated from the erroramplifier output. The buffer stage is most basically realized by a PMOS voltage follower, as forinstance proposed by Rincon-Mora and Allen (1998a). In this way, the first non-dominant polepGATE is split into two higher frequency poles. But still, due to the large pass-transistor gatecapacitance, a tremendous quiescent current is required for the voltage follower to push thesepoles well beyond the unity-gain frequency of the LDO feedback loop. By applying shunt feedbacktechniques to the buffer stage, its output impedance can be further reduced without increasing thequiescent current demand. The first non-dominant pole, associated to the pass-transistor gate node,can in this way be pushed far beyond the LDO unity-gain frequency. This approach is for instanceproposed by Al Shyoukh et al. (2007), here referred to as buffer impedance attenuation technique(BIA). To reduce LDO quiescent current particularly at low load conditions, the buffer stage is inthis case biased depending on the actual load current.

Summary

For the externally compensated LDO topology, the two major design challenges - loop stability andtransient response - can be consistently solved during its design procedure. By increasing the LDOquiescent current, the gain-bandwidth and thus the transient voltage variations can be reduced.At the same time, the first non-dominant pole is pushed to higher frequencies, thereby improvingthe loop stability. With reducing the LDO load capacitance, the design challenges become morestringent, resulting in an LDO quiescent current penalty.

4.2.2 Internally Compensated LDO Topology

The current research on LDO design is focused on removing the large (external) load capacitancewhile maintaining loop stability and fast transient response at sufficiently low quiescent currentlevels. The characteristics of the externally compensated LDO topology suffer significantly whenremoving the external capacitance. For this reason, the internally compensated LDO topology pur-sues alternative approaches for error amplifier design and loop compensation. Though the internallycompensated LDO topology is broadly referred to as capacitor-less LDO in literature (Chen et al.,2007; Or and Leung, 2010), this term is highly misleading since this LDO topology still possessesan internal, merely much smaller load capacitance.

In the following, the two major challenges in LDO design, loop stability as well as transientresponse, are discussed for the internally compensated LDO topology. Based on these results, the


Frequency (Hz)

Vo

ltag

e G

ain

[d

B]

at max. load

at min. load

pGATE

increasing load current

decreasing load current

pOUT pOUT pOUTpOUT

zRHP

Fig. 4.6. Schematic frequency response of the internally compensated LDO topology at different loadconditions. The frequency response is dominated by the pole associated to the pass-transistor gate (pGAT E).The critical condition for loop stability is at low load.

previously presented, internally compensated LDO topologies are reviewed with a particular viewto addressing these design challenges.

Loop Stability

Since the load capacitance is small for the internally compensated LDO topology, the dominantpole is no longer associated to the output node as for the externally compensated LDO topol-ogy. Instead, the internally compensated LDO topology employs some form of Miller compensationtechnique to achieve loop stability (Milliken et al., 2007; Rincon-Mora, 2009, p. 200ff.). The basicprinciple of this compensation scheme is illustrated in the following using the single Miller compen-sation (SMC), while in practice more advanced Miller compensation schemes such as Nested Millercompensation (NMC) or Reverse-Nested Miller compensation (RNMC) are prevalent. Fig. 4.6 il-lustrates the typical frequency response of the internally compensated LDO topology at differentload conditions. By introducing a single compensation capacitance, in the following referred to asCM , across the gate and drain node of the pass-transistor, an internal dominant pole pGATE isestablished determining the LDO frequency response. Due to pole-splitting effect, the dominantpole is pushed to lower frequencies by the Miller multiplication, while the first non-dominant poleis pushed to higher frequencies due to shunt feedback. In this way, the dependency of the load-dependent output pole pOUT is eliminated and a rather constant gain-bandwidth is established.Throughout the subsequent analysis, the Miller compensating capacitance CM is assumed to beconsiderably larger than all other parasitic capacitances, particularly including the pass-transistorgate capacitance (CM CGS,PASS , CGD,PASS) as well as the load capacitance (CM CLOAD).

By evaluating the expression for the dominant pole p1 (see Eq. 4.13), the location of the dominantpole pGATE can be expressed as:

ωp,GATE = 1rout,EA · (CGS,PASS + (1 +Av,PASS) · (CGD,PASS + CM ))

ωp,GATE ∼=1

gm,PASS · rds,PASS · rout,EA · CM(4.22)


While the error amplifier output resistance rout,EA as well as the Miller compensation capacitanceCM are essentially independent of the load condition, the pass-transistor voltage gain decreaseswith the reciprocal of the square root of the load current (assuming the pass-transistor is operatedin strong inversion). As a result, the dominant pole pGATE moves towards higher frequencies withincreasing load current.

ωp,GATE ∼=λsi ·

√ILOAD

rout,EA · CM ·√

2 · µp · COX ·(WL

)PASS

· (1 + λsi · VSD,PASS)(4.23)

For the internally compensated LDO, the first non-dominant pole pOUT is associated to the LDOoutput node. Evaluating the expression for the non-dominant pole p2 (see Eq. 4.14) yields itslocation:

ωp,OUT = gm,PASS · CMCGS,PASS · CLOAD + CGS,PASS · CM + CLOAD · CM

∼=gm,PASSCLOAD

(4.24)

Assuming the pass-transistor is operated in strong inversion (which is valid for mid-range to highload conditions), the first non-dominant pole pOUT moves to higher frequency with the square rootof the load current.

ωp,OUT ∼=

√2 · µp · COX ·

(WL

)PASS

· ILOADCLOAD

(4.25)

The internally compensated LDO topology is therefore usually stable at high load conditions. At lowload conditions, in contrast, the pass-transistor transconductance gm,PASS decreases significantlyand the output pole is pushed to lower frequencies in close proximity to the error amplifier pole.To maintain loop stability, the output pole thus must remain at high frequencies, which limits boththe load capacitance to rather small values and the minimum load current to rather high values.

In addition to pole splitting, the Miller compensating capacitance CM forms a parasitic feed-forward path, resulting in a right-half-plane (RHP) zero. Its location is given by:

ωz,RHP = gm,PASSCM

∼=

√2 · µp · COX ·

(WL

)PASS

· ILOADCM

(4.26)

At high load conditions, the pass-transistor transconductance is large. The small-signal outputcurrent is therefore significantly larger than the feed-forward current, and the effect of the RHP-zeroappears only at very high frequencies. At low load conditions, in contrast, a small pass-transistortransconductance reduces not only the effectiveness of Miller pole-splitting, but also causes theRHP-zero to move to lower frequencies (Leung and Mok, 2001). As a result, the stability of theLDO feedback loop is significantly degraded at low load conditions.

Not only the pole locations significantly vary over load conditions, but also the pass-transistorvoltage gain Av,PASS = gm,PASS ·rds,PASS . Assuming again the pass-transistor is operated in stronginversion, the total voltage gain of the internally compensated LDO topology can be expressed as:

Av,LDO = Av,EA ·Av,PASS ∼= Av,EA ·1λsi·

√2 · µp · COX ·

(W

L

)PASS

· (1 + λsi · VSD,PASS)ILOAD

(4.27)


The pass-transistor voltage gain drops with the reciprocal of the square root of the load current, andin this way compensates for the frequency increase of the dominant pole pGATE . Assuming boththe first non-dominant pole pOUT and the RHP-zero zRHP are located well above the unity-gainfrequency, the LDO gain-bandwidth can be expressed as:

ωGBW = Av,LDO · ωp,GATE ∼=gm,EA · rout,EA · gm,PASS · rds,PASSgm,PASS · rds,PASS · rout,EA · CM

= gm,EACM

(4.28)

The gain-bandwidth of the internally compensated LDO topology is evidently independent of theload current - which is in contrast to the externally compensated LDO topology.

For an internally compensated LDO topology, in conclusion, the load capacitance is limitedto comparatively small values in order to achieve loop stability. Depending on the exact LDOtopology and implementation, the maximum load capacitance is restricted to about 100 pF, while aminimum load current of about 100µA to 5 mA is required to guarantee loop stability of the LDO.The required minimum load current thereby increases for larger load capacitances, and vice versa(Guo and Leung, 2010). Numerous compensation techniques have been proposed to overcome thestability limitations at low load conditions. A detailed discussion of these techniques is deferred tothe following review of previously presented, internally compensated LDO topologies.

Transient Response

While the load capacitance is limited to comparatively small values in order to achieve loop stability,this is in strong conflict to the transient performance of the internally compensated LDO topology.Due to the lack of a large (passive) charge reservoir, large transient under- and overshoots occurin response to high slew-rate load current steps. This effect is further fortified by the low gain-bandwidth of the Miller compensation, resulting in a slow response time. The load transient responseof the internally compensated LDO topology can be best illustrated by evaluating the closed-loopoutput impedance ZCL (s) = ZOL(s)

1+βFB ·HOL(s) (see Eq. 4.5), which is determined by the open-loopoutput impedance ZOL (s) divided by the open-loop transfer-function HOL (s). At low frequency,the open-loop voltage gain is high, and thus, due to negative feedback, the load regulation is good.At frequencies beyond the dominant pole pGATE , the open-loop voltage gain however starts todrop, while the open-loop output impedance remains constant, which is different compared to theexternally compensated LDO topology. As a result, the closed-loop output impedance begins to rise,and approaches the open-loop value at the LDO gain-bandwidth. At frequencies beyond the LDOgain-bandwidth, the loop action is suppressed and the closed-loop output impedance is identical tothe open-loop output impedance. Ultimately, also the open-loop output impedance starts to drop atfrequencies beyond the first non-dominant pole pOUT . In conclusion, there is a region of frequency,ranging from three to five decades, in which the closed-loop output impedance is high, resulting ina poor transient performance. In addition, the LDO transient response is further limited by slewingeffects at the pass-transistor gate node.

In conclusion, both requirements of loop stability and transient response cannot be solved jointlyfor the internally compensated LDO topology, but instead result in contradictory trade-offs. From aloop stability perspective, the pass-transistor gate node needs to be slowed down in order to push theassociated (dominant) pole to lower frequencies. However, from a transient response perspective, thisnode needs to be charged and discharged rapidly. As an applicative solution around this dilemma,the term of edge time has been introduced by Guo and Leung (2010), thereby limiting the slopeof the load current step. However, the slope is defined by application requirements and is usually


not a free parameter of design. Alternatively, dedicated transient enhancement circuits have beenproposed to improve the LDO transient response, either in form of momentarily current boosting (forinstance actuated by capacitive coupling) or in form of a second, transient feedback loop. Furtherdetails will become obvious when reviewing the previously presented, internally compensated LDOtopologies in the following section.

Review of Previous Work

For the single Miller compensation (SMC) scheme as discussed in the loop stability section, theMiller compensation capacitance required to push the LDO output pole beyond the unity-gainfrequency becomes prohibitively large. In order to limit the on-chip capacitance to an acceptablevalue, an additional gain stage can be interposed between the error amplifier and the pass-transistor.These multi-stage LDO topologies are prevalent and represent state-of-the art in the design of in-ternally compensated LDO. They inherently need to be compensated by advanced Miller compen-sation schemes such as Nested Miller compensation (NMC) or Reverse-Nested Miller compensation(RNMC) in order to achieve loop stability and high gain-bandwidth with a reasonable on-chipcompensation capacitance and quiescent current. This approach is well known in the design of op-erational transconductance amplifiers, and requires particular care of the compensation network(Leung and Mok, 2001; Fan et al., 2005). These Miller compensation schemes particularly create aset of non-dominant complex conjugate poles, potentially causing loop stability issues. In contrast totransconductance amplifiers however, an LDO experiences a large load capacitance in combinationwith extreme variations of the pass-transistor output impedance. As a result, the multi-stage LDOtopologies suffer from potential loop stability issues at low load conditions due to gain peaking -limiting both the minimum load current and maximum load capacitance, respectively.

To reduce the quality factor of the non-dominant complex conjugate poles and in this wayachieve loop stability also at low load conditions, Leung and Mok (2003) propose a damping-factor-control (DFC) compensation scheme. An internal dominant pole is established by Millercompensation pole splitting. At the same time, the phase shift of the output pole is canceled byan internally generated left-half plane (LHP) zero. Since the LHP-zero is fixed and consequentlydoes not track with the load-dependent output pole, the loop stability issues at low load conditionsare however still not solved. The presented internally compensated LDO with DFC compensationis unstable for load currents below 5 mA at a maximum on-chip load capacitance of 100 pF. Theminimum load current required for maintaining loop stability can be reduced to 100µA by usinga Q-reduction technique as proposed by Lau et al. (2007). By placing an additional compensationcapacitance across the first and second stage of the error amplifier, a feed-forward path is established,in this way reducing the quality factor of the non-dominant complex conjugate poles while onlyslightly reducing the LDO gain-bandwidth. An alternative approach to mitigate the stability issuesat low load conditions is proposed by Kwok and Mok (2002). By applying a dynamic pole-zerotracking and cancellation technique, an internal LHP-zero is generated, which is used to cancel thephase shift of the output pole. To track with the output pole, its position is adapted dependingon the load current. However, since the tuning range of the LHP-zero is limited, also this LDOtopology is not stable over the entire load current range. Both approaches for avoiding loop stabilityissues at low load conditions and large load capacitances, the Q-reduction technique as well asadaptive zero compensation, are basically combined by Yang et al. (2008). In this way, the minimumload limitations is reduced to load currents of below 50µA. Due to the large Miller compensationcapacitance, all of the above reviewed internally compensated LDO topologies suffer from a low


gain-bandwidth. In order to nevertheless achieve a fast transient response, dedicated transientboosting techniques have been proposed (Or and Leung, 2010; Guo and Leung, 2010). The keyidea behind these techniques is to momentarily increase the bias current of the LDO error amplifierwhen voltage spikes appear at the LDO output in order to overcome the slew-rate limitations dueto the large parasitic gate capacitance of the pass-transistor. The transient boosting techniques arebasically based on capacitive coupling between the LDO output node and an internal bias voltagenode. However, since these LDO topologies resort to advanced Miller compensation schemes toachieve loop stability, they still suffer from stability issues at low load conditions.

An alternative approach for improving the LDO transient behavior is the usage of multiple (ac-tive) feedback loops. Besides the main feedback loop including the error amplifier, these topologiescomprise a fast feedback path to achieve the very fast transient response needed for small load ca-pacitances. To enhance the slew-rate at the pass-transistor gate node, Chen et al. (2007) for instancepropose the introduction of an additional current feedback loop. With the help of a sense transis-tor, a current proportional to the load current is generated. This sense current is combined by atransimpedance stage with the error amplifier output and fed back to the pass-transistor gate node.In this way, the slew-rate of the error amplifier is boosted by this additional feedback current. Thesecondary current feedback loop thereby offers a significantly faster response compared to the mainvoltage feedback loop. Conceptually similar, Milliken et al. (2007) propose to use a differentiatorcomposed of a Miller compensation capacitance connected between the gate and drain node of thepass-transistor. To avoid the feed-forward path and to introduce a LHP-zero, a unity-gain currentbuffer is placed in series to the Miller compensation capacitance. At the same time, the Miller com-pensation capacitance introduces an auxiliary fast transient path. A change in the output voltageis sensed by the capacitance, and the resulting current signal is injected into the pass-transistorgate node. Hazucha et al. (2005) propose to employ two active feedback loops, at which a fast stageprovides high bandwidth, while a slow stage provides high gain as part of a replica-based output.While the dominant pole is associated to the gate node of the pass-transistor just as for any otherinternally compensated LDO topology, the output impedance is significantly reduced by utilizinglocal series-shunt feedback - in this way avoiding stability issues at low load conditions. Achievingloop stability over the full load current range, however, results in a significant quiescent currentpenalty of 6 mA. A more detailed discussion of this LDO topology is deferred to the analysis of thecascoded flipped voltage follower (see Chap. 5.3.3).

In conclusion of the review of prior arts, Fig. 4.7 provides a visual performance comparison ofsome selected recently-published internally compensated LDO topologies. By plotting the under-shoot in response to a full-scale load current step, the load capacitance, the quiescent current aswell as the maximum load current, the area of the chart basically corresponds to the previouslydefined LDO figure-of-merit (FOM) - as larger the area, as smaller the FOM, and as better the LDOperformance. For the selected internally compensated LDO topologies, the FOM ranges between32.4 ps for the LDO topology presented by Hazucha et al. (2005), down to 0.0046 ps for the LDOtopology presented by Guo and Leung (2010). This FOM, however, disregards the stability issuesof most of these topologies at low load conditions, and hence does not allow a final statement abouttheir suitability to supply the digital core as part of a fully-integrated MCU system. This work doesnot necessarily strive for the LDO topology with the best FOM, but for the one best suited for thegiven application.


\

Leung and Mok, 2003 Hazucha et al., 2005 Milliken et al., 2007 Guo and Leung, 2010

Technology [nm] 600 90 350 90

Chip Area [mm²] 0.307 0.008 0.12 0.019

Supply Voltage [V] 1.5 - 4.5 1.2 3.0 - 4.2 0.75 - 1.2

Output Voltage [V] 1.3 0.9 2.8 0.5 - 1.0

Load Capacitance [nF] 0.1 0.6 0.1 0.05

Max. Load Current [mA] 100 100 50 100

Load Trans. Undershoot [mV] ~120 90 23 114

Quiescent Current [µA] 38 6000 65 8

Peak Current Efficiency [%] 99.96 94.32 99.93 99.99

FOM* [ps] 0.0456 32.4 0.0598 0.0046

qLOAD OUT

LOAD LOAD

IC ΔV*) FOM=

I I

1.0 0.1 0.011 10 100 100001000

50

100

50

100

150

200

Guo and Leung, 2010

Milliken et al., 2007

Hazucha et al., 2005

Leung and Mok, 2003

Load Capacitance [nF]Quiescent Current [µA]

Max. Load

Current [mA]

Load Tran.

Undershoot [mV]

Fig. 4.7. Spider diagram providing a visual performance comparison of selected recently-published inter-nally compensated LDO topologies. The area of the chart basically corresponds to the previously definedLDO figure-of-merit (FOM) - as larger the area for an LDO topology, as better its FOM.

Summary

For the previously presented, internally compensated LDO topologies, the requirements for bothloop stability and transient response cannot be solved jointly, but instead result in contradictorytrade-offs. While the LDO load capacitance is limited to comparatively small values with respectto the LDO loop stability, this is in strong contrast to the requirements for the transient response.Due to the lack of a large passive charge reservoir, large transient under- and overshoots occurin response to high slew-rate load current steps. The vast majority of the suggested internallycompensated LDO topologies are therefore not suited to supply digital CMOS circuits. Furtherdetails will become more evident when reviewing the power supply needs of the digital CMOScircuits in the subsequent section.


4.3 LDO Load Environment

Proper design of an LDO demands an accurate knowledge of its load environment. This becomesparticularly essential for an internally compensated LDO with a small load capacitance only. So far,the load circuit was represented simplistically by the load current as well as the load capacitancewith its equivalent series resistance. In order to find a more accurate model of the load circuit, theMCU digital core and the total power supply system are subsequently investigated from an LDOload point of view.

In a highly-integrated ultra-low-power MCU system, both the LDO and the digital core areintegrated on the same chip, resulting in a well-defined power supply system and LDO load envi-ronment. As evident from the schematic circuit diagram shown in Fig. 4.8, the physical structureof the power supply system is hierarchical. A variable supply voltage, determined by the batteryvoltage over lifetime, is provided from external. Before crossing the chip boundary, it is bufferedby an external capacitance. The integrated LDO generates a lower, constant output voltage VOUT ,which is provided to the MCU digital core by using wide metal lines, here schematically representedby their resistance. Ultimately, the on-chip power distribution network as part of the MCU digitalcore provides the voltage generated by the LDO to every single digital logic gate. To locally bufferthe supply voltage, and thus maintain low impedance also at high frequencies (determined by theswitching speed of the digital logic gates), decoupling capacitances are allocated and distributedon-chip. In this context, it is also particularly important to consider the effect of the bond wireinductance. The inductance of the on-chip supply lines, in contrast, is much smaller compared totheir off-chip counterparts, and is thus neglected (Eireiner, 2009). Though Kelvin sensing is usedfor the LDO feedback signal, the feedback point needs to be carefully chosen. Since the MCU dig-ital core forms a widely distributed load, the resistance of the power distribution network causes

optional

VDD

Off-Chip

Capacitance

MCU Digital Core(incl. CPU, Peripherals)

Q

QD

Q

QD

LDO

VCORE

VFB

Bondwire

Off-Chip

Trace

VSS VSS

Bondwire

Bondwire

Off-Chip

Capacitance

Off-Chip

Trace

VOUT

Fig. 4.8. Schematic circuit diagram of the power supply system of a highly-integrated ultra-low-powerMCU system, particularly showing the LDO load environment. While the external load capacitance ismandatory for an externally compensated LDO, it is removed for an internally compensated LDO.

4.3 LDO Load Environment 65

considerable resistive voltage drops, which need to be taken into account. Even though the LDOregulates the core voltage perfectly at a single point, it may differ significantly at another point.

Also with respect to the LDO load environment, it needs to be distinguished between an exter-nally compensated and internally compensated LDO. For an externally compensated LDO, a largeexternal load capacitance is mandatory for proper operation. This in detail also includes the bondwire connection as depicted in Fig. 4.8. Besides the external load capacitance, the MCU digital corecomprises and requires an on-chip decoupling capacitance. For an internally compensated LDO, incontrast, the large external load capacitance is removed and only the much smaller on-chip decou-pling capacitance remains. The on-chip capacitance as intrinsic part of the MCU digital core servesat the same time also as LDO load capacitance. In this way, the need for any additional capacitance,neither off-chip nor on-chip, is avoided, resulting in substantial advantages with respect to systemintegration and overall system costs.

The problem in designing an LDO for supplying the MCU digital core is that there are manyunknowns until the very end of the design process. This particularly includes the maximum loadcurrent, which the LDO must be able to provide as well as the load capacitance, the LDO is exposedto. The design procedure is thus twofold. At an early design stage, both the maximum currentconsumption and the on-chip decoupling capacitance of the MCU digital core are conservativelyestimated. After designing both the LDO and the MCU digital core based on these estimations, thecross-functional operation of the overall system (including both analog and digital parts) needs tobe ultimately verified. However, due to the high complexity and different design flows, simulation-based verification is limited either in significance or in feasibility (Rabaey et al., 2006). The cross-functional verification thus requires the definition of a proper abstraction level, including worst-caseload profiles and an extracted load capacitance. The major challenge arising from this top-downdesign procedure is to minimize the margin for the LDO design with regard to the load current andthe load capacitance - without being neither too conservative nor too aggressive. This is generallyspeaking much more crucial in case of an internally compensated LDO, since it shows a stronginteraction with the MCU digital core due to its small load capacitance, which at the same time isan intrinsic part of the MCU digital core. In case of an externally compensated LDO in contrast,the design challenges are generally more relaxed and can thus be largely considered independentfrom each other.

The subsequent sections discuss the fundamental design considerations for an LDO supplyingthe digital core of a highly-integrated ultra-low-power MCU system (Lueders et al., 2011a). Thisparticularly includes the cross-functional requirements regarding load current and load capacitancein case of both an internally and an externally compensated LDO.

4.3.1 Power Supply Requirements of the MCU Digital Core

To guarantee a fault-free operation of the MCU digital core, its supply voltage must remain withina certain tolerance window (Chandrakasan and Brodersen, 1995). This window is determined byreliability constraints on the upper boundary side and speed requirements on the lower boundaryside. To minimize the power consumption, the core voltage is typically set to the lower boundarywith some safety margin. This margin is determined by various tolerances of the power supply sys-tem, which accumulate. Besides the LDO tolerance window determined by the static and transientregulation errors, this also includes the supply noise within the on-chip power distribution network.Large average currents result in increased resistive voltage drops (also widely referred to as IR-drop), while fast current transients result in increased di/dt-noise (Mezhiba and Friedman, 2004).


- 6

- 4

- 2

0

2

1. 50

1. 51

1. 52

1. 53

1. 54

0 100 200 300 400 500 600 700 800 900 1000

- 6

- 4

- 2

0

2

0. 0000E+00

5. 0000E- 04

1. 0000E- 03

1. 5000E- 03

2. 0000E- 03

3. 0000E- 05 3. 1000E- 05 3. 2000E- 05 3. 3000E- 05 3. 4000E- 05 3. 5000E- 05 3. 6000E- 05 3. 7000E- 05 3. 8000E- 05 3. 9000E- 05 4. 0000E- 05

Clock

Load Current

1µs

Average:

208µA

Clock

Core Voltage

1µs

10mV

(a)

(b)

Fig. 4.9. (a) Simulated transient load current profile of the MCU digital core when operating at 1 MHz,and (b) the resulting core voltage when the load current profile is presented to an LDO compensated byan on-chip capacitance of 3 nF.

These tolerances are budgeted during the system specification such that the core voltage alwaysremains within the specified tolerance window. Vice versa, these accuracy considerations drive theLDO requirements with respect to the regulation performance. Beyond this, due to the inherentlyhigh noise margin of the digital logic gates, the MCU digital core does not place great demands onits supply voltage (e.g. with respect to the power supply rejection).

While an LDO is commonly specified, designed and verified for DC load currents, the MCUdigital core presents a pulsed load current to the LDO. The digital core operates synchronousto a system clock; it draws current by charging the various gate capacitances whenever they areswitched. This results in large current spikes created at each clock edge, as illustrated in Fig. 4.9(a).By neglecting the leakage current, the average current consumption of the MCU digital core canbe expressed as:

ILOAD =k∑i=1

(αi · Ci) · fCLK · VCORE (4.29)

where αi is the activity rate and Ci is the capacitance of a single digital logic gate, fCLK is thesystem clock frequency and VCORE is the core voltage. The current consumption of the MCU digitalcore can instantly change from zero to maximum (and back) when gating the system clock. An LDOsupplying the MCU digital core must therefore feature either a large gain-bandwidth to react asfast as the clock frequency demands, or a large load capacitance to suppress large transient voltageundershoots.


VCORE

VCORE

(a) (b)

Decoupling Capacitance

Grid Wire Resistance

Time-Varying Current Source

Fig. 4.10. (a) Lumped circuit model of the on-chip power distribution network (adopted from Wang andMarek-Sadowska, 2005), and (b) the resulting equivalent circuit model for LDO simulation.

4.3.2 On-Chip Power Distribution Network

The fundamental task of the on-chip power distribution network is to provide the core voltagegenerated by the LDO to every single digital logic gate within the MCU digital core. While adetailed analysis of the on-chip power distribution network is beyond the scope of this work - theinterested reader is instead referred to Eireiner (2009) and Mezhiba and Friedman (2004) - the aimin the following is to find an equivalent circuit model from an LDO load point of view.

Fig. 4.10(a) shows a lumped equivalent circuit model of the on-chip power distribution network,which most basically forms a distributed resistor-capacitor network. It is comprised of the actual gridwires forming a mesh structure, which is very similar for both the supply and the ground side. Foran ultra-low-power MCU system with its comparably low clock frequency and current consumption,the grid wire can be modeled purely resistive. Its inductance can be neglected, as it is much smallerthan the LDO output impedance for the relevant frequency range (Mezhiba and Friedman, 2004).The envelope-switching current profiles of the digital logic gates are modeled by time-varying currentsources, which are distributed connected across the supply and ground distribution networks. Eachcurrent spike is thereby modeled by a triangle shape. To maintain a low impedance of the on-chippower distribution network also at high frequencies, on-chip decoupling capacitances are placedacross the supply and ground distribution networks (Mezhiba and Friedman, 2004). At this, itcan be distinguished between intrinsic capacitance as well as intentional capacitance. The intrinsiccapacitance includes all parasitic capacitances, namely the interconnection, the device and the p-n junction capacitances of the diffusion wells. The intentional capacitance is in contrast explicitlyadded to the power distribution network to increase the overall on-chip decoupling capacitance. It isrealized using the gate capacitance of a standard MOS transistor (Aminzadeh et al., 2008). To keepthe equivalent series resistance of the on-chip decoupling capacitance low, the MOS capacitance isdivided into multiple small devices, which are distributed and placed close to the digital logic gates.

To ensure the power integrity of the MCU digital core, both the grid wire resistance and theon-chip decoupling capacitance are determined during the digital backend design (place-and-routestage), corresponding to the last step of the common top-down digital design flow. However, both


parameters need to be carefully considered also for the LDO design, and must satisfy the needs ofboth the digital design and the LDO design. This becomes particularly important for an internallycompensated LDO, since the load capacitance is small and an intrinsic part of the MCU digitalcore. The following section hence particularly focuses on resolving this conflict and harmonizing thedifferent viewpoints of design.

Grid Wire Resistance

The grid wire resistance directly translates into resistive voltage drops (also widely referred toas IR-drop). The maximum tolerated grid wire resistance is determined during the digital backenddesign such that the supply voltage is kept within the specified tolerance window. At the same time,from an LDO load point of view, the grid resistance corresponds to an equivalent series resistanceof its load capacitance (see also Fig. 4.1). It causes a left-half-plane (LHP) zero ωz,ESR in theLDO transfer-function, which needs to be considered for loop stability. Depending on its location,two boundary conditions can be identified: (1) In case the LHP-zero resides at low frequencies(corresponding to a high equivalent series resistance), it may cause the LDO gain-bandwidth toextend into the parasitic pole region. (2) In case the LHP-zero resides at high frequencies wellbeyond the LDO gain-bandwidth, it has no impact leaving a potentially unstable two-pole systemformed by the error amplifier pole as well as the output pole. Since the requirements towards IR-drop limit the grid wire resistance to rather small values in the range of ten to hundred milliohms,the worst-case condition for loop stability clearly occurs for very small resistance values, i.e. whenthe LHP-zero ωz,ESR resides more than one decade above the LDO gain-bandwidth ωGBW .

ωz,ESR ωGBW (4.30)

For an externally compensated LDO with the dominant pole associated to the output node, thecondition for the maximum tolerated equivalent series resistance can thus be expressed as:

1RESR · CLOAD

Av,EA ·Av,PASSrds,PASS · CLOAD

RESR rds,PASS

Av,EA ·Av,PASS(4.31)

In conclusion, and as depicted in Fig. 4.10(b), the on-chip power distribution network can be simpli-fied modeled by a lumped current source connected in parallel to a lumped decoupling capacitance- representing worst-case conditions for LDO loop stability. For larger equivalent series resistances(such that ωz,ESR < 10 · ωGBW ) on the other hand, the LHP-zero counteracts the phase shiftand thus improves the LDO loop stability. Analogue considerations for the maximum grid wireresistance can be obtained for an internally compensated LDO topology.

Minimum On-Chip Capacitance

In case of an internally compensated LDO, the on-chip decoupling capacitance serves at the sametime as LDO load capacitance. The need for additional area consuming and thus costly on-chipcapacitance should be avoided. However, an LDO designed for low quiescent current is too slow toreact on single current spikes caused by switching of the digital logic gates. Its closed-loop outputimpedance shows an inductive behavior for frequencies above the LDO gain-bandwidth. For this


reason, each current spike results in a voltage ripple ∆VCORE , as illustrated in Fig. 4.9(b). Here,the transient load current profile of the MCU digital core is applied to an LDO compensated byan on-chip capacitance of 3 nF. An upper limit of the voltage ripple can be determined by settingthe LDO into open-loop configuration. By applying the charge conservation law, the voltage ripple∆VCORE caused by switching of the digital logic gates can be expressed as:

VCORE · CLOAD = (VCORE +∆VCORE)(CLOAD +

k∑i=1

(αi · Ci))

∆VCORE = −

k∑i=1

(αi · Ci)

CLOAD +k∑i=1

(αi · Ci)· VCORE (4.32)

where αi is the activity rate, and Ci is the gate capacitance of a single digital logic gate. By insertingEq. 4.29 into Eq. 4.32, and assuming the load capacitance is sized much larger than the effectiveswitching capacitance, the voltage ripple can be approximated as:

∆VCORE ∼= −ILOAD

fCLK · CLOAD(4.33)

Thereby, the MCU digital core is, for simplicity, assumed to have the same effective switching ca-pacitance at each clock edge. The resulting voltage ripple ∆VCORE can be considered as di/dt-noiseand thus adds to the overall LDO tolerance window. To limit this voltage ripple to a sufficient levelof some ten millivolts, a minimum on-chip capacitance is required. This capacitance is proportionalto the maximum load current and is typically in the range of some nanofarads for an ultra-low-powerMCU system.

The required capacitance estimated by the above approximation is thereby in good accordancewith the one determined at the digital backend design. Independent of the LDO compensationscheme, the on-chip decoupling capacitance serves as first line of defense to suppress the voltageripple. At the digital backend design, a certain lead inductance is assumed, which prevents thecharge from being provided immediately by the external decoupling capacitance. This, however, isanalogous to the impact caused by the LDO output impedance, which shows an inductive behaviorfor frequencies above its gain-bandwidth. Abstractly speaking, both the LDO output impedanceand the package inductance enforce very similar conclusions on the required minimum on-chipcapacitance.

4.3.3 External Load Capacitance

While an internally compensated LDO solely relies on the on-chip decoupling capacitance, an exter-nally compensated LDO requires an additional, large (external) load capacitance. For this purpose,a multilayer ceramic capacitor is most commonly used with a capacitance in the range of some100 nF and a small equivalent series resistance (Chava and Silva Martinez, 2004; Rincon-Mora,2009, p. 21). An equivalent circuit model of the external load capacitance as part of the MCUpower supply system is depicted in Fig. 4.11(a). Besides the load capacitance, this equivalent cir-cuit exhibits a parasitic equivalent series resistance RESR as well as a parasitic equivalent seriesinductance LESL. The equivalent series resistance combines the on-chip resistance, the bond wire


-50

-25

0

25

50

75

100

10 100 1000 10000 1000001000000100000001000000001E+09 1E+10

Pass-TransistorExternal

CapacitanceOn-Chip Power

Distribution Network

CLOAD ILOAD

gm·VSG VOUT

rds

CLOAD

RESR

LESL

VDD

(a)

Frequency (Hz)

Ou

tpu

t Im

pe

da

nc

e [Ω

]

CLOAD,int = 5nF

CLOAD,ext = 500nF

RESR = 100mΩLESL = 10nH

(b)

external load capacitance

internal load capacitance

Fig. 4.11. (a) Equivalent circuit model of the external load capacitance as part of the MCU powersupply system, and (b) its simulated output impedance zout,P ASS . The parasitic inductance LESL forms incombination with the external load capacitance CLOAD,ext a series resonance circuit, causing gain peakingat frequencies beyond the gain- bandwidth.

resistance, the package lead frame resistance as well as the equivalent series resistance of the capac-itor itself. The same applies to the equivalent series inductance. In accordance to the considerationsin the preceding section, the on-chip power distribution network is modeled by a lumped currentsource as well as a lumped decoupling capacitance - both of which are connected in parallel to theexternal load capacitance.

By evaluating the equivalent circuit model depicted in Fig. 4.11(a) - thereby assuming theequivalent series resistance is much smaller than the channel resistance of the pass-transistor(RESR rds,PASS), and neglecting the high frequency components - the output impedancezout,PASS can be expressed as:

zout,PASS = rds,PASS ||1+s·RESR ·CLOAD,ext−s2 ·LESL ·CLOAD,ext

s · CLOAD,ext|| 1s·CLOAD,int

zout,PASS ∼=rds,PASS ·

(1+s·RESR ·CLOAD,ext−s2 ·LESL ·CLOAD,ext

)1+s·rds,PASS ·(CLOAD,ext+CLOAD,int)+

s2 ·(rds,PASS ·RESR ·CLOAD,ext ·CLOAD,int−LESL ·CLOAD,ext) (4.34)

Fig. 4.11(b) illustrates the output impedance zout,PASS for a rather typical LDO configurationwith external load capacitance designed to supply the MCU digital core of an ultra-low-powerMCU system. Besides the previously considered LHP-zero ωz,ESR, the output impedance exhibitsalso a right-half plane (RHP) zero, arising due to the parasitic inductance LESL. Its location canbe approximated as:

ωz,ESL ∼=RESRLESL

(4.35)

Both the LHP-zero and the RHP-zero fall into a similar frequency range as the resonance frequency,which is determined by the external load capacitance CLOAD,ext connected in series to the parasiticinductance LESL. Beyond the resonance frequency, the output impedance starts to increase againwith frequency, before it ultimately decreases due to the internal load capacitance CLOAD,int,forming a parallel resonance circuit in combination with the parasitic inductance LESL.

In conclusion, it is crucial to take the parasitic effects of the bond wires into account for designand simulation of the externally compensated LDO topology. The bond wire inductance significantly

4.4 Summary 71

affects the output impedance “seen” by the LDO. Particularly the effect of gain peaking in betweenof the two resonance frequencies needs to be considered carefully for compensation of the externallycompensated LDO topology. Due to the direct connection of the on-chip integrated load capacitance,these parasitic effects are eliminated for the internally compensated LDO topology, and thus canbe ignored in this case.

4.4 Summary

By employing negative feedback, a low-dropout voltage regulator (LDO) generates a constant outputvoltage lower than its supply voltage. Ideally, an LDO should have a high voltage gain for an accuratefinal value as well as a large gain-bandwidth for a fast transient response, while loop stability isguaranteed under all operating conditions. In practice, the effectiveness of the LDO is limited tocertain operating conditions, which obviously have to meet the target application requirements. Inthis work, the LDO is intended to supply the digital core of an ultra-low-power MCU system, whileavoiding the need for any additional capacitance, neither off-chip nor on-chip. Instead, the alreadyavailable on-chip decoupling capacitance, which is an intrinsic part of the MCU digital core anddetermined during the digital backend design, serves at the same time as LDO load capacitance.

The design of internally compensated LDO topologies has recently received a lot of attention inliterature. While conventional LDO topologies are compensated by a large (external) capacitancewith the dominant pole being associated to the output node, most of the presented, internally com-pensated LDO topologies use some form of Miller compensation to establish an internal, dominantpole. These LDO topologies in principle suffer from stability issues at low load conditions. Thereby,the minimum load current required to maintain loop stability increases for a larger capacitance atthe LDO output. For example, the LDO topology presented by Guo and Leung (2010) requires aminimum load current of 3 mA while the capacitance is limited to 50 pF in order to maintain loopstability.

Commonly, an LDO is specified, designed and verified for DC load currents. In contrast, theMCU digital core creates large current spikes when switching synchronously to the system clock.As an LDO designed for low quiescent current is too slow to react on fast current spikes, eachcurrent spike results in a voltage ripple. To limit this voltage ripple to a sufficient level of some tenmillivolts, a minimum on-chip decoupling capacitance is required, which depends on the maximumload current and is in the range of some nanofarads in case of an ultra-low-power MCU system.This, however, is in strong contrast to the needs of the previously presented, internally compensatedLDO topologies. Consequently, these LDO topologies are not best suited to supply CMOS digitalcircuits.

5

Any-Load Stable LDO Topology

An alternative LDO topology, in the following referred to as any-load stable LDO, has been proposedby Ivanov (2008). As shown in the subsequent chapter, this LDO topology achieves both highaccuracy and fast transient response by combining two stages in two feedback loops, thereby forminga multiple-loop LDO topology. In this way, the fundamental trade-off between high voltage gain,large gain-bandwidth and loop stability under all operating conditions is resolved. As a result, aconcurrent optimization of these parameters during design is enabled. While the any-load stableLDO follows the external LDO compensation scheme, it can be easily adapted to operate with aload capacitance ranging between a few nanofarads in case of an on-chip integration and a fewmicrofarads in case of an off-chip component.

This chapter provides a detailed circuit analysis of the any-load stable LDO. By deriving anequivalent circuit model, the fundamental design trends and trade-offs are identified and quantified.A sensitivity analysis is performed with regard to process, voltage and temperature variations. TheLDO circuit analysis is concluded by deriving concise LDO scaling laws from the detailed resultspreviously obtained. Complementing the LDO scaling laws, this chapter examines the impact andbenefits of CMOS technology migration for LDOs. This particularly includes the presentation ofan alternative pass-transistor topology, enabling technology scaling for LDOs by combining a lowvoltage thin-oxide pass-transistor with a high-voltage thick-oxide protection device.

The circuit analysis throughout the following chapter is based on the basic Shichman-Hodgestransistor model (Shichman and Hodges, 1968) in order to find analytical solutions predicting thecircuit behavior to first order. For modern deep submicron CMOS technologies, the basic transistorequations can however only provide approximate results. If appropriate, the analytical results arethus compared with those obtained from circuit simulation for ultimate confirmation.

5.1 Multiple-Loop LDO Topology

The basic principle of a multiple-loop LDO topology in general and the any-load stable LDO inparticular can be best revealed by examining the control tasks of an LDO more closely. Ideally, itshould generate a constant output voltage independent of its environment and its operating condi-tions. This particularly includes supply voltage, load current, temperature and process variations.These variations, however, exhibit fundamentally different time constants. While temperature andprocess vary only very slowly over time, supply voltage and load current are subject to abrupt tran-

74 5 Any-Load Stable LDO Topology

-gm

rout

Slow Stage with

High Gain

Fast Stage with

Low GainPass-Transistor

EASLOW

-gm

rout Cout

+-

MPASS

VAMP VGATE VCORE

Cutting both

feedback loops

Slow Loop Fast Loop

gm

rout

EAFAST

Cout Cout

p2p0 p1

, , , , , , , , , v SLOW m SLOW out SLOW v FAST m FAST out FAST v PASS m PASS out PASSA g r A g r A g r

Fig. 5.1. Small-signal block diagram of the any-load stable LDO topology illustrating the active feed-forward compensation scheme.

sient variations. The control tasks can thus be separated into two problems. (1) For any steady-statecondition of supply voltage, load current, temperature, and process parameters, the LDO outputvoltage should be accurately equal to the reference voltage. (2) During fast transient variations, theLDO output voltage must remain within the specified LDO tolerance window. In order to solve thefirst problem, the LDO feedback loop should have a high voltage gain, while the gain-bandwidthis uncritical. The second problem requires a much smaller voltage gain, while a very high gain-bandwidth is essential. Due to their different requirements, the two problems should be addressedseparately by decomposing the control tasks into constituent feedback loops (Hazucha et al., 2005;Ivanov, 2008).

The any-load stable LDO proposed by Ivanov (2008) is, for this purpose, based on a recursivestructure of two nested feedback loops with a single input (feedback source) and a single output(pass-transistor). Thereby, the gain-bandwidth increases starting from the outer feedback loop to theinner feedback loop while the voltage gain decreases. Fig. 5.1 shows the small-signal block diagramof the any-load stable LDO. In detail, this topology is composed of a slow error amplifier stageEASLOW with high voltage gain, of a fast error amplifier stage EAFAST with low voltage gain aswell as the pass-transistorMPASS . The small-signal voltage gains of these stages are correspondinglydenoted by Av,SLOW , Av,FAST and Av,PASS . The frequency response of the slow feedback loop isdominated by the pole p0, which is associated to the output of the slow error amplifier stage. Thefrequency response of the fast feedback loop is in contrast dominated by the pole p1, which isdetermined by the huge load capacitance at the LDO output (Cout,PASS ∼= CLOAD). Another polep2 is associated to the output of the fast stage.

In summary, the inner, fast feedback loop provides a large gain-bandwidth, but only a limitedvoltage gain. It is thus responsible for the fast LDO transient response. The outer, slow feedbackloop in contrast provides a high voltage gain, thus greatly improving the static LDO accuracy. Inthis way, the fundamental trade-off between a high voltage gain for an accurate final value as wellas a large gain-bandwidth for a fast transient response is resolved.

5.1 Multiple-Loop LDO Topology 75

5.1.1 Loop Stability in a Multiple-Loop System

The any-load stable LDO achieves loop stability at any load conditions by using an active feed-forward compensation scheme (Thandri and Silva-Martinez, 2003). This compensation scheme canbe best illustrated by again considering the small-signal block diagram depicted in Fig. 5.1. Theoverall LDO open-loop transfer-function HOL (s) results from superposition of the transfer-functionof the slow feedback loop as well as that of the fast feedback loop. An open-loop analysis therebyrequires the cut of both feedback loops.

HOL (s) = −

(Av,SLOW ·Av,FAST +Av,FAST ·

(1 + s

ωp0

))·Av,PASS(

1 + sωp0

)(1 + s

ωp1

)(1 + s

ωp2

) (5.1)

HOL (s) = −

((Av,SLOW + 1) ·Av,FAST ·Av,PASS ·

(1 + s

(Av,SLOW+1)·ωp0

))(

1 + sωp0

)(1 + s

ωp1

)(1 + s

ωp2

) (5.2)

Evaluating the above transfer-function at low frequencies (s = 0) yields the open-loop voltage gainof the any-load stable LDO. It is dominated by the voltage gain of the slow stage.

HOL (s = 0) = (Av,SLOW + 1) ·Av,FAST ·Av,PASS (5.3)

The overall open-loop transfer-function HOL (s) possesses three poles and one left-half-plane (LHP)zero, which arises due to the summation of the two feedback loops. The first dominant pole p0 isassociated to the output of the slow stage. By introducing a capacitance at this node, the poleis designed to reside at very low frequencies. The negative phase shift of this first dominant poleis compensated by the positive phase shift of the LHP-zero z0. Evaluating the numerator of theopen-loop transfer-function HOL (s) yields the frequency of the zero:

ωz0 = (Av,SLOW + 1) · ωp0 ∼= Av,SLOW · ωp0 (5.4)

which is equal to the gain-bandwidth of the slow stage. A second dominant pole p1 is associatedto the LDO output. This pole widely moves with load current due to varying channel resistanceof the pass-transistor MPASS . It ranges between p0 at no load condition and p2 at maximum loadcondition. The pole p2 is the first non-dominant pole and is associated to the output of the faststage.

The overall open-loop transfer-function HOL (s) is depicted in Fig. 5.2 at different load condi-tions. To identify the constraints for loop stability, a case-by-case analysis is required.1. At low load conditions, the pole frequency ωp1 shifts below the zero z0 resulting in a pole-pole-

zero configuration. The phase shift of the first dominant pole p0 is thus not fully compensatedand the loop stability is degraded. The phase margin in this case is given by:

φm = 180 − arctan(ωGBWωp0

)+ arctan

(ωGBW

Av,SLOW · ωp0

)− arctan

(ωGBWωp1

)(5.5)

2. With increasing load current, the pole frequency ωp1 increases resulting in a pole-zero-poleconfiguration at mid-range load conditions. The phase shift of the first dominant pole p0 is fullycompensated by the zero z0, while the first non-dominant pole p2 is still located well-beyondthe LDO gain-bandwidth. The multiple-loop topology is hence definitely stable.


-60

-30

0

30

60

90

120

-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00

-180

-150

-120

-90

-60

-30

0

-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00

Vo

lta

ge

Ga

in [

dB

]P

ha

se

[d

eg

]

Frequency (Hz)

1m 10m 1 10 100 1k 10k 100k 1M

at min. load

at max. load

100m 10M 100M

p0

p1

increasing

load current

decreasing

load current

z0 p2

at max. load

p1 p1

at min. load

Fig. 5.2. Schematic Bode plot of the any-load stable LDO at different load conditions. For clarity of thecompensation scheme, the pass-transistor voltage gain is here assumed to be independent of load current.

3. At high load conditions, the LDO loop stability is limited by the first non-dominant pole p2.As the phase shift of the first dominant pole p0 is fully compensated by the zero z0, the phasemargin corresponds to that of a two-pole system and is thus given by:

φm = 180 − arctan(ωGBWωp1

)− arctan

(ωGBWωp2

)(5.6)

In conclusion, and as illustrated in Fig. 5.3, two constraints can be identified for LDO loopstability at any load conditions: (1) At no load condition, the second dominant pole p1 must notmove to very low frequencies. (2) At maximum load condition, the first non-dominant pole p2 muststill be located well above the LDO gain-bandwidth. Both requirements can be easily ensured bycircuit design techniques, as will be demonstrated in Chap. 5.3. Thereby, the design of the any-loadstable LDO with external compensation is predominantly bounded by the first constraint, sincethe second dominant pole p1 is in this case located at comparably low frequencies. In contrast, thepole p1 moves to a higher frequency range when removing the large (external) load capacitance.The LDO design therefore becomes bounded by the second constraint in case of a small on-chipintegrated load capacitance.

The impact of the multiple-loop LDO topology, composed of the slow, high-gain stage and thefast, low-gain stage, is also clearly evident in the transient response. The unmatched pole-zerocancellation causes slow-settling components appearing in the LDO transient response (Kamath

5.1 Multiple-Loop LDO Topology 77

0

30

60

90

120

-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00

Ph

as

e M

arg

in [

de

g]

Pole Frequency ωp1 (Hz)

1m 10m 1 10 100 1k 10k 100k 1M100m 10M 100M

p0 z0 p2

Stable Range

Fig. 5.3. Phase margin of the any-load stable LDO at different load conditions. For clarity of the compen-sation scheme, the pass-transistor voltage gain is here assumed to be independent of load current.

et al., 1974). However, an LDO designed for supplying digital circuits does not aim to obtain veryfast settling, but instead aims to maintain the output voltage within a specified tolerance window(as revealed in Chap. 4.3.1). A slow settling at the LDO output can thus be tolerated for thisapplication.

5.1.2 Circuit Implementation

An appropriate circuit representation of the any-load stable LDO is depicted in Fig. 5.4. The slowstage is realized by a folded-cascode amplifier loaded by the capacitance C1. The fast stage, includingthe pass-transistor, is realized by a cascoded flipped voltage follower (FVF) (Ramirez-Angulo et al.,2005). The cascoded FVF aims to keep the LDO output voltage equal to VCORE = VAMP+VSG,CG1,while the folded-cascode amplifier in turn provides the set-point to the cascoded FVF by accuratelycontrolling the node VAMP . The two feedback loops are combined by a single transistor,MCG1. Forthe slow feedback loop (assuming the node voltage VCORE is fixed), this transistor acts as common-source stage with the node VAMP as non-inverting input. For the fast feedback loop in contrast(assuming the node voltage VAMP is fixed), it acts as a common-gate stage with the node VCORE asinverting input. Consequently, the transistor MCG1 provides the same gain to both feedback loops.

Besides the current source capability determined by the pass-transistor MPASS , the any-loadstable LDO also provides some current sink capability, which is realized by the dedicated sinktransistor MSINK . In this way, the LDO transient response for negative load current steps andpositive supply voltage steps, respectively, is improved, thus preventing conditional loop instability.The LDO current sink capability will be addressed in more detail in Chap. 5.7.

In conclusion, the any-load stable LDO “can sink and source current to the load, it has ex-ceptional bandwidth for any given process and quiescent current, and it is stable with any loadcapacitance” (Ivanov, 2008, p. 349). Both stages, the folded-cascode amplifier and the cascodedflipped voltage follower, are investigated and analyzed separately in the subsequent chapters. Forthis purpose, the any-load stable LDO is ripped up at the folded-cascode amplifier output anddivided into its two stages.


+

-

R1

MPASS

MCG2

MSINK

IBIAS2

C1

VREF

EASLOW

VDD

CLOAD

VCORE

IBIAS1

IBIAS3

ILOAD

VDD VDD VDD

MCG1

Slow Loop

Fast Loop

Folded-Cascode

Amplifier

Cascoded

Flipped Voltage Follower

VAMP

p0

p1

p2

Fig. 5.4. Circuit diagram of the any-load stable LDO composed of a folded-cascode amplifier and a cascodedflipped voltage follower.

5.2 Slow Stage: Folded-Cascode Amplifier

Owing to its high voltage gain, the slow stage determines the static accuracy of the any-load stableLDO, including line regulation, load regulation and offset error. The implementation of the slowstage is rather uncritical. In principle, any differential amplifier topology with a single pole responsecan be used. Here, a folded-cascode amplifier with an NMOS differential input pair is chosen, whichcircuit diagram is shown in Fig. 5.5. The folded-cascode amplifier features a wide common-modeinput range, which is independent of the output voltage level. In this way, a unity-gain feedback(βFB = 1) is enabled, thereby omitting the resistive feedback divider commonly required in theLDO feedback path. This saves silicon area and, above all, current flowing through the resistivefeedback divider.

Small-Signal Analysis

The folded-cascode amplifier provides a high voltage gain resulting in an excellent line and loadregulation of the any-load stable LDO. The small-signal voltage gain can be expressed as:

Av,EA1 ∼= gm,1 · rout,EA1 (5.7)

rout,EA1 ∼= (gm,6 · rds,6 · (rds,2||rds,4)) || (gm,8 · rds,8 · rds,10) (5.8)where gm,1 is the transconductance of the differential input transistors, and rout,EA1 is the outputresistance of the folded-cascode amplifier. The frequency response of the folded-cascode amplifieris primarily determined by the pole at its output, which is defined as:

5.2 Slow Stage: Folded-Cascode Amplifier 79

VG

S,1 +

VD

S,1

1 +

VD

S,1

2V

SD

,3 +

Vth

,n

VD

S,8 +

VD

S,1

0V

SD

,4 +

VS

D,6

C1

VAMPVREF VCOREM1 M2

M4

M6

M3

M5

VDD VDD

VCASC,P

(VCORE - VSG,CG1)

VCASC,N M8M7

M10M9

M11

M12VBIAS,N

Fig. 5.5. Circuit diagram of the folded-cascode amplifier. Due to its wide common-mode input range,unity-gain feedback is enabled, thereby omitting the resistive feedback divider.

ωp0 ∼= −1

rout,EA1 · C1(5.9)

The total capacitance at the amplifier output is dominated by the compensation capacitance C1.The gain-bandwidth of the folded-cascode amplifier corresponds to the LDO left-half-plane zero z0,and can be expressed as:

ωz0 = Av,EA1 · ωp0 ∼= −gm,1C1

(5.10)

Design Guidelines

Though disregarded by the majority of publications, a static offset at the LDO output can signifi-cantly contribute to the overall LDO tolerance window. For the any-load stable LDO, this offset isdetermined by random mismatch of the folded-cascode amplifier, primarily of its differential inputtransistors, M1 and M2. Good matching of these transistors can be achieved by the well-knowndesign and layout techniques, which however require additional silicon area (Bastos et al., 1996).

Besides the differential input transistors, the area of the folded-cascode amplifier is dominatedby the capacitance C1. Particularly in case of an external LDO compensation, the first dominantpole p0 needs to be pushed to very low frequencies in order to maintain loop stability at no loadcondition, and this capacitance becomes rather large. An area-efficient implementation is enabledby using a MOS capacitor operated in accumulation region. Due to its thin gate oxide, a MOScapacitor offers a substantially larger capacitance per unit area compared to other capacitor types(e.g. MIM capacitors). At the same time, the folded-cascode amplifier is loaded by the gate leakageof the capacitance C1. This limits the voltage gain particularly at high temperature and mightultimately introduce a systematic offset to the folded-cascode amplifier.


Based on these considerations, an interesting aspect of circuit sizing can be revealed. At this,both the differential input transistors,M1 andM2, and the cascode transistors,M5 throughM8, areassumed to be operated in deep weak inversion. Moreover, the transistorsM3 andM4 as well asM9andM10 are assumed to be operated in strong inversion. By decreasing the bias current of the folded-cascode amplifier under these assumptions, its output resistance increases proportionally while itssmall-signal voltage gain is maintained. To maintain at the same time also the pole frequency ωp0,the required capacitance C1 needs to be decreased proportionally. In this way, both silicon area andquiescent current can be reduced concurrently. This still holds true when taking into account theloading due to the gate leakage of the capacitance (since the gate leakage is proportional to thecapacitance). Scaling down the bias current of the folded-cascode amplifier is ultimately limited byleakage currents as well as matching requirements. An additional limit for bias current scaling arisesfrom the capacitance C1. To avoid capacitive coupling, this capacitance is in any case required to bemuch larger than the parasitic gate capacitance of the subsequent common-gate transistor MCG1.

5.3 Fast Stage: Cascoded Flipped Voltage Follower

Owing to its large gain-bandwidth, the fast stage determines the transient behavior of the any-loadstable LDO. It must be able to provide large load currents to a highly capacitive load. For thisreason, a voltage buffer commonly referred to as cascoded flipped voltage follower is used here.Since this topology has yet received only limited attention in literature (Ramirez-Angulo et al.,2005; Hazucha et al., 2005), it is derived stepwise from the basic voltage follower.

5.3.1 Basic Voltage Follower

The most basic voltage follower is realized as a single transistor in common-drain configuration asdepicted in Fig. 5.6(a). It forms the basis for any non-low-dropout linear voltage regulator topologywith NMOS pass-transistor. Due to its low output impedance (rout ∼= 1/gm,PASS ) and large gain-bandwidth (ωGBW ∼= gm,PASS/CLOAD), the basic voltage follower offers substantial advantages forloop stability and transient response. However, unless an additional charge pump is used to boostthe gate node of the NMOS pass-transistor above the positive supply voltage (as demonstrated forexample by Kruiskamp and Beumer, 2008; Camacho et al., 2009), the basic voltage follower is notable to support low dropout voltages.

5.3.2 Flipped Voltage Follower

An alternative voltage follower, commonly referred to as flipped voltage follower (FVF), has recentlyreceived a lot of attention in literature (Carvajal et al., 2005 as well as Bargagli-Stoffi, 2006 shall bementioned representatively). As depicted in Fig. 5.6(b), the FVF is essentially composed of a localseries-shunt feedback loop. The common-gate transistor MCG1 senses the output, while the PMOSpass-transistor MPASS provides the load current to the output. In this way, the FVF aims to keepits output voltage equal to VCORE = VAMP +VSG,CG1. By using a PMOS pass-transistor, the FVFovercomes the dropout voltage limitations of the basic voltage follower. On the other hand, at alarge supply voltage, the source-drain voltage of the common-gate transistor is “strangled” by thesource-gate voltage of the pass-transistor. To keep the common-gate transistor in saturation region,the voltage drop over the pass-transistor is therefore restricted to:

5.3 Fast Stage: Cascoded Flipped Voltage Follower 81

MPASS

(NMOS)

CLOAD

VCORE

VAMP

ILOAD CLOAD ILOAD

VAMP

VCORE

VGATE

(a) (b) VDDVDD

MCG1

VBIAS

MCM1

MPASS

(PMOS)

Fig. 5.6. Circuit diagram of (a) the basic voltage follower, and (b) the flipped voltage follower to providelarge load currents to a highly capacitive load.

VSD,PASS = VDD − VCORE < VSG,PASS − VSD,CG1(sat) (5.11)

which is in the range of some hundred millivolts for modern deep submicron CMOS technologies.By utilizing local series-shunt feedback, the FVF enables an improved control performance

compared to the basic voltage follower. Cutting the feedback loop at the node VGATE , and assumingfor simplicity an ideal current source (rds,CM1 →∞), the small-signal voltage gain can be conciselyexpressed as:

Av,FV F ∼= −gm,PASS · rout,GATE (5.12)rout,GATE ∼= gm,CG1 · rds,CG1 · rds,PASS (5.13)

where rout,GATE denotes the resistance at the node VGATE . The open-loop transfer-function ofthe FVF exhibits two low-frequency poles, which makes it potentially unstable when driving largecapacitive loads. The one pole p1 is associated to the LDO output node, while the other pole p2is associated to the gate node of the pass-transistor. Of particular importance, and in contrastto the basic LDO topology introduced in Chap. 4.1.3, the output pole p1 is not solely definedby the channel resistance of the pass-transistor MPASS , but also by the transconductance of thecommon-gate transistor MCG1.

ωp1 ∼=1(

rds,PASS || 1gm,CG1

)· CLOAD

(5.14)

The pole p2 is determined by the large gate capacitance of the pass-transistorMPASS in combinationwith the channel resistance of both the common-gate transistor MCG1 and the current-mirrortransistor MCM1.

ωp2 ∼=1

(rds,CG1||rds,CM1) · (CGS,PASS +Av,PASS · CGD,PASS) (5.15)

Noteworthy, the FVF forms the basis of an LDO with single-transistor control proposed byMan et al. (2008). Under the absence of a large (external) load capacitance, the dominant poleof this LDO topology is associated to the gate node of the pass-transistor just as for any otherinternally compensated LDO topology. However, since the transconductance of the common-gate


transistor MCG1 becomes dominating at low load conditions, the pole p1 is prevented from movingto very low frequencies. In this way, loop stability can be maintained also at low load conditions.Nevertheless, the maximum load capacitance is limited to a few hundred picofarads at reasonabletransconductance levels and bias current levels, respectively. This, however, is in strong conflictto the minimum load capacitance requirements identified in Chap. 4.3.2. Furthermore, the single-transistor control LDO topology suffers from two additional drawbacks: (1) Due to its comparablylow voltage gain, the output voltage is not very accurate when used as sole gain stage. (2) Itsoperation is limited to low dropout voltages. In conclusion, the FVF is not suited for implementingthe fast stage of the any-load stable LDO topology. However, it forms the basis of the cascodedflipped voltage follower, which is introduced subsequently.

5.3.3 Cascoded Flipped Voltage Follower

To extend the output voltage range of an amplifier, the concept of folded-cascode is well knownand widely applied (Razavi, 2001, p. 301). As demonstrated by Ramirez-Angulo et al. (2005),this technique can be applied to the flipped voltage follower (FVF) as well. The circuit diagramof the resulting cascoded FVF is depicted in Fig. 5.7(a). An NMOS cascode transistor MCG2 isintroduced between the gate node of the pass-transistorMPASS and the drain node of the common-gate transistor MCG1. In this way, the voltage headroom of MCG1 becomes independent of thepass-transistor voltage drop. It is instead defined by the bias voltage VBIAS2.

VSD,CG1 = VCORE − (VBIAS2 − VGS,CG2) (5.16)

The bias voltage is chosen such that both the current-mirror transistorMCM1 and the common-gatetransistor MCG1 are always operated in saturation region. It can be for instance easily generatedby two stacked transistors in diode configuration. Due to the additional voltage gain provided byMCG2, the voltage at the folding node VFOLD exhibits only small variations.

By utilizing series-shunt feedback, the cascoded FVF aims to keep the output voltage equalto VCORE = VAMP + VSG,CG1. Compared to the flipped voltage follower, however, the transistorMCG2 provides additional voltage gain to the feedback loop. By again cutting the feedback loop

MPASS

CLOAD ILOAD

VCORE

VGATE

VDD

VAMPVBIAS2

VDD

VBIAS1

MCM1

MCG1MCG2

MCM2

VBIAS3

MPASS

CLOAD ILOAD

VCORE

VGATE

VDD

VAMPVBIAS2

VDD

VBIAS1

MCM1

MCG1MCG2

R1

(a) (b)

VFOLDVFOLD

Fig. 5.7. Circuit diagram of the cascoded flipped voltage follower (FVF) (a) with current source (MCM2),and (b) with resistor (R1).

5.3 Fast Stage: Cascoded Flipped Voltage Follower 83

at the node VGATE , and assuming the current sources to be ideal (rds,CM1, rds,CM2 → ∞), thesmall-signal voltage gain can be concisely expressed as:

Av,CASFV F ∼= −gm,PASS · rout,GATE (5.17)rout,GATE ∼= gm,CG1 · gm,CG2 · rds,CG1 · rds,CG2 · rds,PASS (5.18)

where rout,GATE denotes the resistance at the node VGATE . In accordance to the FVF, also thecascoded FVF exhibits two low-frequency poles: The one pole p1 is associated to the LDO outputnode, while the other pole p2 is associated to the gate node of the pass-transistor. Another polearises at the folding node VFOLD, which is however located well above the unity-gain frequency,and can thus be neglected.

Noteworthy, the cascoded FVF forms the basis of several previously presented LDO topologies(Hazucha et al., 2005; Or and Leung, 2010; Guo and Leung, 2010). Each of these LDO topologiesuses the cascoded FVF as single gain stage. The set-point of the cascoded FVF at the node VAMP isthereby provided by a simple reference generation circuitry utilizing some form of open-loop trackingtechnique. However, due to the absence of any high-gain stage, these LDO topologies suffer froma poor static accuracy. Under the absence of a large external load capacitance, the dominant poleof these LDO topologies is located at the gate node of the pass-transistor just as for any otherinternally compensated LDO topology (see Chap. 4.2.2). In contrast to those topologies, however,the output pole is not solely defined by the channel resistance of the pass-transistor MPASS , butalso by the transconductance of the common-gate transistor MCG1. At low load conditions, theoutput pole frequency is dominated by the transconductance, in this way preventing the pole frommoving down to very low frequencies. To effectively avoid stability issues, a large transconductanceis required, which in turn results in a quiescent current penalty. The LDO presented by Hazuchaet al. (2005), for example, achieves loop stability over the entire load current range at a loadcapacitance of 600 pF by using a considerable LDO quiescent current of 6 mA. For the any-loadstable LDO, the cascoded FVF is in contrast embedded into the multiple-loop topology (also seeChap. 5.1). The set-point of the cascoded FVF at the node VAMP is accordingly controlled by thefolded-cascode amplifier with its high voltage gain resulting in a greatly improved static accuracyof the LDO output voltage.

Pole-Placement Strategy

Embedding the cascoded flipped voltage follower (FVF) into the multiple-loop topology demandsan alternative strategy for pole placement. In accordance to the constraints identified in Chap. 5.1.1,the pole p1 associated to the output node is made dominant. The low-frequency pole p2 associatedto the gate node of the pass-transistor is accordingly the first non-dominant pole here. Its locationis determined by the large gate capacitance of the pass-transistor as well as the channel resistance ofboth the current-mirror transistor MCM2 and the folding transistor MCG2. To ensure loop stabilityalso at high load conditions, the first non-dominant pole p2 needs to be pushed beyond the unity-gain frequency of the cascoded FVF. For this reason, and as depicted in Fig. 5.7(b), the currentsource MCM2 is replaced by a resistor R1. In this way, loop stability of the cascoded FVF becomesa free design-parameter, which can be easily adjusted during design. This, however, comes at theexpense of a reduced voltage gain and an increased LDO quiescent current, respectively. At lowload conditions, in contrast, the pole p1 moves to lower frequencies and potentially collides with thepole p0, which is associated to the folded-cascode output. The transconductance ofMCG1, however,


prevents the pole to move to very low frequencies such that loop stability can be achieved also atno load condition. At this, the gain-bandwidth of the folded-cascode amplifier (essentially definedby the compensation capacitance C1) can be traded-off against the transconductance of MCG1(essentially defined by the bias current IDS,CM1). The pole-placement strategy and the involvedtrade-offs are further elaborated in the subsequent detailed circuit analysis in Chap. 5.4.

From a large-signal perspective, the cascoded FVF with resistor can be considered as a currentswitch. The predefined bias current IDS,CM1 is divided: While the one portion I1 flows throughMCG1, the other portion I2 flows through MCG2 thereby controlling the pass-transistor MPASS viathe voltage drop over the resistor R1.

IDS,CM1 = I1 + I2 (5.19)

I2 = IDS,CG2 = VSG,PASSR1

(5.20)

I1 = ISD,CG1 = IDS,CM1 −VSG,PASS

R1(5.21)

In low load conditions, the pass-transistor is operated in weak inversion with a small source-gatevoltage only. The major portion of the bias current IDS,CM1 hence flows throughMCG1. In contrast,the pass-transistor is operated in strong inversion at high load conditions with a correspondinglyhigher source-gate voltage. The current flowing through MCG1 is thus reduced. The total draincurrent of the pass-transistor can be furthermore defined as:

ISD,PASS = I1 + ILOAD (5.22)

Design Guidelines

The design and sizing of the cascoded flipped voltage follower (FVF) is closely related to the LDOspecification. To begin with, a general design guideline and procedure for the cascoded FVF isprovided in the following. For a given LDO specification, including a simplified model of the loadenvironment, the cascoded FVF is designed starting with the pass-transistorMPASS , to the resistorR1 and determination of the required bias current, and ending with the common-gate transistorMCG1.

The ability to source high load current while achieving low dropout voltage requires the use of alarge size PMOS pass-transistor. In the most basic case, the pass-transistor MPASS is realized by asingle, thick-oxide device with minimum channel length. As the pass-transistor has a great impact onthe LDO performance, its design is addressed separately in Chap. 5.6, including the proposal of analternative pass-transistor topology. The pass-transistor size, or more precisely its gate capacitance,in combination with the resistance R1 directly determine the location of the first non-dominant polep2. By sizing of the resistor, this pole can be pushed to sufficiently high frequencies such that loopstability is maintained also at maximum load conditions. The resistor preferably provides a high-sheet resistance while exhibiting only a small parasitic substrate capacitance. The bias current of thecascoded FVF is determined such that the pass-transistor gets fully turned on ( I2 = VSG,PASS(max)/R1)at maximum load current and lowest dropout voltage while still sufficient bias current flows throughthe transistor MCG1 (ISD,CG1 = IDS,CM1 − I2) to achieve a certain voltage gain. The bias currentis defined by the current-mirror transistor MCM1. As its drain-source voltage exhibits only small

5.4 Analysis of Cascoded Flipped Voltage Follower 85

variations, a simple current mirror is adequate here; no cascode current mirror structure is required.To maximize the voltage gain of the cascoded FVF for a given bias current, the common-gatetransistor MCG1 is operated in deep weak inversion. To keep its parasitic capacitances small, thetransistor MCG1 is realized by a thin-oxide device with its channel length being minimized. At thesame time, the transconductance gm,CG1 prevents the pole frequency ωp1 to collide with the firstdominant pole p0 at no load condition. In this way, the transconductance (and thus the LDO biascurrent) can be traded-off against the size of the capacitance C1 at the folded-cascode amplifieroutput. The sizing of the common-gate transistorMCG2 is in contrast rather uncritical. As it mightsee large voltages across its source-drain nodes, the transistor is required to be a thick-oxide device.For proper operation, it must always remain in saturation region, whereas the limiting corner is atminimum supply voltage and maximum load current (with the pass-transistor source-gate voltagebeing at its maximum). To avoid additional parasitic capacitance at the pass-transistor gate-node,minimum transistor dimensions are preferred.

The performance of the cascoded FVF is essentially insensitive to relative process variations(also known as device matching). It shows in contrast a much higher sensitivity to absolute processvariations (also known as process corner variations), particularly concerning the resistor R1. Theavailability of a high-sheet resistor with high absolute accuracy can thus greatly improve the designof the cascoded FVF. A detailed sensitivity analysis including design centering aspects and mitiga-tion techniques is presented in Chap. 5.5. Strongly depending on the LDO specification, the siliconarea of the cascoded FVF is dominated either by the resistor R1 or the pass-transistor MPASS . Inthe one extreme case of a large external load capacitance and a low current drive capability, thepass-transistor becomes comparably small while the resistor becomes large. This turns vice versain the other extreme case of a small on-chip load capacitance and a high current drive capability.

5.4 Analysis of Cascoded Flipped Voltage Follower

In order to identify and quantify the fundamental design trends and trade-offs of the any-load stableLDO, a simplified equivalent circuit model is derived in the following. This model focuses on thecascoded flipped voltage follower (FVF), acting as fast stage of the multiple loop LDO topology.The any-load stable LDO is for this purpose ripped up into its two stages while the set-point at thenode VAMP is fixed to a constant voltage level. This approach is valid without limitations at mediumto high load conditions. As demonstrated in Chap. 5.1.1, the phase shift of the first dominant polep0 is under these conditions fully compensated by the left-half-plane zero z0 (ωp1 > 10 · ωz0 withωz0 = Av,SLOW · ωp0). In this way, the transfer-function of the any-load stable LDO simplifies toa second-order two-pole system. While the LDO behavior is dominated by the cascoded FVF atmedium to high load conditions, the impact of the folded-cascode amplifier cannot be neglected atlow load conditions. The LDO behavior at low load conditions, corresponding to the second criticalconstraint for LDO loop stability, is therefore addressed separately in Chap. 5.7. This also includesa detailed analysis of the current sink capability (realized by the transistor MSINK), which is hereneglected for simplicity.

Starting point for the following analysis of the cascoded FVF is its small-signal equivalent circuitdiagram as depicted in Fig. 5.8. The cascoded FVF is considered as minimal-phase, linear andsecond-order negative feedback system. Its small-signal analysis is enabled by cutting the feedbackloop at the feedback node. To take the loading effects due to the sensing transistor MCG1 intoaccount, the dummy transistor MCG1(dummy) is added to the output node, emulating the same


VDD

CLOADILOAD

VCORE

Load

CGS

CGD

MPASS

gm·VSG rds

VAMP

CGS

CGD

MCG1

gm·VSG rds

VBIAS2

VDD

CGS

CGD

MCG2

gm·VSG rds

VGATE

Cutting the

feedback loop VFB

R1

VAMP

CGS

CGD

MCG1 (dummy)

gm·VSG rds

I2I1 VFOLD

rds,CM1 rds,CM1(dummy)

Fig. 5.8. Small-signal equivalent circuit diagram of the cascoded flipped voltage follower (FVF). To takethe loading effects into account after breaking up the closed-loop, a dummy circuit is added.

DC conditions (Man et al., 2008; Razavi, 2001, p. 270ff). Its dimensions as well as its biasingconditions, defined by MCM1(dummy), are identical to that of MCG1. As will become obvious in thefurther course of this circuit analysis, the impact of the cascode transistorMCG2 and the resistor R1can be neglected for modeling the loading effects. To enable an analytical solution of the cascodedFVF, the recursive dependency of the total pass-transistor drain current on the bias current I1 isin the following neglected. The bias current I1 decreases with an increasing pass-transistor draincurrent, with the drain current in turn depending on the bias current I1. This recursive dependencycan be neglected by assuming the pass-transistor drain current is dominated by the LDO loadcurrent, and is particularly much greater than the bias current (ILOAD I1).

The following detailed small-signal analysis of the cascoded FVF is derived for all load conditions.This includes analytical expressions for the pole locations as well as the voltage gain. Based on theseresults, the loop stability and the transient behavior of the cascoded FVF are determined. In order tofind analytical solutions predicting the circuit behavior to first order, the circuit analysis relies on thebasic Shichman-Hodges transistor model (Shichman and Hodges, 1968). The analytical expressionsare verified by comparing them with the results obtained from circuit simulation using Spectre andBSIM4 models. The following analysis is for this purpose based on an exemplary implementationof the cascoded FVF, realized in a 0.13µm standard CMOS technology. The supply voltage ranges


between 1.8 V and 3.6 V, while the output voltage is specified with 1.5 V ± 0.1 V. The maximumload current is 10 mA, and the load capacitance is 5 nF. Any parasitic series impedance of the loadcapacitance is neglected throughout the following analysis. While this is an appropriate assumptionin case of an on-chip integrated load capacitance (also see Chap. 4.3.2), this is not valid for anoff-chip component. The presented, detailed results of the small-signal analyses are conclusivelycombined to a simplified equivalent circuit model of the cascoded FVF.

5.4.1 Pole Location

The frequency behavior of the cascoded FVF is modeled with a reasonable accuracy by a second-order two-pole system. The dominant pole p1 is associated to the LDO output, while the firstnon-dominant pole p2 is associated to the gate node of the pass-transistor. Another non-dominantpole, associated to the node VFOLD, is located at much higher frequencies. It is thus neglected forthe analysis of the cascoded FVF.

Dominant Pole

The frequency of the dominant pole p1 is determined by the channel resistance of the pass-transistorMPASS as well as the load capacitance CLOAD. Moreover, the output is loaded by the transcon-ductance of the sensing transistor MCG1, which needs to be taken into account particularly at lowload conditions. By evaluating the small-signal equivalent circuit diagram depicted in Fig. 5.8, thepole frequency ωp1 can be expressed as:

ωp1 = 1(rds,PASS ||rin,CG1) · CLOAD

(5.23)

rin,CG1 = (rds,CM1||rin,CG2) + rds,CG1

1 + gm,CG1 · rds,CG1

rin,CG1 =

(rds,CM1|| R1+rds,CG2

1+(gm,CG2+gmb,CG2)·rds,CG2

)+ rds,CG1

1 + gm,CG1 · rds,CG1(5.24)

where rin,CG1 and rin,CG2 are the equivalent resistances when looking into the source nodes ofMCG1 and MCG2, respectively. Since rin,CG2 rds,CM1 and rin,CG2 rds,CG1, the expression ofthe equivalent resistance rin,CG1 and consequently also of the pole frequency ωp1 can be simplifiedto:

rin,CG1 ∼=rds,CG1

1 + gm,CG1 · rds,CG1(5.25)

ωp1 ∼=1(

rds,PASS || rds,CG11+gm,CG1·rds,CG1

)· CLOAD

(5.26)

Under the approximation that the intrinsic transistor gain gm,CG1 ·rds,CG1 is much larger than one,further approximation yields:

ωp1 ∼=1(

rds,PASS || 1gm,CG1

)· CLOAD

(5.27)


The pass-transistor MPASS is operated in weak inversion for low load currents and enters stronginversion with increasing load current. Depending on the pass-transistor sizing approach, the pass-transistor might enter triode region in case of small dropout voltages and high load conditions. Oper-ation in triode region, however, is at first neglected and instead discussed separately in Chap. 5.5.1.With the pass-transistor operated in saturation region, as well as the common-gate transistorMCG1operated in weak inversion, the pole frequency ωp1 can be expressed as:

ωp1 ∼=

1(

1λwi·(ILOAD+I1) ||

n·VTI1

)·CLOAD

at low load conditions

λsi·ILOADCLOAD

at high load conditions(5.28)

Fig. 5.9(a) shows the pole frequency ωp1 as a function of load current. Clearly evident, the polefrequency ωp1 widely moves with load current ILOAD due to the varying channel resistance of thepass-transistor. At high load conditions, the pole frequency is dominated by the channel resistancerds,PASS , while the transconductance gm,CG1 can be neglected. This picture changes at low loadconditions. As the channel resistance rds,PASS decreases with load current, the transconductancegm,CG1 cannot be neglected anymore and becomes dominating at zero load condition. In this way,the pole frequency ωp1 never reduces to zero, but always remains above some kilohertz. This enablesto place the first dominant pole p0 at even lower frequencies without compromising loop stabilityat no load conditions.

As depicted in Fig. 5.9(a), the pole frequency obtained from circuit analysis (i.e. by evaluatingEq. 5.28) is in good accordance with the simulation results. At this, the transition between low loadand high load conditions is linearly interpolated. Moreover, and as discussed earlier, the recursivedependency of the bias current I1 on the total pass-transistor drain current is neglected by assumingthe load current is much greater than the bias current in high load conditions (ILOAD,max I1),and vice versa in low load conditions (ILOAD,min I1).

Non-Dominant Pole

The frequency of the first non-dominant pole p2 is basically determined by the resistor R1 as well asthe total pass-transistor gate capacitance. By evaluating the small-signal equivalent circuit diagramdepicted in Fig. 5.8, it can be expressed as:

ωp2 = CLOAD + (1 + gm,PASS · (rout,CG2||R1)) · CGD,PASS(rout,CG2||R1) (CLOAD · (CGS,PASS + CGD,PASS) + CGS,PASS · CGD,PASS) (5.29)

rout,CG2 = (1 + (gm,CG2 + gmb,CG2) · rds,CG2) (rds,CG1||rds,CM1) + rds,CG2 (5.30)

where rout,CG2 is the equivalent resistance when looking into the drain node ofMCG2 (Razavi, 2001,p. 175). With the load capacitance being much larger than the pass-transistor gate capacitance,this expression simplifies to:

ωp2 ∼=1

(rout,CG2||R1) (CGS,PASS + CGD,PASS) (5.31)

More abstractly speaking - with the dominant pole p1 associated to the LDO output, thepass-transistor voltage gain already drops in advance of the non-dominant pole frequency ωp2


0

15

30

45

60

0 2 4 6 8 10Po

le F

req

uen

cy [

kHz]

Load Current [mA]

10

11

12

13

14

0 2 4 6 8 10Po

le F

req

uen

cy [

MH

z]

Load Current [mA]

(a) p1

(b) p2

Circuit Simulation

Theory

Fre

qu

ency

[kH

z]F

req

uen

cy [

MH

z]Load Current [mA]

Load Current [mA]

Fig. 5.9. Pole frequency of the cascoded FVF as a function of load current at nominal conditions(VDD =3.0 V, Temp. =25 C, nom. process), thereby comparing analytical results with circuit simulation.

(Av,PASS ≤ ωp2/ωp1). As a result, the pole frequency ωp2 is not (considerably) affected by theMiller effect. The above expression can be further simplified by assuming the common-gate transis-tor MCG2 is operated in saturation region. In this case, its small-signal output resistance rout,CG2is much larger than the resistance R1 and can thus be neglected.

rout,CG2 ≥ rds,CG2 = 1λsi · IDS,CG2

rout,CG2 ≥R1

λsi · VSG,PASS(5.32)

Further approximation thus yields:

ωp2 ∼=1

R1 · (CGS,PASS + CGD,PASS) (5.33)

With increasing load current, the pass-transistor passes through the various operating regions:Starting in weak inversion at low load conditions over strong inversion at high load conditions and,depending on the design guideline, to triode region at very high load conditions and low dropoutvoltages. The total gate capacitance of the pass-transistor is the sum of the gate-source (CGS) andthe gate-drain (CGD) capacitances. As a first order approximation, the total gate capacitance is inthe following considered to be constant across the transistor operating regions (Allen and Holberg,2002, p. 85).

CGATE,PASS = CGS,PASS + CGD,PASS ≈ const. (5.34)


It should be noted that also the gate-drain capacitance of the transistor MCG2 and the parasiticsubstrate capacitance of the resistor R1 contribute to the total capacitance at the pass-transistorgate node. While this can be neglected for most LDO specifications, this approximation does notapply in case of LDO designs with very low current drive capability. Such specifications requirea very small pass-transistor and consequently a very large resistor. This becomes more severe incase the CMOS technology does not offer any high-sheet resistance with low parasitic substratecapacitance.

Fig. 5.9(b) shows the pole frequency ωp2 as a function of load current, thereby comparing thesimulation results with the theoretical results obtained by evaluating Eq. 5.33. While the theoreticalconsiderations indicate a load independent pole frequency, the simulation results reveal an increasingpole frequency at low load conditions. This effect is mostly caused by the total pass-transistor gatecapacitance, which in fact decreases in weak inversion operation and consequently at low loadconditions.

Summary

While the first non-dominant pole p2 does not vary significantly with load current, the dominantpole p1 widely moves with load current due to changing channel resistance of the pass-transistor.To achieve loop stability, the first non-dominant pole p2 can be shifted by sizing of the resistor R1.In this way, LDO loop stability at high load conditions becomes a free design parameter, which canbe easily adjusted during design. This, however, is associated to an LDO quiescent current penalty,as will be further investigated and quantified in Chap. 5.8.1.

5.4.2 Voltage Gain

The small-signal voltage gain of the cascoded FVF determines the transient error at the LDO outputin response to fast line and load transient conditions. As higher the gain, as smaller the transienterror becomes. The overall LDO open-loop voltage gain is determined by the two common-gatetransistors as well as the pass-transistor, such that Av,CASFV F = Av,CG1 · Av,CG2 · Av,PASS . Foreach of these gain stages, analytical expressions are derived separately in the following.

Common-Gate Transistor Voltage Gain

The small-signal voltage gain of the common-gate transistor MCG1 is in the following denoted byAv,CG1. By evaluating the small-signal equivalent circuit diagram depicted in Fig. 5.8, and withthe back-gate of MCG1 shorted to its source (and hence gmb,CG1 = 0), it can be expressed as:

Av,CG1 = gm,CG1 · rds,CG1 + 1rds,CG1 + (rds,CM1||rin,CG2) · (rds,CM1||rin,CG2) (5.35)

rin,CG2 = R1 + rds,CG2

1 + (gm,CG2 + gmb,CG2) · rds,CG2(5.36)

where rin,CG2 is the small-signal resistance when looking into the source node of MCG2. Asrin,CG2 rds,CM1 and rin,CG2 rds,CG1, the above expression can be simplified to:

Av,CG1 ∼= gm,CG1 ·(

R1 + rds,CG2

1 + (gm,CG2 + gmb,CG2) · rds,CG2

)(5.37)


Assuming the common-gate transistor MCG2 is operated in saturation region and thereforerds,CG2 R1 (for the analytical derivation compare Eq. 5.32), further approximation yields:

Av,CG1 ∼= gm,CG1 ·(

rds,CG2

1 + (gm,CG2 + gmb,CG2) · rds,CG2

)(5.38)

Besides the small-signal voltage gain Av,CG1, also the common-gate transistorMCG2 contributes tothe overall voltage gain. By again evaluating the small-signal equivalent circuit diagram depictedin Fig. 5.8, its small-signal voltage gain Av,CG2 can be expressed as:

Av,CG2 = (gm,CG2 + gmb,CG2) · (R1||rout,CG2) (5.39)

rout,CG2 = (1 + (gm,CG2 + gmb,CG2) · rds,CG2) (rds,CG1||rds,CM1) + rds,CG2 (5.40)

where rout,CG2 is the small-signal resistance when looking into the drain node of MCG2. However,as derived in Chap. 5.4.1 (particularly compare Eq. 5.32), the resistance rout,CG2 is much largerthan R1. It can thus be neglected, such that the above expression simplifies to:

Av,CG2 ∼= (gm,CG2 + gmb,CG2) ·R1 (5.41)

To increase the transconductance (and to obtain a more compact layout), the back-gate of thecommon-gate transistor MCG2 is here connected to ground (such that gmb,CG2 > 0). However,the transconductance gm,CG2 has only little impact on the overall small-signal voltage gain of thevoltage follower, as revealed in the following.

After deriving the small-signal voltage gain for both common-gate transistors MCG1 and MCG2individually, the combined small-signal voltage gain is determined in the following by multiplyingAv,CG1 and Av,CG2.

Av,CG12 ∼= gm,CG1 ·(gm,CG2 + gmb,CG2) · rds,CG2

1 + (gm,CG2 + gmb,CG2) · rds,CG2·R1 (5.42)

Under the approximation that the intrinsic transistor gain (gm,CG2 + gmb,CG2) · rds,CG2 is muchlarger than one, a handy design equation can be obtained.

Av,CG12 ∼= gm,CG1 ·R1 (5.43)

With the common-gate transistorMCG1 operated in weak inversion, the overall small-signal voltagegain Av,CG12 can be furthermore expressed as:

Av,CG12 ∼=ISD,CG1

n · VT·R1

Av,CG12 ∼=IDS,CM1 ·R1 − VSG,PASS

n · VT(5.44)

As depicted in Fig. 5.10(a), the small-signal voltage gain Av,CG12 is to a first order approxima-tion constant over load current. The slight dependency of the voltage gain on the load current iscaused by a varying transconductance gm,CG1. With increasing load current, the pass-transistorsource-gate voltage VSG,PASS increases. In consequence, the current through the common-gatetransistor MCG1 and so its transconductance decreases. The pass-transistor source-gate voltage


30

35

40

45

0 2 4 6 8 10

Volta

ge G

ain

[dB]

Load Current [mA]

Gai

n [d

B]

(a) Av,CG12

(b) Av,PASS

24

26

28

30

32

0 2 4 6 8 10

Volta

ge G

ain

[dB]

Load Current [mA]

Circuit SimulationTheory

Gai

n [d

B]

Load Current [mA]

Load Current [mA]

Fig. 5.10. Small-signal voltage gain of the cascoded FVF as a function of load current at nominal conditions(VDD=3.0 V, Temp. =25 C, nom. process), thereby comparing analytical results with circuit simulation.

VSG,PASS is not only a function of the load current, but also of the current through the common-gate transistor MCG1, which is neglected here for simplicity (f : VSG,PASS → (ILOAD + ISD,CG1)with ISD,CG1 := IDS,CM1 − Vth,p/R1 = const). Compared to the results obtained from circuit sim-ulation, the analytic expression found in Eq. 5.44 slightly overestimates the small-signal voltagegain Av,CG12. This discrepancy is mainly due to the approximation of the intrinsic transistor gain(gm,CG2 + gmb,CG2) · rds,CG2, which actually is not much larger than one.

Pass-Transistor Voltage Gain

The small-signal voltage gain of the pass-transistor MPASS is determined by its transconductanceas well as its channel resistance. Moreover, the LDO output is loaded by the transconductance of thesensing transistor MCG1, which needs to be taken into account particularly at low load conditions.By evaluating the small-signal equivalent circuit diagram depicted in Fig. 5.8, the small-signalvoltage gain Av,PASS can be expressed as:

Av,PASS = gm,PASS · (rds,PASS ||rin,CG1)

Av,PASS ∼= gm,PASS ·(rds,PASS ||

1gm,CG1

)(5.45)

The equivalent resistance rin,CG1 denotes the resistance when looking into the source nodes ofMCG1, and is analytically derived in Chap. 5.4.1 (particularly compare Eq. 5.25). With the pass-transistor operated in saturation region, as well as the common-gate transistor MCG1 operated


in weak inversion, the small-signal voltage gain Av,PASS can be expressed depending on the loadconditions:

Av,PASS ∼=

(ILOAD+I1)

n·VT ·(

1λwi·(ILOAD+I1) ||

n·VTI1

)at low load conditions

1λsi·√

2·µp·COX ·(WL )PASS

ILOADat high load conditions

(5.46)

At high load conditions, the pass-transistor is operated in strong inversion, and the loading effectdue to the transconductance of the sensing transistor MCG,1 can be neglected. The small-signalvoltage gain consequently decreases with the square root of the load current. This picture changesat low load conditions. In this case, the pass-transistor is operated in weak inversion, and thetransconductance gm,CG1 cannot be neglected anymore. The small-signal voltage gain Av,PASStherefore reaches its maximum at low load conditions.

Fig. 5.10(b) shows the small-signal voltage gain Av,PASS as a function of load current. Evidently,the small-signal voltage gain from circuit analysis (i.e. by evaluating Eq. 5.46) is in good accordancewith the simulation results. At this, the transition between low load and high load conditions islinearly interpolated. Moreover, the recursive dependency of the biasing current I1 on the totalpass-transistor drain current is again neglected.

Summary

The overall small-signal voltage gain of the cascoded FVF can be obtained by multiplication ofAv,CG1, Av,CG2, and Av,PASS . The small-signal voltage gain of the common-gate transistors Av,CG12remains basically constant over load current, while that of the pass-transistor Av,PASS decreaseswith the square root of the load current (assuming operation in strong inversion). In combinationwith the low-frequency poles p1 and p2, the small-signal voltage gain Av,CASFV F determines theloop stability, as will be demonstrated in the following section.

5.4.3 Loop Stability

The loop stability of a minimal-phase, linear and second-order negative feedback system - such asthe cascoded FVF - can be directly determined from its transfer-function. It can be most generallyexpressed as:

HCL (s) = Av,CASFV F ·ωn

2

s2 + 2 · ζ · ωn · s+ ωn 2 (5.47)

where ωn denotes the resonance frequency and ζ denotes the damping factor. According to theBarkhausen criteria, the system may oscillate at the resonance frequency ωn in case (1) the phaseshift around the loop at this frequency is too high resulting in a positive feedback, and (2) thevoltage gain is still larger than unity to allow signal buildup. The damping factor ζ can be usedas a relative measurement of loop stability. As derived by Allen and Holberg (2002, p. 768ff.), itis defined as combination of the two low frequency poles as well as the small-signal voltage gain.Applied to the cascoded FVF, the damping factor ζ can be expressed as:

ζ = 12 ·

ωp1 + ωp2

ωn(5.48)

ωn =√ωp1 · ωp2 · (1 +Av,CASFV F ) (5.49)


20

40

60

80

0 2 4 6 8 10

Pha

se M

argi

n [d

eg]

Load Current [mA]

Circuit Simulation

Theory

Load Current [mA]

Ph

ase

Mar

gin

[d

eg]

Fig. 5.11. Phase margin of the cascoded FVF as a function of load current at nominal conditions(VDD = 3.0V , Temp. = 25C, nom. process), thereby comparing analytical results with circuit simula-tion.

By inserting the expressions for the two low frequency poles p1 (see also Eq. 5.28) and p2 (see alsoEq. 5.33) as well as the small-signal voltage gain Av,CASFV F (see also Eq. 5.46), the damping factorof the cascoded FVF can be expressed as:

ζ = 12 ·

R1 · (CGS,PASS + CGD,PASS) + rds,PASS · CLOADrds,PASS ·R1 ·

√gm,CG1 · gm,PASS · CLOAD · (CGS,PASS + CGD,PASS)

(5.50)

Assuming the two low frequency poles are widely spaced such that ωp2 ωp1, the expression canbe simplified and a handy design equation is obtained:

ζ = 12 ·√

CLOADgm,CG1 · gm,PASS ·R 2

1 · (CGS,PASS + CGD,PASS) (5.51)

By means of the damping factor, it can be distinguished between an overdamped system (ζ > 1), acritically damped system (ζ = 1), and an underdamped system (ζ < 1). While the damping factorcan also be extracted from the large-signal transient response, the phase margin is often a moreconvenient and common metric to determine the closed-loop stability from the transfer-function.The damping factor has a direct relationship to the phase margin φM .

φM = arctan

2 · ζ√√(4 · ζ4 + 1)− 2 · ζ2

(5.52)

Interestingly, the function is approximately a straight line up to a phase margin of about φM = 60,which allows for a simple “rule-of-thumb” estimation.

Fig. 5.11 shows the phase margin φM of the cascoded FVF as a function of load current. Clearlyevident, the worst-case phase margin occurs at high load conditions. This behavior can be easilyillustrated by considering the variation of the two low frequency poles p1 and p2 as well as ofthe small-signal voltage gain Av,CASFV F over load current. While the pole frequency ωp2 remainsapproximately constant, the pole frequency ωp1 increases linearly with load current (assuming thepass-transistor is operated in strong inversion). At the same time, the small-signal voltage gain


Av,CASFV F decreases with the square root of the load current (again assuming the pass-transistoris operated in strong inversion). By sizing of the resistor R1, the phase margin of the cascodedFVF becomes a free design-parameter, which can be easily adjusted during design. This, however,comes at the expense of either a reduced voltage gain or an increased quiescent current demand.The analytical results depicted in Fig. 5.11 are obtained based on Eq. 5.51, and matches well withthe simulation results.

The phase margin of the any-load stable LDO is identical to that of the cascoded FVF at mediumto high load conditions. Under these conditions, the phase shift of the first dominant pole p0 is fullycompensated by the left-half-plane zero z0, and the any-load stable LDO simplifies to a second-ordertwo-pole system. Furthermore, the phase margin estimates the small-signal stability only and is notable to predict any large-signal, conditional instability effects. Both aspects, LDO loop stability atlow load conditions as well as conditional instability effects, are addressed separately in Chap. 5.7.

5.4.4 Transient Response

The transient behavior of the cascoded FVF in response to fast line and load transient conditions isdominated by the settling phase. Large-signal slewing effects at the pass-transistor gate node can bein contrast mostly neglected - the first non-dominant pole p2 associated to the pass-transistor gatenode must be located at significantly higher frequencies than the dominant pole p1 associated to theLDO output node to guarantee loop stability. As a result, the transient response of the cascodedFVF is not limited by the slew-rate at the pass-transistor gate node, but is solely determined bysmall-signal parameters, particularly including the resonance frequency ωn and the damping factorζ. The general transient response of a minimal-phase, linear and second-order negative feedbacksystem - such as the cascoded FVF - can be obtained by determining the inverse Laplace transformof the closed-loop transfer-function HCL (s) (compare eq. 5.47; Allen and Holberg, 2002, p. 768ff.).

vCORE (t) = Av,CASFV F ·

[1− e−ζ·ωn·t√

1− ζ2· sin

(ωn · t ·

√1− ζ2 + φ

)](5.53)

where φ = arctan(√

1−ζ2

ζ

).

While the supply voltage of an ultra-low-power MCU system is rather constant in a battery-powered application, the digital nature of the load places stringent requirements on the LDO loadtransient performance. The focus is therefore put in the following section on the LDO load transientresponse. Of particular interest in this context is the maximum voltage under- and overshoot,which defines the tolerance window to guarantee a fault-free operation of the MCU digital core. Todetermine and quantify the maximum voltage under- and overshoot of the cascoded FVF in responseto fast load transient conditions, it is in the following distinguished between an overdamped orcritically damped system (corresponding to a damping factor ζ ≥ 1), and an underdamped system(corresponding to a damping factor ζ < 1).

In case the cascoded FVF is overdamped or critically damped, the output voltage settles with-out any dynamic oscillation. The output voltage deviation is solely determined by the small-signalvoltage gain of the cascoded FVF, which is however rather limited. To enable an analytical ex-pression for the output voltage deviation, the biasing current of the sensing transistor (I1), andthus also the small-signal voltage gain Av,CG12 are in the following assumed to be constant overload current. With the pass-transistor operated in saturation region, thereby neglecting its channel


-30-25-20-15-10

-50

0 2 4 6 8 10

Und

ersh

oot [

mV

]

Load Current [mA]

Out

put V

olta

ge [V

]

(a)

(b)

(b) for an under- (b) damped system

(a) for an over- (a) damped system

Circuit Simulation

Theory

Time [µs]

Fig. 5.12. Transient voltage undershoot of the cascoded FVF in response to a load current step as afunction of the load step height for (a) an overdamped or critically damped system, as well as (b) anunderdamped system, thereby comparing analytical results with circuit simulation. The initial condition isin each case zero load current. The operating conditions are otherwise nominal (VDD=3.0 V, Temp. =25 C,nom. process).

length modulation, the output voltage deviation in response to a load current step can be expressedas:

∆VCORE∆ILOAD

∼= −1

Av,CG12 · gm,PASS(5.54)

∆VCORE ∼= −1

Av,CG12·√

∆ILOAD


)PASS

(5.55)

Fig. 5.12(a) shows the transient voltage error as function of the load current in case the cascodedFVF is overdamped or critically damped. Evidently, the transient voltage error obtained from circuitanalysis (i.e. by evaluating Eq. 5.55) is in good accordance with the simulation results. While theworst-case transient voltage error obviously occurs in response to a full-scale load transient step,the theoretic analysis also allows to predict the transient under- and overshoot in response to anarbitrary load transient step. The theoretic analysis becomes inaccurate at low load conditions ofbelow 1 mA, at which the pass-transistor enters weak inversion and the biasing current I1 cannotbe neglected anymore compared to the load current.

To achieve an overdamped or critically damped LDO response, the first non-dominant polep2 needs to be pushed to higher frequencies, which results in a quiescent current penalty. Forthis reason, designing the cascoded FVF for an underdamped response with a worst-case dampingfactor ζ < 1 at high load conditions is more prevalent in practice. An underdamped responserelaxes not only the quiescent current penalty, but also offers a faster response time comparedto the overdamped case. In this case, the LDO transient response is superposed by a decayingoscillation. As revealed by Esteves et al. (2013), the maximum transient voltage error occurs at thetime π

ωn·√

1−ζ2and is solely determined by the damping factor ζ. Fig. 5.12(b) shows the transient

voltage error as function of the load current in case the cascoded FVF is underdamped. Due toreduced voltage gain and phase margin, it increases with increasing load current.

While the transient under- and overshoot in response to a positive and negative load current stepis symmetric in case of an overdamped or critically system response, this changes in case of an un-derdamped system response. Since the dominant pole p1 associated to the LDO output node widelymoves with load current, the phase margin is degraded at high load conditions. As a result, the


MPASS

VDD

VGATE

CLOAD ILOAD

VCORE

gm,CG1

R1

VDD

VAMP

-

+

Fig. 5.13. Simplified equivalent circuit model of the cascoded flipped voltage follower (FVF). To enablefirst order hand calculation, this model neglects the finite input resistance of the sensing transistor MCG1,and is thus valid at medium to high load conditions.

cascoded FVF responds differently to a positive and negative load current step, respectively: Whilethe LDO feedback loop is highly overdamped at low load conditions, it may become underdampedat high load conditions.

This transient response analysis in conclusion focuses on the cascoded FVF only, while it isembedded into the multi-loop topology of the any-load stable LDO. Owing to its high bandwidth,the cascoded FVF allows the any-load stable LDO to instantaneously react to the load currentstep, but with limited gain only. The output voltage subsequently settles slowly back to its accuratenominal level determined by the slow folded-cascode amplifier providing high gain to the feedbackloop. These two steps are in fact superposed, resulting in a two-stage transient response.

5.4.5 Equivalent Circuit Model

The presented, detailed results of the small-signal analyses are conclusively combined to a simplifiedequivalent circuit model of the cascoded flipped voltage follower (FVF) as depicted in Fig. 5.13.The primary aim of this concise model is to identify, quantify and illustrate the fundamental designtrends and trade-offs of the cascoded FVF. To remain simple and clear, the biasing current of thesensing transistor (I1) is for the equivalent circuit model assumed to be independent of the LDOload conditions. In this way, the input resistance of the sensing transistor MCG1 becomes infinite(particularly such that rds,PASS 1

gm,CG1), and the loading effect of the sensing transistor can be

neglected. While this approximation constrains the scope of the simplified equivalent circuit modelto medium to high load conditions, it enables first order hand calculation and design of the cascodedFVF.

As evident from the equivalent circuit model depicted in Fig. 5.13, the cascoded FVF is modeledas low-pass second-order system. The first gain stage is formed by the sensing transistor MCG1 incombination with the resistor R1, while the second gain stage is formed by the pass-transistorMPASS . The small-signal voltage gain of the first gain stage can be expressed as:


Av,CG12 ∼= gm,CG1 ·R1 = IDS,CM1 ·R1 − VSG,PASSn · VT

=(IDS,CM1 − Vth,p

R1

n · VT

)·R1 (5.56)

While for practical implementations the small-signal voltage gain Av,CG12 decreases with increasingLDO load current due to the current switch mechanism of the cascoded FVF, the current I1 andin consequence also the transconductance gm,CG1 are assumed to be constant over load current forthis equivalent circuit model (I1 := IDS,CM1−Vth,p/R1 = const). For the second gain stage, a handydesign equation can be obtained for its small-signal voltage gain by neglecting the loading effect dueto the transconductance of the sensing transistorMCG1 (particularly such that rds,PASS 1

gm,CG1).

Av,PASS ∼= gm,PASS · rds,PASS = 1λsi·

√2 · µp · COX ·

(WL

)PASS

ILOAD(5.57)

Each gain stage introduces a low-frequency pole, associated to the respective output node. Thefirst dominant pole p1 is associated to the LDO output, while the second non-dominant pole p2 isassociated to the gate node of the pass-transistor. The location of the first dominant pole p1 canbe expressed as:

ωp1 ∼=1

rds,PASS · CLOAD= λsi · ILOAD

CLOAD(5.58)

Analogous to the considerations on the small-signal voltage gain Av,CG12, the above expression ne-glects the finite input resistance of the sensing transistorMCG1 (such that rds,PASS 1

gm,CG1), and

is hence valid at medium to high load conditions only. The location of the second non-dominant polep2 is dominated by the total gate capacitance of the pass-transistor, and is in a first approximationindependent of the LDO load conditions.

ωp2 ∼=1

R1 · (CGS,PASS + CGD,PASS) (5.59)

The results of the above detailed small-signal circuit analysis of the cascoded FVF matches well tothose obtained from circuit simulation. By restricting the analytical results to medium to high loadconditions, the resulting simplified equivalent circuit model provides a good approximation andenables a first order hand calculation and design of the cascoded FVF. The model is subsequentlyemployed to illustrate the sensitivity with regard to process, voltage and temperature variations,as well as to derive concise scaling laws for the most vital LDO performance parameters.

5.5 Analysis and Mitigation of Parameter Variations

The behavior of the any-load stable LDO has so far been analyzed under nominal operating condi-tions, at which only load current variations have been taken into account. However, for robustnessof the LDO operation and performance, also random process fluctuations as well as variations ofthe operating conditions (such as supply voltage and temperature) must be taken into account forits design (just as for any other analog circuitry). The process, supply voltage and temperaturevariations (in short PVT parameter variations) cause measurable and predictable variations in theLDO operation and performance. The ultimate goal of high yield design is therefore the circuit

5.5 Analysis and Mitigation of Parameter Variations 99

Ph

as

e M

arg

in [

de

g] weak process,

85ºC, 1.8V

Minimum

Phase Margin

strong process,

85ºC, 1.8V

weak process,

-40ºC, 3.6V

Voltage Gain [dB]

strong process,

-40ºC, 3.6V

Minimum

Voltage Gain

Fig. 5.14. Schematic illustration of the small-signal voltage gain and phase margin of the cascoded flippedvoltage follower (FVF) under the presence of PVT parameter variations. For robustness of the LDO oper-ation and performance, an LDO design-tolerance window needs to be defined, guaranteeing that both thevoltage gain and phase margin do not fall under certain limits.

robustness against the inevitable PVT parameter variations. For this purpose, several variationaware design methodologies and tools have been recently presented, which systematically exploreand minimize the impact of process variations and interferences on analog circuits (Graeb et al.,2007; Gielen, 2006; Onabajo and Silva-Martinez, 2012). While a detailed review is beyond the scopeof this work, the main focus of attention is in the following on a sensitivity analysis for the any-loadstable LDO towards PVT parameter variations. The any-load stable LDO performance parame-ters - particularly including the transient accuracy (associated to the small-signal voltage gain ofthe cascoded FVF) as well as the loop stability (associated to the phase margin of the cascodedFVF) - should be insensitive to PVT parameter variations and, moreover, well centered within thespecification limits. The focus is therefore on the cascoded FVF, while designing the folded-cascodeamplifier is comparatively uncritical and generally well understood (Ivanov and Filanovsky, 2004,p. 149ff). The impact of PVT parameter variations can be best illustrated by making use of thesimplified equivalent circuit model (see Fig. 5.13), as introduced and derived in Chap. 5.4. Basedon the sensitivity analysis results, design mitigation techniques are identified and evaluated.

Fig. 5.14 illustrates the small-signal voltage gain and phase margin of the cascoded FVF underthe presence of PVT parameter variations, thereby revealing the basic challenge for design centering.For the one worst-case corner combination (strong process, lowest temperature and highest supplyvoltage), a degraded voltage gain and an excessive phase margin causes deep under- and overshootin response to a transient load step, while the LDO is very stable. For the other worst-case cornercombination (weak process, lowest temperature and highest supply voltage), the picture is vice versa.An excessive voltage gain and a degraded phase margin lead to minimum under- and overshoot inresponse to a transient load step, while the LDO is, if at all, only marginal stable. Against thisbackground, an LDO design-tolerance window needs to be defined. On the one hand, the LDOfeedback loop must still be stable under worst-case conditions for phase margin, while on the otherhand the transient error requirements must still be met under worst-case conditions for voltage gain.This in turn translates into the need to design the LDO for worst-case PVT variations. By makingthe LDO performance less sensitive to parameter variations (e.g. by design mitigation techniques),


the design requirements can be more relaxed also at typical operating conditions and nominalprocess. In this way, reduced variations of the LDO performance across PVT variations enable alower quiescent current demand of the any-load stable LDO.

Noteworthy at this point, the LDO performance parameters (in particular the small-signal volt-age gain as well as the phase margin), become less sensitive to PVT parameter variations at highquiescent current levels (such that IDS,CM1 ·R1 VSG,PASS). In this case, the bias current throughthe sensing transistor MCG1, and in turn also its transconductance, are well defined.


n · VT(5.60)

VSG,PASS ∼=√

2 · ILOADµp · COX ·

(WL

)PASS

· 1(1 + λsi · VSD,PASS) + Vth,p (5.61)

As a result, the small-signal voltage gain Av,CG12 becomes less sensitive to PVT parameter vari-ations - particularly including variations of the pass-transistor threshold voltage Vth,p and carriermobility µp as well as variations of the resistor R1. In the other extreme case, the small-signalvoltage gain completely collapses if the sensing transistor MCG1 runs out of current (such thatIDS,CM1 ·R1 ≤ VSG,PASS).

5.5.1 Supply Voltage Variations

The any-load stable LDO should ideally be able to generate a constant output voltage independentof its supply voltage. The lower limit is thereby defined by the LDO dropout voltage. A variation ofthe supply voltage, however, directly translates into a varying drain-source voltage of the common-gate transistor MCG2 as well as of the pass-transistor MPASS . While this is rather uncritical withrespect to the common-gate transistor (as long as it is operated in saturation region at low supplyvoltages), it is much more critical for the pass-transistor.

Fig. 5.15(a) shows the simulated small-signal voltage gain of the cascoded FVF as a function ofsupply voltage - it conspicuously increases with increasing supply voltage. This behavior is primarilycaused by two mechanisms. (1) Due to channel length modulation effects, the pass-transistor source-gate voltage VSG,PASS required to provide a certain load current decreases with increasing supplyvoltage. This, in turn, translates into an increased small-signal voltage gain Av,CG12 caused bythe current switch mechanism and determined by the robustness of the circuit design. (2) Whilethe basic Shichman-Hodges transistor model indicate a constant channel resistance rds,PASS insaturation region, it in fact increases with increasing source-drain voltage (Tsividis, 2004, p. 370ff.).An increasing channel resistance rds,PASS in turn directly translates into an increasing small-signalvoltage gain Av,PASS .

Due to the increasing channel resistance rds,PASS , also the pole p1 moves to lower frequencieswith increasing supply voltage. At the same time, the pole p2 is not (significantly) affected bysupply voltage variations. However, as the pole frequency ωp1 does not decrease as fast as thesmall-signal voltage gain Av,CASFV F increases, the LDO gain-bandwidth increases. In conclusion,and as depicted in Fig. 5.15(b), the phase margin degrades with increasing supply voltage.


Mitigation of Supply Voltage Variation

Based on the above considerations on supply voltage variations, the implications of operating thepass-transistor in triode region at small dropout voltages and high load conditions can be revealed.Operating the pass-transistor in triode region, as for instance proposed by Gutierrez et al. (2009),appears to be attractive on the first sight as the pass-transistor could be sized significantly smaller.However, drastic implications on the LDO characteristics are associated to this approach and haveto be carefully traded against the (potential) benefits. The major source of concern is the resultingvariability of the pass-transistor transconductance gm,PASS . As evident from Fig. 5.15(a), the small-signal voltage gain remains basically constant over supply voltage variations as long as the pass-transistor is operated in saturation region, while it decreases dramatically in triode region. As aresult, the dynamic under- and overshoot in response to fast line and load transient steps increases atsmall dropout voltages and high load currents. Presuming that a certain error window for dynamicunder- and overshoot must be guaranteed, the pass-transistor entering triode region at small dropoutvoltages becomes the limiting condition for defining the small-signal voltage gain of the cascodedFVF. It consequently needs to be increased - resulting in a LDO quiescent current penalty. Oncethe pass-transistor enters saturation region at higher dropout voltages and/or smaller load current,however, the excessive voltage gain potentially causes loop stability issues. To maintain loop stabilityunder these conditions, the resistor R1 needs to be adjusted, again resulting in an LDO quiescentcurrent penalty.

In conclusion, the pass-transistor entering triode region at small dropout voltages has a crucialimpact on the LDO circuit design not only at small dropout voltages, but under all operatingconditions. The drawbacks need to be carefully evaluated and traded off against the potentialbenefits, a smaller pass-transistor size and herewith a smaller parasitic gate capacitance. It shouldbe kept in mind that there is no abrupt transition between saturation and triode region - in contrastto what is stated by the basic Shichman-Hodges transistor model. For optimizing the LDO design, itcan hence be worth to explore the limits when sizing the pass-transistor. To mitigate the impact ofsupply voltage variations, the pass-transistor in conclusion should be operated in saturation regionunder all LDO operating conditions. The robustness of the any-load stable LDO to supply voltagevariations can be further improved by using an alternative pass-transistor topology, introduced inChap. 5.6.

5.5.2 Temperature Variations

The characteristics of both active and passive integrated devices vary significantly over temperature(Tsividis, 2004). The following discussion thereby focuses on the temperature dependence of thetransistors. The passive device, namely the resistor R1, is realized by a high-sheet resistor withonly little temperature drift (e.g. by using a silicide-block polysilicon resistor). For the transistors,a varying temperature results first and foremost, among many other effects, in a varying thresholdvoltage as well as a varying carrier mobility (both decrease with increasing temperature).

Vth (T ) = Vth (T0)− αV th · (T − T0) (5.62)

µ (T ) = µ (T0) ·(T

T0

)−αµ(5.63)

where αµ and αV th are process dependent constants (Tsividis, 2004).


20

25

30

35

40

1.80 2.00 2.20 2.40 2.60 2.80 3.00 3.20 3.40 3.60

Ph

ase M

arg

in [

deg

]

Supply Voltage VDD [V]

30

40

50

60

70

1.80 2.00 2.20 2.40 2.60 2.80 3.00 3.20 3.40 3.60

Vo

ltag

e G

ain

[d

B]

Supply Voltage VDD [V]

(a)

(b)

85ºC25ºC

-40ºC

85ºC

-40ºC

25ºC

Fig. 5.15. Simulated (a) small-signal voltage gain and (b) phase margin of the cascoded FVF at maximumload current ILOAD = 10 mA as a function of supply voltage and temperature variations.

The temperature behavior of the cascoded FVF is in detail determined by various effects, par-ticularly affecting the small-signal voltage gain Av,CG12. The small-signal voltage gain Av,PASS isas a first order approximation constant over temperature. While the pass-transistor transconduc-tance gm,PASS decreases with increasing temperature due to decreasing carrier mobility, its channelresistance rds,PASS increases. At high load conditions, the pass-transistor source-gate voltage re-quired to provide a certain load current is dominated by the carrier mobility and consequentlyincreases with temperature. As a result, the bias current of the sensing transistor MCG,1 and inturn the small-signal voltage gain Av,CG12 decreases. The effect of decreasing small-signal voltagegain is further fortified when assuming a temperature independent biasing of the cascoded FVF.Fig. 5.15(a) shows the simulated small-signal voltage gain of the cascoded FVF at high load cur-rent as a function of temperature - it decreases with increasing temperature. The above describedpicture changes at low load conditions. Here, the pass-transistor source-gate voltage required toprovide a certain load current is dominated by the threshold voltage and consequently decreaseswith increasing temperature. The two effects thus essentially cancel each other, resulting in a ratherstable small-signal voltage gain Av,CG12 at low load conditions.

Due to the increasing channel resistance rds,PASS , also the pole p1 moves to lower frequen-cies with increasing temperature. At the same time, the pole p2 is not (significantly) affected bytemperature variations, as neither the pass-transistor gate capacitance nor the resistance R1 showa (noticeable) temperature dependency. In combination with the decreasing small-signal voltagegain Av,CASFV F , the phase margin therefore improves with increasing temperature as depicted inFig. 5.15(b).


Mitigation of Temperature Variation

To compensate the effect of temperature variation on the small-signal voltage gain Av,CG12, thecascoded FVF can be biased by a bias current proportional to absolute temperature (PTAT). ThePTAT current is generated with the help of a bandgap reference, such that IDS,CM1 =

k·Tq ·ln(m)RIREF

with m and RIREF being circuit design parameters (Razavi, 2001, p. 389).


n · VT=

(k·Tq ·ln(m)RIREF

)·R1 − VSG,PASS(n · k·Tq

) (5.64)

Particularly at high bias current levels (such that IDS,CM1 ·R1 VSG,PASS), the impact of temper-ature variations on the LDO performance is in this way effectively minimized. However, the impactof temperature variations on the characteristics of the cascoded FVF is rather small compared tothose caused by supply voltage and particularly process variations, as discussed in the followingsection.

5.5.3 Process Variations

The characteristics of integrated circuit devices suffer from substantial parameter variations (Tsi-vidis, 2004; Eisele et al., 1995). At this, it can be distinguished between absolute variations (alsoknown as process corner variations) and relative variations (also known as device matching). Whilethe device characteristics vary substantially from wafer-to-wafer and from lot-to-lot, a group ofequally designed devices in close proximity have very similar characteristics (Pelgrom et al., 1989) -a fact widely exploited for the design of integrated analog circuitries. The cascoded flipped voltagefollower (FVF) is essentially insensitive to relative process variations. Disregarding the bias cur-rent and bias voltage generation, it does not rely on device matching. The following considerationstherefore focus on the process corner variations. Of particular interest are the transistor and resistorcorner variations, which impact is investigated in the following separately. Based on the results ofthe sensitivity analysis, approaches for mitigation are identified.

Transistor Variations

The transistor characteristics first and foremost vary due to errors of the dimensions, the gatethickness as well as the channel doping (Drennan and McAndrew, 2003). While variations of NMOSdevices (namely with regard to the cascode transistor MCG2) have no significant impact on thebehavior of the cascoded FVF, this is different for the PMOS devices (namely with regard to thesensing transistor MCG1 and the pass-transistor MPASS). As depicted by the columns three andfive in Fig. 5.16, variations of the PMOS devices affect both the small-signal voltage gain and thephase margin of the cascoded FVF.

The small-signal voltage gain Av,CG12 increases in the strong process corner due to varyingtransconductance gm,CG1, which is in turn caused by a varying sub-threshold slope factor. This ef-fect is further fortified as also the pass-transistor source-gate voltage VSG,PASS , required to providea certain load current, is reduced in the strong process corner. As a result, the bias current of thesensing transistor MCG,1 and in turn the small-signal voltage gain Av,CG12 increases. Moreover,


(b)(a)Nominal supply voltage and temperature

(VDD=3.0V ; Temp.=25ºC )(a)

Vo

ltag

e G

ain

[d

B]

Ph

ase M

arg

in [

deg

]

wea

k P

MO

S, lo

w R

,

co

mp

ensate

d

stro

ng P

MO

S, hig

h R

,

co

mp

ensate

d

wea

k P

MO

S, lo

w R

wea

k P

MO

S, no

m. R

str

ong P

MO

S, hig

h R

str

ong P

MO

S, no

m. R

no

m. P

MO

S, no

m. R

wea

k P

MO

S, lo

w R

,

co

mp

ensate

d

str

ong P

MO

S, hig

h R

,

co

mp

ensate

d

wea

k P

MO

S, lo

w R

wea

k P

MO

S, no

m. R

str

ong P

MO

S, hig

h R

stro

ng P

MO

S, no

m. R

nom

. P

MO

S, no

m. R

1 2 3 4 5 6 7 1 2 3 4 5 6 7

60

50

40

30

20

10

70

65

60

55

50

45

Fig. 5.16. Simulated (a) small-signal voltage gain and (b) phase margin of the cascoded FVF at maximumload current ILOAD = 10 mA as a function of process corners. The cascoded FVF shows a strong dependenceon process corner variations, particularly on variations of the resistor R1.

also the small-signal voltage gain Av,PASS increases in the strong process corner due to increas-ing transconductance gm,PASS . The increasing transconductance is thereby partially offset by adecreasing channel resistance rds,PASS .

The decreasing channel resistance rds,PASS causes the pole p1 to move to lower frequencies inthe strong process corner. At the same time, the pole p2 is not (considerably) affected by transistorvariations. However, as the pole frequency ωp1 does not decrease as fast as the small-signal voltagegain Av,CASFV F increases, the LDO gain-bandwidth increases. In conclusion, and as depicted inFig. 5.16(b), the phase margin is degraded in case of the strong process corner and vice versa incase of the weak process corner.

Resistor Variations

Depending on the resistor types offered by the CMOS technology, integrated resistors are not veryaccurate and typically exhibit a large variation of ±20% to ±30% (Razavi, 2001, p. 616f.). The resis-tor variation is dominated by fluctuations of the sheet resistance as well as the resistor dimensions.Compared to transistor variations, the cascoded flipped voltage follower (FVF) is significantly moresensitive to resistor variations, namely with respect to the resistor R1.

The resistor R1 directly determines the small-signal voltage gain Av,CG12 of the cascoded FVF.While the relation is linear for a conservative design approach with IDS,CM1 · R1 VSG,PASS ,it becomes more attenuated for an aggressive design approach with IDS,CM1 · R1 > VSG,PASS .Besides the small-signal voltage gain, the resistor variations have also a direct impact on the firstnon-dominant pole p2. As the pole frequency ωp1 is at the same time independent of the resistorvariations, the loop stability is affected by the resistor variations in two ways. On the one hand,the small-signal voltage gain of the cascoded FVF is increased by a linear factor, in this wayincreasing the gain-bandwidth. On the other hand, also the first non-dominant pole moves towards

5.6 Pass-Transistor Design 105

lower frequencies. In conclusion, and as evident from the columns two and six in Fig. 5.16, theresistor variations result for the cascoded FVF in an increased voltage gain and thus degradedloop stability for the strong process corner as well as in a reduced voltage gain and thus degradedtransient regulation performance for the weak process corner.

Mitigation of Process Variation

The above sensitivity analysis for process variations clearly reveals a strong dependency of the LDOcharacteristics on the resistor variations, namely with respect to the resistor R1. For mitigation ofthe process variations, it is hence indispensable to minimize the resistor variations, for which twoalternative approaches are identified in the following.

The most basic, but also the most effective approach is to trim the resistor R1 during finalproduction test (Ivanov and Filanovsky, 2004, p. 93). One option for trimming (as widely employedfor highly-integrated MCU systems) is to partially short the resistor as part of a ladder structureand to store the respective digital value in the on-chip non-volatile memory. Depending on thetrim step size, the resistor variation and consequently its impact on the LDO characteristics can bedrastically reduced. However, the trimming approach tends to be costly with regard to both testtime and circuit overhead.

An alternative approach for mitigation of process variations on the LDO characteristics is basedon process parameter tracking techniques, and consequently does not require any trimming. Thebasic idea is to generate a bias current, which is inversely proportional to the resistor variations(IDS,CM1 ∝ 1/R1).

Av,CG12 ∼=

(IDS,CM1∆R1

)· (R1 ·∆R1)− VSG,PASS

n · VT(5.65)

As evident from comparing column one and three for the weak process corner as well as column fiveand seven for the strong process corner in Fig. 5.16(a), the variations of the small-signal voltagegain Av,CG12 can be drastically reduced with the help of this process parameter tracking technique.Due to stabilization of the small-signal voltage gain and therefore also of the gain-bandwidth, thevariations of the phase margin can be effectively minimized - although the variation of the firstnon-dominant pole frequency ωp2 is not compensated by this approach. The resulting phase marginfor the worst-case process corner combinations is depicted in Fig. 5.16(b). In conclusion, the impactof the resistor variations is efficiently mitigated, with respect to both its voltage gain and its phasemargin. Since the cascoded FVF needs to be laid out for worst-case parameter variations, thisprocess parameter tracking technique enables a more aggressive design, in this way enabling a lowerquiescent current demand of the any-load stable LDO.

5.6 Pass-Transistor Design

The pass-transistor forms the backbone of any LDO topology and therefore greatly affects its per-formance. The ability to source high load currents while achieving low dropout voltages demandsthe use of a large size PMOS pass-transistor. Its gate capacitance presents a large capacitive load tothe error amplifier, which needs to be driven rapidly. Consequently, the pass-transistor gate capac-itance is the key limiting factor for achieving low LDO quiescent current. Based on the simplified


equivalent circuit model introduced and derived in Chap. 5.4.5, it can be particularly shown thatthe quiescent current of the cascoded flipped voltage follower (FVF) is directly proportional to thepass-transistor gate capacitance. Presuming the loop stability is to be maintained, the resistor R1needs to be scaled inversely proportional to the pass-transistor gate capacitance. To maintain atthe same time also the small-signal voltage gain Av,CG12, the bias current IDS,CM1 in turn mustbe scaled inversely proportional to the resistor R1, and consequently proportional to the pass-transistor gate capacitance. The pass-transistor design is thereby strongly related to the CMOStechnology parameters. To achieve a small gate capacitance, a high carrier mobility (µp), a smallgate capacitance per area (COX), as well as a short minimum channel length (Lmin) are preferred.In the following, the conventional pass-transistor topology is presented including a design method-ology for determining the required pass-transistor dimensions. Based on these results, an alternativepass-transistor topology is proposed, which efficiently makes use of the different transistor typesavailable in modern deep submicron CMOS technologies.

5.6.1 Conventional Pass-Transistor Topology

Fig. 5.17(a) shows the conventional pass-transistor topology composed of a single PMOS devicein common-source configuration. This device can be exposed to source-drain voltages as high asthe maximum supply voltage during shut-down and start-up, respectively. To avoid punch-throughbreakdown, the pass-transistor is consequently required to be a thick-oxide device with a comparablylong minimum channel length and a high voltage tolerance.

In the following, the basic Shichman-Hodges transistor model is used to find an analyticalexpression for the required pass-transistor dimensions (Gutierrez et al., 2009). The pass-transistorsizing is thereby determined by the following, critical operating conditions: Minimum supply voltage,maximum load current, weak process corner and, for most LDO specifications, high temperature(when both the carrier mobility and the threshold voltage are at their minimum). As revealed inChap. 5.5.1, the pass-transistor must be operated in saturation region to mitigate the impact ofsupply voltage variations on the LDO characteristics. The required pass-transistor dimensions canthus be expressed as: (

W

L

)PASS

= 2µp · COX

· (ILOAD,max + I1)(VSG,PASS − Vth,p)2 (5.66)

As the LDO dropout voltage and thus the minimum source-drain voltage of the pass-transistor iscommonly small, its channel length modulation is neglected here. Moreover, as the pass-transistoris a thick-oxide device with a comparably long minimum channel length, it is not (considerably)affected by short-channel effects (Tsividis, 2004). The maximum source-gate voltage of the pass-transistor is determined by two operational constraints.

1. To keep the pass-transistor dimensions as small as possible, it is designed to operate at the edgeto triode region at the above-specified worst-case operating conditions.

VDD,min − VCORE ≥ VSG,PASS − Vth,p (5.67)

By rearranging the above equation, the maximum source-gate voltage can thus be defined as:

VSG,PASS = VDD,min − VCORE + Vth,p (5.68)


MPASS

VCORE

VDD

ILOAD

VGATE

CGS

CGD

(a) (b)

VGATE,min = 0.70V, Vth,p = 0.56V, µp·COX = 56µA/V2

VCORE [V]

(W/L)

VDD,min [V]

10000

1000

100

101.6

1.51.4

1.31.2

1.1

1.61.7

1.8

1.9

2.0

Fig. 5.17. (a) Conventional pass-transistor topology, and (b) the required device dimensions over minimumsupply voltage and maximum (static) core voltage for a load current of 1 mA.

2. In addition to the first constraint, the pass-transistor source-gate voltage might be limited bythe output voltage range of the error amplifier. Particularly at low LDO output voltages, thepass-transistor dimensions are therefore determined by the minimum voltage level at its gatenode, in the following denoted by VGATE,min.

VDD,min − VCORE ≥ VDD,min − VGATE,min − Vth,pVCORE ≤ VGATE,min + Vth,p (5.69)

For the cascoded flipped voltage follower (FVF), the minimum voltage level is determined bythe sum of the drain-source voltage of the current mirror MCM1 and that of the common-gatetransistor MCG2, such that both devices always remain in saturation region. The drain-sourcevoltage of the current mirror MCM1 can thereby be controlled by the bias voltage VBIAS . Themaximum source-gate voltage of the pass-transistor is in this case simply defined as:

VSG,PASS = VDD,min − VGATE,min (5.70)

Substituting VSG,PASS in Eq. 5.66 by the two sizing constraints expressed by Eq. 5.68 as well asEq. 5.70, and neglecting the LDO quiescent current (ILOAD,max I1), the required pass-transistordimensions can be expressed as:

(W

L

)PASS

=

2

µp · COX· ILOAD,max

(VDD,min − VCORE)2 forVCORE > VGATE,min + Vth,p

2µp · COX

· ILOAD,max

(VDD,min − VGATE,min − Vth,p)2 forVCORE ≤ VGATE,min + Vth,p

(5.71)The gate capacitance of the pass-transistor, which is to be driven by the error amplifier, is directlyproportional to its aspect ratio. With the pass-transistor operated in saturation region, the totalgate capacitance can be approximated as (Allen and Holberg, 2002, p. 85):


(CGS,PASS + CGD,PASS) = COX · (LD + 0.67 · Leff ) ·Weff + COX · LD ·Weff

= COX ·Weff · (2 · LD + 0.67 · Leff ) (5.72)

where LD denotes the overlap due to lateral diffusion, which is neglected here for simplicity. Tominimize the gate capacitance of the pass-transistor, its channel length is designed as short as thelithography and breakdown limits of the CMOS technology allow. Hence, with Leff = Lmin andWeff =

(WL

)PASS

· Lmin, the above equation yields:

(CGS,PASS + CGD,PASS) ∼= 0.67 · COX · L2min ·

(W

L

)PASS

(5.73)

It should be noted that these equations only give an approximation for the required pass-transistordimensions and the resulting gate capacitance. The transition between saturation and triode re-gion is not very well covered by the basic transistor model. Particularly the transition point atVSD,PASS = VSG,PASS − Vth,p is in practice a poor estimation (Tsividis, 2004, p. 158ff.). This ap-proximation in fact results in an underestimation of the required pass-transistor dimensions, whichin case of the 0.13µm standard CMOS technology ranges between −10 % at high dropout voltagesand −20 % at very small dropout voltages.

Although the above considerations are not able to supersede fine-tuning of the pass-transistordimensions by device simulation, they provide a good approximation for the required pass-transistordimensions. Fig. 5.17(b) illustrates the required pass-transistor dimensions as a function of theminimum supply voltage and maximum (static) core voltage normalized to a load current of 1 mA. Inconclusion, the required pass-transistor dimensions and consequently the gate capacitance presentedto the error amplifier increase with the second power of the dropout voltage and linearly with themaximum load current.

5.6.2 Cascode Pass-Transistor Topology

For the conventional pass-transistor topology, the designer’s options to reduce the pass-transistorgate capacitance are very limited. Instead, the required pass-transistor dimensions and thus thegate capacitance to be driven by the error amplifier are directly determined by the LDO specifi-cation and the CMOS technology being used. In order to overcome these limitations and enable asignificant reduction of the gate capacitance, a cascode pass-transistor topology is presented in thefollowing. This alternative pass-transistor topology makes efficiently use of the different transistortypes available in modern deep submicron CMOS technologies. The thick-oxide pass-transistor isthereby replaced by a thin-oxide device. Compared to a thick-oxide device, such a device offers ahigher carrier mobility (µp) as well as a shorter minimum channel length (Lmin). At the same time,it exhibits a larger gate capacitance per area (COX) and suffers from a limited voltage capability.To nevertheless ensure a reliable operation, the thin-oxide device has to be protected such thatthe voltages across its gate-to-source, gate-to-drain, and gate-to-body terminals do not exceed themaximum voltage rating specified by the CMOS technology under any operating condition. For thisreason, an additional thick-oxide cascode transistor is connected in series to the actual thin-oxidepass-transistor as depicted in Fig. 5.18(a). The required considerations involved in design of thisalternative pass-transistor topology are examined in the following. This particularly also includesa comparison to the conventional pass-transistor topology with regard to LDO quiescent currentdemand as well as silicon area.


MPASS

VCORE

VDD

ILOAD

VGATE

MCASCVBIAS

VCGATE

+

-

VCASC

MPASS

VCORE

VDD

ILOAD

VGATE

MCASC

VCGATE

VCASC

VDD

IBIAS

(a) (b)

MCS1

Fig. 5.18. (a) Basic principle, and (b) circuit diagram of the cascode pass-transistor topology composed ofa thin-oxide pass-transistor and a thick-oxide cascode transistor for protection. By using this cascode pass-transistor topology, the gate capacitance of the pass-transistor and consequently also the LDO quiescentcurrent can be drastically reduced for a given LDO specification.

To protect the thin-oxide pass-transistor from high voltage stress, the thick-oxide cascode tran-sistor is controlled by an auxiliary amplifier forming a local negative feedback loop (Bult and Geelen,1990; Chiu, 2013). By aiming to keep the source-drain voltage of the thin-oxide pass-transistor equalto a defined bias voltage VBIAS , the local negative feedback loop provides a gain-boosting func-tion. The small-signal output impedance of the cascode pass-transistor configuration is in this wayincreased to:

rout,PASS ∼= rds,PASS + rds,CASC · [1 + gm,CASC · rds,PASS · (1 +Av,AUX (s))] (5.74)

with Av,AUX (s) denoting the transfer-function of the auxiliary amplifier. Due to the increasedoutput impedance, the small-signal voltage gain Av,PASS is boosted, while the dominant pole p1 ismoved to lower frequency by the same factor.

Av,PASS ∼= gm,PASS · rds,PASS · [1 + gm,CASC · rds,CASC · (1 +Av,AUX (s))] (5.75)

ωp1 ∼=1

rds,PASS · [1 + gm,CASC · rds,CASC · (1 +Av,AUX (s))] · CLOAD(5.76)

As a result, the LDO gain-bandwidth is not affected by the gain-boosting function, and the previousconsiderations on the placement of the first non-dominant pole p2 to guarantee loop stability remainvalid without limitation also for the cascode pass-transistor topology (particularly see Chap. 5.4.3).Since the gain-boosting function relies on increasing the output impedance without affecting thepass-transistor transconductance gm,PASS , it has also no impact on the LDO transient response,neither on the line transient nor on the load transient (particularly see Chap. 5.4.4). In conclusion,the sole purpose of the cascode transistor and the auxiliary amplifier is to protect the thin-oxidepass-transistor from high voltage stress. The small-signal voltage gain of the auxiliary amplifier isthus uncritical, and, in particular, the cascode transistor can be operated in triode region withoutsacrificing the LDO transient response.

The frequency response of the auxiliary amplifier is dominated by a single low-frequency polepAUX , which is associated to the gate node of the cascode transistor, and is determined by its


total gate capacitance as well as the output impedance of the auxiliary amplifier. The low-passcharacteristic of the transfer-function Av,AUX (s) introduces a frequency dependency of the gain-boosting and thus of the LDO output impedance. To avoid interferences with the main LDOfeedback loop, the local negative feedback loop controlling the cascode transistor must have alarger gain-bandwidth than the main LDO feedback loop (Chiu, 2013), such that:

Av,AUX · ωp,AUX ≥ Av,CASFV F · ωp1 (5.77)gm,CS1

(CGS,CASC + CGD,CASC) ≥Av,CG12 · gm,PASS

CLOAD(5.78)

For simplicity, the first non-dominant pole p2 is thereby assumed to be located well beyond theLDO unity-gain frequency. The critical condition for avoiding interference is at high load condi-tions, when the gain-bandwidth of the main LDO feedback loop is at its maximum. Fig. 5.18(b)shows a practical implementation of the auxiliary amplifier controlling the thick-oxide cascode pass-transistor. The auxiliary amplifier is formed by a single-transistor common-source amplifier, whichaims to keep the source-drain voltage of the thin-oxide pass-transistor equal to the source-gate volt-age of the transistor MCS1. While the source-gate voltage varies significantly over process cornerand temperature, it must be always greater than the minimum dropout voltage which the thin-oxidepass-transistor is designed for. To further enhance the driving capability of the auxiliary amplifier,an additional voltage follower stage might be intercepted.

Analogous to the conventional pass-transistor topology, the critical operating condition for pass-transistor sizing is at minimum supply voltage, maximum load current, weak process corner and, formost pass-transistor operating conditions, high temperature (when both the carrier mobility and thethreshold voltage are reduced). Since the total voltage headroom needs to be shared for the cascodepass-transistor topology, the minimum source-drain voltage is however reduced to VDD,min−VCASC ,with VCASC denoting the intermediate voltage node. To mitigate the impact of supply voltagevariations on the LDO characteristics, the thin-oxide pass-transistor must be again laid out tooperate in saturation region under all operating conditions. The required pass-transistor dimensionsfor a given LDO performance specification, particularly including the maximum load current andthe minimum dropout voltage, can thus be determined as:(

W

L

)PASS

= 2µp · COX,thin

· ILOAD,max

(VDD,min − VCASC)2 (5.79)

In contrast to a thick-oxide pass-transistor with a minimum channel length of more thanLmin = 0.25µm, the thin-oxide pass-transistor is thereby affected by short channel effects, makingits behavior less predictable by the basic Shichman-Hodges transistor model (Tsividis, 2004). Asrevealed at the preceding discussion of the conventional pass-transistor topology, the total pass-transistor gate capacitance can be approximated as:


(W

L

)PASS

∼= 0.67 · L2min,thin ·

2µp· ILOAD,max

(VDD,min − VCASC)2 (5.80)

thereby neglecting the overlap capacitance as first order approximation. The 0.13µm standardCMOS technology used for this design example offer a thick-oxide transistor with a maximum


voltage rating of 3.6 V and a minimum (drawn) channel length of Lmin,thick = 0.5µm, as well asa thin-oxide transistor with a maximum voltage rating of 1.5 V and a minimum (drawn) channellength of Lmin,thin = 0.15µm. While the thick-oxide transistor has a lower gate capacitance perarea compared to the thin-oxide transistor (approximately a factor of two), its minimum channellength is 3.3 times larger. Since both the threshold voltage and the carrier mobility are very similarfor both transistor types, the thin-oxide pass-transistor offers a lower gate capacitance by a factorof eleven for the same current drive capability compared to its thick-oxide counterpart.

While the thin-oxide pass-transistor MPASS must be operated in saturation region to mitigatethe impact of supply voltage variations, the thick-oxide cascode transistor MCASC is operated intriode region with its source-drain voltage equal to VCASC −VCORE . The transistor dimensions fora given LDO performance specification, particularly including the maximum load current as wellas the minimum dropout voltage, can thus be determined as:(

W

L

)CASC

= 1µp · COX,thick

· ILOAD,max

(VCASC − Vth,p) · (VCASC − VCORE)−(VCASC−VCORE

2)2 (5.81)

With the thick-oxide cascode transistor operated in triode region, the total gate capacitance canbe approximated as (Allen and Holberg, 2002, p. 85):

(CGS,CASC + CGD,CASC) = 2 · COX,thick · (LD + 0.5 · Leff ) ·Weff (5.82)

where LD denotes the overlap due to lateral diffusion, which is neglected here for simplicity. Byminimizing the channel length of the thick-oxide cascode transistor, such that Leff = Lmin,thickand Weff =

(WL

)PASS

· Lmin,thick, the above equation can be expressed as:

(CGS,CASC + CGD,CASC) ∼= COX,thick · L2min,thick ·

(W

L

)CASC

(5.83)

∼=L2min,thick

µp· ILOAD,max

(VCASC−Vth,p)·(VCASC−VCORE)−(VCASC−VCORE

2)2

Since the total voltage headroom needs to be shared for the cascode pass-transistor topology,there is a sizing trade-off between the thin-oxide pass-transistor and the thick-oxide cascode tran-sistor with regard to quiescent current demand as well as silicon area. Both aspects are examinedin the following based on a design example realized in a 0.13µm standard CMOS technology. Theminimum supply voltage for this design example is 1.8 V, while the nominal output voltage is 1.5 V.

For the cascoded flipped voltage follower (FVF), the lower gate capacitance of the thin-oxidepass-transistor directly translates into a reduced quiescent current demand - potentially amount-ing to a factor of eleven for the given design example. However, by introducing the additionalthick-oxide cascode transistor and the hereof resulting shared voltage headroom, this theoreticalquiescent current saving factor cannot be exploited. As can be revealed by analyzing Eq. 5.80,scaling of the pass-transistor gate capacitance, and thus the quiescent current is instead deter-mined by scaling of the minimum channel length (Lmin,thin) in proportion to that of the voltageheadroom (VDD,min − VCASC). By sharing the total voltage headroom for instance in the ratioof three-to-one, the quiescent current is reduced by a factor of 6.25 for the given design example(compared to the theoretic factor of eleven). This quiescent current advantage is further offset bythe auxiliary amplifier controlling the cascode transistor. The quiescent current of the auxiliary am-plifier is determined by the gain-bandwidth requirements, as revealed in Eq. 5.78, in combination


with the total gate capacitance of the cascode transistor, as revealed in Eq. 5.83. By operating thecommon-source transistor MCS1 in deep weak inversion such that gm,CS1 = ISD,CS1

n·VT , the quiescentcurrent of the auxiliary amplifier becomes directly proportional to the total gate capacitance of thethick-oxide cascode transistor. Since the gain-bandwidth requirements of the auxiliary amplifier arelow compared to the main LDO feedback loop, it adds only a small quiescent current offset. Thequiescent current advantage is therefore conclusively reduced to a factor of five for the given designexample. For lowest quiescent current demand, a good rule-of-thumb is thereby to distribute thevoltage headroom such that the total gate capacitance of the thin-oxide pass-transistor is smallerthan that of the thick-oxide cascode transistor by the same factor as the small-signal voltage gainAv,CG12.

The rule-of-thumb for lowest quiescent current demand is however in conflict with the require-ment for smallest silicon area. The total silicon area required for both the thin-oxide pass-transistorand the thick-oxide cascode transistor can be approximated as:

Area ∼= L2min,thin ·

(W

L

)PASS

+ L2min,thick ·

(W

L

)CASC

(5.84)

Though the overall voltage headroom needs to be shared, the cascode pass-transistor topologyprovides a significant area advantage compared to the conventional pass-transistor topology usinga single thick-oxide transistor operated in strong inversion. As lower the dropout voltage, as moresignificant the area advantage becomes - amounting to a factor of approximately two for the givendesign example with a dropout voltage of 300 mV. For smallest silicon area, a good rule-of-thumb isthereby to distribute the total voltage headroom evenly between the thin-oxide pass-transistor andthe thick-oxide cascode transistor. It should be additionally noted that both the area advantage andthe best divider ratio strongly depend on the CMOS technology used, particularly on the minimumtransistor length and the gate capacitance per area of both transistor types.

In conclusion, the cascode pass-transistor topology offers substantial benefits over the conven-tional pass-transistor topology, both with regard to quiescent current demand and silicon area.For a given LDO specification, particularly including the dropout voltage and the maximum loadcurrent, it helps to minimize the gate capacitance of the pass-transistor and consequently to reducethe LDO quiescent current. As a positive side effect of the cascode pass-transistor topology, theimpact of supply voltage variations on the LDO characteristics is strongly mitigated. As revealed inChap. 5.5.1, supply voltage variations directly translate into a varying source-drain voltage of thepass-transistor. For the cascode pass-transistor topology, the source-drain voltage however remainsbasically constant over supply voltage. In this way, both the small-signal voltage gain Av,PASSand the pole frequency ωp1 become significantly less sensitive to supply voltage variations, thusnarrowing the LDO design-tolerance window.

5.7 Current Sink Capability

Similar to the basic voltage follower, the cascoded flipped voltage follower (FVF) has a large currentsource capability while its current sink capability is limited to the bias current IDS,CM1. In orderto improve the LDO transient behavior in response to negative load current steps and positivesupply voltage steps, respectively, the cascoded FVF needs to be extended to class-AB operation(Razavi, 2001). This class-AB extension in fact does not only improve the transient response, butalso the loop stability at low load conditions. A negative load current step (and similarly a positive

5.7 Current Sink Capability 113

MPASS

CLOAD ILOAD

VCORE

VGATE

VAMPVBIAS2

MSINK

VDD VDD

MPASS

CLOAD ILOAD

VCORE

VGATE

VAMPVBIAS2

MSINK

VDD VDD

(a) (b)

R1R1

VBIAS1

MCM1

VBIAS1

MCM1

Fig. 5.19. Circuit diagram of the cascoded flipped voltage follower (FVF) (a) with current sink, and (b)with gain-boosted current sink.

line transient step) causes a transient overshoot of the core voltage. In order to bring the outputvoltage back to its nominal value, the LDO feedback loop shuts down the pass-transistor. The polefrequency ωp1 is thus solely determined by the transconductance of the sensing transistor MCG1and thus drastically decreases. In case the pole frequency ωp1 is (temporarily) shifted below theleft-half-plane (LHP) zero ωz0, the phase shift of the first dominant pole p0 is not fully compensated.As revealed in Chap. 5.1.1, the resulting pole-pole-zero configuration in turn causes a conditionalinstability.

ωp1 < ωz0gm,CG1

CLOAD<gm,EAC1

(5.85)

A simple approach to improve the current sink capability of the basic FVF is presented byJimenez et al. (2006). Thereby, a single transistor, in the following referred to asMSINK , is added tothe basic FVF. This approach can also be applied to the cascoded FVF without any modifications,as depicted in Fig. 5.19(a). With the transistor MSINK being of the same type as the sensingtransistor MCG1, the current sink capability of the cascoded FVF is determined by the intrinsictransistor gain gm,SINK = ISD,SINK

n·VT as well as the sizing ratio of the transistor MSINK and thesensing transistorMCG1. It is consequently limited to rather small values, at which a higher currentsink capability unavoidably results in a higher steady-state quiescent current.

The current sink capability of the cascoded FVF can be further improved by applying gainboosting techniques. As depicted in Fig. 5.19(b), the current sink is in this case realized by a singleNMOS transistor, which forms in combination with the sensing transistor MCG1 a local feedbackloop. In this way, the transconductance of the current sink transistor gm,SINK is boosted by thegain ACG,1, which in turn can be expressed as (see also Chap. 5.4, particulary Eq. 5.38):

Av,CG1 ∼=gm,CG1

(gm,CG2 + gmb,CG2) (5.86)

As revealed in Chap. 5.3.3, the cascoded FVF acts as a current switch. The transconductance ofboth the sensing transistor MCG1 and the cascode transistor MCG2 is therefore a function of thepass-transistor source-gate voltage VSG,PASS , and thus of the load current.


gm,CG1 =IDS,CM1 − VSG,PASS

R1

n · VT

gm,CG2 = 1λsi·

√√√√2 · µp · COX ·(WL

)CG2

VSG,PASSR1

(5.87)

At low load conditions, the major portion of the bias current IDS,CM1 flows through the sensingtransistorMCG1, while the current flowing through the cascode transistorMCG2 is correspondinglylow - resulting in a high gain Av,CG1 and a correspondingly high sinking current. With increasingload current, the bias current through the sensing transistor MCG1 decreases, while that throughthe cascode transistor MCG2 increases. Since the sensing transistor MCG1 is operated in deep weakinversion, its transconductance changes at a much faster rate than that of the cascode transistorMCG2. The sinking current in this way settles to a steady-state level at medium load conditions,which is defined by the bias voltage VBIAS2. The bias voltage is for this purpose generated withthe help of two stacked diodes, whereas the lower one matches with the transistor MSINK and theupper one matches with the transistor MCG2.

Owing to the gain-boosted current sink capability, the first dominant pole p1 is prevented tomove to very low frequencies at no load conditions - in this way avoiding conditional instability.Taking the transistor MSINK into account, the pole frequency ωp1 can be expressed as:

ωp1 ∼=1(

rds,PASS || 1(gm,CG1+gm,SINK ·Av,CG1)

)· CLOAD

(5.88)

In conclusion, by adding the current sink capability to the cascoded FVF, the LDO transientbehavior in response to negative load current steps and positive supply voltage steps, respectively,is improved and conditional instability is avoided. Owing to the gain boosting scheme, the sinkingcurrent in steady-state is minimized.

5.8 LDO Scaling Laws and Trade-Offs

The circuit analysis of the any-load stable LDO is in the following concluded by deriving con-cise scaling laws from the detailed results previously obtained. This discussion in detail comprisestwo aspects. First, a set of scaling laws is derived, describing the behavior of fundamental LDOperformance parameters as function of the LDO quiescent current. Second, the effect of CMOStechnology scaling on the LDO design, in particular on its quiescent current, is examined. In thisway, the trade-offs between competing performances parameters are illustrated and the ultimateperformance capabilities of the any-load stable LDO topology are explored, particularly also acrossdifferent CMOS technologies. Complementing the discussion on LDO scaling laws, and demonstrat-ing their practicable application, a universal LDO adaption strategy is presented. This strategyallows to easily adapt an existing LDO design to a wide variety of MCU digital core designs andneeds. While these scaling laws and trade-offs are subsequently identified for the any-load stableLDO in particular, the obtained results can also be applied to any other, externally compensatedLDO topology. As revealed in Chap. 4.2.1, these topologies are affected by the same fundamentaldesign trade-offs.

5.8 LDO Scaling Laws and Trade-Offs 115

5.8.1 Scaling Laws for LDO Performance Parameters

As revealed in Chap. 5.1, the fundamental trade-offs between high gain, fast transient response aswell as loop stability under all operating conditions are resolved for the any-load stable LDO. ThisLDO topology can thus be adapted for a wide range of specification requirements. For this purpose,simple dimensionless-parameter scaling laws for all critical performance parameters are establishedin the following, with the LDO quiescent current as free parameter for design. This in detail includesthe load capacitance, the maximum load current, the dropout voltage as well as the transient voltageerror. In each case, the scaling of the LDO quiescent current is determined in dependence on one ofthese design parameters, while the remaining design parameters are kept constant. The followinginvestigation thereby focuses on the cascoded flipped voltage follower (FVF). The folded-cascodeamplifier as well as the biasing network add an offset to the overall quiescent current demand, whichis basically independent of the LDO performance specification and can be neglected in most cases.

The LDO scaling laws are drawn based upon the equivalent circuit model as introduced andderived in Chap. 5.4 (particularly see Fig. 5.13), which will be referred to in the following sections.

Load Capacitance

The LDO load capacitance is an essential specification parameter for the design of the any-loadstable LDO. Depending on the compensation scheme, it can vary by more than three decades. Whilethe load capacitance is in the range of a few nanofarads in case of on-chip integration, it can easilyexceed a few microfarads in case of an off-chip component. The load capacitance directly affectsthe dominant pole frequency ωp1 and in this way also the LDO gain-bandwidth.

ωp1,high = 1rds,PASS ·CLOAD

(5.89)

To maintain loop stability of the any-load stable LDO both in low load and high load conditions,the pole frequency of both p0 and p2 need to be adapted correspondingly to the load capacitance.The first dominant pole p0 can be easily shifted by adapting the capacitance C1, without impactingthe LDO quiescent current demand. While shifting the first non-dominant pole p2 has a drasticimpact for the vast majority of LDO topologies, this adaption is comparably straightforward forthe any-load stable LDO by varying the resistance R1 (also see Chap. 5.3.3). Assuming both thesmall-signal voltage gain as well as the loop stability (here expressed by the damping factor ζ) areto be maintained, the resistor R1 must be shifted at the same rate as the LDO load capacitanceCLOAD. A handy expression can be obtained for this purpose by taking the expression for thedamping factor (also see Chap. 5.1.1, Eq. 5.51), and assuming that the two poles are widely spaced,such that ωp2 ωp1.

ωp2 ∼= 4 · ζ2 ·Av,CASFV F · ωp1,high1

R1 · (CGS,PASS + CGD,PASS)∼= 4 · ζ2 ·Av,CASFV F ·

1rds,PASS ·CLOAD

(5.90)

At the same time, the cascoded FVF bias current IDS,CM1 needs to be scaled inversely proportionalto the load capacitance in order to maintain the small-signal voltage gain despite of scaling theresistor R1.

Av,CASFV F = Av,CG12 ·Av,PASS ∼=(IDS,CM1 ·R1 − VSG,PASS

n · VT

)·Av,PASS (5.91)


In conclusion, and hardly surprising, the LDO quiescent current increases for smaller load ca-pacitances (Iq ∝ 1/CLOAD). Abstractly considered from a time-domain perspective, the passivecharge reservoir is reduced in case of a smaller load capacitance, which in turn demands a highergain-bandwidth of the active LDO feedback loop. To keep the operating point of the transistorsthereby constant at different biasing levels (namely including the common-gate transistors MCG1andMCG2), their aspect ratio needs to be scaled accordingly. The affect of varying transistor dimen-sions, particularly with regard to the parasitic capacitance, is however neglected here for simplicity.In addition, it is worth pointing out that any parasitic series impedance of the load capacitance, inparticular due to the bond wire inductance, is neglected for these considerations. While this is anappropriate assumption in case of an on-chip integrated load capacitance, this is not valid for anoff-chip component. In this case, and as discussed in Chap. 4.3.2, the bond wire inductance signifi-cantly affects the output impedance seen by the LDO. In contrast to the scaling law stated above,the quiescent current of an any-load stable LDO design compensated by an on-chip integrated loadcapacitance of 5 nF thus does not necessarily increase by a factor of hundred compared to a designutilizing an off-chip load capacitance of 500 nF.

Maximum Load Current

The ability to source high load current while achieving low dropout voltage requires the use of alarge-size PMOS pass-transistor. Its aspect ratio, and in turn also its gate capacitance, are directlyproportional to the maximum load current (also see Chap. 5.6.1; Eq. 5.72).


(W

L

)PASS

(5.92)


(2

µp · COX· ILOAD,max

VDROP 2

)(5.93)

In addition, also the pole frequency ωp1 is proportional to the maximum load current (also seeChap. 5.4.1; Eq. 5.28).

ωp1,high = 1rds,PASS ·CLOAD

∼=λsi · ILOAD,max

CLOAD(5.94)

Combining these two boundary conditions, and assuming both the small-signal voltage gain as wellas the phase margin are to be maintained, the adaption of the resistor R1 can be determined as:

ωp2 = 4 · ζ2 ·Av,CASFV F · ωp1,high

1R1 · (CGS,PASS + CGD,PASS) = 4 · ζ2 ·Av,CASFV F ·

λsi · ILOAD,maxCLOAD

(5.95)

The resistor R1 needs to be adapted with the maximum load current to the inverse power of two.Noteworthy, the pass-transistor small-signal voltage gain Av,PASS does not change if the aspectratio is scaled at the same rate as the maximum load current. Again, to maintain the small-signalvoltage gain despite of scaling the resistor R1, the cascoded FVF bias current IDS,CM1 needs to bescaled accordingly.


n · VT

)·Av,PASS (5.96)


In conclusion, the LDO quiescent current scales with the maximum load current to the second power(Iq ∝ ILOAD,max

2). This relation holds true except for very low levels of maximum load current.In such cases the offset for the folded-cascode amplifier as well as the biasing network cannot beneglected anymore. It is worth pointing out that the LDO quiescent current scaling reduces to alinear relationship when applying the considerations on the minimum required load capacitance.As indicated in Chap. 4.3.2, the load capacitance is proportional to the maximum load current.

Dropout Voltage

The pass-transistor size is not only determined by the maximum load current, but also by theLDO dropout voltage, which is the difference between the minimum supply voltage and the outputvoltage. To achieve a good power supply rejection, the pass-transistor is designed to operate insaturation region under all operating conditions. Assuming furthermore the LDO dimensions arenot limited by the error amplifier output voltage range, the required pass-transistor dimensions canbe expressed as (also see Chap. 5.6.1; Eq. 5.72):


(W

L

)PASS


(2

µp · COX· ILOAD,maxVDROP

2

)(5.97)

As evident from this expression, the pass-transistor size and consequently its parasitic gate ca-pacitance are inversely proportional to the dropout voltage to the power of two. The total gatecapacitance (CGS,PASS + CGD,PASS), and in turn also the first non-dominant pole p2, are directlyrelated to the pass-transistor aspect ratio.

ωp2 = 1R1 · (CGS,PASS + CGD,PASS) (5.98)

Since the second dominant pole p1 is not (significantly) affected by the dropout voltage, the resistorR1 needs to be adapted such that the first non-dominant pole p2 remains constant in order to main-tain loop stability. To maintain also the small-signal voltage gain, the quiescent current determinedby the current mirror MCM1 needs to be adapted accordingly and analogous to the above sections.


n · VT

)·Av,PASS (5.99)

In conclusion, the quiescent current of the LDO voltage follower scales with the dropout voltage tothe second power (Iq ∝ VDROP 2). At the same time, however, a larger pass-transistor aspect ratioresults in a higher transconductance, and thus in turn provides a smaller voltage error in responseto fast load transient steps.

Av,CASFV F = Av,CG12 ·Av,PASS

Av,CASFV F ∼= Av,CG12 ·1λsi·

√2 · µp · COX ·

(W

L

)PASS

(5.100)

Av,CASFV F ∼= Av,CG12 ·2λsi·

√ILOAD,maxVDROP

2 (5.101)


As evident from the above equation, the small-signal voltage gain depends linearly on the LDOdropout voltage when neglecting the channel length modulation as first order approximation. Tonevertheless maintain the small-signal voltage gain (and consequently the LDO transient behav-ior), the pass-transistor gain variation needs to be compensated accordingly by adapting the gainAv,CG12. In this way, the dependency of the quiescent current on the LDO dropout voltage is effec-tively attenuated to lower than quadratic. A detailed discussion of the transient voltage error andthe related quiescent current scaling is deferred to the following section.

Transient Voltage Error

The transient voltage error of the any-load stable LDO in response to fast load transient conditionsis determined by the small-signal voltage gain and bandwidth of the cascoded FVF - as higherits gain, as smaller the transient error becomes. In case of an overdamped or critically dampedsystem response (corresponding to a damping factor ζ ≥ 1), the transient voltage error in responseto fast load transient steps can be most effectively reduced by increasing the small-signal voltagegain of the cascoded FVF. By increasing the bias current IDS,CM1, the transconductance gm,CG12is increased, and consequently the transient voltage error is reduced. The relationship is directlylinear at high bias current levels (i.e. by assuming VSG,PASS IDS,CM1 · R1), while it attenuatesat lower bias current levels.

∆VCORE ∼= −1

gm,CG12 ·R1·√

∆ILOAD


)PASS

∆VCORE ∼= −n · VT

IDS,CM1 ·R1 − VSG,PASS·√

∆ILOAD


)PASS

(5.102)

Iq ∝1

∆VCORE(5.103)

With increasing transconductance gm,CG12, however, also the gain-bandwidth increases, and conse-quently the loop stability is sacrificed. To maintain loop stability (here expressed by the dampingfactor ζ), the first non-dominant pole p2 needs to be pushed to higher frequencies by reducing theresistance R1 with the square root of the gain increase.

ζ = 12 ·√

CLOADgm,CG1 · gm,PASS ·R 2

1 · (CGS,PASS + CGD,PASS) (5.104)

In conclusion, the LDO quiescent current scales with the reciprocal of the square root of the tran-sient voltage error (Iq ∝ 1/

√∆VCORE) while loop stability is maintained. When trading-off higher

transient voltage error for lower quiescent current, it should be noted that decreasing the gain ofthe cascoded FVF makes its performance more susceptible to process, voltage and temperaturevariations as revealed in Chap. 5.5. In case the sensing transistor MCG1 runs out of current (suchthat IDS,CM1 · R1 ≤ VSG,PASS), the small-signal voltage gain Av,CG12 even completely collapses.An efficient way to reduce the small-signal voltage gain without decreasing the bias current of thesensing transistor MCG1 is to replace the current source MCM1 by a resistor R2, such that thebias current is defined as IDS,CM1 = (VBIAS2−VGS,CG2)/R2. In this case, the small-signal voltage gainAv,CG12 can be approximated as:


Av,CG12 ∼= gm,CG1 · (R1||R2) (5.105)

In case of an underdamped system response (corresponding to a damping factor ζ < 1), whichis more prevalent in practice due to faster response time and relaxed quiescent current penalty, theoutput voltage deviation is not only determined by the small-signal voltage gain, but also by thedynamic oscillation. This in turn enables another degree of freedom for optimization of the transientvoltage error in response to fast load transient steps.

Another interesting aspect arises due to the self-biasing of the cascoded FVF, and is not reflectedby the simplified equivalent circuit model. The pass-transistor drain current is thereby not onlydefined by the load current, but also by the bias current I1 of the sensing transistor MCG1. Thisin turn prevents the pass-transistor from entering into deep weak inversion at zero load condition.Taking this effect into account, the worst-case transient voltage error in response to a full-scale loadcurrent step is proportional to

(ILOAD,max+I1,minILOAD,min+I1,max

). In this way, the transient voltage error decreases

at higher LDO quiescent current levels even though the small-signal voltage gain remains constant.

Summary

The simple dimensionless-parameter scaling laws for all essential LDO performance parametersenable an early-stage trade-off analysis for system partitioning and design. This namely includesthe maximum load current ILOAD,max, the load capacitance CLOAD, the dropout voltage V DROP ,as well as the transient voltage error ∆VCORE in response to a full-scale load current step. As degreeof freedom for design serves thereby in each case the LDO quiescent current. The LDO scaling laws,and thus the LDO quiescent current demand, can thereby be traced back to the fundamental designrequirements of the error amplifier - particularly the gain and bandwidth required to drive the LDOpass-transistor and its parasitic gate capacitance.

In summary, the individual LDO scaling laws can be combined to a universal figure-of-merit(FOM), enabling easy benchmarking of different LDO topologies and designs. For this purpose, theinverse LDO current gain ( Iq

ILOAD,max) is multiplied by the LDO response time (CLOAD·∆VCOREILOAD,max

)- basically describing the inverse of the gain-bandwidth product for a certain LDO topology anddesign. A smaller figure-of-merit thereby indicates a superior LDO performance.

FOM [s] = CLOAD ·∆VCOREILOAD,max

· IqILOAD,max

= CLOAD ·∆VCORE · IqILOAD,max 2 (5.106)

This figure-of-merit takes into account all essential LDO performance parameters in such a waythat normal design trade-offs are factored out. For instance, the current gain can be traded-off forfaster response by resizing a given LDO topology without affecting the figure-of-merit. This forinstance applies when adapting the LDO load capacitance, e.g. doubling the LDO load capacitancewhile dividing the LDO quiescent current by two. On the other hand, if the figure-of-merit of acertain LDO implementation is far from the required figure-of-merit, there is little chance thatthe topology will meet the requirements after resizing. Noteworthy, this figure-of-merit is consistentwith the one proposed by Hazucha et al. (2005), and introduced in Chap. 4.2. To the best knowledgeof the author, this is the first analytical derivation and validation of this generally accepted LDOfigure-of-merit.

Excluded from the discussion so far, but nevertheless noteworthy, the loop stability namelyexpressed by the phase margin can be considered as degree of freedom for LDO quiescent current


optimization. While traditional circuit design targets a phase margin of typically 60 (Razavi, 2001,p. 354), the phase margin for the any-load stable LDO is here frequently reduced to about 30 to40 , in favor of an accordingly reduced quiescent current demand. Ultimately, this leads to the(almost philosophic) question of what is the minimum phase margin required to guarantee loopstability under all operating conditions.

5.8.2 CMOS Technology Scaling

To enable easy benchmarking of LDO topologies and designs across different CMOS technologies,the above derived LDO scaling laws are extended to also include CMOS technology scaling trends.While the evolution in CMOS technology is motivated and driven by a decreasing price-performanceratio for the digital circuits, the needs of the integrated analog and mixed-signal circuits must beconsidered likewise in a highly-integrated ultra-low-power MCU system (also see Chap. 2.3.2). Thesubsequent considerations on CMOS technology scaling are based on the constant field scalingtheory, at which both the operating voltage and the minimum transistor dimensions are scaled atthe same rate in order to keep the electric field constant (Tsividis, 2004).

In the context of LDO design, the most vital aspect for technology scaling is the pass-transistor.The required pass-transistor dimensions and consequently the gate capacitance presented to theerror amplifier are directly determined by the performance specifications and the used CMOStechnology. However, since the pass-transistor in a conventional pass-transistor topology is requiredto be a thick-oxide device with a comparably long minimum channel length and a high voltagetolerance, there are essentially no scaling benefits for the pass-transistor. This picture changes incase of the cascode pass-transistor topology. By replacing the thick-oxide pass-transistor by a thin-oxide device, it enables technology scaling for LDOs. As revealed in Chap. 5.6.2, the pass-transistorgate capacitance and in turn the LDO quiescent current decreases with the minimum channel lengthto the second power - corresponding to a factor of two with each technology node.

(CGS,PASS + CGD,PASS) ∼= 0.67 · COX ·L2min ·

(W

L

)PASS

(CGS,PASS + CGD,PASS) ∼= 0.67 ·L2min ·

2µp· ILOAD,max

V 2DROP

⇒ Iq ∝ L2min (5.107)

Based on the above estimation, the LDO quiescent current is predicted to decrease by a factor ofeleven for the example implementation using a 0.13µm CMOS technology, at which a thick-oxidedevice with a drawn channel length of 0.5µm is replaced by a thin-oxide device with a drawn channellength of 0.15µm. This quadratic relationship is also still maintained when taking into account thesizing trade-off associated to the additional thick-oxide cascode transistor, i.e. when sharing the totalvoltage headroom between the thin-oxide pass-transistor and the thick-oxide cascode transistor. TheLDO quiescent current saving is, however, partially offset by the local feedback loop required forcontrolling the cascode transistor. Since the cascode transistor needs to withstand the full supplyvoltage, and thus does not (significantly) benefit from CMOS technology scaling, the bias currentfor the auxiliary amplifier remains in a first approximation constant. The advantage of the totalLDO quiescent current thus starts to saturate for deeply scaled CMOS technologies.

To take into account the technology scaling trends for LDOs, Hazucha et al. (2005) propose adimensionless figure-of-merit (FOM), whereas the above introduced figure-of-merit is divided by the


process-specific gate delay tG. The gate delay tG, also referred to as FO4 metric, is widely adoptedin digital circuit design and describes the delay of an inverter, driven by an inverter four timessmaller than itself, and driving an inverter four times larger than itself (Harris et al., 1997).

FOM2 = FOM

tG= CLOAD ·∆VOUT · Iq

tG · ILOAD,max 2 (5.108)

Assuming a constant field scaling, such that the supply voltage is reduced at the same rate as thetransistor dimensions, the process-specific gate delay tG scales linearly with the minimum channellength (Harris et al., 1997). This proposed figure-of-merit, when considered superficially, appearsto be in conflict with the above conclusion for technology scaling of LDOs, predicting a scalingwith the minimum channel length to the second power. By taking the gate delay tG as metric fortechnology scaling of LDOs, it is implicitly assumed that the pass-transistor operates in triode regionat minimum dropout voltage and maximum load current. However, as revealed in Chap. 5.5.1, thepass-transistor must be operated in saturation region in order to mitigate the impact of supplyvoltage variations on the LDO characteristics.

Besides of the pass-transistor, also the error amplifier design needs to be considered for CMOStechnology scaling. As CMOS technology advances into the deep submicron regime, a numberof challenges must be addressed for the design of analog circuits in general, and the LDO erroramplifier in particular. The most prominent design challenge is the reduction of supply voltages- resulting in reduced voltage headroom and reduced signal swing (Annema et al., 2005). For theLDO, however, the supply voltage is defined by the battery technology and is thus independent ofthe used CMOS technology. The error amplifier is instead operated at a supply voltage significantlyhigher than the nominal specified by the CMOS technology. A key parameter of the LDO erroramplifier is its low-frequency gain, which depends on the intrinsic transistor gain (gm/gds). Whiletechnology scaling introduces an increase of the transconductance gm, this is reversed by a strongerincrease of the output conductance gds - resulting in a degraded transistor gain (Annema et al.,2005; ?). To nevertheless achieve a high voltage gain, multi-stage error amplifier topologies havebeen widely proposed (see Chap. 4.2.2). Since each gain stage introduces a pole into the overallfrequency response, these topologies must employ advanced Miller compensation schemes, and arethus critical with respect to loop stability (Guo and Leung, 2010; also see Chap. 4.2.2). Owing tothe high supply voltage of the LDO error amplifier, the degraded transistor gain can be alternativelyrecovered by utilizing gain-boosting techniques. The principle of these techniques is to add a localfeedback loop to the cascode device, resulting in a higher transistor output impedance and thushigher amplifier gain without introducing additional poles to the overall frequency response. Theeffective output impedance attainable via gain-boosting techniques is ultimately limited by gateleakage.

An obvious advantage of migration to more advanced CMOS technologies is the possibility ofa selective application of thin-oxide and thick-oxide transistors depending on their specific charac-teristics. An illustrative example is the differential input transistors of the folded-cascode amplifier,which define the overall LDO offset error. Since the transistor matching per unit area improves withscaled down gate-oxide thickness (Pelgrom et al., 1989; Brederlow et al., 2001), these transistors arepreferably thin-oxide devices to reduce the LDO offset error. The differential input transistors arethereby protected by thick-oxide devices to withstand the supply voltage significantly higher thanthe nominal voltage specified by the CMOS technology. As for most analog circuits with ultra-lowbias currents, the cascoded FVF greatly benefits from the availability of high-sheet resistors withlow parasitic substrate capacitance. At the same time, however, deeply-scaled CMOS technologies


primarily target digital applications and frequently lack for high quality passive components suchas resistors and capacitors. These components require extra mask layers, and thus add cost.

5.8.3 Universal LDO Design and Adaption Strategy

To enable a simple and quick reuse of the power management system, the ultimate aim is to developa universal LDO design, which can be easily adapted to a wide variety of MCU digital core designsand needs. The intrinsic on-chip decoupling capacitance, but also the maximum current demandand the maximum clock frequency can vary significantly for different implementations of the MCUdigital core.

In case of a large (external) load capacitance, a universal LDO design can be established with-out severe limitations. The LDO needs to be laid out for the maximum current demand ever tobe expected from any MCU digital core. Although the maximum current drive capability is notexploited by a smaller MCU digital core (in terms of its functionality as well as its number of digitallogic gates), the LDO quiescent current overhead is negligible since the design trade-offs in case ofa large load capacitance are generally more relaxed. Since the external load capacitance is clearlydominating the overall LDO load capacitance, the on-chip decoupling capacitance is only of minorimportance for the LDO design.

In contrast thereto, the LDO design shows a strong interaction with the MCU digital core in caseof a small integrated load capacitance (as revealed in Chap. 4.3). The on-chip decoupling capacitanceof the MCU digital core in this case acts solely as LDO load capacitance. Additional on-chipcapacitance is extremely area consuming and costly, and should thus be avoided. Designing the LDOat the same time for the maximum current demand ever to be expected, and the smallest on-chipdecoupling capacitance ever to be expected widens the LDO design window dramatically and cannotbe afforded with respect to the LDO quiescent current demand. A simple but effective adaptionstrategy can be instead conceived by exploiting the correlation between the on-chip decouplingcapacitance and the maximum load current demand. A small MCU digital core, for instance, offersonly a small decoupling capacitance, but at the same time also shows a reduced load current demand.In accordance with the considerations for the minimum load capacitance required (see Chap. 4.3.2),a linear relationship is assumed for the following considerations.

To obtain a universal design, the LDO is initially laid out for the highest expected load currentdemand and the accordingly required on-chip decoupling capacitance. In case of a smaller MCUdigital core with a lower load current demand, only the pass-transistor size needs to be adaptedcorresponding to the actual maximum load current demand. At this, the pass-transistor gate noderemains connected to the error amplifier output, but its drain and source nodes are (partially)shorted. By adapting the pass-transistor size, the overall small-signal voltage gain is maintained.

Av,CASFV F = Av,CG12 ·1λsi·

√2 · µp · COX ·

(WL

)PASS

ILOAD,max(5.109)

For this reason, the load capacitance can be reduced linearly with the maximum current drivecapability without sacrificing loop stability at high load conditions. Loop stability at low loadconditions is not of concern since the minimum pole frequency increases when adapting the universalLDO design for a smaller MCU digital core (the transconductance gM,CG1 is maintained, while theload capacitance CLOAD is reduced).

ωp1(high) = λsi · ILOAD,maxCLOAD

(5.110)


VCORE: 20mV/div

Time: 2µs/div

LDO transient behavior in response to

a load step from 5mA to 10mA and back

LDO transient behavior in response to

a load step from 2.5mA to 5mA and back

LDO design adapted to high current drive capabilitywith full pass-transistor width activated and a load capacitance of 5nF

(a)

(b)LDO design adapted to reduced current drive capabilitywith half of the pass-transistor width activated and a load capacitance of 2.5nF

Time: 2µs/div

VCORE: 20mV/div

Fig. 5.20. Transient behavior of two variants of the universal LDO implementation in response to a loadtransient step. While (a) one variant is adapted for a maximum load current of 10 mA and a load capacitanceof 5 nF, (b) the other variant is adapted for a maximum load current of 5 mA and a load capacitance of2.5 nF.

With the pass-transistor gate node remaining connected to the error amplifier output, the totalpass-transistor gate capacitance, and consequently also the frequency of the first non-dominantpole ωp2 are maintained.

ωp2 = 1R1 · (CGS,PASS + CGD,PASS) (5.111)

Without adapting the pass-transistor size depending on the maximum current drive capability, itssmall-signal voltage gain increases and the gain-bandwidth does not decrease at the same rate asthe dominant pole frequency ωp1. This ultimately prohibits a reduction of the on-chip decouplingcapacitance by the same factor as the maximum load current without causing stability issues.

The effectiveness of the universal adaption strategy is demonstrated by the simulation resultsdepicted in Fig. 5.20. Here, the transient behavior in response to a full scale load transient stepis compared for two versions of the universal LDO design. While the one version is designed fora maximum load current of 10 mA and a load capacitance of 5 nF, the other version is designedfor a maximum load current of 5 mA and a load capacitance of 2.5 nF. By exploiting the univer-sal adaption strategy, the LDO performance, including the transient response as well as the loopstability, is evidently maintained for different combinations of maximum load current and load ca-pacitance. This adaption strategy in conclusion allows adapting an existing LDO design for variousMCU digital core designs with minimum design effort. The LDO can thereby be designed for afixed load capacitance (instead of a wide load capacitance range), in this way tightening the LDOdesign window. However, it should not be a secret that the LDO quiescent current of the univer-sal LDO design is identical for all versions, while it could be reduced linearly with the maximumcurrent drive capability in case of a customized design approach (as revealed in Chap. 5.8.1). Ab-stractly speaking, the LDO design can be either (1) quickly adapted by exploiting the universalLDO adaption strategy, but resulting in a non-optimum LDO quiescent current, or (2) customized


to the particular specification, requiring more design time and effort, but benefiting from a trulyoptimized LDO quiescent current.

5.9 Summary

The any-load stable LDO addresses both the static and the dynamic regulation separately bydecomposing the control tasks into constituent feedback loops. A slow stage, realized by a folded-cascode amplifier, provides high voltage gain to the system, while a fast stage, realized by a cascodedflipped voltage follower, provides a fast feedback path to the system. In this way, the fundamentaltrade-offs between high voltage gain, a large gain-bandwidth as well as loop stability under alloperating conditions are resolved. The any-load stable LDO topology can be easily adapted to awide range of applicative requirements, namely defined by the load capacitance, the maximum loadcurrent, the dropout voltage as well as the transient voltage error. In each case, the quiescent currentserves as free parameter for design. As confirmed and quantified by a detailed circuit analysis, theLDO design trade-offs can thereby be summarized to a concise figure-of-merit - consistent with theone proposed by Hazucha et al. (2005), and introduced in Chap. 4.2. To the best knowledge of theauthor, this is the first analytical derivation and validation of this generally accepted figure-of-meritfor LDOs.

FOM = CLOAD ·∆VOUT · IqILOAD,max 2 (5.112)

The LDO figure-of-merit, and thus the LDO quiescent current demand, can be traced back tothe fundamental design requirements of the error amplifier - particularly the gain and bandwidthrequired to drive the LDO pass-transistor and its parasitic gate capacitance. The pass-transistorforms the backbone of any LDO topology and thus greatly affects its performance. As the pass-transistor in a conventional pass-transistor topology is required to be a thick-oxide device with acomparably long minimum channel length and a high voltage tolerance, it does not (considerably)benefit from CMOS technology scaling. This picture changes for an alternative, cascoded pass-transistor topology. By combining a low voltage thin-oxide pass-transistor with a high-voltage thick-oxide protection device, this topology enables technology scaling for LDOs. The pass-transistor gatecapacitance and in turn the LDO quiescent current decreases with the minimum channel length tothe second power - corresponding to a factor of two with each technology node. This (theoretic)LDO quiescent current saving is partially offset by the local feedback loop required for driving thecascode pass-transistor.

The majority of the previously presented fully-integrated LDO topologies, using some form ofMiller compensation to establish an internal dominant pole, suffer from stability issues at low loadconditions (also see Chap. 4.2.2). In contrast, the any-load stable LDO can be easily adapted tooperate with a small integrated load capacitance. Clearly evident from the above figure-of-merit, theabsence of a large external capacitance presents several design challenges for stability and transientbehavior of the LDO feedback loop, resulting in an increased quiescent current demand. While asmall LDO load capacitance enables an energy-efficient wake-up from sleep mode with ultra-low-power consumption, the LDO quiescent current necessarily increases when the LDO is active. Theactual “sweet spot” for system energy consumption strongly depends on the wake-up frequency andis thus application related. From a MCU system energy consumption perspective, there is thus afundamental trade-off between minimizing the LDO load capacitance in order to achieve a fast andenergy-efficient wake-up from sleep mode on the one hand and an increased LDO quiescent current

5.9 Summary 125

when minimizing the load capacitance on the other hand. Strongly depending on the applicationrequirements, i.e., particularly depending if the MCU system frequently wakes up from sleep modeor remains constantly in active mode operating at low clock frequencies, either the one or the othermight be preferred. Motivated by this fundamental trade-off, digital-enhancement techniques forLDOs are introduced in the subsequent chapter.

6

Digitally-Enhanced LDO Voltage Regulators

The design of low-dropout voltage regulators (LDO) is subject to fundamental design trade-offs asidentified and quantified in the preceding chapter. The largest trade-off among all other specifica-tions exists between a high voltage gain for an accurate final value, a large gain-bandwidth for afast transient response, as well as loop stability under all operating conditions. Achieving a givenperformance specification results in a certain, irrevocable LDO quiescent current demand. Any LDOdesign can only approach the theoretical limits expressed by the basic LDO figure-of-merit. At this,removing the large (external) capacitance clearly exacerbates the design trade-offs by demandinga higher gain-bandwidth, which in turn significantly increases the LDO quiescent current demand.From a system energy perspective, in contrast, the LDO should preferably combine a high currentefficiency under all load conditions with a small integrated capacitance at the LDO output in orderto enable a highly flexible and energy-efficient system operation (also see Chap. 3.5).

To reconcile these contradicting LDO design requirements, an alternative approach is proposedand analyzed in this chapter: The requirements for the analog LDO feedback loop can be alter-natively relaxed by applying digital-enhancement techniques. The key idea behind is to exploitsynergy effects on system level due to exact system knowledge in order to relax the fundamentaltrade-offs for the design of LDOs. In a first step, the digital-enhancement techniques are examinedfrom a control theory perspective. A discrete load adaption scheme is proposed as most promisingapproach to combine a high current efficiency under all load conditions with a small integratedcapacitance at the LDO output. At this approach, the LDO current drive capability is digitallycontrolled depending on the power demand of the MCU digital core. After outlining the systemrequirements for the discrete load adaptive LDO, the circuit concept and implementation detailsare examined in further detail. The exemplary implementation is thereby based on the any-loadstable LDO topology. The effectiveness of the discrete load adaption scheme is ultimately confirmedby simulated and experimental results, both under DC load conditions as well as under applicativeconditions (when supplying the MCU digital core).

6.1 Introduction to Digital-Enhancement Techniques

Implementing a basic feedback loop in an analog way represents state-of-the-art in LDO design andhas been intensively studied in the preceding chapters. To relax the fundamental design trade-offsand in this way to save LDO quiescent current, this work proposes and pursues an alternative

128 6 Digitally-Enhanced LDO Voltage Regulators

concept, which is in the following referred to as digital-enhancement techniques. This concept isbased on synergy effects due to exact system knowledge, evolving between the LDO on the on handand the MCU digital core on the other hand. While for a stand-alone LDO, the “actual” load isunknown, this is different for an LDO supplying a CMOS digital circuit as part of a fully-integratedMCU system. Here, the current consumption and its characteristics are rather well known. As in-troduced in Chap. 4.3.1, the current consumption of CMOS digital circuits can be broadly classifiedinto a static component, determined by leakage currents, and a dynamic component, determinedby switching of the digital logic gates. For ultra-low-power MCU systems, the leakage componentadds only a small, slowly-varying offset to the overall current consumption, and is thus neglectedfor the following considerations. In this way, the average current consumption of the MCU digitalcore, which is to be provided by the LDO, can be most generally expressed as:

ILOAD = IDYN + ISTAT ∼=k∑i=1

(αi · Ci) · fCLK · VCORE (6.1)

where αi is the activity rate and Ci is the capacitance of a single digital logic gate, fCLK is theclock frequency and VCORE is the core voltage. While some of these parameters are hard to predictduring system operation, other parameters can be easily determined on system level. For a highly-integrated MCU system, both the system clock frequency and the core voltage are generated andcontrolled internally. They can hence be easily and precisely predicted during system operation,potentially also in advance. This is different for the effective switching capacitance, which stronglydepends on the actual code and data being executed and processed. Although rough estimationscan be made based on the actual system operating condition (i.e. the number of active sub-modules,etc.), it is hardly possible to predict general, but precise information during operation.

The ultimate aim of the digital-enhancement techniques is to efficiently make use of these systempower information to relax the requirements and fundamental trade-offs for the LDO feedback loopand in turn to minimize the LDO quiescent current. By drawing inspiration from other disciplinesin the fields of control theory, and combining with the above considerations, two basic approachescan be identified to utilize the system power information in order to enhance the performance of theLDO feedback loop (Landau et al., 2011; Seborg et al., 2011). Both approaches rely on mathematicalmodels of the control system, either based on simple analytical relationships, or based on linearempirical models obtained by system identification. The aim of the following sections is hence toprovide a concise introduction into digital-enhancement techniques for LDOs with a particular focuson the control theory aspects.

Adaptive Feedback Control (Parameter Scheduling)

The properties of a conventional feedback control scheme are defined at design time and needs to belaid out for worst-case operating conditions. It can therefore only provide sub-optimal control per-formance: The transient response may be sluggish, errors may fail to stay within satisfactory limits,or designs must compensate for loose error tolerances in other ways. Adaptive feedback control incontrast promises a superior control performance under the presence of large and unknown varia-tions in operating conditions (Leith and Leithead, 1999; Landau et al., 2011, p. 9ff.). It dynamicallyadapts the control behavior depending on the actual system operating conditions, and is thereforeable to dynamically adjust the trade-off between good static accuracy, fast transient response andloop stability. An adaptive feedback control system is for this purpose based on an inner (con-ventional) feedback loop, which control parameters are dynamically adjusted by an outer control

6.1 Introduction to Digital-Enhancement Techniques 129

Reference

-+

PID

Controller

ErrorProcess

Scheduling

Output

Feedback

(a)

Observation

Control

Reference

-+

PID

Controller

ErrorProcess

Output

Feedback

(b)

Control

Disturbance

Disturbance

Feed

Forward

+

+

Adaptive Feedback

Control Scheme

Feed-Forward

Control Scheme

Prediction

Fig. 6.1. Digital-enhancement techniques aim to relax the fundamental LDO design trade-offs by makinguse of synergy effects due to exact system knowledge. Examples for such techniques are (a) an adaptivefeedback control scheme, as well as (b) a predictive feed-forward control scheme.

loop. The control parameters are most widely adapted in a closed-loop scheme - either based on astatic reference model, which defines the desired control behavior (model reference adaptive control- MRAC), or based on a dynamical reference model, which is determined by observing and iden-tifying the actual control behavior (model identification adaptive control - MIAC). Alternatively,the control parameters of the inner feedback control loop might also be adjusted in an open-loopscheme in case the behavior of the control system is well known at different operating conditions(Leith and Leithead, 1999). A principle block diagram of such a control scheme - widely referredto as parameter scheduling - is depicted in Fig. 6.1(a). Parameter scheduling assumes the existenceof a rigid relationship between one or more observable variables, called the scheduling variables,the control parameters as well as the control behavior. The scheduling variables are exploited todetermine the current operating conditions, and based hereupon to dynamically adapt the controlparameters. Since the impact of the control parameter modification on the control behavior is notverified, parameter scheduling may fail in case the rigid relationship becomes invalid for any rea-son. Due to the open-loop configuration, the parameter scheduling does not need to be consideredfor loop stability of the inner feedback loop. The control system is instead stable when consider-ing all possible combinations of control parameters and operating conditions independently. As adrawback, the design procedure of an adaptive feedback control system with parameter schedulingtherefore tends to be tedious and time consuming.

The concept of parameter scheduling can be applied to the basic LDO feedback loop by exploitingthe known system power information of the MCU digital core to predict the absolute current


demand. Although the missing information about the effective switching capacitance prevent anabsolute prediction of the current demand, restrictive predictions can be, nevertheless, made basedon the available information about the clock frequency as well as the core voltage only. This alsoincludes a prediction of the worst-case load transient step ranging from leakage current level upto the respective maximum load defined by the clock frequency and the core voltage. In this way,knowledge of the system operating parameters enables an adaptive setting of the LDO currentdrive capability, and in turn an independent performance optimization of the LDO feedback loopat different load conditions.

Predictive Feed-Forward Control

A conventional feedback control scheme is only able to take corrective actions in response to adeviation of the controlled output. Perfect control, at which the controlled output remains at thetarget level independent of any variations in the operating conditions, is therefore by definitionimpossible. The basic concept of feed-forward control, in contrast, is to measure significant variationsand take corrective actions before affecting the controlled output (Faanes and Skogestad, 2004;Seborg et al., 2011, p. 273ff.). The control variable adjustment is accordingly not error-based, butinstead relies on a precise mathematical model describing the controlled output as a function ofboth the operating conditions and the control variable. The quality of feed-forward control vitallydepends on the accuracy of this mathematical model. In practical applications, the feed-forwardcontrol is therefore almost always combined with a feedback control loop. While the feed-forwardcontrol is used to reduce the impact of measurable variations, particularly in response to fasttransient variations, the feedback control compensates for inaccuracies in the mathematical model,measurement errors as well as unmeasured variations. A principle block diagram of such a controlscheme - widely referred to as predictive feed-forward control - is depicted in Fig. 6.1(b). In thistypical control configuration, the output of the feed-forward and the feedback control are combined,and the sum acts as the signal for the final control element. The feed-forward control does notaffect the stability of the feedback control loop, such that both control systems can be designedmostly independent from each other (Seborg et al., 2011, p. 283f.). In comparison to a conventionalfeedback control scheme, the predictive feed-forward control is able to significantly improve thetransient control performance whenever there are major variations which can be measured beforeaffecting the controlled output.

The concept of predictive feed-forward control can be applied to improve the LDO behaviorin response to the fast load transient variations when supplying the MCU digital core. While thebasic LDO feedback loop can react only in response to an output voltage deviation, the predictivefeed-forward control allows anticipating the effect of load current variation on the output voltage. Itthereby exploits the knowledge of the MCU digital core through feed-forward input, while account-ing for errors (uncertainties, disturbances) using voltage feedback. While it is complex to rapidlyand precisely measure the transient disturbance (corresponding to the change of load current), theknown system power information of the MCU digital core allow to predict the relative change ofload current. By utilizing the available information about the clock frequency as well as the corevoltage, relative changes in the current demand of the MCU digital core are anticipated by thepredictive feed-forward control scheme. This is of particular help to react to the fast load transientvariations when changing the clock frequency. In this way, the transient response requirements andin turn the fundamental design trade-offs for the LDO feedback loop are greatly relaxed.

6.2 Demonstrator System and Design-for-Test 131

Summary

In conclusion, the application of digital-enhancement techniques to LDOs can be considered as widefield of research. Although the digital-enhancement techniques might seem obvious, such informationcan be utilized clearly beneficially in the design of LDOs. To minimize LDO quiescent current fora given performance specification, the two major challenges in LDO design are addressed by thedigital-enhancement techniques, which are load transient response as well as loop stability. Variouscontrol schemes can be conceived depending on (1) which system power information is exploited,and (2) how this information is applied to the LDO. All of these schemes are characterized bymaking use of synergy effects due to exact system knowledge in order to relax the fundamentaltrade-offs for the design of LDOs. To demonstrate the feasibility and benefits of digital-enhancementtechniques, a discrete load adaption scheme is exemplary examined in the remainder of this chapter.By exploiting the correlation between system clock frequency and load current demand, this schemedigitally adapts the maximum LDO current drive capability, and can thus be classified as parameterscheduled adaptive control. In this way, a high LDO current efficiency is achieved over a wide rangeof load conditions.

6.2 Demonstrator System and Design-for-Test

To experimentally verify the digital-enhancement techniques for LDOs, a demonstrator system hasbeen implemented in a 0.13µm standard CMOS technology. This demonstrator system combines thedigitally-enhanced LDO with a complete ultra-low-power MCU system. It serves two main purposes:(1) The LDO performance is verified separately under DC load conditions, thereby determining allcommon LDO performance parameters. (2) The cross-functional performance under applicativeconditions (when supplying the MCU digital core) is verified. Particularly with regard to the lattercase, simulation can only provide approximate results as part of the common analog design flow(also see Chap. 4.3). An experimental verification of the mixed-signal circuit concepts, implementedin modern deep submicron technologies, is therefore essential.

The following section gives an overview of the demonstrator system as well as the dedicatedmeasurement and test options, which enable a quick and accurate validation of the LDO perfor-mance.

6.2.1 Demonstrator System

The digitally-enhanced LDO is implemented as part of a complete ultra-low-power MCU systemfabricated in a 0.13µm standard CMOS technology. The demonstrator system is based on a com-mercially available ultra-low-power MCU system, as introduced in Chap. 2.1 (Zwerg et al., 2011).This MCU system comprises a 16-bit MSP430 CPU, an integrated power management and clockgeneration unit, analog and digital peripherals as well as a non-volatile FeRAM memory for fastwrite capability. The maximum system clock frequency is 16 MHz. In accordance with the scopeof this work, the focus of interest is here on the power management unit, including the digitally-enhanced LDO. A simplified block diagram of the power management unit is depicted in Fig. 6.2.

According to the power management architecture established in Chap. 3, the MCU digital coreis divided into a high performance domain (VCORE), which can be shut down in sleep mode, andan ultra-low-power domain (VRTC), which remains always active. Each domain is supplied by a


MCU Digital Core

vd

d_fa

il_o

ut

FeRAM

Domain

LDO_R

LDO_C

LDO_F

VCORE

VRTC

VFeRAM

SVS_L

VDD_Fail

Digital State-

Machine

svsl_out

SVS_H

VREF

Bias

BG

BOR

svsh

_o

ut

bo

r_o

ut

VDD

Ext. Components:- Battery- Input Capacitor

Ultra-Low-Power

Domain

High-Performance

Domain

Digitally-Enhanced LDO

Fig. 6.2. Simplified block diagram of the power management unit (adopted from Texas Instruments internaldesign documentation). The high performance domain is supplied by a single digitally-enhanced LDO (heredenoted by LDO_C), which does not need any external capacitance for compensation in order to allow afast and energy-efficient wake-up from sleep.

dedicated LDO: (1) The high performance domain is supplied by a single digitally-enhanced LDO(LDO_C), which does not need any external capacitance for compensation in order to allow a fastand energy-efficient wake-up from sleep. The LDO output voltage is here 1.52 V (typical) and isdecided by digital needs as trade-off between maximum operating speed requirements and powerconsumption. To guarantee a fail-safe operation, the LDO must keep its output voltage within atolerance window of +30/− 70 mV. For checking the supply system integrity, the output voltage ismonitored by the supply voltage supervisor SV S_L. In case the voltage falls below 1.42 V (typical),the system operation is ultimately stopped and the device is reset. The maximum LDO load currentis determined by the MCU digital core at highest performance levels and is here 2.56 mA. (2) Theultra-low-power domain is supplied by an LDO (LDO_R) optimized for ultra-low quiescent current,but with a limited current drive capability of 100µA. For ultra-low-power operation with onlylimited speed requirements, the output voltage is here chosen as 1.2 V. Besides, the ultra-low-powerMCU system comprises a non-volatile FeRAM memory, which is supplied by a dedicated supplysystem, including a dedicated LDO (LDO_F ) and supply voltage failure detection (V DD_Fail),not being discussed here.

A shared reference module comprises a bias current generator (Bias) for analog circuits, abandgap voltage reference (BG) as well as a reference voltage generator (V REF ). It providesthe required reference voltages and biasing currents to all other sub-modules, also including thedigitally-enhanced LDO. The LDOs are thereby configured in unity gain. Their reference voltagesare trimmed independently during final production test in order to guarantee accurate output levels.The trimming options are thereby combined in the reference voltage generator (V REF ), and arehence not part of the respective LDO circuitries. The external supply voltage of the MCU system

6.2 Demonstrator System and Design-for-Test 133

ranges from 1.9 V to 3.6 V. To guarantee a fail-safe operation, the supply voltage is checked in twoways. A brownout reset (BOR) circuit serves as lowest level input supply (VDD) supervisor. It isresponsible for generating the power-on reset as long as the supply voltage is too low to allow afail-save system operation. During system operation, the supply voltage is additionally monitoredby the supply voltage supervisor SV S_H. In case the supply voltage falls below 1.8 V (typical),the system operation is stopped and the device is reset.

The power management unit is controlled by a digital finite state-machine. Depending on thesystem operating mode, it is configured in a fine-grained and flexible way to achieve highest per-formance in active mode as well as to minimize its power overhead in sleep mode. Further detailson system operating modes and transition between them are addressed in Chap. 7.1. The systemoperating temperature ranges from −40 C to 85 C, corresponding to the industrial temperaturegrade.

For the digitally-enhanced LDO, the on-chip decoupling capacitance acts at the same time asLDO load capacitance, and thus plays an essential role for the LDO circuit design. The total on-chip decoupling capacitance amounts for the demonstrator system to about 3 nF. It is first andforemost determined by both the intrinsic and intentional capacitance of the MCU digital core(also see Chap. 4.3.2). The total intrinsic capacitance is 0.9 nF, while the intentionally placeddecoupling capacitance is 0.8 nF. In addition to the actual capacitance of the MCU digital core,a decoupling capacitance of 1.3 nF is placed on top-level (e.g. underneath the routing channels)without requiring any additional area. In this way, the overall on-chip decoupling capacitance issufficient for the LDO operation, avoiding the need for any additional, area-consuming on-chipcapacitance. For simulation purposes, the on-chip decoupling capacitance is modeled in accordanceto the considerations in Chap. 4.3.2.

6.2.2 Measurement and Test Options

For an easy and accurate verification of the LDO performance, dedicated measurement and testoptions are added to the demonstrator system. This includes a fully-differential sensing scheme tocharacterize the dynamic LDO behavior, an on-chip programmable load to generate well-definedDC load current profiles as well as a stand-alone operating mode to evaluate the LDO independentof the MCU digital core operation. These measurement and test options are briefly presented inthe following, and will be referred to in the context of the experimental LDO verification.

Differential Sensing

Characterizing the dynamic LDO behavior requires the measurement of full time-domain voltagewaveforms during the entire course of system operation. Besides the LDO behavior in response totransient line and load conditions, this also includes the LDO operation under applicative conditions.Particularly the measurement of the dynamic power supply noise, as it appears when supplyingthe MCU digital core, demands a high resolution, wide bandwidth measurement setup. However,without taking special care, large parasitics easily limit the measurable signal bandwidth. For thisreason, several on-chip voltage measurement schemes have recently been reported. Takamiya et al.(2002) for instance proposes a sub-sampling “on-chip oscilloscopes”, at which the device-under-testis operated under defined conditions in a repetitive manner to measure the dynamic power supplynoise. An alternative on-chip voltage measurement scheme is presented by Nagata et al. (2005).


VDD


Q

QD

Q

QD

LDO

VCORE

VFB

Bondwire

VSS

Bondwire

Off-Chip Trace

Off-Chip Trace

Active Probe

VCORE_SENSE

VSS_SENSE

Fig. 6.3. Schematic circuit diagram of the wide bandwidth, fully-differential sensing scheme to characterizethe LDO transient behavior, particularly including the power supply noise when supplying the MCU digitalcore.

Here, the supply voltage is buffered by a high speed on-chip amplifier to effectively isolate theoff-chip parasitics.

As the clock frequency, and consequently also the switching speed of the digital logic gates islimited to moderate levels in ultra-low-power MCU systems, the sophisticated on-chip measurementschemes are abandoned here in favor of a simple, but effective differential sensing scheme as depictedin Fig. 6.3. At this, the LDO output voltage (particularly its dynamic behavior) is measured by a4-wire Kelvin connection. Dedicated sense signals for the core voltage and the ground are connectedto the LDO feedback point and the LDO ground node, respectively. The LDO transient behavioris measured by an Agilent 1.5 GHz differential active probe in combination with an Agilent digitalstorage oscilloscope having an analog signal bandwidth of 1.0 GHz. The signal bandwidth of thedifferential sensing scheme is basically determined by the parasitic inductance and resistance causedby the bond wires as well as the off-chip traces in combination with the active probe. In particular,the parasitic inductance forms a series resonance circuit in combination with the differential inputcapacitance. By conservatively assuming a parasitic inductance of 5 nH for each sense line anda differential input capacitance of 1.0 pF of the active probe, the resulting signal bandwidth is1.5 GHz.

On-Chip Programmable Load

The LDO transient behavior is evaluated by applying well-defined DC load current profiles. Dueto the series inductance of the external leads, it is difficult to apply transient load current profileswith fast rise and fall times from external (Hazucha et al., 2005). With load transient slopes slowerthan the LDO gain-bandwidth, the LDO is able to react and follow these changes instantaneously.As the MCU digital core presents, however, very steep load transient conditions to the LDO, themeasurement results would pretend a too optimistic LDO transient response. To adequately emulatethese operating conditions, it is therefore essential to have transient load current profiles with rise

6.3 Principle of Discrete Load Adaption 135

CURRENT GAIN 2

VCORE

2x1x 64x

CURRENT GAIN 1SEL

CLK

DATA

IREF

b0 b1 b6

1

Fig. 6.4. Schematic circuit diagram of the on-chip programmable load. For test purposes, arbitrary DC loadcurrent profiles with rise and fall times significantly faster than the LDO gain-bandwidth can be generatedon-chip.

and fall times significantly faster than the LDO gain-bandwidth. For this purpose, arbitrary DCload current profiles can here be generated by an on-chip programmable load, which basic principleis depicted in Fig. 6.4. The load current is defined by a reference current IREF provided fromexternal, which is multiplied by a gain factor. This gain factor is represented by a 7-bit number andis stored in two registers, which can be alternatively selected by an external signal. Additionally,these registers can be accessed by a serial synchronous data interface consisting of a clock signal anda data signal. The register not defining the load current is selected for access by the data interface.In this way, an arbitrary transient load current profile with two load conditions defined by the twogain factors can be generated on-chip.

LDO Stand-Alone Operation

For test purposes, the digitally-enhanced LDO can be operated completely independent of theactual MCU system operation. During normal system operation, the digital control signals as wellas the reference voltage and bias current required for LDO operation are provided from the systemas discussed in Chap. 6.2.1. In the stand-alone operation, however, they are provided from external,in this way allowing full control of the LDO during test.

Although not required for operation of the digitally-enhanced LDO, its output is here connectedto a separate pin and is therefore accessible from external. In this way, static measurement conditionssuch as static load currents and/or static voltages can be applied to the LDO output for experimentalpurposes.

6.3 Principle of Discrete Load Adaption

The discrete load adaption scheme is an example for the digital-enhancement techniques, partic-ularly in form of the adaptive feedback control as introduced in Chap. 6.1, and aims to achieve ahigh LDO current efficiency over the full load current range. By exploiting the correlation betweensystem clock frequency and load current demand, the discrete load adaption scheme digitally adaptsthe maximum current drive capability.


+

-MPASS

VREF

VDD

CLOAD

VOUT

ILOAD

EA1

R2

R1

RESR

VGATE

IBIAS

+

-MPASS

VREF

VDD

CLOAD

VOUT

ILOAD

EA1

R2

R1

RESR

VGATE

IBIAS

Control

Unit

(a) (b)

Fig. 6.5. To achieve a high LDO current efficiency over a wide range of load conditions, the LDO quiescentcurrent can be adapted depending on the load current, either (a) with the help of a dedicated feedbackloop sensing the load current and adapting the LDO characteristics, or (b) by an open-loop control of theLDO characteristics based on the principle of digital-enhancement.

For any state-of-the-art LDO with a basic feedback loop implemented in an analog way, thequiescent current demand is described by the basic LDO figure-of-merit introduced in Chap. 5.8.The LDO needs to be laid out for worst-case load demand at maximum clock frequency. This definesin combination with the abstinence of a large (external) load capacitance a certain, irrevocablequiescent current demand. The current efficiency of a state-of-the-art LDO with load-independentquiescent current is thereby high at maximum load, but quickly drops with decreasing load current.The MCU system power consumption thus becomes dominated by the LDO quiescent current whenoperating at low clock frequencies, while the LDO current drive capability is not fully exploited.At the same time, the LDO output voltage must remain within a certain tolerance window (here+30/ − 70 mV) under all operating conditions in order to guarantee a fault-free operation of theMCU digital core (see also Chap. 4.3.1). In this context, the rapid load current transients presentedto the LDO when starting and stopping system operation add another level of complexity. The LDOthereby needs to be laid out for a worst-case load current step from zero to full load condition. Atthe same time, however, the LDO feedback loop is unnecessarily fast at low load conditions, andthe output voltage tolerance window is not fully utilized. Conceptually obvious, the key idea is tocombine these two constraints. While the LDO needs to be fast at (potentially) high load currentsto keep its output voltage within the specified tolerance window, it can be slowed down in low loadconditions in order to save quiescent current.

Scaling the quiescent current by sensing its load current represents state-of-the-art in the fieldof load adaptive LDOs. For this purpose, and as depicted in Fig. 6.5(a), a secondary feedback loopis added to the LDO topology. The load current information is thereby derived from the pass-transistor source-gate voltage by using a replica of the PMOS pass-transistor. While the topologyproposed by Thiele and Bayer (2005) represents a classic externally compensated LDO topology,whereas only the buffer stage is adaptively biased, the replica current is fed back into the biasnetwork of the error amplifier for the topology proposed by Lam and Ki (2008) and Zhan and Ki(2009). Though the exact implementation details are manifold, the basic principle is very similarfor all of these load adaptive LDO topologies. The LDO control characteristics determining theregulation performance - particularly including the gain-bandwidth as well as the loop stability -


are continuously adapted depending on the load current, resulting in a continuous LDO transfer-function. However, it is difficult to precisely control the LDO control characteristics depending onthe sensed load current, such that both regulation performance and loop stability are guaranteedunder all load conditions. The full potential of LDO quiescent current savings can therefore notbe exploited by these LDO topologies. In addition, and above all, the LDO transient response israther slow for a load current step from low load to high load condition. At low load condition,the LDO quiescent current is at its minimum, resulting in the lowest gain-bandwidth. The LDOfeedback loop is thus not able to react to a load current step with its full gain-bandwidth, butinstead the current sensing loop needs to first raise the LDO quiescent current to increase the gain-bandwidth. Due to the reaction time of the current sensing loop, either large transient undershootshave to be accepted or a large capacitance is required to suppress large transient undershoots. Inconclusion, such state-of-the-art analog load adaption schemes are indeed able to achieve a highcurrent efficiency over the full output current range. However, they are severely limited with respectto the required load transient performance, particularly in combination with the requirement for asmall integrated load capacitance.

In order to overcome the limitations of the state-of-the-art analog load adaption schemes, thediscrete load adaption scheme was first proposed by Lueders et al. (2011b) and Lueders et al.(2012). At this, the secondary feedback loop sensing the load current and adapting the LDO controlcharacteristics is replaced by an open-loop control. The discrete load adaption scheme can thereforebe classified as parameter scheduled adaptive control, as introduced in Chap. 6.1. The knowledgeof the system operating conditions allows a prediction of the maximum current demand and thusenables an adaptive setting of the current drive capability. Although the missing information aboutthe effective switching capacitance prevents an absolute prediction of the current demand, restrictivepredictions can nevertheless be made depending on the significance and accuracy of the systempower information. For this purpose, and as depicted in Fig. 6.5(b), a control unit is added tothe system. This control unit makes use of known system power information in order to digitallyadapt the LDO characteristics in an open-loop configuration. This includes not only the maximumcurrent drive capability, but also other critical parameters for the LDO regulation performance suchas voltage gain, gain-bandwidth as well as loop stability. As revealed by the detailed analysis ofthe any-load stable LDO (particularly see Chap. 5.4), the maximum load current determines thedominant pole frequency and consequently the gain-bandwidth of the LDO feedback loop. With themaximum load current restricted by the load current prediction, the gain-bandwidth of the LDOfeedback loop can be adapted, while still maintaining loop stability. At the same time, also theworst-case load transient step is limited by the load current prediction - ranging from leakage levelup to the respective maximum load defined by the clock frequency and the core supply voltage. Asagain revealed by the detailed analysis of the any-load stable LDO (particularly see Chap. 5.4), theLDO feedback loop is allowed to be slower at lower load current levels in order to cause the sametransient under- and overshoot at the same load capacitance. Hence, the gain-bandwidth of the LDOfeedback loop can be adapted depending on the maximum load current, while still maintaining theregulation performance. Obviously, the scaling of both regulation performance and loop stabilityneed to be and can be addressed hand-in-hand - both are guaranteed for each current drive level.

For the discrete load adaption scheme, in conclusion, the analog LDO feedback loop is enhancedby employing known system power information. The knowledge of the system operating parametersenables an adaptive setting of the LDO current drive capability and in this way leads to quiescentcurrent savings at low load conditions. Since this information is known in advance, the LDO canget prepared for the new load conditions rather than need to react. In the subsequent sections, the


necessary system requirements for the definition and implementation of the discrete load adaptionscheme are identified.

6.3.1 Drive Capability Determination

In order to control the LDO current drive capability in dependence on the system operating con-ditions, a simple control unit is added to the MCU system. Though the prediction is not limitedto, the focus is here on the system clock frequency. It is on the one hand rather easy to predict onsystem level (which is in contrast to the effective switching capacitance), and shows on the otherhand a strong correlation significance between clock frequency and load current demand (which isin contrast to the core supply voltage). For practical implementation, a discrete number of currentdrive levels (CDL) is defined, in which the LDO can operate. The control unit thereby maps thepredicted clock frequency into an LDO current drive level using a lock-up table based approach -thereby exploiting the linear correlation between both parameters. The lock-up table is defined atdesign time and is hard-coded. The definition of the current drive levels is widely flexible and canthus be adapted to system requirements - both with respect to the dynamic range and the granu-larity. To achieve a high LDO current efficiency over a wide range of load current, a high dynamicrange - defined as ratio between the LDO current drive capability in the highest and lowest drivelevel - is preferred. For the demonstrator system, six discrete current drive levels are defined in abinary way, spanning a dynamic range of 20 to 25. Depending on the predicted clock frequencyfCLK,pred, the required current drive level can be determined by the following expression:

CDL =⌈

log2

(fCLK,predfCLK,max

· 2CDLmax)

+ 1⌉

(6.2)

While the leakage current can be neglected at high current drive levels, this does not hold trueat very low drive levels with accordingly low current drive capability. Here, the constant offsetintroduced by the leakage current needs to be considered carefully for selection of the current drivelevel.

While the system clock frequency so far has been implicitly assumed to be known on system level,various approaches and techniques for clock frequency prediction in a highly-integrated MCU systemare identified in the following. In this context, also prediction tolerances need to be considered suchthat the LDO drive capability can be guaranteed to provide the maximum load current required.For the demonstrator system, the system clock is generated by a highly flexible clock generationunit, which provides several clock sources the application can choose from. This in detail includesan on-chip digitally controlled oscillator, an external crystal oscillator as well as an external squarewave signal. Depending on the clock source, different techniques for clock frequency prediction canbe applied.

For the on-chip digitally controlled oscillator (DCO), the system clock frequency is directlypredicted from its control settings. For this purpose, both the DCO frequency setting and thepre-divider setting need to be taken into account. The prediction tolerance is in this case solelydetermined by the actual oscillator tolerance. The DCO frequency is trimmed at final productiontest achieving an accuracy of ±3.5% over the full supply voltage and temperature range. Sincethe oscillator tolerance is offset-free, it needs to be equally taken into account for all drive levels -i.e. the prediction tolerance scales at the same rate as the absolute frequency and thus also as therequired current drive capability. Although not available in this demonstrator system, a very similar


Update clock

generation

configuration

Determine

current drive

level required

Apply updated

current drive

level to LDO

1 2 3

1 2 3 1 2 3

Low Drive LevelLow Drive Level High Drive Level

Time

Syste

m C

lock

Fig. 6.6. Timing diagram illustrating the dynamic adaption of the LDO current drive capability in responseto a changing system clock frequency. For highest system flexibility, arbitrary and instantaneous switchingbetween any levels is allowed, independent of the LDO operating condition.

approach for clock frequency prediction can also be applied in case of alternative on-chip oscillatortopologies as for instance phase-locked loop (PLL) or frequency-locked loop (FLL) oscillators.

In contrast to the on-chip oscillators, the clock frequency is not directly known in case of anexternal crystal oscillator. However, since its frequency does not change during operation, the clockfrequency can also be predicted in this case. For this purpose, a gate measurement is performedonce after system start-up, using the on-chip DCO as frequency reference. In combination with thefrequency divider setting, this provides information about the clock frequency which is employedfor the discrete load adaption scheme. At this, also the accuracy of the gate measurement schemeneeds to be taken into account. The accuracy is determined by the error in reference frequency(here the DCO frequency) as well as the actual measurement error, which reduces for longer gatemeasurement times.

Last but not least, another option for clock generation is to provide a square wave signal fromexternal. In this case, the clock frequency cannot be predicted in a reliable way. The external squarewave signal may change at any time without prior notice. Instead, the discrete load adaptive LDOis in this case set to its highest current drive level, thereby effectively disabling the discrete loadadaption scheme, and losing its benefits. Nevertheless, in this way a safe operation is maintainedalso in case of no clock frequency prediction is possible.

6.3.2 Dynamic Drive Level Adaption

The knowledge of the system operating conditions, in this particular case namely of the systemclock frequency, allows a prediction of the maximum current demand and consequently enables adynamic adaption of the current drive capability. During MCU operation, the clock frequency maychange upon application request. For this purpose, the clock generation unit is reconfigured, forinstance, by switching the clock source or setting the clock divider. Since the information aboutclock frequency is known in advance, the discrete load adaptive LDO can get prepared for the newload conditions rather than need to react.

A principle timing diagram illustrating the dynamic adaption of the clock frequency and theLDO current drive capability is depicted in Fig. 6.6. Determined by the application requirements, theclock frequency may change at any time. The sequence for adapting the clock frequency is therebyinitiated by updating the configuration of the clock generation unit synchronously to the rising clock


edge. The new clock frequency comes into effect with a delay of one clock cycle corresponding to60 ns at a maximum clock frequency of 16 MHz. This is accordingly the worst-case time for settingthe discrete load adaptive LDO to a new drive level and thus being prepared for the new load currentrequirements. The newly required LDO current drive level is determined at the falling clock edgeusing the lock-up table based approach as introduced in the preceding section. The configurationof both the clock generation unit and the LDO drive level come into effect at the subsequentrising clock edge. The sequence is thereby basically independent of increasing or decreasing theLDO current drive level. However, while the LDO current drive capability needs to be increasedimmediately in case the clock frequency is increased, the timing constraints are more relaxed incase the clock frequency is decreased. For best current efficiency, the LDO current drive capabilityis nevertheless preferred to be decreased promptly.

For highest system flexibility, arbitrary and instantaneous switching between any levels shallbe allowed, independent of the LDO operating condition (with regard to the supply voltage, loadcurrent, etc.), while keeping the output voltage within the tolerance window necessary to guaranteefault-free operation of the MCU digital core (here +30/−70 mV). The major challenge from a circuitdesign point of view is thereby to avoid dynamic effects such as charge injection and feed-throughof the digital control signals. The aspects necessary for circuit implementation are addressed inChap. 6.5.

6.3.3 LDO Circuit Implementation

An exemplary implementation of the discrete load adaption scheme is based on the any-load stableLDO, as introduced and analyzed in Chap. 5. As depicted by the circuit diagram in Fig. 6.7, thisdiscrete load adaptive LDO consists of two stages in two feedback loops in a unity-gain configuration:A slow folded-cascode amplifier (EA1) with high gain and a pole p0 at its output is combined with afast, low gain cascoded flipped voltage follower (FVF). The cascoded FVF is formed by the common-gate transistorsMCG1 andMCG2, the current mirrorMCM1, the resistor R1 and the pass-transistorMPASS . The two feedback loops are combined at MCG1. The LDO topology is compensated byan active feed-forward compensation scheme introduced in Chap. 5.1 (Thandri and Silva-Martinez,2003). The first dominant pole p0 is associated to the output of the folded-cascode amplifier. Byintroducing a capacitance at this node, the pole is designed to reside at very low frequencies. Thefast feed-forward path introduces a left-half-plane zero to the LDO open-loop transfer-functionHOL (s). The hereof arising positive phase shift is used to shape the LDO frequency response, inparticular to cancel the negative phase shift of the first dominant pole p0. A second dominantpole p1, associated to the LDO output, widely moves with load current due to changing outputimpedance of the pass-transistor MPASS . It ranges between the pole p0 in no load condition andthe pole p2 in maximum load condition. The pole p2 is the first non-dominant pole, and is definedby the resistor R1 and the gate capacitance of the pass-transistor. As illustrated in Chap. 5.1.1,this topology does not need further frequency compensation to ensure proper phase margin underall load conditions due to the pole-zero cancellation.

For the discrete load adaption scheme, both the cascoded FVF and the folded-cascode amplifierare biased depending on the current drive level (CDL) by adapting the mirror ratio of MCM1,MCM2 as well as MCM3. In this way, the gain-bandwidth of both LDO feedback loops is adapted.All relevant poles move simultaneously ensuring loop stability for each current drive level. Tomaintain the operating points of the circuit, the active width of MCG1 and MPASS as well as theresistance of R1 are adapted. Particular attention is paid to the switch design to avoid dynamic


+

-

CDL

R1

CDL

MPASS

CDL

MCM3

MCG2

MSINK

CDL

MCG1

CDL

MCM1

CDL

MCM2

C1

VREF

EA1

VDD VDD

CLOAD

VCORE

ILOAD

Digital Core

VDD VDD

p0

p1

p2

Fig. 6.7. Circuit diagram of the discrete load adaptive LDO. The current mirrorsMCM1,MCM2 andMCM3,the common-gate transistor MCG1, the pass-transistor MP ASS , as well as the resistor R1 are controlleddepending on the current drive level (CDL).

effects such as charge injection and feed-through of the digital control signals. In this way, the LDOoutput voltage is kept within the specified tolerance window while switching the LDO current drivelevel.

The chip micrograph of the discrete load adaptive LDO is depicted in Fig. 6.8. The LDO circuitoccupies an area of 0.016 mm2, which is dominated by the pass-transistor MPASS as well as theresistor R1. The control logic, required for selecting and switching the drive level, in contrast addsonly a small area overhead.

After outlining the system requirements for the discrete load adaption scheme, the circuit con-cept and implementation details are examined in the following. This discussion is divided into twosubsequent sections. At first, the steady-state operation of the discrete load adaptive LDO within

MPASSR1

EA1

MCG1

Control

Logic

138µm

11

6µ

m

C1

MCM1

MCM2

Fig. 6.8. Chip micrograph of the discrete load adaptive LDO. The LDO is implemented as part of acomplete ultra low-power MCU system in a 0.13µm standard CMOS technology.


one current drive level (CDL) is investigated, thereby resorting to the detailed circuit analysis of theany-load stable LDO presented in Chap. 5. In a second step, the required implementation detailsare presented in order to enable a dynamic adaption of the current drive levels during operation.

6.4 Discrete Load Adaptive LDO Steady-State Operation

By setting the current drive level (CDL), the current drive capability of the discrete load adaptiveLDO can be dynamically adapted during operation. In the lowest drive level, the quiescent currentis minimized, while in the highest drive level, the maximum load current can be provided at anaccordingly higher quiescent current level. To achieve a high LDO current efficiency over a widerange of load current, the drive capability adaption strategy is of vital importance for the effec-tiveness of the discrete load adaption scheme. At the same time, both static and dynamic LDOaccuracy (associated to the small-signal voltage gain and gain-bandwidth) as well as the LDO loopstability (associated to the phase margin) have to be maintained and guaranteed for each currentdrive level. Furthermore, also the requirements for dynamic switching between current drive levelsmust be considered and aligned with the drive capability adaption strategy.

The following section focuses on the steady-state operation of the discrete load adaptive LDOwithin one current drive level. At this, the drive capability adaption strategy for the discrete-loadadaptive LDO is elaborated step-by-step. Corresponding to the common LDO design procedure,the considerations start with the pass-transistor, to the cascoded flipped voltage follower (FVF),and the folded-cascode amplifier. The effectiveness of the proposed adaption strategy is ultimatelyconfirmed by simulation and experimental results. This particularly includes the loop stability, thestatic regulation performance, as well as the transient regulation performance for each current drivelevel.

6.4.1 Drive Capability Adaption Strategy

For the any-load stable LDO, just as for any other externally compensated LDO topology, theperformance requirements of the error amplifier are in various respects determined by the maxi-mum load current. As revealed by the detailed analysis of the any-load stable LDO (particularlysee Chap. 5.4), this primarily includes two aspects: (1) The pass-transistor size and thus its gatecapacitance are directly proportional to the maximum load current - thus determining the gatecapacitance to be driven by the error amplifier. (2) The dominant pole p1 is directly proportionalto the maximum load current - thus determining the location of the first non-dominant pole p2to maintain loop stability at maximum load current. By restricting the maximum load current,the discrete load adaption scheme allows to adapt the performance of the error amplifier and thusits quiescent current depending on the drive level setting. Based on the above boundary condi-tions, two alternative strategies for the drive capability adaption are presented and compared inthe following. (1) For the first adaption strategy, the location of the first non-dominant pole p2 isadapted, resulting in a linear scaling of the LDO quiescent current depending on the current drivecapability. (2) By adapting at the same time also the pass-transistor gate capacitance, the secondadaption strategy offers in contrast a scaling of the quiescent current to the power of two. For bothdrive capability adaption strategies, the small-signal voltage gain and the phase margin are therebymaintained, and only the gain-bandwidth is adapted depending on the drive level setting. Similarly

6.4 Discrete Load Adaptive LDO Steady-State Operation 143

to the investigation of the any-load stable LDO topology, the LDO is ripped up again at the folded-cascode amplifier output and divided into its two stages. The main focus for the drive capabilityadaption strategy is on the fast LDO stage in form of the cascoded FVF, since it dominates theoverall LDO quiescent current demand. The following considerations of the steady-state operationof the cascoded FVF can thereby be best illustrated by making use of the simplified equivalentcircuit model (see Fig. 5.13), as introduced and derived in Chap. 5.4. The adaption scheme for theslow stage in form of the folded-cascode amplifier is in contrast rather uncritical, and is addressedsubsequently.

Cascoded Flipped Voltage Follower

The dominant pole p1, and consequently also the gain-bandwidth of the cascoded FVF are directlydetermined by the load current. Assuming the pass-transistor is operated in strong inversion, whichis valid at medium to high load conditions, the pole frequency ωp1 is directly proportional to theload current (also see Chap. 5.4.1; Eq. 5.28).

ωp1,high = 1rds,PASS · CLOAD

∼=λsi · ILOAD,max

CLOAD(6.3)

By restricting the maximum load current with the help of the discrete load adaption scheme, the firstnon-dominant pole p2 is enabled to shift towards lower frequencies without causing stability issues -for instance by adaption of the resistance R1. At the same time, however, the pass-transistor voltagegain increases with the reciprocal of the square root of the load current - in this way counteractingthe frequency decrease of the dominant pole p1. As a result, the LDO gain-bandwidth decreaseswith the square root of the load current, and the full potential of quiescent current saving cannotbe exploited. This is further fortified by high tolerances, which need to be taken into account toguarantee loop stability also at low drive capability levels, particularly when the pass-transistorenters into weak inversion.

To exploit the full potential of the discrete load adaption scheme, the pass-transistor width isinstead adapted according to the maximum current drive capability in the respective LDO drivelevel. In this way, the small-signal voltage gain of the pass-transistor MPASS becomes independentof the selected LDO drive level (also see Chap. 5.4.2; Eq. 5.46).

Av,PASS ∼= gm,PASS · rds,PASS = 1λsi·

√2 · µp · COX ·

(WL

)PASS

ILOAD,max(6.4)

Although the small-signal voltage gain is expressed here for strong inversion, this relation remainsvalid independent of the transistor operating region, particularly independent of the dropout voltageand load current. With the pass-transistor voltage gain independent of the selected LDO drive level,the LDO gain-bandwidth becomes directly proportional to the maximum current drive capability.For the discrete load adaption scheme, the pass-transistor width is therefore adapted by dividing itinto multiple segments, which are dynamically controlled depending on the drive level setting. Twoalternative drive capability adaption strategies for the cascoded FVF, differing by the transistorswitch configuration, are presented and evaluated subsequently.

For the first drive capability adaption strategy, the pass-transistor segments are individuallycontrolled by introducing switches at their drain nodes, as illustrated in Fig. 6.9(a). This adaptionstrategy may appear unattractive at the first sight, since the additional switch transistors in the


VDD VDD VDD

VCORE

R1

VDD

MPASS[W/L]1 [W/L]2 [W/L]X

MSWITCH

[W/L]1 [W/L]2 [W/L]X

CDL1 CDL2 CDLX

(a)

CLOAD ILOAD

VDD VDD VDD

VCORE

R1

VDD

MPASS

(b)

CLOAD ILOAD

[W/L]1 [W/L]2 [W/L]X

CDLX

CDLX

CDL2

CDL2

CDL1

CDL1

gm,CG1

VAMP

-

+

gm,CG1

VAMP

-

+

VGATE

VGATE

Fig. 6.9. Simplified circuit diagram of the two alternative drive capability adaption strategies for thecascoded flipped voltage follower (FVF). (a) For the first adaption strategy, the pass-transistor width iscontrolled by switches at the drain node, resulting in a linear scaling of the quiescent current with thecurrent drive capability. (b) By shifting the switches to the gate node of the pass-transistor segments forthe second adaption strategy, a quadratic scaling of the quiescent current can be achieved.

power path require additional voltage headroom. However, recalling the cascode pass-transistortopology introduced in Chap. 5.6.2, the thick-oxide cascode transistor is utilized to adapt the activepass-transistor width. In this way, no additional devices need to be introduced into the power pathand beyond that, the pass-transistor is replaced by a thin-oxide device, thus benefiting from CMOStechnology scaling. The pass-transistor segments are individually turned-off by pulling the gate nodeof the respective thick-oxide cascode transistor to the positive supply voltage. When activated, therespective thick-oxide cascode transistor is controlled by a common auxiliary amplifier, in this waykeeping the source-drain voltage of the thin-oxide pass-transistor equal to a defined bias voltage.Since the gate nodes of all thin-oxide pass-transistor segments are permanently connected, thetotal pass-transistor gate capacitance presented to the cascoded FVF is independent of the selected


LDO drive level. By combining these two boundary conditions - (1) the LDO gain-bandwidth isdirectly proportional to the maximum current drive capability, (2) the total pass-transistor gatecapacitance is independent of the maximum current drive capability - and presuming both thesmall-signal voltage gain Av,CASFV F and the phase margin (here expressed by the damping factorζ) are to be maintained for each LDO drive level, the location of the first non-dominant pole p2can be determined as:




(6.5)

The location of the first non-dominant pole p2 is adapted with the help of the resistance R1. Asevident from the above expression, it scales inversely to the maximum current drive capability andthe activated pass-transistor width. The resistance R1 is for this purpose implemented as stringresistor ladder, which is shorted stepwise depending on the drive level setting. To maintain thesmall-signal voltage gain independent of the resistance scaling, the bias current IDS,CM1 of thecascoded FVF needs to be adapted correspondingly.


n · VT

)·Av,PASS (6.6)

To adapt the bias current IDS,CM1 depending on the drive level setting, the current mirror MCM1is implemented as current-mode DAC. By adapting the bias current IDS,CM1, also the bias currentthrough the common-gate transistor MCG1 varies depending on the LDO drive level. Since thetransistor is operated in deep weak inversion - with its transconductance ideally being proportionalto the bias current and independent of the transistor aspect ratio - the small-signal voltage gainAv,CG12 remains constant. However, due to the wide variation of its bias current from lowest tohighest drive level, the operating region of the transistor unavoidably changes. To avoid second-ordervariation effects of the small-signal-voltage gain Av,CG12, also the active width of the common-gatetransistorMCG1 is adapted depending on the LDO drive level by dividing it into multiple segments.Since the common-gate transistor MCG2 provides merely a folding function to the cascoded FVF,a dynamic adaption depending on the drive level setting is not required. Nonetheless, it mustbe ensured that the transistor remains in saturation region, and also the voltage level at the nodeVFOLD is sufficient such that both the common-gate transistorMCG1 and the current mirrorMCM1remain in saturation region in both the highest and the lowest current drive level. At the sametime, however, also the current sink capability is defined by the common-gate transistor MCG2, asdiscussed in Chap. 5.7. To adapt the current sink capability at the same rate as the current drivecapability, the biasing of the common-gate transistor MCG2, defined by the current source MCM3in Fig. 6.7, is adapted.

The total pass-transistor gate capacitance presented to the cascoded FVF is independent ofthe drive level setting when controlling the individual segments by introducing switches at thedrain nodes. As a result, the LDO quiescent current scales linearly with the maximum current drivecapability (Iq ∝ ILOAD,max) for this first drive capability strategy. An alternative approach to adaptthe pass-transistor width is to introduce switches at the gate nodes of the individual transistorsegments, as illustrated in Fig. 6.9(b). The pass-transistor segments are in this way completelyseparated from the cascoded FVF when turned-off. As a result, not only the LDO gain-bandwidthis directly proportional to the maximum current drive capability, but also the pass-transistor gate


capacitance. By again combining these two boundary conditions, and presuming both the small-signal voltage gain Av,CASFV F and the phase margin (here expressed by the damping factor ζ) areto be maintained for each LDO drive level, the location of the first non-dominant pole p2 can bedetermined as:




(6.7)

As evident from the above expression, the resistance R1 in this case scales with the maximumcurrent drive capability and the activated pass-transistor width to the inverse power of two. Thebasic principle for adaption of the remaining devices and bias currents corresponds to that of thefirst drive capability adaption strategy. They are adapted proportional to the resistance R1 in therespective drive level setting. In conclusion, the LDO quiescent current scales with the maximumcurrent drive capability to the power of two (Iq ∝ ILOAD,max

2). Herewith the quiescent currentscaling for this drive capability adaption strategy is consistent with the ideal scaling limits asidentified in Chap. 5.8.1. Disregarding non-ideal effects due to the additional switches at the pass-transistor gate node, the quiescent current demand thus corresponds to that of an LDO customizedand optimized for the same current drive capability.

Although this second drive level adaption strategy offers a superior quiescent current scalingability compared to the first strategy, an instantaneous switching between any drive levels, however,cannot be achieved when adapting the pass-transistor width by switching the gate node. By addingor removing substantial parts of the pass-transistor gate capacitance, high voltage spikes can beobserved at the pass-transistor gate node when switching the LDO drive level. This results inirrepressible voltage fluctuations at the LDO output, particularly exceeding the tolerance windowto guarantee a fault-free operation of the MCU digital core. In contrast hereto, the gate nodesof all pass-transistor segments are permanently connected to the cascoded FVF for the first drivecapability adaption strategy. While the quiescent current scaling is reduced to a linear factor,this strategy enables highest system flexibility with arbitrary and instantaneous switching betweenany levels independent of the LDO operating condition. Only in this way, the high LDO currentefficiency over the full load current range can be exploited also under applicative conditions. For thisreason, the first drive capability adaption strategy is chosen for implementation of the discrete loadadaptive LDO. The implementation details required for the dynamic drive level adaption duringoperation, particularly with respect to the pass-transistor and the series resistor ladder, are furtherelaborated in Chap. 6.5.

To prove the effectiveness of the proposed adaption strategy, Fig. 6.10 shows the simulatedsmall-signal voltage gain Av,CASFV F as well as the relevant pole frequencies p1 and p2 of thecascoded FVF for each current drive level (CDL). Both the dominant pole p1 and the first non-dominant pole p2 are dependent on the selected current drive level. They scale linearly with themaximum LDO current drive capability and therefore span a dynamic range of 32. While thefrequency of the dominant pole ranges between ωp1 = 2π · 136.1 kHz in the highest drive level(CDL6) and ωp1 = 2π · 4.3 kHz in the lowest drive level (CDL1), the frequency of the first non-dominant pole ranges between ωp2 = 2π · 3087 kHz and ωp2 = 2π · 102 kHz, respectively. In contrasthereto, the small-signal voltage gain remains constant over the current drive levels, amounting toAv,CASFV F = 41 dB. The simulation results are thereby in good accordance with the theoreticconsiderations obtained from the simplified equivalent circuit model as derived in Chap. 5.4.5.


50100200400800

160032006400

1 2 3 4 5 6

Current Drive Level (CDL)

30

35

40

45

50

1 2 3 4 5 6


248

163264

128256

1 2 3 4 5 6



Gai

n [

dB

]F

req

uen

cy [

kHz]

Fre

qu

ency

[kH

z](b) p1

(c) p2

strong, 85ºC

weak, -40ºCnominal, 25ºC






strong, 85ºC


strong, 85ºC


(a) Av,CASFVF

Fig. 6.10. Cascoded flipped voltage follower (a) small-signal voltage gain Av,CASF V F , as well as (b)frequency of the dominant pole p1, and (c) the first non-dominant pole p2 for each current drive level(CDL) at the respective maximum load current and as a function of process and temperature variations atnominal supply voltage VDD=3.0 V.

Folded-Cascode Amplifier

In contrast to the cascoded FVF, the adaption scheme for the folded-cascode amplifier is ratheruncritical when adhering to the following two boundary conditions: (1) To maintain the staticaccuracy of the LDO independent of the current drive level, the small-signal voltage gain of thefolded-cascode amplifier must remain constant. (2) To maintain stability of the LDO at low loadconditions independent of the current drive level, the location of the left-half-plane (LHP) zero z0must follow the movement of both low-frequency poles p1 and p2. The LHP-zero is - as demonstratedin Chap. 5.1 - determined by the gain-bandwidth of the folded-cascode amplifier. Consequently, theaim for the adaption scheme of the folded-cascode amplifier is to adapt its gain-bandwidth dependingon the drive level setting, while maintaining its small-signal voltage gain.

The folded-cascode amplifier forms in combination with the capacitance C1 a gm/C-filter. Con-sequently, and as derived in Chap. 5.1.1, the frequency of the LHP-zero can be expressed as:

ωz0 = Av,EA1 · ωp0 ∼= −gm,1C1

(6.8)

With the differential input pair operated in deep weak inversion (and hence gm,1 = IDS,1n·VT ), the

gain-bandwidth of the folded-cascode amplifier is directly proportional to its bias current. At thesame time, the low-frequency voltage gain is independent of the bias current.


40.0

50.0

60.0

70.0

80.0

1 2 3 4 5 6

Volta

ge G

ain

[dB

]


(b) z0

1.02.04.08.0

16.032.064.0

128.0

1 2 3 4 5 6


Fre

qu

ency

[kH

z]


1 2 3 4 5 6

1286432168421


1 2 3 4 5 6

Gai

n [

dB

]

(a) Av,EA1

40

50

60

70

80 weak, -40ºC

strong, 85ºCnominal, 25ºC

strong, -40ºC

weak, 85ºCnominal, 25ºC


Fig. 6.11. Folded-cascode amplifier (a) small-signal voltage gain Av,EA1, and (b) gain-bandwidth for eachcurrent drive level (CDL) as a function of process and temperature variations at nominal supply voltageVDD=3.0 V. By neglecting the additional loading by the gate leakage of the MOS-capacitance C1, thetheoretic considerations overestimate the voltage gain at low current drive levels.

Av,EA1 ∼= gm,1 · rout,EA1 (6.9)

rout,EA1 ∼= (gm,6 · rds,6 · (rds,2||rds,4)) || (gm,8 · rds,8 · rds,10) (6.10)

where gm,1 is the transconductance of the differential input transistors, and rout,EA1 is the outputresistance of the folded-cascode amplifier. While the theoretical considerations indicate a constantvoltage gain, they neglect the gate leakage of the MOS-capacitance C1. The gate leakage placesan additional load to the folded-cascode amplifier, which is independent of the LDO drive level.This additional loading effect becomes particularly apparent at low bias current levels, and causesthe small-signal voltage gain Av,EA1 to drop. These theoretical considerations are confirmed by thesimulation results as evident from Fig. 6.11 - showing the small-signal voltage gain Av,EA1 and thegain-bandwidth ωGBW of the folded-cascode amplifier for each current drive level. In the highestdrive level (CDL6), the folded-cascode amplifier offers a small-signal voltage gain of Av,EA1 = 71 dB,which reduces to Av,EA1 = 56 dB in the lowest drive level (CDL1). At the same time, the gain-bandwidth scales (approximately) linearly with the maximum current drive capability - rangingbetween ωGBW = 2π · 56.0 kHz in the highest drive level (CDL6) and ωGBW = 2π · 2.1 kHz in thelowest drive level (CDL1). This in turn proves that the gain degradation of the low drive levels iscaused by the output resistance rout,EA1, while the transconductance gm,12 scales linearly with thebias current as intended.

6.4.2 Maintaining LDO Loop Stability

To achieve loop stability under all load conditions, the discrete load adaptive LDO employs anactive feed-forward compensation scheme. As introduced in Chap. 5.1, the overall transfer-functionHOL (s) is thereby determined by three low frequency poles as well as one left-half-plane (LHP)


-40

0

40

80

120

1.0E-01 1.0E+00 1.0E+01 1.0E+02 1.0E+03 1.0E+04 1.0E+05 1.0E+06 1.0E+07 1.0E+08

Frequency [Hz]

Ga

in [

dB

]

P0

P1CDL1

CDL6

0

50

100

150

200

1.0E-01 1.0E+00 1.0E+01 1.0E+02 1.0E+03 1.0E+04 1.0E+05 1.0E+06 1.0E+07 1.0E+08

Frequency [Hz]

Ph

as

e [

de

g]

-40

0

40

80

120

P2

P0

P1

CDL1

CDL6

P2

100m 1 10 100 1k 10k 100k 1M 10M 100M

50

100

200

150

0

Fig. 6.12. Simulated Bode plot for each current drive level (CDL) at the respective maximum load currentand nominal condition (VDD=3.0 V, Temp. =25 C). The LDO gain-bandwidth is adapted depending onthe LDO drive level setting, while both low-frequency voltage gain and phase margin are maintained.

zero, arising due to the summation of the two feedback loops. Depending on the drive level setting,all relevant (low-frequency) poles and thus the gain-bandwidth of the discrete load adaptive LDOare adapted. Fig. 6.12 shows the simulated Bode plot for each current drive level (CDL) at therespective maximum load current, representing worst-case conditions for loop stability. The LDOgain-bandwidth is adapted by a factor of two with each current drive level, and spans a dynamicrange from ωGBW = 2π · 6800 kHz in the highest drive level (CDL6), down to ωGBW = 2π · 210 kHzin the lowest drive level (CDL1). The simulated dynamic range thus closely corresponds to the idealfactor of 32, as determined in the preceding section on the drive capability adaption strategy. Whilethe LDO gain-bandwidth is adapted depending on the current drive level, both the low-frequencyvoltage gain as well as the phase margin are maintained, as demonstrated in the following. Thesimulated small-signal voltage gain at low frequency is Av,LDO = 105 dB in the highest drive level(CDL6), and is dominated by the high voltage gain provided by the folded-cascode amplifier. Whilethe small-signal voltage gain remains essentially constant in the upper four drive levels, it startsdropping when further reducing the LDO current drive capability, reaching Av,LDO = 93 dB inthe lowest drive level (CDL1). This effect is caused by the dropping voltage gain of the folded-cascode amplifier due to non-ideal scaling effects, as revealed in the preceding section. Since allrelevant (low-frequency) poles move simultaneously when adapting the LDO current drive capa-bility, loop stability is ensured for each current drive level. The simulated phase margin rangesbetween φM = 26 in the highest drive level (CDL6), and φM = 25 in the lowest drive level(CDL1). The simulation results can thereby be confirmed by the simplified equivalent circuit modelof the cascoded FVF. By combining the location of the two low frequency poles p1 and p2, as wellas the small-signal voltage gain Av,CASFV F as derived in Chap. 5.4.3, the phase margin can beexpressed as:


φM = arctan

2 · ζ√√(4 · ζ4 + 1)− 2 · ζ2

(6.11)

ζ = 12 ·

ωp1 + ωp2√ωp1 · ωp2 (1 +Av,CASFV F )

(6.12)

with ζ denoting the damping factor. The phase margin obtained by evaluating the above expressionis φM = 26 in the highest drive level (CDL6), and φM = 25 in the lowest drive level (CDL1),which is both in good accordance to the above simulation results. Noteworthy, the negative phaseshift of the first dominant pole p0 is fully canceled by the positive phase shift of the LHP-zero, andhas thus no impact on the loop stability at high load conditions. By sizing of the resistor R1, theloop stability of the discrete load adaptive LDO is a free design-parameter, in this way giving theopportunity to trade-off the phase margin against the quiescent current demand. As evident fromFig. 6.12, this trade-off is here balanced in favor of a low quiescent current, accepting a worst-casephase margin of 25 at maximum load conditions.

The LDO stability analysis in conclusion confirms the theoretic considerations of the drivecapability adaption strategy. For the discrete load adaptive LDO, both the low-frequency voltagegain as well as the phase margin are maintained for each drive level. The gain-bandwidth is adaptedat the same rate as the maximum current drive capability - which in turn is adapted at the samerate as the LDO quiescent current, as will become evident in the subsequent section.

6.4.3 Adapting LDO Quiescent Current

For the discrete load adaption scheme, the LDO quiescent current depends on the current drive leveland scales linearly with the maximum current drive capability (with a constant offset of 100 nAfor the bias network). The LDO quiescent current is determined with the help of a differentialmeasurement approach, at which the LDO is first forced into the particular drive level, before it isdisabled and the MCU digital core is supplied from external. As the ultra-low-power MCU systemis for this experiment set into sleep mode, the MCU digital core draws no load current, exceptinga leakage current of 4µA (at 25 C). Fig. 6.13 shows the measured LDO quiescent current as afunction of the maximum load current at nominal conditions (VDD=3.0 V, Temp.=25 C). Clearlyevident from the measurement results are the six discrete current drive levels of the demonstratorsystem, defined in a binary way and spanning a dynamic range of 20 to 25. In the lowest level(CDL1), a quiescent current of 650 nA is needed, while in the highest level (CDL6), the quiescentcurrent is 17.7µA and a load current of up to 2.56 mA can be provided. Since the LDO quiescentcurrent is solely determined by the bias current, it is well defined and hence shows only a weakdependency on process, voltage and temperature variations. The variation of the LDO quiescentcurrent is dominated by the accuracy of the reference bias current generation. The measurementresults are consequently in good accordance with the simulation results.

By adapting the LDO quiescent current depending on the load condition, the LDO currentefficiency ideally remains above 97 % over two decades of load current (also see Fig. 6.13). Dependingon the significance and accuracy of the load current prediction, the actual current efficiency mightbe lower. In contrast to this, the dashed line in Fig. 6.13 indicates the LDO current efficiency ofa conventional LDO topology with a load-independent quiescent current demand. It quickly dropsat lower load conditions and approaches 59 % only at a load current of 25µA, which is 38 % lowerthan for the discrete load adaption scheme.


Conventional LDO

with load-independent

quiescent current

LDOVOUT

CDL

Fig. 6.13. Measured LDO quiescent current at VDD=3.0 V, Temp.=25 C and calculated current efficiencyILOAD/ (ILOAD + Iq) as a function of current drive capability. The current efficiency is highly improved inlow load conditions and thus remains above 97 % over two decades of load current.

6.4.4 Experimental Results on Static Regulation Performance

The effectiveness of the above presented adaption strategy is in the following examined with respectto the static regulation performance by simulation and experimental results. This namely includesthe static line regulation, the static load regulation as well as the static offset error for each currentdrive level (CDL). The experimental results are obtained with the ultra-low-power MCU systemset into sleep mode. Since the system clock is stopped in this mode, the MCU digital core draws nocurrent, excepting a leakage current of 4µA (at 25 C). The experimental setup conditions includingsupply voltage, load current and external control signals are (statically) applied from external asstated in each case.

Static Line Regulation

The static line regulation describes a steady-state voltage variation at the LDO output resultingfrom changes in supply voltage, as introduced in Chap. 4.1.2. It is determined by the low-frequencyerror amplifier voltage gain, which is for the discrete-load adaptive LDO the product of the folded-cascode amplifier gain Av,EA1, and the cascoded FVF gain Av,CG12. Noteworthy, the pass-transistorgain has essentially no impact on the static line regulation.

∆VCORE∆VDD

= 1 +Av,PASS1 + (Av,EA1 + 1) ·Av,CG12 ·Av,PASS

∼=1

Av,EA1 ·Av,CG12(6.13)

Fig. 6.14(a) shows the output voltage variation over supply voltage for both the lowest drive level(CDL1) and the highest drive level (CDL6) at the respective maximum load current and roomtemperature. Owing to the high voltage gain provided by the folded-cascode amplifier, the discreteload adaptive LDO shows an excellent static line regulation, which is basically independent of thedrive level setting. The LDO line regulation in the highest drive level (CDL6) is 0.2 mV/V acrossthe complete supply voltage range from 1.9 V up to 3.6 V. In the lowest drive level (CDL1), the


0.0

0.2

0.4

0.6

0.8

1.0

CDL1 CDL2 CDL3 CDL4 CDL5 CDL6

A

x

i

s

T

i

t

l

e

Current Drive Level

1.519

1.520

1.520

1.521

1.521

1.6 2.0 2.4 2.8 3.2 3.61.519

1.520

1.520

1.521

1.521

1.6 2.0 2.4 2.8 3.2 3.6Supply Voltage [V]

1.6 60 803.2 3.6

0

-0.5

-1.0

1.0

0.5

Vo

ltag

e V

aria

tio

n [m

V]

Simulation

+3 sigma

-3 sigma

mean

Measurement

eight units

(a)

(b)

CDL1 CDL6

Lin

e R

egu

lati

on

[m

V/V

]

MeasurementCircuit Simulation

2.0 2.4 2.8

Supply Voltage [V]

1.6 60 803.2 3.62.0 2.4 2.8

0

-0.5

-1.0

1.0

0.5

Vo

ltag

e V

aria

tio

n [m

V]

Fig. 6.14. Simulated and measured LDO static line regulation performance. (a) Output voltage variationover supply voltage for both lowest drive level (CDL1) and the highest drive level (CDL6), and (b) LDOstatic line regulation over the complete supply voltage range for each current drive level. The simulationand measurement results are obtained at the respective maximum load current and room temperature.

error amplifier voltage gain is reduced by 10 dB, resulting in a slightly degraded line regulation of0.6 mV/V. At a supply voltage of VDD = 1.7 V, the LDO reaches its dropout limit. Independentof the drive level setting, the pass-transistor enters triode region and the error amplifier voltagegain starts to drop. Ultimately, the LDO is not able to maintain regulation and the output voltagebegins to follow the supply voltage.

Complementing the previous results, Fig. 6.14(b) summarizes the LDO static line regulation foreach current drive level over the complete supply voltage range from 1.9 V up to 3.6 V. At this, themeasurement results, obtained and summarized from eight units from one wafer, match well withthe range indicated by Monte-Carlo simulation (light gray bar, 100 runs, 3σ).

Static Load Regulation

Corresponding to the LDO static line regulation, a steady-state voltage variation resulting fromchanges in load current is defined as load regulation, as introduced in Chap. 4.1.2. It is determinedby the open-loop output impedance of the discrete load adaptive LDO divided by its overall low-frequency voltage gain.


1.520

1.520

1.520

1.520

1.520

0 320 640 960 128016001920224025601.520

1.520

1.520

1.520

1.520

0 10 20 30 40 50 60 70 80Load Current [µA]

0

-20

-40

40

20

Vo

ltag

e V

aria

tio

n [µ

V]

0 10 20 30 40 50 60 70 80

Load Current [mA]

0 0.32 40 50 60 70 800.64 0.96 1.28 1.60 1.92 2.24 2.56

0

-20

-40

40

20

Vo

ltag

e V

aria

tio

n [µ

V]

Simulation

+3 sigma

-3 sigma

mean

Measurement

eight units

(a)

(b)

CDL1 CDL6

0.0001

0.0010

0.0100

0.1000

1.0000

10.0000


A

x

i

s

T

i

t

l

e

Current Drive Level

Lo

ad R

egu

lati

on

[m

V/m

A]


Fig. 6.15. Simulated and measured LDO static load regulation performance. (a) Output voltage variationover load current for the lowest drive level (CDL1) and the highest drive level (CDL6), and (b) LDO staticload regulation over full load current range for each current drive level. The simulation and measurementresults are obtained at a supply voltage of VDD = 3.0 V and room temperature.

∆VCORE∆ILOAD

= rds,PASS1 + (Av,EA1 + 1) ·Av,CG12 ·Av,PASS

∼=rds,PASS

Av,EA1 ·Av,CG12 ·Av,PASS(6.14)

Fig. 6.15(a) shows the output voltage variation over load current for both the lowest drive level(CDL1) and the highest drive level (CDL6) at nominal conditions, i.e. at a supply voltage ofVDD = 3.0 V and room temperature. To determine the load regulation of the LDO separately andcancel out any parasitic line resistance, the output voltage is here determined directly at the LDOfeedback point with the help of the differential sensing scheme as introduced in Chap. 6.2.2. Owingto the high voltage gain provided by the folded-cascode amplifier, the discrete load adaptive LDOshows an excellent static load regulation. In the highest drive level (CDL6), the LDO load regulationis below 0.01 mV/mA across the complete load current range from 0 mA up to 2.56 mA. While theerror amplifier voltage gain is essentially independent of the drive level setting, the open-loop outputimpedance of the discrete-load adaptive LDO scales at the same rate as the maximum current drivecapability. As a result, the LDO load regulation is sacrificed to 0.3 mV/mA in the lowest drive level(CDL1), which is however only of limited importance for practical purposes. Since the maximumcurrent drive capability is reduced to 0.08 mA in the lowest drive level, the LDO load regulationremains essentially constant when expressed in absolute terms. Evident from Fig. 6.15(a), themeasurement results for static load regulation are overall in good accordance with the simulationresults, though the parasitic line resistance is not perfectly canceled by the differential sensing


scheme. Due to this measurement error, an additional systematic load regulation is introduced,which becomes particularly apparent at high current drive levels.

Complementing the previous results, Fig. 6.15(b) summarizes the LDO static load regulation foreach current drive level over the respective full load current range. At this, the measurement results,obtained and summarized from eight units from one wafer, match well with the range indicated byMonte-Carlo simulation (light gray bar, 100 runs, 3σ).

Offset Error

A static offset of the error amplifier (VOS,EA1) directly contributes to the LDO regulation error, asintroduced in Chap. 4.1.2. For the discrete load adaptive LDO, this offset is primarily determinedby random mismatch of the differential input transistors.

∆VCORE∆VOS,EA1

= 1 (6.15)

In order to determine the LDO offset error separately - excluding the error in bandgap referencevoltage - the LDO reference voltage is here provided from external. Fig. 6.16 shows the LDO offseterror for each current drive level, determined at a supply voltage of VDD = 3.0 V, zero load currentand room temperature. For the discrete load adaptive LDO, the offset increases at smaller currentdrive levels. This effect is primarily caused by two reasons. First, the gain of the folded-cascodeamplifier is reduced, introducing a systematic offset. Second, the biasing of the differential inputpair changes and the matching becomes worse at smaller bias current levels, i.e. when the differentialinput pair is operated in deep weak inversion.

The measurement results, obtained and summarized from eight units from one wafer, matchwell with the range indicated by Monte-Carlo simulation (light gray bar, 100 runs, 3σ). It shouldhowever be noted that the results obtained have to be treated carefully. As the LDO offset erroris determined by random device mismatch, ultimate verification would require measurement datafrom a larger number of units from different wafer lots.

-30

-20

-10

0

10

20

30


A

x

i

s

T

i

t

l

e

Current Drive Level

Vo

ltag

e V

aria

tio

n [

mV

]

Simulation

+3 sigma

-3 sigma

mean

Measurement

eight units

Fig. 6.16. Simulated and measured LDO offset error for each current drive level (CDL) at a supplyvoltage of VDD = 3.0 V, zero load current and room temperature. The measurement results, obtained andsummarized from eight units from one wafer, match well with the range indicated by Monte-Carlo simulation(light gray bar, 100 runs, 3σ).


Summary

In conclusion of the experimental results on the LDO static regulation performance, two aspects canbe summarized. First, due to the high voltage gain of the folded-cascode amplifier, the discrete loadadaptive LDO shows an excellent static accuracy (namely including static line regulation, static loadregulation and offset error). This proves the effectiveness of the control strategy, addressing boththe static and the transient regulation separately by decomposing the control tasks into constituentfeedback loops. Second, the excellent static accuracy is also not compromised when reducing thecurrent drive level (and with that the LDO quiescent current). This in turn proves the effectivenessof the drive capability adaption strategy with regard to the static regulation performance.

6.4.5 Experimental Results on Transient Regulation Performance

After evaluating the static regulation performance of the discrete load adaptive LDO, the focus isin the following directed to its transient regulation performance. In this regard, the above presentedadaption strategy is examined by simulated and experimental results, thereby particularly focusingon the transient load regulation for each current drive level. While the supply voltage of an ultra-low-power MCU system is rather constant in a battery-powered application, the digital nature ofthe load places stringent requirements on the LDO load transient performance. To experimentallydetermine the transient load regulation, the ultra-low-power MCU system is again set into sleepmode with the system clock being stopped. The load current profiles are generated by the on-chipprogrammable load, while the MCU digital core itself draws no load current, excepting a leakagecurrent of 4µA at 25 C, which decreases to 0.2µA at −40 C. The LDO transient response ismeasured by utilizing the differential sensing scheme as introduced in Chap. 6.2.2.

Transient Load Regulation

The transient load regulation defines the LDO behavior in response to fast load current steps, asintroduced in Chap. 4.1.2. To determine the tolerance window necessary to guarantee a fault-freeoperation of the MCU digital core, particularly the worst-case transient under- and overshoot areof importance. Accordingly, the cascoded flipped voltage follower (FVF) allows the discrete loadadaptive LDO to instantaneously react to fast load transient conditions. The transient voltage erroris determined by its small-signal voltage gain and gain-bandwidth, as identified and quantified inChap. 5.4.4. The transient response of the cascoded FVF is superposed by the slow response of thefolded-cascode amplifier, which brings the output voltage back to its accurate final value.

To begin with, and as illustrated in Fig. 6.17(a), a full-scale load current step from 0 mA to2.56 mA is applied under worst-case conditions (VDD = 1.9 V, Temp. =−40 C). The simulatedand measured LDO behavior in response thereto is depicted in Fig. 6.17(b), while the LDO isoperated in the highest current drive level (CDL6) with a quiescent current of Iq = 17.7µA.The unmatched pole-zero cancellation resulting from the two feedback loops of the discrete loadadaptive LDO is well evident from the LDO transient response. Owing to the cascoded FVF withits high gain-bandwidth, the discrete load adaptive LDO is able to instantaneously react to the loadcurrent step, but with limited voltage gain only. The resulting transient voltage errors of −27 mVand +15 mV are well within the tolerance window necessary to guarantee fault-free operation ofthe MCU digital core (here +30/ − 70 mV). Subsequently, the output level settles slowly back tothe accurate nominal level determined by the slow folded-cascode amplifier providing high voltage


1. 47

1. 49

1. 51

1. 53

1. 55

26 29 32 35 38 42 45 48 51 54 58

1. 47

1. 49

1. 51

1. 53

1. 55

0 32 64 96 128 160 192 224 256 288 320

- 1. 25

0. 00

1. 25

2. 50

3. 75

0 32 64 96 128 160 192 224 256 288 320

1. 47

1. 49

1. 51

1. 53

1. 55

0. E+00 1. E+00 2. E+00 3. E+00 4. E+00 5. E+00 6. E+00 7. E+00 8. E+00 9. E+00 1. E+01

VOUT: 20mV/div

Time: 1µs/div

CDL6

1. 47

1. 49

1. 51

1. 53

1. 55

8. E- 01 9. E- 01 1. E+00 1. E+00 1. E+00 1. E+00 1. E+00 2. E+00 2. E+00 2. E+00 2. E+00

- 1. 25

0. 00

1. 25

2. 50

3. 75

0. E+00 1. E+00 2. E+00 3. E+00 4. E+00 5. E+00 6. E+00 7. E+00 8. E+00 9. E+00 1. E+01

1. 47

1. 49

1. 51

1. 53

1. 55

0. E+00 1. E+00 2. E+00 3. E+00 4. E+00 5. E+00 6. E+00 7. E+00 8. E+00 9. E+00 1. E+01

(e)

(d)

(f) VOUT: 20mV/div

Time: 32µs/div

CDL118mV

27mV

(b)

(c)

VOUT: 20mV/div

VOUT: 20mV/div

Time: 1µs/div

Time: 1µs/div

Time: 0.1µs/div

ILOAD:1.60mA/div

15mV

27mV

CDL6

CDL6

ILoad=2.56mA

ILoad=0.00mA

(a)


27mV

enlarged detail view

Time: 1µs/div, 32µs/div

ILoad=0.00mA

ILoad=0.08mA

ILOAD: 0.05mA/div

(g) VOUT: 20mV/div

Time: 3.2µs/div

CDL1

27mV

enlarged detail view

Fig. 6.17. Simulated and measured LDO transient behavior (a) in response to a load current stepfrom 0 mA to 2.56 mA (b) when operating in the highest drive level (CDL6) with a quiescent currentof Iq = 17.7µA, and (c) showing an enlarged detail view, as well as (d) in response to a load currentstep from 0 mA to 0.08 mA (e) when operating in the highest drive level (CDL6) with a quiescent cur-rent of Iq = 17.7µA, and (f) when operating in the lowest drive level (CDL1) with a quiescent current ofIq = 650 nA, and (g) showing an enlarged detail view. All simulation and measurement results are obtainedunder worst-case conditions (VDD = 1.9 V, Temp. = −40 C).


gain to the feedback loop. As an LDO designed for supplying digital circuits aims to maintain theoutput voltage within a specified tolerance window, a slow settling at the LDO output can howeverbe tolerated for this application. To allow a detailed observation of the settling behavior of thecascoded FVF, Fig. 6.17(c) shows an enlarged detail view of the LDO load transient response. Thesettling behavior corresponds to that of a minimal-phase, linear and second-order negative feedbacksystem, and is as such determined by the damping factor ζ as well as the resonance frequency ωn, asidentified and quantified in Chap. 5.4.4. The close correspondence between the experimental resultsand the simulation results is evident.

For comparison, and as illustrated in Fig. 6.17(d), a small load current step from 0 mA to 0.08 mAis applied, while the LDO is again operated in the highest current drive level CDL6 with a quiescentcurrent of Iq = 17.7µA. As evident from the LDO transient response depicted in Fig. 6.17(e), thediscrete load adaptive LDO is in this case unnecessarily fast and the tolerance window is not fullyutilized. The discrete load adaptive LDO can thus be slowed down to the lowest current drive levelCDL1 achieving a quiescent current of Iq = 650 nA only. The LDO behavior in response to a loadcurrent step from 0 mA to 0.08 mA is depicted in Fig. 6.17(f). The resulting transient voltage errorsof−27 mV and +18 mV are very similar to those in response to a full-scale load current step in thehighest current drive level, and are again well within the specified tolerance window of +30/−70 mV.Fig. 6.17(g) shows an enlarged detail view of the LDO load transient response in the lowest currentdrive level. The close correspondence between the experimental results and the simulation results,both with respect to the damping factor ζ and the resonance frequency ωn, is again evident.

When comparing the LDO load transient response in the lowest current drive level (CDL1) withthat in the highest current drive level (CDL6), particular attention should be paid to the differentscaling of the time axis in Fig. 6.17. In order to allow for a direct comparison, the time axes are scaledaccording to the LDO drive capability. The LDO gain-bandwidth can be determined based on theload transient response by extracting the resonance frequency ωn. For the lowest current drive level(CDL1), the extracted gain-bandwidth of ωGBW = 2π · 200 kHz closely corresponds to the small-signal analysis results.1 In contrast hereto, the extracted gain-bandwidth of ωGBW = 2π ·11900 kHzfor the highest current drive level (CDL6) is significantly higher than indicated by the small-signal analysis (ωGBW = 2π · 6800 kHz). The difference between the small-signal analysis and thelarge-signal transient response for the highest current drive level becomes even more evident whenconsidering the loop stability. The discrete load adaptive LDO exhibits significantly less ringing inresponse to a full-scale load current step when operating in the highest drive level (CDL6), comparedto when operating in the lowest drive level (CDL1). By determining the damping factor from theload transient response, the phase margin is φM = 34 in the highest drive level (CDL6), andφM = 25 in the lowest drive level (CDL1). In contrast hereto, the small-signal analysis performedin Chap. 6.4.2 indicates a very similar phase margin for both drive level settings of φM = 26 andφM = 25 , respectively. For the small-signal analysis, the on-chip power distribution network ismodeled in accordance to the considerations in Chap. 4.3.2 as purely capacitive. By neglecting anygrid wire resistance, the phase margin obtained represents worst-case conditions for loop stability.The grid wire resistance however causes a left-half-plane (LHP) zero ωz,ESR in the LDO transfer-function, which potentially improves the loop stability if located within the LDO gain-bandwidth(such that ωz,ESR < 10 · ωGBW ). For the above load transient simulations, performed as lastverification step after finalizing the design of both the LDO and the MCU digital core, an equivalent

1 Please note that the results for the transient load regulation and the small-signal analysis are obtainedunder different operating conditions (namely including supply voltage and temperature). The results cantherefore not be compared directly, but nevertheless provide a good indication.


series resistance of RESR = 3 Ω is included, which is determined based on the digital backend design(place-and-route design stage). As a result, the LHP-zero is located at around ωz,ESR = 2π ·18 MHz,and thus becomes effective only in the highest current drive level (CDL6) when the gain-bandwidthis at its maximum. By partially counteracting the phase shift of the first non-dominant pole p2,the LHP-zero extends the gain-bandwidth and improves the loop stability compared to the resultsindicated by the small-signal analysis. For the lowest current drive level (CDL1), in contrast, thegain-bandwidth is at its minimum, such that the LHP-zero is located well beyond. The LHP-zerois thus not able to counteract the phase shift, and the loop stability determined by extracting thedamping factor is again in good accordance to the worst-case obtained from small-signal analysis.The experiments on transient load regulation in this way prove the LDO loop stability under worst-case operating conditions.

Summary

Due to the large gain-bandwidth of the cascoded flipped voltage follower, the discrete load adaptiveLDO is able to instantaneously react to fast load transient steps, as demonstrated by the aboveexperimental results. This again proves the effectiveness of the control strategy, addressing boththe static and the transient regulation separately by decomposing the control tasks into constituentfeedback loops. Furthermore, in accordance to the drive capability adaption strategy, only thegain-bandwidth of the discrete load adaptive LDO is adapted depending on the current drive level(and thus on the LDO quiescent current), while both the small-signal voltage gain and the phasemargin are maintained. This is impressively demonstrated by comparing the measured load transientresponse in the different current drive levels: By scaling the time axis according to the LDO drivecapability, the resulting waveforms appear to be very similar. In particular, the transient voltageunder- and overshoot in response to a respective full-scale load current step are independent of thedrive level setting. This in turn proves the effectiveness of the drive capability adaption strategyalso with regard to the transient regulation performance.

6.4.6 Dynamic Range Limitations

For the discrete load adaption scheme the definition of the current drive levels is widely flexibleand can thus be adapted to system requirements. While the theoretic considerations on the drivecapability adaption strategy in Chap. 6.4.1 indicate no limitations, various non-ideal effects need tobe considered for the definition of the current drive levels - particularly with respect to the dynamicrange as well as the granularity.

According to the definition of the current drive levels, the sensing transistor MCG1, the pass-transistor MPASS and the resistor R1, as well as the current mirror banks MCM1, MCM2 andMCM3 are subdivided into multiple segments. As higher the dynamic range, as larger the ratiobetween the smallest segment and the sum of all segments, which might become problematic invarious respects. (1) First, each segment, although it is deactivated, cannot be isolated completelyand hence still contributes to the respective node capacitance, particularly also due to the switchtransistors needed for each segment. As a result, the node capacitances and consequently also thepole locations scale at a slower rate as indicated by the theoretic drive capability adaption strategy,and the ideal adaption laws do not hold true anymore. (2) Second, the silicon area is determinedby the extreme LDO drive levels - the highest drive level determines the transistor sizes, while thelowest determines the resistor value. In particular, the resistor R1 - which is inversely proportional

6.5 Discrete Load Adaptive LDO Dynamic Drive Level Adaption 159

to the maximum LDO current drive capability - might become extremely large at very low LDOdrive levels with a correspondingly low bias current.

For each current drive level additional switches and control signals are required. However, sincethe devices are build-up of equal segments, with the segment size defined by the lowest current drivelevel, the resulting area overhead is low. The granularity of the current drive levels is instead ratherlimited by the arising complexity for circuit design, verification and validation. In this context,particularly also the system requirements should be included into the considerations. By choosing anon-linear definition of the current drive levels as for the demonstrator system, a high dynamic rangecan be combined with a fine granularity at low load current levels, i.e. when power managementoverhead is most critical. A coarse granularity is in return accepted at high load current levels.

6.5 Discrete Load Adaptive LDO Dynamic Drive Level Adaption

After examining the steady-state operation of the discrete load adaptive LDO within one currentdrive level (CDL), the focus is in the following directed towards the dynamic switching betweenthe drive levels during operation. In order to exploit the high LDO current efficiency over the fullload current range also under applicative conditions, the dynamic drive level adaption strategy isof vital importance. This particularly includes arbitrary and instantaneous switching between anylevels, independent of the LDO operating conditions. At the same time, the output voltage must bekept within the tolerance window necessary to guarantee fault-free operation of the MCU digitalcore (here +30/− 70 mV) when dynamically switching between the drive levels.

The following section elaborates the dynamic drive level adaption strategy, thereby particularlyfocusing on the implementation aspects necessary to enable an instantaneous switching between thedrive levels while keeping dynamic settling effects such as charge injection and feed-through low.The effectiveness of the dynamic drive level adaption strategy is ultimately confirmed by simulationand experimental results. This particularly includes the transient voltage error induced by the drivelevel adaption and caused by dynamic settling effects, as well as the time required to achieve thenew current drive capability.

6.5.1 Dynamic Drive Level Adaption Strategy

By adapting the current drive capability of the discrete load adaptive LDO, the biasing of both LDOstages - the cascoded flipped voltage follower (FVF) and the folded-cascode amplifier - is adapted.As a result, and in accordance to the drive capability adaption strategy presented in Chap. 6.4.1,the gain-bandwidth is adapted depending on the drive level setting, while both the small-signalvoltage gain and the phase margin are maintained. When dynamically adapting the drive level, theLDO operating points change abruptly from one steady-state to another. Accordingly, two aspectsrequire particular attention for definition and implementation of the dynamic drive level adaptionstrategy: The transient voltage error induced by the drive level adaption, as well as the settlingtime required until the LDO reaches its new current drive capability.

When dynamically switching between the LDO drive levels, the pass-transistor width and conse-quently its drain current (provided to the LDO output) abruptly changes by the same factor as theLDO current drive capability, presuming the pass-transistor source-gate voltage remains constantduring the transition. At the same time, however, adapting the gain-bandwidth of the cascoded


FVF causes a fluctuation of the pass-transistor source-gate voltage. To minimize the dynamic set-tling effects, and thus the transient voltage error induced by the drive level adaption, the basicapproach is to adapt only the bias currents, while the internal node voltages remain approximatelyconstant. In this way also a fast settling time of the LDO current drive capability is enabled. Thistime is solely determined by adapting the pass-transistor width and settling of the bias currents(particularly that of the cascoded FVF). In order to achieve instantaneous adaption, the settlingtime of the bias current must be faster than the response time of the LDO feedback loop determinedby its gain-bandwidth. This should not be confused with the time required until the output voltagesettles back to its nominal level. In response to the small, but unavoidable transient voltage error,the LDO feedback loop reacts by adjusting the pass-transistor drain current, such that the outputvoltage settles back to its nominal level. In addition to the settling effects of the pass-transistor andthe cascoded FVF, also those of the folded-cascode amplifier are translated to the LDO output, andthus contribute to the transient voltage error. In conclusion, the transient voltage error induced bythe drive level adaption results from superposition of all three settling effects. These are elaboratedseparately in the following, with a particular focus on the circuit design techniques required tokeep the internal node voltages constant during switching and to achieve instantaneous adaptionof the drive capability. Similarly to the investigation of the any-load stable LDO, the LDO is forthis purpose ripped up at the folded-cascode amplifier output and divided into its two stages. Themain focus for the dynamic drive level adaption is on the fast LDO stage in form of the cascodedFVF, since it is much more susceptible to dynamic settling effects due to its high gain-bandwidth.The adaption scheme for the slow stage in form of the folded-cascode amplifier is in contrast ratheruncritical, and is addressed subsequently.

Pass-Transistor

When dynamically switching between the LDO drive levels, the pass-transistor width is adaptedaccording to the new current drive capability (also see Chap. 6.4.1). For this purpose, both thethin-oxide pass-transistor and the thick-oxide cascode transistor are divided into multiple segments.The pass-transistor segments are individually turned-off by pulling the gate node of the respectivecascode transistor segment to the positive supply voltage. When activated, the thick-oxide cascodetransistor segments are controlled by a common auxiliary amplifier, in this way keeping the source-drain voltage of the thin-oxide pass-transistor equal to a defined bias voltage. A circuit diagramshowing the pass-transistor implementation is depicted in Fig. 6.18. Due to charge sharing effectswith the already activated cascode transistor segments, the auxiliary amplifier output VCGATEhowever bounces up when activating one or more segments. To enable nevertheless a fast settlingtime, an additional voltage follower is intercepted to improve the current sink capability of theauxiliary amplifier. As a result, the amplifier output settles quickly within a few ten nanoseconds.At the same time, the pass-transistor gate voltage VGATE undershoots when activating additionalsegments due to charge sharing effects of the gate-drain capacitance. However, since the source-drain voltage of the pass-transistor is limited by the auxiliary amplifier to a few hundred millivolts,this voltage undershoot is small.

While the dynamic switching effects of charge injection and feed-through can be effectivelymitigated by simple circuit design techniques, another effect requires a more thorough examination.By adapting the pass-transistor width, also its transconductance is adapted. Presuming the pass-transistor gate voltage remains constant during the transition, its drain current (provided to theLDO output) therefore abruptly changes by the same factor as the LDO current drive capability.


-2

0

2

4

6

0 10 20 30 40 50 60 70 80 90 100

2

2

2

2

3

0 10 20 30 40 50 60 70 80 90 100

2

2

2

3

3

0 10 20 30 40 50 60 70 80 90 100

VCGATE

VCASC

VDD VDD

IBIASCLOAD

VCORE

ILOAD

VDD VDD VDD

R1

VDD

MPASS[W/L]X [W/L]2 [W/L]1

MCASC[W/L]X [W/L]2 [W/L]1

VGATE

CDL2CDLX

CMOS transmission-gate

Time: 10ns/div

Time: 10ns/div

Time: 10ns/div

VCTRL: 2.0V/div

VGATE: 200mV/div

VCGATE: 200mV/div

CDL1 CDL6

Simulation results obtained at nominal conditions (VDD=3.0V, room temperature)

Fig. 6.18. Circuit diagram of the pass-transistor with dynamic drive level adaption. By adapting thepass-transistor width, its drain current abruptly changes by the same factor as the LDO current drivecapability.

When switching to a higher drive level setting, the pass-transistor drain current suddenly becomestoo high, resulting in a dynamic overshoot at the LDO output. Abstractly speaking, this behaviorcan be compared to that in response to a negative load current step. The vice versa applies whenswitching to a lower drive level setting: The pass-transistor drain current suddenly becomes toolow, and a dynamic undershoot can be observed at the LDO output. Switching the pass-transistorwidth in this way corresponds to an inverse load current step. Under applicative conditions, the LDOcurrent drive capability is increased in response to an increasing load current demand of the MCUdigital core. By exploiting the information of the system clock frequency, the LDO current drivecapability and thus also the pass-transistor drain current are increased - thereby instantaneouslyproviding higher current to the output. At the same time, also the load current demand is supposedlyincreased (by the same or similar factor) due to the increased clock frequency. The discrete loadadaptive LDO is therefore in principle able to anticipate the increasing load current demand ina feed-forward scheme - presuming the correct timing. Even though this scheme is not furtherelaborated as part of this work, it offers the potential of perfect control of the LDO output voltagein response to fast load transient variations. In case the timing is not correct, the LDO feedbackloop needs to react to the pass-transistor drain current change. While the dynamic range of theLDO current drive capability is limited to 32 for the demonstrator system, a “real” load currentstep under applicative conditions often spans more than three decades. Due to the limited dynamicrange, the discrete load adaptive LDO is for this reason easily able to keep the output voltagewithin the specified tolerance window when dynamically switching the LDO drive level.


Cascoded Flipped Voltage Follower

While the pass-transistor width is adapted, its gate voltage must remain constant when dynamicallyswitching the LDO drive level. Any dynamic voltage fluctuation at the gate node is translated by thepass-transistor transconductance into a varying output current, and can thus be directly observedat the LDO output. As revealed in Chap. 5.3.3, the cascoded FVF with resistor can be considered ascurrent switch from a large-signal perspective. The predefined bias current IDS,CM1 is subdividedas illustrated in Fig. 6.19: While the one portion I1 flows through the sensing transistor MCG1, theother portion I2 flows through the cascode device MCG2, thereby controlling the pass-transistorMPASS via the voltage drop over the resistor R1.

IDS,CM1 = I1 + I2 (6.16)

I2 = IDS,CG2 = VSG,PASSR1

(6.17)

I1 = ISD,CG1 = IDS,CM1 −VSG,PASS

R1(6.18)

For the discrete load adaptive LDO, the bias current ranges between IDS,CM1 = 0.4µA in thelowest drive level (CDL1) and IDS,CM1 = 12.8µA in the highest drive level (CDL6), while theresistance R1 ranges between R1 = 4480 kΩ and R1 = 140 kΩ. To keep the pass-transistor gatevoltage constant at any point in time, adaption of the bias current IDS,CM1 must be carefullysynchronized to that of the resistor R1. Both long settling times and high over- and undershoots in

0

4

8

12

16

0 10 20 30 40 50 60 70 80 90 100

2

2

2

3

3

0 10 20 30 40 50 60 70 80 90 100Time: 10ns/div

VGATE: 200mV/div

CDLR1

CDLMPASS

MCG2

CDLMCG1

CDLMCM1

VCORE

VDD VDD

VAMP VBIAS2

I2I1

VGATE

-2

0

2

4

6

0 10 20 30 40 50 60 70 80 90 100Time: 10ns/div

VCTRL: 2.0V/div

CDL1 CDL6


Time: 10ns/div

IBIAS: 4µA/div

ISD,CG1

IDS,CG2

IDS,CM1

32x

I1+I2

Fig. 6.19. Circuit diagram of the cascoded flipped voltage follower (FVF) with the dynamic drive leveladaption. To keep the gate node of the pass-transistor constant at any point in time, adaption of the currentmirror MCM1 needs to be carefully synchronized to that of the resistor R1.


bias current must therefore be avoided. While the dynamic adaption of the current mirror MCM1and the resistor R1 is most vital to achieve instantaneously the new current drive capability withoutcausing large fluctuations of the pass-transistor gate voltage, also the sensing transistor MCG1 andthe folding transistor MCG2 are adapted depending on the LDO current drive level. The followingsection provides a brief overview of the implementation aspects necessary to keep the dynamicswitching effects during drive capability adaption low.

The resistorR1 is implemented as string resistor ladder, which can be shorted stepwise dependingon the selected LDO drive level. The switch transistors are sized such that their on-resistance can beneglected compared to the respective segment resistance R1. At the same time, the switches causecharge injection and capacitive coupling affecting the pass-transistor gate node during dynamicadaption of the LDO drive level. To minimize charge sharing effects, the resistors and switches arearranged symmetrically, as illustrated in 6.20(a). In this way, the parasitic capacitances are evenlycharged and discharged, and the pass-transistor gate voltage remains approximately constant.

To adapt the bias current IDS,CM1 depending on the drive level setting, the current mirrorMCM1 is implemented as current-mode DAC with thermometer coding. As its drain-source voltageexhibits only small variations, a simple current mirror is adequate here; no cascoded current mirrorstructure is required. To enable a fast and precise adaption of the bias current, the individual currentmirror segments are controlled by switch transistors at the drain node, as depicted in Fig. 6.20(b).Since the switch transistors are minimum size, dummy transistors to cancel the charge sharing andfeed through of the control signals are omitted. The main error is instead caused by coupling ofthe gate-drain capacitance of the current mirror MCM1, which is rather large size to achieve goodmatching. To avoid a variation of the current mirror bias voltage due to capacitive coupling, a buffercapacitance is placed at the current mirror bias node. In this way, a fast and precise settling behaviorof the bias current IDS,CM1 is achieved, which is particularly much faster than the response timeof the cascoded FVF determined by its gain-bandwidth.

Conceptually similar to the implementation of the current mirror MCM1, also the sensing tran-sistor MCG1 is divided into multiple segments. The individual transistor segments are controlledby switches at the drain node, while the gate node of all segments are permanently connected tothe folded-cascode amplifier output, which, due to the compensation capacitance C1, presents alow impedance node (assuming C1 (CGS,CG1 + CGD,CG1)). While the settling time until thecascoded FVF achieves the new current drive capability is determined by the settling time of thebias current, the voltage at the folding node VFOLD must remain constant. Any voltage fluctuationat the folding node is translated by the common-gate transistor MCG2 into a current, and in turninto a voltage variation at the pass-transistor gate node. Noteworthy in this context, the chargesharing induced by adapting the current mirror MCM1 essentially cancels with that induced byadapting the sensing transistor MCG1. As illustrated in Fig. 6.20(b), the drain node of the sensingtransistor MCG1 is charged to VCORE , while that of the current mirror MCM1 is charged to groundpotential. By enabling both transistors simultaneously and assuming similar transistor sizes, thecharge sharing effects essentially cancel each other out. As a result, the voltage at the folding noderemains in a first approximation constant, and only very short voltage spikes can be observed whendynamically switching the LDO drive level. The voltage at the folding node, and consequentlyalso the drain current of the common-gate transistor MCG2 settle within less than 50 ns under anyoperating conditions.

In accordance with the considerations in Chap. 6.4.1, the common-gate transistor MCG2 isadapted only indirectly. Since this transistor merely provides a folding function, it does not affectthe dynamic drive level adaption of the cascoded FVF. However, to adapt not only the current drive


VDD

MCG2

VBIAS2

CMOS transmission-gate

CDL1CDL2

(a)

VGATE

R1

[R]1

[R/2]2

[R/2]X

[R/2]2

[R/2]X[W/L]1 [W/L]2 [W/L]XVAMP

MCG1

CDLXCDL2CDL1

[W/L]1 [W/L]2 [W/L]X

MCM1

CDLXCDL2CDL1

VBIAS1

[W/L]1

(b)

CBUF

VCORE

VFOLD

I BIA

S

Fig. 6.20. Circuit diagram of (a) the string resistor ladder R1, and (b) the current mirror MCM1. Dueto their instantaneous and synchronous adaption, the discrete load adaptive LDO achieves instantaneouslyits new current drive capability without causing large fluctuations of the pass-transistor gate voltage.

capability, but also the current sink capability, the biasing of the common-gate transistor MCG2 isadjusted. The bias voltage is generated by two stacked NMOS transistors in diode configuration.The current source MCM3 - defining the bias current provided to the diode stack as introduced inFig. 6.7 - is for this purpose derived from the folded-cascode amplifier and is adapted depending onthe drive level setting. Since the biasing node is buffered by a small decoupling capacitance, the biasvoltage settles slowly within a few hundred nanoseconds. While the LDO current source capabilityis adapted instantaneously, the LDO current sink capability is therefore adapted only slowly.

In conclusion, owing to the instantaneous and synchronous adaption of the current mirrorMCM1and the string resistor ladder R1, the discrete load adaptive LDO achieves instantaneously its newcurrent drive capability without causing large fluctuations of the pass-transistor gate voltage. Sinceall internal node voltages remain in a first approximation constant, there is no settling time required,but the time until the cascoded FVF achieves its new current drive capability is solely determinedby settling of the bias current. The above design techniques are complemented by careful layout ofthe cascoded FVF. To avoid large fluctuations of the pass-transistor gate voltage during drive leveladaption, the parasitic capacitances particularly at the pass-transistor gate node and the foldingnode are minimized. The matching requirements are subordinated to this - layout techniques toimprove the device matching (such as interdigitating), which however increase the parasitic tracecapacitance, are therefore applied very restrainedly.


Folded-Cascode Amplifier

In contrast to the cascoded flipped voltage follower (FVF), the dynamic drive level adaption for thefolded-cascode amplifier is rather uncritical. By adapting its bias current depending on the currentdrive level (CDL), the gain-bandwidth of the folded-cascode amplifier is adapted. For this purpose,the current mirror MCM2 - defining the bias current as introduced in Fig. 6.7 - is implementedas current-mode DAC with thermometer coding. While conceptually similar to the current mirrorMCM1 providing the bias current to the cascoded FVF, a fast and precise settling behavior of thecurrent mirror MCM2 is not required, making its implementation less critical.

The gain-bandwidth of the folded-cascode amplifier is determined by the transconductance gm,1in combination with the compensation capacitance C1 at the amplifier output (also see Chap. 5.2for further details). To maintain loop stability of the discrete load adaptive LDO also at no load con-dition, the capacitance is rather large, and the gain-bandwidth is low. The folded-cascode amplifieris therefore insensitive to any dynamic switching effects such as charge sharing and feed throughof the control signals. The settling behavior in response to the dynamic drive level adaption isinstead dominated by the first non-dominant pole p3 associated to the amplifier current mirror, asillustrated in Fig. 6.21. Although the settling time of the bias current is rather slow (a few hun-dred nanoseconds), it is still much faster than that of the amplifier current mirror. As a result, thedrain current of the transistor M6 is adapted with a delay, while that of the transistor M8 comesinto effect immediately. The amplifier output VAMP therefore tends to undershoot when increasingthe bias current (i.e. switching from a lower drive level to a higher drive level), and vice versa.

1.13

1.14

1.14

1.15

1.15

0 10 20 30 40 50 60 70 80 90 100

-200

0

200

400

600

0 10 20 30 40 50 60 70 80 90 100

C1

VAMPVREF VCOREM1 M2

M4

M6

M3

M5

VDD VDD

VCASCP

VCASCN M8M7

M10M9

M11

M12

M13

M14

p0

p3CDL

VDD

MCM2

VBIASN

-2

0

2

4

6

0 10 20 30 40 50 60 70 80 90 100Time: 100ns/div

Time: 100ns/div

Time: 100ns/div

VCTRL: 2.0V/div

IBIAS: 200nA/div

VAMP: 5mV/div

CDL1 CDL6


IDS,8 (sinking)

ISD,6 (sourcing)

32x

Fig. 6.21. Circuit diagram of the folded-cascode amplifier with dynamic drive level adaption. The settlingbehavior of the folded-cascode amplifier is dominated by the first non-dominant pole p3 associated to theamplifier current mirror.


Particularly when decreasing the bias current (i.e. switching from a higher drive level to a lowerdrive level), the settling behavior is determined by the low bias current and becomes rather slow.The settling behavior of the folded-cascode amplifier is directly translated by the cascoded FVFto the LDO output. To improve the settling behavior, the first non-dominant pole p3 needs to bepushed to higher frequency significantly above the amplifier gain-bandwidth. This can be achievedeither by minimizing the parasitic capacitance, which however results in a matching penalty - orby increasing the bias current, which however results in a quiescent current penalty.

6.5.2 Experimental Results on Dynamic Drive Level Adaption

The effectiveness of the above presented strategy for dynamic adaption of the LDO drive capabilityduring operation is in the following examined by simulated and experimental results. For thispurpose, two experiments are conducted to determine the transient voltage error induced by thedrive level adaption, as well as the settling time required until the LDO reaches its new current drivecapability. In both cases, the ultra-low-power MCU system is set into sleep mode with the systemclock being stopped. The transient current profiles are generated by the on-chip programmableload, while the MCU digital core itself draws no load current. The leakage current of 4µA at 25 C,which increases to 24µA at 85 C, is compensated by an external constant current source to notdistort the following LDO performance measurements. To allow full control over the discrete-loadadaptive LDO, the dynamic LDO drive level adaption as during applicative system operation isdisabled. The LDO drive level setting is for these experiments instead controlled from external. TheLDO transient behavior is measured by utilizing the differential sensing scheme as introduced inChap. 6.2.2.

Transient Voltage Error

The transient voltage error in response to a dynamic adaption of the LDO drive level is caused bydynamic settling effects of both the cascoded flipped voltage follower (FVF) and the folded-cascodeamplifier. During applicative system operation, the LDO drive level is adapted in anticipation ofa changing system clock frequency, and thus a changing load current demand. For the followingexperiment, however, the load current remains constant in order to be able to distinguish betweenthe transient voltage error induced by the drive level adaption and that induced by the LDO loadtransient response.

Fig. 6.22(b) shows the simulated and measured LDO behavior when switching between the low-est and the highest current drive level (CDL1 and CDL6, respectively) at no load condition. TheLDO quiescent current is in this way instantaneously adapted between Iq = 650 nA (CDL1) andIq = 17.7µA (CDL6). In contrast to the above experiments for the LDO transient regulation per-formance, worst-case conditions for this experiment are at maximum supply voltage and maximumtemperature (VDD = 3.6 V, Temp. = +85 C), i.e. when the pass-transistor transconductance is atits maximum. While the output voltage is ideally expected to remain constant, transient voltageerrors of −6 mV and +14 mV can be nevertheless observed, caused by dynamic settling effects whenswitching the LDO drive level. The settling effects visible at the LDO output are dominated bythe folded-cascode amplifier. Due to the first non-dominant pole associated to the amplifier currentmirror, the error amplifier output tends to undershoot when increasing the bias current (i.e. switch-ing from a lower drive level to a higher drive level), and vice versa. Particularly when decreasingthe bias current (i.e. switching from a higher drive level to a lower drive level), the settling behavior


1. 48

1. 50

1. 52

1. 54

1. 56

0. E+00 1. E+01 2. E+01 3. E+01 4. E+01 5. E+01 6. E+01 7. E+01 8. E+01 9. E+01 1. E+02

- 1

0

1

2

0 10 20 30 40 50 60 70 80 90 100Time: 10µs/div

(a)

Measurement

Circuit Simulation

VCTRL: 2V/div

CDL1CDL6

1. 48

1. 50

1. 52

1. 54

1. 56

0. E+00 1. E+01 2. E+01 3. E+01 4. E+01 5. E+01 6. E+01 7. E+01 8. E+01 9. E+01 1. E+02Time: 10µs/div

VOUT: 20mV/div(b)ILOAD: 0µA

14mV

_6mV

(c)

Time: 10µs/div

VOUT: 20mV/divILOAD: 80µA

10mV

35mV

Fig. 6.22. Simulated and measured LDO transient behavior when (a) switching between the lowest andthe highest current drive level (CDL1 and CDL6, respectively) at (b) no load condition, and (c) a loadcurrent of 0.08 mA, corresponding to the maximum load current in the lowest drive level. The simulationand measurement results are obtained under worst-case conditions for this experiment (VDD = 3.6 V,Temp. = +85 C).

of the folded-cascode amplifier is determined by the low bias current and becomes rather slow. Thesettling behavior of the folded-cascode amplifier is directly translated by the cascoded FVF to theLDO output. Since the load current is zero for this experiment, the pass-transistor transconduc-tance is at its minimum. For this reason, the adaption of the pass-transistor width, but also thesettling effects at the pass-transistor gate node have only little impact on the LDO output voltagein this case.

Complementing the above experiment, Fig. 6.22(c) shows the simulated and measured LDObehavior when switching between the lowest and the highest current drive level (CDL1 and CDL6,respectively) at a load current of 0.08 mA, corresponding to the maximum load current in thelowest drive level. In contrast to the preceding experiment, the pass-transistor transconductance istherefore at its maximum. In this way, the pass-transistor drain current is most sensitive to anysettling effects at the gate node, and particularly also to the adaption of the transistor width. Whenswitching to the highest drive level (CDL6), the pass-transistor drain current abruptly changes bythe same factor as the LDO current drive capability, resulting in a dynamic overshoot of +10 mVat the LDO output. Abstractly speaking, this behavior can be compared to that in response to anegative load current step from 2.56 mA to 0.08 mA in the highest drive level. In response to thedynamic overshoot, the output voltage is corrected by the closed LDO feedback loop. The vice versaapplies when decreasing the LDO drive level: The pass-transistor drain current suddenly becomestoo low, and a dynamic undershoot of −35 mV can be observed at the LDO output. Since a loadcurrent of 0.08 mA in the lowest LDO drive level represents worst-case conditions for loop stability,a pronounced oscillation can be observed. This behavior is similar to that in response to a positiveload current step to 0.08 mA in the lowest drive level (CDL1), as demonstrated in Chap. 6.4.5


- 2

0

2

3

5

0 10 20 30 40 50 60 70 80 90 100

1. 48

1. 50

1. 52

1. 54

1. 56

0. E+00 1. E+01 2. E+01 3. E+01 4. E+01 5. E+01 6. E+01 7. E+01 8. E+01 9. E+01 1. E+02

Time: 10µs/div

(a)

Measurement

Circuit Simulation

ILOAD: 2mA/div

CDL1

ILoad=0.00mACDL6

ILoad=2.56mA

VOUT: 20mV/div(b)

Time: 10µs/div

32mV

17mV

Fig. 6.23. Simulated and measured LDO transient behavior when switching between the lowest and thehighest current drive level (CDL1 and CDL6, respectively), while applying simultaneously a worst-case loadcurrent step from 0 mA to 2.56 mA. The simulation and measurement results are obtained under worst-caseconditions for this experiment (VDD = 3.6 V, Temp. = +85 C).

(particularly also see Fig. 6.17). The settling behavior of the cascoded FVF is superposed by thatof the folded-cascode amplifier. The settling behavior of the folded-cascode amplifier is therebyindependent of the load current, as becomes evident when comparing the slow-settling componentsin both cases, i.e. at no load condition and at a load current of 0.08 mA. In conclusion, the closecorrespondence between the experimental results and the simulation results in both cases is evident.

Settling Time

After dynamically adapting the drive level, the discrete load adaptive LDO achieves instantaneouslyits new current drive capability, as demonstrated by the experiment shown in Fig. 6.23. While thediscrete load adaptive LDO is again switched between the lowest and the highest current drive level(CDL1 and CDL6, respectively), a worst-case load current step from 0 mA to 2.56 mA is appliedsimultaneously. The time required to achieve the new current drive capability is determined by thecascoded flipped voltage follower (FVF). By adapting both the resistor ladder R1 and the currentmirror MCM1 instantaneously and synchronously, the node voltages are basically maintained, andonly the bias current and thus the gain-bandwidth are adapted. When switching from the lowest tothe highest current drive level (CDL1 and CDL6, respectively), the bias current thereby settles inless than 50 ns, which is particularly faster than the response time of the LDO feedback loop. Due tolower bias currents, the settling time increases when switching into the lowest drive level. However,since also the LDO gain-bandwidth is reduced, the settling time remains clearly faster than theresponse time of the LDO feedback loop. When dynamically adapting the LDO drive level, theLDO feedback loop is moreover supported by the adaption of the pass-transistor width. Due to theincreasing pass-transistor width, the LDO output current is instantaneously increased by a factor of32 when switching from the lowest to the highest current drive level (CDL1 and CDL6, respectively),and vice versa. The LDO transient behavior is the superposition of that in response to adaptingthe LDO current drive capability and that in response to changing load current requirements. As aresult, switching the current drive level and changing load current requirements cause a maximumtransient voltage error of −32 mV and +17 mV, respectively, which is well within the maximum

6.6 LDO Operation under Applicative Conditions 169

LDO tolerance window (here +30/ − 70 mV). Though the discrete load adaptive LDO achievesinstantaneously its new current drive capability, it takes some time until the output voltage settlesback to its nominal level. This settling behavior is determined by the (slow) folded-cascode amplifier,as already demonstrated in the preceding section. The experimental results are essentially confirmedby transient circuit simulations, though the settling behavior cannot be precisely predicted withouttaking also the interconnect parasitic into account.

Summary

For highest system flexibility, the LDO current drive capability can be instantaneously adapted bya factor of up to 32. By switching between the lowest and the highest current drive level (CDL1and CDL6, respectively), the above experiments represent worst-case operating conditions for dy-namic drive level adaption, at which the maximum transient voltage error is observed. Since thedynamic switching effects such as charge injection and feed-through of the digital control signalsare minimized, the LDO output voltage is kept within the specified tolerance window while switch-ing the LDO drive level. The settling behavior is thereby dominated by the (slow) settling ofthe folded-cascode amplifier and, particularly at high load conditions, also by the adaption of thepass-transistor width. For any other drive level adaption, the change in LDO operating points isreduced resulting in a smaller voltage error during adaption. Since only the bias current is adapted,while the internal node voltages are basically maintained, the discrete load adaptive LDO achievesinstantaneously its new current drive capability after dynamically adapting the drive level. Theexperimental results in this way clearly proof the effectiveness of the dynamic drive level adaptionstrategy. The discrete load adaptive LDO allows for arbitrary and instantaneous switching betweenany drive levels, independent of the LDO operating conditions. As revealed in the following sectionfocusing on the performance of the discrete load adaptive LDO under applicative conditions, thediscrete load adaption scheme is thus completely transparent for the ultra-low-power MCU sys-tem operation. No restrictions are imposed for system operation - it is particularly not required tointroduce any wait states.

6.6 LDO Operation under Applicative Conditions

While the discrete load adaptive LDO has so far been verified for DC load conditions only, theMCU digital core presents a pulsed load current to the LDO (also see Chap. 4.3). The digital coreoperates synchronously to a system clock; it draws current by charging the various gate capacitanceswhenever they are switched. This results in large current spikes created at each clock edge. Thefollowing experimental results demonstrate the LDO performance under applicative conditions,when supplying the MCU digital core. This in detail includes (1) the LDO steady-state operationat various combinations of system clock frequency and LDO current drive level, (2) the LDObehavior in response to rapid load transients as commonly occurring when starting and stoppingsystem operation and (3) the dynamic LDO drive level adaption in combination with a change ofthe system clock frequency.

The experimental results are obtained with the ultra-low-power MCU system being in activemode, executing in each case a simple code example as specified in the according sections. In orderto guarantee reproducible and well-defined experimental conditions, the MCU system includingthe LDO is fully controlled from external. For this purpose, both the system clock signal and the


control signals determining the LDO current drive level are provided from external. The LDOoutput voltage waveforms are measured with the help of the differential sensing scheme introducedin Chap. 6.2.2. Each of the following experiments is performed at nominal conditions, i.e. at asupply voltage of VDD = 3.0 V and room temperature.

LDO Steady-State Operation

The MCU digital core creates large current spikes when switching synchronously to the system clock.However, designing the LDO to be able to react to the individual current spikes cannot be affordedwith a reasonable quiescent current demand. As demonstrated by the following experimental results,the discrete load adaptive LDO is therefore too slow to react to the current spikes, and insteadprovides the average load current to the MCU digital core. Each current spike consequently resultsin a voltage ripple ∆VCORE , which can be approximated by the expression derived in Chap. 4.3:

∆VCORE ∼= −ILOAD

fCLK · CLOAD(6.19)

Thereby, the MCU digital core is, for simplicity, assumed to have the same effective switchingcapacitance at each clock edge.

Fig. 6.24 shows the measured LDO output waveform when the MCU digital core operates atvarious combinations of system clock frequency and LDO current drive level (CDL). The MCUdigital core performs in each case a simple idle loop (also referred to as jmp-$ loop), which takestwo clock cycles. At the rising edge of the first clock pulse, the instruction code is fetched into theCPU, while at the falling edge the instruction address is automatically increased in order to fetchesthe next instruction. However, since the jmp-$ instruction forces a discontinuity in the programsequence, the next instruction code is ignored. Instead, the CPU calculates the new instructionaddress at the rising edge of the second clock pulse, which is then used at the falling edge to fetchthe next instruction code. Since this code is identical to the one executed at the first clock pulse, theCPU remains forever in the idle loop. This repetitive and constant pattern is also well noticeablefrom the LDO output waveforms.

When performing the simple idle loop at a clock frequency of 1 MHz, as depicted in Fig. 6.24(a),the average current consumption of the MCU digital core is ILOAD = 54µA. The discrete loadadaptive LDO is accordingly operated in the current drive level CDL2 with a quiescent current of1.2µA, as shown by the experiment in Fig. 6.24(b). The measured voltage ripple at each clock edgeis 10 mV, which is in good agreement with the estimation based on the above expression, indicatinga ripple of 9.3 mV. The voltage ripple caused by the digital activity is almost independent of thedrive level setting, as revealed by a comparison with the experiment shown in Fig. 6.24(c). In thisexperiment, the LDO is forced to the highest drive level CDL6 with a quiescent current of 17.7µA,while the MCU digital core again performs the idle loop at a clock frequency of 1 MHz. The discreteload adaptive LDO is in this case sufficiently fast to react to each individual current spike and bringthe output voltage back to its nominal value. In contrast, when operating in drive level CDL2, theLDO is not able to bring the output voltage back to its nominal value, but instead provides theaverage load current to the MCU digital core. Independent of the LDO drive level, the discreteload adaptive LDO is not able to instantaneously react to the fast current spikes, and is thereforenot able to prevent the voltage ripple. In order to keep the output voltage perfectly constant, theLDO would require a significantly higher gain-bandwidth (determined by the switching speed ofthe digital logic gates). Since this is not affordable with regard to the quiescent current demand,


VCLK: 1V/div

Time: 500ns/div

Time: 500ns/div

VCORE: 5mV/div(b)

10mV

Time: 500ns/div

VCORE: 5mV/div(c)

VCLK: 1V/div

Time: 31.25ns/div

Time: 31.25ns/div

VCORE: 5mV/div(e)CDL6

CDL6

CDL2

fCLK: 1MHz

fCLK: 16MHz

(a)

(d)

10mV

10mV

Fig. 6.24. Measured LDO transient behavior while the MCU digital core executes a simple idle loop undernominal operating conditions (VDD = 3.0 V, Temp. = 25 C). The measurement results are carried out atvarious combinations of system clock frequency and LDO current drive level.

a voltage ripple is accepted as long as remaining within the specified tolerance window. In thiscase, the charge needed during one clock cycle is mostly supplied from the integrated (on-chip)capacitance of 3 nF.

When varying the clock frequency, the rate of current spikes increases proportionally, resultingin an average load current of 840µA at a clock frequency of 16 MHz, as depicted in Fig. 6.24(d).The discrete load adaptive LDO is accordingly operated in the current drive level CDL6, as shownby the experiment in Fig. 6.24(e). The voltage ripple caused by the digital activity is again verysimilar to the previous cases, which is in good agreement with the estimation based on the aboveexpression (see Eq. 6.19). By scaling of the time axis by the same factor of 16 as the clock frequencyas well as the LDO current drive capability, the resulting waveforms shown in sub-figure (b) and(e) of Fig. 6.24 appear to be very similar. At the higher clock frequency, the LDO is also in thehighest drive level CDL6 not sufficiently fast to bring the output voltage back to its nominal valuein response to the current spike occurring at each clock edge. It provides instead again the averageload current to the MCU digital core.


LDO Load Transient Response

The MCU digital core presents rapid load current transients to the discrete load adaptive LDO whenstarting and stopping system operation, such as they frequently occur when exiting and enteringsleep mode. In such a case, the load current suddenly jumps from a very low level, determined bythe static leakage current of the MCU digital core, to a significantly higher level, determined by thedynamic switching current of the MCU digital core. Fig. 6.25 shows the measured LDO transientbehavior when the MCU digital core operates at an alternating clock frequency of 0 MHz and 16 MHz(black curve). Again a simple idle loop is executed resulting in an active current consumption of840µA under nominal operating conditions (VDD = 3.0 V, Temp. = 25 C). The LDO is constantlyoperated in the highest drive level CDL6 with a maximum current drive capability of 2.56 mAand a quiescent current of 17.7µA. Since the cascoded flipped voltage follower allows the LDOto instantaneously react to load changes, the maximum transient voltage errors of −24 mV and+12 mV (black curve) are kept well within the tolerance window of +30/−70 mV. The output levelsubsequently settles slowly back to the accurate nominal level determined by the slow folded-cascodeamplifier (EA1). Since an LDO designed for supplying digital circuits however aims to maintainthe output voltage within a specified tolerance window, a slow settling at the LDO output can betolerated for this application.

For comparison, an equivalent DC load current step from 4µA to 840µA is applied (gray curve).The average LDO output voltage and the maximum errors in both cases are the same. This experi-ment clearly demonstrates that the LDO is too slow to react to each single current spike caused bythe MCU digital core. Instead, the LDO provides only the average load current to the MCU digi-tal core. The LDO transient response is consequently superposed by the voltage ripple due to theMCU digital core operation. This in turn proves that the experimental results shown in Chap. 6.4.5represent worst-case conditions not only for DC load conditions, but also for applicative conditionswhen supplying the MCU digital core.

VOUT: 10mV/div

VCLOCK: 3V/div

Time: 1µs/div

CDL6

62ns

12mV

Fig. 6.25. Measured LDO transient behavior while the MCU digital core operates at an alternating clockfrequency of 0 MHz and 16 MHz (black curve). The gray curve shows the LDO behavior in response to anequivalent DC load current step. The average LDO output voltage and the maximum errors in both casesare the same.


LDO Dynamic Drive Level Adaption

For highest system flexibility, the discrete load adaptive LDO is designed to arbitrarily and instanta-neously adapt its current drive capability. The LDO dynamic drive level adaption is in the followingproven under applicative operating conditions by executing a simple data analysis algorithm at analternating clock frequency of 1 MHz and 16 MHz. For this purpose, two experiments are performed:While in the one case the dynamic LDO drive level adaption is disabled for demonstration purposes,it is enabled in the other case, thereby accordingly adapting the current drive level between CDL2and CDL6. Fig. 6.26(a) shows the system clock signal for both scenarios.

To begin with, the dynamic LDO drive level adaption is disabled for demonstration purposes.Instead, the LDO is constantly operated in the highest drive level CDL6 with a maximum currentdrive capability of 2.56 mA and a quiescent current of 17.7µA. Fig. 6.26(b) shows the measuredLDO transient behavior while the MCU digital core operates at an alternating clock frequency of1 MHz and 16 MHz. Clearly, the LDO is unnecessarily fast when operating at a clock frequencyof 1 MHz. The output voltage is able to fully settle back to its nominal value in response to thecurrent spike occurring at each clock edge. When changing the clock frequency, the rate of currentspikes increases proportionally to the clock frequency, resulting in a sudden step of the averageload current from 72µA to 1140µA. Since the LDO needs to react to this load current step andsettle to the new steady-state operating point, the increasing load current requirements cause amaximum transient voltage error of −18 mV. In order to exploit the LDO quiescent current benefits

-1

0

1

2

3

4

0 0 1 1 1 1 2 2 2 2 3

1

2

2

2

2

0 0 1 1 1 1 2 2 2 2 3

1 .4 9

1 .5 0

1 .5 1

1 .5 2

1 .5 3

0 .0 0 .3 0 .5 0 .8 1 .0 1 .3 1 .5 1 .8 2 .0 2 .3 2 .5

Time: 250ns/div

VCLK: 1V/div

VCORE: 10mV/div

Time: 250ns/div

Time: 250ns/div

VCORE: 10mV/div

fCLK: 1MHz fCLK: 16MHz

10mV

12mV

CDL6 => CDL6

CDL2 => CDL6

18mV

24mV

(b)

(c)

1 .4 9

1 .5 0

1 .5 1

1 .5 2

1 .5 3

0 .0 0 .1 0 .1 0 .1 0 .2 0 .2 0 .2 0 .2 0 .3 0 .3 0 .3

Time: 31.25ns/div

(d) VCORE: 10mV/div CDL2 => CDL6

12mV

(a)

Fig. 6.26. Measured LDO transient behavior while the MCU digital core operates at an alternating clockfrequency of 1 MHz and 16 MHz at VDD = 3.0 V. The discrete load adaption scheme is (b) disabledfor demonstration purposes, and (c) enabled to exploit the LDO quiescent current benefits at low clockfrequencies, with (d) showing a zoom of the dynamic drive level adaption.


at low clock frequencies, the dynamic LDO drive level adaption is activated for the experimentshown in Fig. 6.26(c). The MCU digital core again operates at an alternating clock frequency of1 MHz and 16 MHz, while the LDO is in the drive level CDL2 with a quiescent current of 1.2µAand in the drive level CDL6 with a quiescent current of 17.7µA, respectively. As discussed inChap. 6.3.2, the information about a clock frequency change is known one clock cycle in advance- corresponding in this case to the rising edge of the 1 MHz clock pulse. Based on the informationabout the (new) system clock frequency, the LDO current drive level is predicted by using a look-uptable approach. The current drive level is updated synchronously at the first rising clock edge ofthe 16 MHz operation. As evident from Fig. 6.26(d) showing a zoom of the dynamic drive leveladaption, the LDO instantly achieves its new load current drive capability. Switching the currentdrive level and increasing load current requirements cause a maximum transient voltage error of−24 mV. This transient voltage error compares to the −18 mV caused by the increasing load currentrequirements only, and is well within the maximum LDO tolerance window of +30/− 70 mV.

For the above simple data analysis algorithm, the effective switching capacitance of the MCUdigital core and in turn also the average LDO load current are rather low. Based on the actual loadcurrent, the LDO could (theoretically) be set to the next lower current drive level CDL5 instead ofCDL6, thereby reducing its current drive capability, but also its quiescent current by a factor of two.However, since it is hardly possible to predict general, but precise information about the effectiveswitching capacitance during operation, the load current prediction for the demonstrator system issolely based on the information of the system clock frequency. The full (theoretical) potential ofthe discrete load adaption is thus not exploited in this case.

Summary

The above experimental results under applicative conditions clearly show that the discrete loadadaptive LDO is too slow to react to each single current spike caused by the MCU digital core.Instead, the LDO provides only the average load current to the MCU digital core. The LDOtransient response is consequently superposed by the voltage ripple due to the MCU digital coreoperation. This in turn proves that the experimental results shown in Chap. 6.4.5 represent worst-case conditions not only for DC load conditions, but also for applicative conditions when supplyingthe MCU digital core. In conclusion, and as proposed in Chap. 4.3, it is a valid design approachto consider the two effects of LDO operation (transient response) and MCU digital core operation(voltage ripple) mostly independent from each other. After specifying both the maximum currentconsumption and the on-chip decoupling capacitance of the MCU digital core at an early designstage, the LDO circuit is designed and verified for DC load currents. Due to the superposition ofthe two effects, the LDO will also meet the specification when supplying the MCU digital core -with the voltage ripple primarily determined by the on-chip capacitance.

6.7 Summary

An LDO supplying the digital core of an ultra-low-power MCU system must combine a high currentefficiency under all load conditions with a small integrated capacitance at the LDO output in orderto enable a highly flexible and energy-efficient system operation. At the same time, the LDO outputvoltage must remain within a certain tolerance window under all operating conditions in order toguarantee a fault-free operation of the MCU digital core. In this context, particularly the rapid load

6.7 Summary 175

*) FOM =CLOAD ∙ ∆VOUT

ILOAD∙

Iq(min)

ILOAD

CDL1 CDL6

Technology 0.13µm

Active Area 0.016mm²

Temperature Range -40°C - 85°C

Supply Voltage Range 1.9V - 3.6V

Nom. Output Voltage 1.52V

Load Capacitance 3nF

Maximum Load Current 0.08mA 2.56mA

Quiescent Current 0.65µA 17.70µA

DC Offset (sim. 3σ) 19mV 10mV

Line Regulation 0.6mV/V 0.2mV/V

Load Regulation 0.3mV/mA 1.4µV/mA

Load Tran. Undershoot 27mV 27mV

Current Efficiency 99.2% 99.3%

FOM* 8ps

Fig. 6.27. Key performance parameters of the discrete load adaptive LDO. It combines a fast transientresponse, an excellent DC accuracy with a small load capacitance while keeping the LDO quiescent currentalways much lower than the load current.

current transients presented to the LDO when starting and stopping system operation add anotherlevel of complexity.

To reconcile these contradicting LDO design requirements, a discrete load adaptive LDO schemeis presented in this chapter. By exploiting the correlation between system clock frequency and loadcurrent demand, this LDO digitally adapts the maximum LDO current drive capability. It can bechosen from six discrete, binary scaled current drive levels. In the lowest level (CDL1) a quiescentcurrent of 650 nA is needed, while in the highest level (CDL6) the quiescent current is 17.7µAand a load current of up to 2.56 mA can be provided. In this way, the current efficiency remainsabove 97 % over two decades of load current. To achieve fast and energy-efficient wake-up, thisLDO does not need any external capacitance. It instead relies on the intrinsic capacitance of theMCU digital core amounting to overall 3 nF. The key performance parameters of the discrete loadadaptive LDO are summarized in Fig. 6.27 - it combines a fast transient response, an excellentDC accuracy with a small load capacitance while keeping the LDO quiescent current always muchlower than the load current. As proven by the simulation and experimental results, the static anddynamic LDO accuracy (related to the voltage gain) as well as the LDO loop stability (related tothe phase margin) are maintained and guaranteed for each current drive level. By making use ofknown system power information, the discrete load adaptive LDO achieves a figure-of-merit (FOM)of 8 ps.

The concept of digital-enhancement techniques is based on synergy effects due to exact systemknowledge, evolving between the LDO on the one side and the MCU digital core on the otherside. By effectively exploiting these information, the fundamental LDO design trade-offs can bereconciled, thereby opening a new degree of freedom for the design and optimization. While theconcept of discrete load adaption has been exemplary investigated for the any-load stable LDO inthis work, the application of digital-enhancement techniques can be considered as a wide field ofresearch. Other LDO topologies can benefit from this concept as well, since the same fundamental


design trade-offs between high accuracy, fast transient response and loop stability also apply here.Beyond this example, various schemes for digital-enhancement can be conceived depending on (1)which system power information is exploited, and (2) how this information is applied to the LDO.

7

Conclusion on System Energy Consumption

To conclusively proof the benefits of the digital-enhancement techniques with regard to systemenergy consumption, two very similar ultra-low-power MCU systems have been implemented andexperimentally verified. The one demonstrator system features a state-of-the-art power manage-ment architecture with the digital core supplied by an externally compensated low-dropout voltageregulator (LDO) requiring a large (external) load capacitance. The other demonstrator system incontrast is designed according to the holistic energy saving concept elaborated step-by-step in thepreceding chapters - combining the energy-efficient power management architecture as proposed inChap. 3 with the discrete load adaptive LDO as introduced in Chap. 6. According to the needs oftypical ultra-low-power application scenarios, the key target is to enable lowest power consumptionduring sleep as well as energy-efficient wake-up by minimizing the LDO load capacitance. At thesame time, the power management overhead is expected to be drastically reduced in active modeowing to the discrete load adaption scheme.

This chapter evaluates the benefits and limitations of the holistic energy saving concept withregard to both system energy consumption as well as system costs for typical ultra-low-powerapplication scenarios. After briefly introducing both demonstrator systems, the system power con-tributors and the related system trade-offs are evaluated and compared. This includes the powerconsumption during sleep, the energy required for switching between power modes, as well as theenergy and performance scalability in active mode. By experimentally comparing the proposedpower management architecture to the state-of-the-art, the benefits of the holistic design approachare identified. The experimental results are thereby based on the work presented in Lueders et al.(2013). Closing the circle of this work, the impact of the discrete load adaptive LDO is analyzedconclusively for two applications, a smoke detector (Mitchell, 2006) and a glass break detector(Kammel and Venkat, 2007), which represent both extremes of the broad range of applicationscenarios from frequent to rare wake-up events.

7.1 Demonstrator System and Operating Modes

The investigations presented in this chapter are based on two very similar ultra-low-power MCUsystems designed for small-scale battery powered and energy-harvesting applications. As introducedin Chap. 2 (particularly see Fig. 2.1), both MCU systems comprise a 16-bit MSP430 CPU, variousmultifaceted analog and digital peripherals as well as a non-volatile FeRAM memory for fast write

178 7 Conclusion on System Energy Consumption

Power Management

VCORE (1.52V)

Current

DrivabilityControl Unit

Bias Generator

Voltage Reference

Voltage Supervisor

VDD (1.9V-3.6V)

Clock

Frequency

LDO

Power Management


VCORE (1.52V)

Clock


Bias Generator

Voltage Reference

Voltage Supervisor

VDD (1.9V-3.6V)

Clock Generation

Sleep

LDO

Active

LDO

Clock Dividers

External Capacitance

(a)

(b)

470nF

Q

QD

Q

QD


Clock


Clock Generation

Clock DividersQ

QD

Q

QD

Fig. 7.1. Ultra-low-power MCU systems with (a) a state-of-the-art power management unit using twoseparate LDOs, and (b) an energy-aware power management unit using a single discrete load adaptiveLDO.

capability. To react to environmental changes, an energy-aware power management and clock gen-eration unit dynamically adapts the performance and power consumption of the MCU system. Theone MCU system depicted in Fig. 7.1(a) uses a state-of-the-art power management solution (Zwerget al., 2011). Here, the MCU digital core is supplied by two separate LDOs for active and sleepmode, respectively. Both of them are compensated by an external load capacitance of 470 nF. Forthe other MCU system depicted in Fig. 7.1(b), the MCU digital core is supplied by a single discreteload adaptive LDO as introduced in Chap. 6 (Lueders et al., 2011b). By exploiting the correlationbetween system clock frequency and system power demand, this LDO digitally adapts its maximumcurrent drive capability. To achieve a fast and energy-efficient wake-up, this LDO does not needany external load capacitance. It instead relies on the intrinsic capacitance of the MCU digital coreamounting to overall 3 nF. Since no additional capacitance is required, neither on-chip nor off-chip,

7.1 Demonstrator System and Operating Modes 179

the system costs are minimized. For both MCU systems, the supply voltage VDD ranges from 1.9 Vto 3.6 V, determined by the battery voltage over lifetime. The nominal core supply voltage VCOREis 1.52 V, which is decided as trade-off between speed/performance requirements on the one sideand active power considerations on the other side. Both demonstrator systems are fabricated in a0.13µm standard CMOS technology.

LDO Voltage Regulator

The LDO in both demonstrator systems is based on the any-load stable LDO topology as intro-duced in Chap. 5. In the one case, it is designed for a large (external) load capacitance of 470 nF,while in the other case it solely relies on a small on-chip load capacitance of 3 nF. While a smallLDO load capacitance enables an energy-efficient wake-up from sleep mode with ultra-low-powerconsumption, it presents several design challenges for stability and transient behavior of the LDO.As demonstrated in Chap. 5.8.1, the LDO gain-bandwidth and consequently also the quiescentcurrent demand are directly proportional to the LDO load capacitance. For the fully-integratedLDO with an on-chip load capacitance of 3 nF, the quiescent current demand is accordingly ex-pected to be increased by a factor of 150 compared to the LDO with an external load capacitanceof 470 nF, presuming an otherwise identical performance specification. In this particular example,however, the LDO with an external load capacitance is designed for a maximum load current of25 mA (instead of 2.5 mA) as well as for an enhanced phase margin. In good agreement with theLDO scaling law introduced in Chap. 5.8.1, the quiescent current is therefore very similar for bothLDOs at maximum performance levels. However, while the LDO quiescent current in case of thestate-of-the-art MCU system remains independent of the actual performance requirements, the dis-crete load adaption scheme enables a dynamic adaption of the quiescent current depending on theactual load current requirements. The state-of-the-art MCU system instead makes use of a dedi-cated low-power LDO to improve the current efficiency during sleep (Kristjansson, 2006; Baumannet al., 2013). The use of multiple independent voltage regulators however requires complex and slowtransition between them. With the use of the discrete load adaptive LDO, a dedicated low-powerLDO becomes redundant - in this way saving chip area, enabling faster wake-up times and reducingdesign complexity.

MCU System Operating Modes

The requirements towards the ultra-low-power MCU system are manifold and significantly differdepending on the application scenario. This is also reflected in the power management system bybeing highly flexible to adapt to the needs of different application scenarios. Fig. 7.2 provides anoverview of the MCU operating modes with the respective enabled functions for power managementand clock generation, and is discussed in the following.

During active power mode (APM), the ultra-low-power MCU system is fully active. The CPUfetches the code from the non-volatile memory and executes it synchronously to the clock signalprovided by the clock generation unit. To support the stringent supply voltage requirements of theMCU digital core, the power management unit operates in a high-performance mode. In this way,highest precision of the voltage reference system and fastest reaction of the voltage supervisorsare provided. The LDO drive capability is defined by the current needed by the MCU digital core- it is either constant depending on the highest processing performance, or dynamically adapteddepending on the system clock frequency.


So

ftw

are

trig

ger

ed s

leep

Inte

rru

pt

trig

ger

ed w

ake-

up

Active Mode

Sleep Modes

Red

uce

d p

ow

er c

on

sum

pti

on

du

rin

g s

leep

Red

uce

d e

ner

gy

and

tim

e d

uri

ng

wak

e-u

pAPM

LPM-Clock

LPM-Standby

LPM-Off

Reference LDO High-Freq. Oscillator Low-Freq. OscillatorAPM high-performance high-performance active active

LPM-Clock high-performance high-performance active active

LPM-Standby low-power low-power disabled active

LPM-Off disabled disabled disabled (optional)

Power Management Clock Generation Operating Mode

Fig. 7.2. Overview of MCU operating modes and the associated functions. To minimize power consumptionduring sleep, the presented MCU systems offer three fine-graded low-power modes, in which the system isgradually disabled.

To minimize the power consumption during sleep, each demonstrator system offers three fine-graded low-power modes, in which the system is gradually disabled. As “lower” the low-power mode,as lower the power consumption during sleep, which however comes at the cost of an increased energyand time required for wake-up. In the most elementary low-power mode, in the following referredto as LPM-Clock, the CPU operation is stopped by gating the system clock. This is rather fastin reaction time and therefore preferred in case the expected sleep time is short. Since the MCUdigital core remains powered in this operating mode, the application data is retained. Both thepower management unit and the clock generation unit remain fully activated, in this way ensuringa fast wake-up into active mode within 1µs (typical). To further reduce the power consumptionduring sleep, both the power management and the clock generation unit can be alternatively setinto a standby state, in the following referred to as LPM-Standby. Due to the relaxed supplyvoltage requirements of the MCU digital core, either the main LDO is turned off and only thededicated low-power LDO remains active, or the discrete load adaptive LDO is set to its lowestdrive level. The voltage reference system is set into a low-power state with reduced precision, butalso significantly reduced quiescent current. The high-frequency oscillator is completely disabledin this low-power mode, and only the low-frequency oscillator remains active to provide a time-base for periodic triggering of wake-up events. In this way, the power consumption during sleepis reduced, while the time required for wake-up necessarily increases to 10µs (typical). However,due to increasing leakage current in modern deep submicron CMOS technologies, the system powersavings in the LPM-Standby mode become more and more limited, particularly at high temperature.A deep low-power mode, in the following referred to as LPM-Off, is therefore better realized by alsogating the power supplied to the MCU digital core. The major portion of the power management

7.2 Experimental Results on System Energy Consumption 181

unit (particularly including the LDO) is for this purpose completely disabled, and only a continuoussupply voltage supervisor remains active. Since the power is removed from the MCU digital core, theapplication data is not retained, and thus needs to be stored in the non-volatile FeRAM memory.The wake-up from this low-power mode is significantly slower (1 ms typical) and therefore onlypreferable for longer sleep times and when fast reaction time is not needed.

Depending on the requirements for latency and the expected frequency of wake-up, it can bechosen between the three low-power modes, in this way enabling considerable power savings whilethe MCU system waits for external and/or internal events. The operating modes of the ultra-low-power MCU system and their transitions are controlled by a finite state machine, as illustratedin Fig. 7.2. The transition between the system operating modes, including the particular case ofsystem start-up, follows a sequential scheme. A simple handshake sequence, based on dedicatedenable and power good signals for each circuit block, is chosen as best trade-off between reliablesystem operation and fast transition times.

7.2 Experimental Results on System Energy Consumption

After briefly introducing the MCU demonstrator systems, the system power contributors and therelated system trade-offs are analyzed in the following. This in detail includes the power consumptionduring sleep, the energy required for switching between power modes, as well as the energy andperformance scalability in active mode. Based on this discussion of system energy consumptionand the individual contributors, it is emphasized how the discrete load adaptive LDO scheme canreconcile the challenges of both system architecture and circuit design.

7.2.1 System Power Consumption during Sleep

Since typical ultra-low-power applications wait the vast majority of time for the next trigger event,the power consumption during sleep significantly contributes to the overall energy consumption.In order to reclaim the benefits of duty cycling, ultra-low-power MCU systems fabricated in deepsubmicron CMOS technologies must apply aggressive power saving strategies to achieve lowestsystem power consumption during sleep. For this purpose, the presented MCU systems offer threefine-graded low-power modes as introduced above, in which the system is gradually disabled. Inaccordance to the theoretic considerations in Chap. 3.3, these low-power modes primarily aim toreduce the leakage current of the MCU digital core as well as the quiescent current of the powermanagement unit. Both of them significantly contribute to the system power consumption duringsleep, while the leakage current in the high-voltage domain (among others, including the I/O ports)in contrast can be neglected. The power consumption and power savings in the low-power modes, asdepicted in Fig. 7.3, are identical for both demonstrator systems, and are discussed in the following.

In the most elementary low-power mode, in the following referred to as LPM-Clock, the CPUoperation is stopped by gating the system clock, while both the power management unit andthe clock generation unit remain fully activated. For this experiment, however, the clock signal isprovided from external, and the clock generation unit including the on-chip oscillator is deactivated.The system power consumption is accordingly dominated by the quiescent current of the powermanagement unit. While this quiescent current shows only a weak temperature dependency, theleakage current of the MCU digital core increases approximately by a factor of ten at 85 C comparedto at room temperature. To minimize the power management overhead, the ultra-low-power MCU


0

50

100

150

200

250

300

1 2 3 4 5 6S

leep

Cur

rent

Low-Power Mode

MCU Digital Core Power Management Unit High-Voltage Domain

Sys

tem

Pow

er [µ

W]

LPM-Clock

x18.9

x4.3

LPM-Standby LPM-Off

25˚C 85˚C 25˚C 85˚C 25˚C 85˚C

Fig. 7.3. Measured power consumption of the MCU systems during sleep at VDD=3.0 V. By disablingthe LDO in LPM-Off mode, the power consumption during sleep is reduced by a factor of 4.3 at roomtemperature and by a factor of 18.9 at high temperature compared to the LPM-Standby mode.

system can be set into the LPM-Standby mode, in which the power management unit is in alow-power state with reduced performance, but also significantly reduced quiescent current. As aresult, the system power savings in this mode are limited by the leakage current of the MCU digitalcore, particularly at high temperatures. It contributes to 50 % of the power consumption at roomtemperature, while the percentage significantly increases to 81 % at 85 C.

As demonstrated in Chap. 3.3, the most efficient strategy for leakage avoidance is to switch-offthe MCU digital core, either completely or partially (Gammie et al., 2010). While the conven-tional approach widely used today for mobile application processors is power gating, an alternativeapproach is to utilize the integrated LDO. By considering typical ultra-low-power application sce-narios, two major system states can be identified, which can be leveraged for system architecturedecisions. In active mode, the processor provides high processing performance while leakage currentis only of limited concern. In sleep mode, in contrast, the system power consumption needs to beminimized, while commonly only a real-time clock remains active. The fine granularity, which ispotentially offered by power gating, is therefore only of limited benefit for the majority of typicalultra-low-power application scenarios. Instead, the integrated LDO is completely disabled in theLPM-Off mode, thereby using its pass-transistor as power switch. As a high voltage drop is a desiredfeature of every LDO design, the pass-transistor can be sized significantly smaller than a power-gating switch resulting in a highly improved leakage reduction. This approach, moreover, does notonly significantly reduce the leakage current of the MCU digital core, but also the quiescent currentof the power management unit. In conclusion, disabling the LDO in the LPM-Off mode reduces thepower consumption during sleep by a factor of 4.3 at room temperature and by a factor of 18.9 athigh temperature in comparison to the LPM-Standby mode.

As many ultra-low-power applications need a time-base to periodically trigger a wake-up event,a separate sub-regulated voltage domain for real-time-clock operation including a low-frequencyoscillator might remain active adding a power consumption of only 1.8µW (at VDD = 3.0 V androom temperature).


7.2.2 Efficient Switching between Power Modes

With the benefit of lower power consumption during sleep, another aspect of system energy opti-mization arises. As introduced theoretically in Chap. 3.4, each wake-up from low-power mode takestime and requires additional energy, which partially offsets the energy saved during sleep. The en-ergy required for wake-up is composed of three major components, which namely are (1) reactivationof the power management and clock generation unit, (2) recharging of the power switched voltagedomains, and (3) recovering the system state in case the MCU digital core is switched-off. Fig. 7.4provides an overview of the energy required for wake-up from the previously discussed low-powermodes, with the three major components listed separately.

The LPM-Clock mode, at which only the system clock signal is gated, but the MCU systemremains otherwise fully active, offers the lowest possible energy and time required for wake-up.When waking up from LPM-Standby mode, in contrast, both the power management and theclock generation unit need to be reactivated. To minimize the wake-up energy, these circuits areparticularly optimized for a fast start-up. While fast starting analog circuits typically require a largebias current, which increases their overhead during active mode, novel circuit design techniquesenable a fast start-up in combination with nanoampere bias currents (Baumann et al., 2013).

In the LPM-Off mode, the LDO is disabled and the power is removed from the MCU digital core.Consequently, not only the power management and clock generation unit need to be reactivated,but also the LDO load capacitance needs to be recharged at wake-up. To enable an energy-efficientwake-up, lowest capacitance at the LDO output is preferred. For the MCU system with an externalLDO load capacitance of 470 nF, the energy required for capacitance recharging clearly dominatesthe overall wake-up energy. As depicted in Fig. 7.4, the wake-up energy is reduced by a factor of5.6 for a 3 nF in comparison to a 470 nF LDO capacitance configuration. While a small LDO loadcapacitance enables an energy-efficient wake-up from sleep mode with ultra-low-power consumption,it is in strong contrast to the voltage regulator needs. As demonstrated in Chap. 5.8.1, the absenceof an external capacitance greatly affects the LDO design. The LDO has to be faster and thus needshigher quiescent current when active.

0.00

0.25

0.50

0.75

1.00

1 2 3 4

Wak

e-Up

Ene

rgy

Low-Power Mode

Wak

e-U

p E

nerg

y [µ

Ws]

470nF 3nFLPM- Clock

LPM-Standby LPM-Off

2.6µWs

x5.6 Capacitor Recharging State Recovery System Reactivation

Fig. 7.4. Measured energy required for wake-up of the MCU systems from the low-power modes atVDD=3.0 V and Temp.=25 C. By minimizing the LDO load capacitance, the energy required for wake-upis reduced by a factor of 5.6.


Since the power is removed from the MCU digital core, the ultra-low-power MCU system behavessimilar to a power-on reset condition when waking-up from the LPM-Off mode. To retain criticalapplication data and the system state during sleep, the respective registers are stored into the non-volatile FeRAM memory and recovered at wake-up using a software-based approach. The energyrequired for storing and recovering this information is rather high in an ultra-low-power MCU systemwith flash memory. By offering a 1000x faster write capability at a 100x lower energy, the FeRAMmemory in contrast enables an energy-efficient data and state retention during sleep (Zwerg et al.,2011; Baumann et al., 2013). Nevertheless, it contributes to 18 % of the overall wake-up energy forthe MCU system with an external LDO load capacitance of 470 nF, and becomes clearly dominatingwhen reducing the LDO load capacitance to 3 nF.

As identified in Chap. 3.4, there is a fundamental trade-off between saved energy during sleep andadditional energy required for wake-up. A break-even time can be defined for which it is beneficialto enter a low-power mode with lower power consumption, but also higher wake-up overhead -e.g. disabling the LDO in the LPM-Off mode instead of setting the MCU system into a standby

1

10

100

1000

0.1 1.0 10.0 100.0 1000.0 10000.0

Syste

m P

ow

er

Co

nsu

mp

tio

n [

uW

s]

Time Period [ms]

LPM-Standby

Cycle Time [ms]

Sys

tem

Po

wer

[µ

W]

x4.6

at 85˚C

LPM-Off with 470nF

LPM-Off with 3nF

10.0 100.0

1

10

100

1000

0.1 1.0 10.0 100.0 1000.0 10000.0

Sys

tem

Po

wer

Co

nsu

mp

tio

n [

uW

s]S

ys

tem

Po

wer

[µ

W]

x2.7

LPM-Off with 470nF

LPM-Off with 3nF

10.0 100.0

at 25˚C

Cycle Time [ms]

(a)

(b)

LPM-Standby

Fig. 7.5. Measured system power consumption for a periodic sensor data analysis (taking 152 clock cyclesat a clock frequency of 8 MHz) as a function of cycle time at VDD=3.0 V. The dashed area highlights thesystem energy savings for a 3 nF in comparison to a 470 nF LDO capacitance configuration at (a) roomtemperature, and at (b) high temperature.


state in the LPM-Standby mode. This trade-off strongly depends on the cycle time and thus on theapplication scenario. To experimentally determine the break-even time, Fig. 7.5 shows the measuredenergy consumption of the MCU system for a periodic sensor data analysis as a function of cycle timeat both room temperature and high temperature. In this experiment, the MCU system periodicallywakes up to perform a sensor data analysis taking 152 clock cycles at an operating frequency of8 MHz. When idle, the MCU system is set into a low-power mode, either into the LPM-Standbymode or into the LPM-Off mode with the LDO being disabled. By sweeping the cycle time, i.e. thetime between the wake-up events, the break-even point can be determined. For very short cycle times(<1 ms), the energy consumption is dominated by power consumption in active mode. In contrast,it becomes dominated by the power consumption during sleep for extended cycle times (>10 s).For comparison, the dashed line indicates the system energy consumption for an “ideal” low-powermode with zero power consumption during sleep and zero energy required for wake-up. For theMCU system with an external LDO load capacitance of 470 nF, the break-even time is 239.3 ms atroom temperature. By using a fully-integrated LDO with a small on-chip load capacitance of only3 nF, this time is drastically reduced to 37.8 ms. At high temperature, the break-even time shifts toshorter periods due to increased leakage current, amounting to 30.8 ms and 4.7 ms, respectively.

In conclusion, the fully-integrated LDO with a small on-chip load capacitance provides significantbenefits for a wide range of typical ultra-low-power application scenarios. On the one end of therange, applications such as the smoke detector spend most of their time sleeping and thus demandultra-low power consumption during sleep. As a result of the long cycle time of tcycle = 8 s, itis highly beneficial to disable the LDO during sleep. Since the energy required for wake-up isonly of minor importance, the system energy consumption is almost independent of the LDO loadcapacitance. On the other end of the range, applications such as the glass break detector wakeup very frequently. With a cycle time of only tcycle = 2 ms, an energy-efficient wake-up is of vitalimportance, while the power consumption during sleep is negligible. As a result, removing powerfrom the MCU digital core during sleep is not beneficial in this case. The energy saved during sleepby switching off the MCU digital core is canceled out by the higher energy required for wake-up -independent of the LDO load capacitance. For all other application scenarios with a cycle time inbetween these two extremes, it becomes beneficial to disable the LDO when its load capacitance issmall, instead of keeping the MCU system in the standby state during sleep. The dashed area inFig. 7.5 emphasizes the savings at different cycle times for a 3 nF in comparison to a 470 nF loadcapacitance configuration. System energy savings of up to a factor of 2.7 at room temperature andup to a factor of 4.6 at high temperature can be achieved, which directly translate into an extendedbattery lifetime.

7.2.3 Energy/Performance Scalability in Active Mode

In active mode, the system power consumption is increased by more than three decades comparedto the low-power modes. It seems obvious that faster code execution should help to save energy.In many ultra-low-power applications, however, the speed of processing is gated by external events(e.g. analog settling times). To avoid wasting energy by running idle, the best the system cando is to adapt to the speed of such events. As revealed theoretically in Chap. 3.2, the powermanagement overhead should be therefore low, which becomes particularly important in case of thefully-integrated LDO with a small on-chip load capacitance only.

Fig. 7.6 shows the measured system energy consumption of both demonstrator systems forperforming a sensor data analysis as a function of clock frequency under nominal operating condi-


100

200

300

400

500

-1.0 0.0 1.0 2.0 3.0 4.0

Time Period [ms]

Sy

ste

m E

ne

rgy

[p

Ws

]15

0.5 1.0 2.0 4.0 8.0 16.0

Clock Frequency [MHz]

30

45

60

75

Sys

tem

En

erg

y [

nW

s]

MCU system with state-of-the-art power management unit

MCU system with discrete load adaptive LDO

MCU system with discrete load adaptive LDO and optimized reference/bias generation

-31%

MCU Digital Core: Dynamic and Static

Fig. 7.6. Measured system energy consumption for performing a sensor data analysis (taking 152 clockcycles) as a function of system clock frequency at VDD=3.0 V and Temp.=25 C. Owing to the discreteload adaption scheme, the energy required for executing a certain code is strongly reduced at low systemclock frequencies.

tions (VDD = 3.0 V, Temp. = 25 C). For this experiment, the code is executed from the volatilestatic RAM (SRAM) memory; the system clock is provided from external. For both demonstratorsystems, no dynamic voltage frequency scaling techniques are employed. Instead, the core supplyvoltage VCORE is 1.52 V - decided as trade-off between maximum speed/performance requirementson the one side and active power considerations on the other side. When considering the energyconsumption of the MCU digital core only, as indicated by the shaded area in Fig. 7.6, the energyper cycle is therefore essentially independent of the clock frequency. It is dominated by the dynamicpower component, while the static power component can be neglected for the clock frequency rangeof interest. This is in contrast to the energy overhead introduced by the power management unit. Astate-of-the-art power management unit, as introduced in Chap. 2, must be laid out for worst-casepower requirements at maximum performance levels. Its power consumption does not scale withthe clock frequency and thus becomes dominant when operating at low clock frequencies, i.e. whenbeing active for a longer time period. It contributes to 47 % of the total system power consumptionwhen operating at a clock frequency of 1 MHz. In order to solely determine the system operatingspeed by application requirements without sacrificing the system energy consumption, the powermanagement unit therefore needs to be optimized for all potential speed requirements. This be-comes particularly important in case of a fully-integrated LDO with a small load capacitance only.While a small load capacitance enables an energy-efficient wake-up from low-power mode withultra-low-power consumption, it also causes several design challenges for stability and transientbehavior of an LDO. As demonstrated in Chap. 5.8.1, the gain-bandwidth and consequently alsothe quiescent current demand is directly proportional to the LDO load capacitance - resulting ina higher quiescent current demand for the fully-integrated LDO by a factor of 150 presuming anotherwise identical performance specification.

By employing the discrete load adaptive LDO as introduced in Chap. 6, the energy required forexecuting a certain code can be strongly reduced at low clock frequencies (again see Fig. 7.6). Thedashed line shows the estimated energy consumption when the power optimized reference and biasgeneration circuitry is used not only during sleep, but also in active mode. In this way, the energyper cycle becomes almost independent of the clock frequency. The contribution of the LDO to the

7.3 Summary 187

overall system power consumption remains below 3 % for clock frequencies down to 0.5 MHz. Thesystem operating speed can therefore be solely determined by application requirements withoutsacrificing the system energy consumption. In conclusion, the discrete load adaption scheme is ableto provide significant benefits for typical ultra-low-power application scenarios acquiring sensor dataand evaluating them by simple signal processing algorithms, such as the smoke detector. For thisclass of applications, the speed of processing is typically gated by external events (e.g. analog settlingtimes) and the clock frequency is therefore low. In this way, the discrete load adaption schemeenables, for instance, system energy savings of 31 % for the smoke detector operating at a clockfrequency of 1 MHz. On the other end of the range, in contrast, computing intensive applicationssuch as the glass break detector require rather complex algorithms for signal processing. Since theclock frequency is accordingly high (e.g. 8 MHz /12 MHz in case of the glass break detector), thisclass of applications essentially does not benefit from the discrete load adaption scheme.

7.3 SummaryBy experimentally verifying and comparing the holistic energy saving concept elaborated step-by-step in this work with a state-of-the-art power management architecture, the benefits of thediscrete load adaptive LDO with regard to system energy consumption are revealed in this chapter.The experiments thereby particularly focuses on analyzing the system power contributors and therelated trade-offs for system energy optimization of typical ultra-low-power application scenarios.By disabling the LDO during sleep, the power consumption can be reduced by a factor of 4.3 at roomtemperature and by a factor of 18.9 at high temperature compared to the basic low-power state.At the same time, however, the energy required for recovering back into normal operating modeincreases - resulting in a fundamental trade-off between saved energy during sleep and additionalenergy required for wake-up. A break-even time can be defined for which it is beneficial to enter alow-power mode with lower power consumption, but also higher wake-up overhead. By drasticallyreducing the LDO load capacitance, the break-even time is minimized, and system energy savingsof up to a factor of 2.7 at room temperature and up to a factor of 4.6 at high temperature can beachieved - translating directly into an extended battery lifetime. At the same time, however, a drasticreduction of the LDO load capacitance causes major challenges for the regulator design - resultingin an increased quiescent current demand. From an MCU system energy consumption perspective,there is hence another fundamental trade-off between minimizing the capacitance to achieve a fastand energy-efficient wake-up from low-power mode, and an increased power management overheadin active mode, which becomes particularly dominant when operating at low clock frequencies. Incontrast to the state-of-the-art power management unit, the discrete-load adaption scheme is ableto adapt to the system operating speed. In this way, the power management unit is optimized forall potential speed requirements - enabling system energy savings of 31 % at 1 MHz even thoughthe LDO load capacitance is drastically reduced.

The experimental results are conclusively able to demonstrate that power management in allsystem conditions, addressed in the context of the application, is a key requirement for lowestenergy systems. The holistic energy saving concept - involving application requirements, systemarchitecture, and circuit design techniques - thereby clearly outperforms the traditional circuitdesign based on analog/digital partitioning and leaves much more flexibility for system energyoptimization. By applying the digital-enhancement technique, the fundamental design trade-offsfor system energy optimization can be reconciled - in this way greatly relaxing the challenges ofboth system architecture and circuit design in an ultra-low-power MCU system.

8

Summary and Outlook

In this work, a holistic energy saving concept for ultra-low-power microcontroller units (MCU)is introduced - thereby addressing all aspects of the design process, namely including (1) processtechnology, (2) circuit design, (3) system architecture, and (4) application requirements. Whileenergy optimization for each of these disciplines is usually well understood when considered in-dependently, this work pursues a holistic energy saving concept to solve the power managementchallenges of ultra-low-power MCU systems fabricated in deep submicron regime. By dissolvingthe strict boundaries between the individual disciplines, it leaves much more flexibility for systemenergy optimization - in this way enabling an additional dimension for energy optimization.

Ultra-low-power MCU systems are optimized for lowest energy consumption while maintain-ing the processing performance needed for the application scenario. They hence require a carefulconsideration of all power contributors making power management in all system conditions, ad-dressed in the context of the application, a key requirement. This particularly includes minimalpower consumption in both active mode and sleep mode as well as fast and energy-efficient tran-sition between these modes. Due to fundamental design trade-offs, these requirements howevercannot be optimized concurrently. Instead, a “sweet spot” for system energy consumption needs tobe found, which strongly depends on the application scenario. The trend towards lower chip costand higher system integration pushes towards deep submicron CMOS technologies. However, asleakage drastically increases for these technologies, the battery lifetime of typical ultra-low-powerapplications with excessive idle times and accordingly very low average switching activity becomesmore and more limited. To enable migration, modern ultra-low-power MCU systems fabricated indeep submicron CMOS technologies thus require efficient power saving techniques to overcome theprocess imperfections and minimize the power consumption during sleep. However, these powersaving techniques partially offset the area and cost benefits due to technology scaling, as well asimpose a significant delay and energy overhead when recovering back into normal operating mode.There hence exists a penalty for the implementation of these techniques, as well as a penalty foractivating them. The most efficient strategy for leakage avoidance is to switch-off the MCU digitalcore, either completely or partially. While the conventional approach widely used today for mobileapplication processors is power gating, the MCU digital core can be alternatively switched-off byutilizing the on-chip integrated voltage regulator. This approach does not only significantly reducethe leakage of the MCU digital core, but also the power management overhead. To achieve fast andenergy-efficient wake-up, lowest capacitance at the core supply is mandatory - which is in conflict tothe needs of switching regulators. In conclusion, and as demonstrated by the systematic analysis of

190 8 Summary and Outlook

energy saving concepts provided in this work, supplying the MCU digital core by a fully-integratedlow-dropout voltage regulator (LDO) with a small load capacitance only is highly beneficial for abroad range of typical ultra-low-power application scenarios. In detail, this solution enables ultra-low power consumption during sleep and energy-efficient wake-up while achieving lowest systemcosts. At the same time, however, a small load capacitance presents major challenges for the designof an LDO.

The design of fully-integrated LDOs has recently received a lot of attention in literature. Whileconventional LDO topologies are compensated by a large (external) capacitance with the dominantpole being at the LDO output, most of the presented, fully-integrated LDO topologies use someform of Miller compensation to establish an internal, dominant pole. While these LDO topologiessuffer from stability issues at low load conditions and large load capacitance, a minimum on-chipcapacitance is required to ensure the power integrity of the MCU digital core despite of the fastswitching operation. Consequently, these LDO topologies are not best suited to supply CMOSdigital circuits. The any-load stable LDO, first presented by Ivanov (2008), can in contrast be easilyadapted for capacitor-less application. By addressing both the static and the transient regulationseparately by decomposing the control tasks into constituent feedback loops, the fundamental trade-offs between high gain, fast transient response as well as loop stability under all operating conditionsare resolved. The any-load stable LDO can thus be easily adapted to a wide range of applicativerequirements, namely defined by the load capacitance, the maximum load current, the dropoutvoltage as well as the transient voltage error. By providing a detailed circuit analysis of the any-loadstable LDO, the fundamental design trends and trade-offs with respect to these fundamental LDOperformance parameters are identified and quantified in this work. These LDO design trade-offs aresummarized to a concise figure-of-merit - consistent with the one proposed by Hazucha et al. (2005).To the best knowledge of the author, this is the first analytical derivation and validation of thisgenerally accepted figure-of-merit. Complementing the LDO figure-of-merit, this work examines theimpact and benefits of CMOS technology migration for LDO voltage regulators. This particularlyincludes the presentation of an alternative pass-transistor topology, enabling technology scalingfor LDOs by combining a low voltage thin-oxide pass-transistor with a high-voltage thick-oxideprotection device. The pass-transistor gate capacitance and in turn the LDO quiescent currentdecreases with the minimum channel length to the second power - corresponding to a factor of twowith each technology node. This theoretic LDO quiescent current saving is partly offset by the localfeedback loop required for driving the cascode pass-transistor.

Clearly evident from the LDO figure-of-merit, the absence of a large external capacitance exacer-bates the LDO design trade-offs by demanding a higher control bandwidth, which in turn translatesinto an increased quiescent current demand. While a small LDO load capacitance enables ultra-lowpower consumption during sleep and energy-efficient wake-up, the LDO quiescent current necessarilyincreases when the LDO is active. From a MCU system energy consumption perspective, either theone or the other might be preferred, depending on the wake-up frequency defined by the application.To reconcile these contradicting LDO design requirements, the application of digital-enhancementtechniques to the basic LDO feedback loop is proposed in this work. A discrete load adaptive LDOscheme digitally adapts its maximum current drive capability by exploiting the correlation betweensystem clock frequency and load current demand - in this way keeping its current efficiency above97 % over two decades of load current. To achieve fast and energy-efficient wake-up, this LDO doesnot need any external capacitance, but instead relies on the intrinsic capacitance of the MCU digi-tal core amounting to overall 3 nF. As proven by the detailed simulation and experimental results,the discrete load adaptive LDO combines a fast transient response with an excellent DC accuracy,

8 Summary and Outlook 191

which both are maintained and guaranteed for each current drive level. By making use of knownsystem power information, the discrete load adaptive LDO achieves in conclusion a figure-of-meritof 8 ps, which is improved by a factor of 32 compared to a state-of-the-art LDO without discreteload adaption. While the concept of discrete load adaption has been exemplary investigated forthe any-load stable LDO in this work, the application of digital-enhancement techniques can beconsidered as a wide field of research. The concept of digital-enhancement techniques is based onsynergy effects due to exact system knowledge, evolving between the LDO voltage regulator on theone side and the MCU digital core on the other side. By efficiently making use of these informa-tion, the fundamental LDO design trade-offs can be reconciled, thereby opening a new degree offreedom for its design and optimization. Other LDO topologies can benefit from this concept aswell, since the same fundamental design trade-offs between high accuracy, fast transient responseand loop stability also apply here. Beyond this example, various schemes for digital-enhancementcan be conceived depending on (1) which system power information is exploited, and (2) how thisinformation is applied to the regulator.

To conclusively proof the benefits of the digital-enhancement techniques with regard to systemenergy consumption, two very similar ultra-low-power MCU systems have been implemented andexperimentally verified in this work. The one demonstrator system features a state-of-the-art powermanagement architecture with the digital core supplied by an externally compensated LDO, whilethe other is designed according to the holistic energy saving concept elaborated step-by-step in thiswork. Owing to the discrete load adaption scheme, the power management overhead is drasticallyreduced when operating at low clock frequencies enabling system energy savings of 31 % at 1 MHz. Atthe same time, a drastic reduction of the LDO load capacitance enables ultra-low power consumptionduring sleep and energy-efficient wake-up, resulting in system energy savings up to a factor of4.6. The presented power management solution, particularly including the discrete load adaptionscheme for LDOs, is highly flexible thereby maximizing the battery lifetime for a broad range ofapplication scenarios from frequent to rare wake-up events. It has been successfully integrated ina commercially available ultra-low-power MCU system (Borgeson, 2012). The power managementsolution presented in this work is optimized for typical ultra-low-power sensor applications withrather long sleep times and only very short periods of active processing. The systematic approachintroduced in this work can however also provide valuable initial directions for future architecturaldecisions. Examples are other application scenarios and/or other CMOS technologies characterizedby a different ratio of active to sleep power consumption - demanding a different “sweet spot” forsystem energy optimization.

Selected Publications

1. M. Lüders, B. Eversmann, J. Gerber, K. Huber, R. Kuhn, D. Schmitt-Landsiedel, and R.Brederlow, “A Fully-Integrated System Power Aware LDO for Energy Harvesting Applications,”IEEE Symposium on VLSI Circuits, pp. 244-245, 2011.

2. M. Lüders, B. Eversmann, D. Schmitt-Landsiedel, and R. Brederlow, “Fully-Integrated LDOVoltage Regulator for Digital Circuits,” Advances in Radio Science, vol. 9, pp. 263-267, 2011.

3. M. Lüders, D. Schmitt-Landsiedel, and R. Brederlow, “Enabling Technology Scaling for LDOVoltage Regulators by Using a Cascode Pass-Transistor Topology,” Workshop Analogschaltun-gen, 2011.

4. M. Lüders, R. Brederlow, and R. Kuhn, “Electronic Device and Method for Discrete LoadAdaptive Voltage Regulation,” United States Patent - US 2012/0062197 A1, 2012.

5. M. Lüders, B. Eversmann, J. Gerber, K. Huber, R. Kuhn, M. Zwerg, D. Schmitt-Landsiedel,and R. Brederlow, “Architectural and Circuit Design Techniques for Power Management ofUltra-Low-Power MCU Systems,” IEEE Transactions on VLSI Systems, pp. 2287-2296, 2013

References

M. Al Shyoukh, H. Lee, and R. Perez, “A Transient-Enhanced Low-Quiescent Current Low-DropoutRegulator With Buffer Impedance Attenuation,” IEEE Journal of Solid-State Circuits, vol. 42,no. 8, pp. 1732–1742, 2007.

P. E. Allen and D. R. Holberg, CMOS Analog Circuit Design. Oxford University Press, 2002.H. Aminzadeh, R. Lotfi, and K. Mafinezhad, “Area-Efficient Low-Cost Low-Dropout Regulatorsusing MOS Capacitors,” IEEE International Symposium on System-on-Chip, pp. 1–4, 2008.

A. J. Annema, B. Nauta, R. van Langevelde, and H. Tuinhout, “Analog Circuits in Ultra-Deep-Submicron CMOS,” IEEE Journal of Solid-State Circuits, vol. 40, pp. 132–143, 2005.

B. Arbetter and D. Maksimovic, “DC-DC Converter with Fast Fransient Response and High Effi-ciency for Low-Voltage Microprocessor Loads,” Applied Power Electronics Conference and Expo-sition (APEC), vol. 1, pp. 156–162, 1998.

C. Azzolini, P. Milanesi, and A. Boni, “Accurate Transient Response Model for Automatic Synthesisof High-Speed Operational Amplifiers,” IEEE International Symposium on Circuits and Systems(ISCAS), pp. 5716–5719, 2006.

A. Bargagli-Stoffi, Ultra Low-Voltage, Low-Power Amplifiers in Deep Submicrometer CMOS.Shaker Verlag, 2006.

J. Bastos, M. Steyaert, B. Graindourze, and W. Sansen, “Matching of MOS Transistors with Differ-ent Layout Styles,” IEEE International Conference on Microelectronic Test Structures (ICMTS),pp. 17–18, 1996.

A. Baumann, M. Jung, K. Huber, M. Arnold, C. Sichert, S. Schauer, and R. Brederlow, “A MCUPlatform with Embedded FRAM Achieving 350nA Current Consumption in Real-Time ClockMode with Full State Retention and 6.5µs System Wakeup Time,” IEEE Symposium on VLSICircuits, pp. 244–245, 2013.

J. Borgeson, “Ultra-Low-Power Pioneers: TI Slashes Total MCU Power by 50Percent with New "Wolverine" MCU Platform,” White Paper, 2012, available:www.ti.com/lit/wp/slay019a/slay019a.pdf. [Accessed June 2012].

R. Brederlow, W. Weber, J. Sauerer, S. Donnay, P. Wambacq, and M. Vertregt, “A Mixed-SignalDesign Roadmap,” IEEE Design Test of Computers, vol. 18, no. 6, pp. 34–46, 2001.

D. Brunelli and L. Benini, “Designing and Managing Sub-Milliwatt Energy Harvesting Nodes:Opportunities and Challenges,” IEEE International Conference on Wireless Communication,Vehicular Technology, Information Theory and Aerospace and Electronic Systems Technology(VITAE), pp. 11–15, 2009.

196 References

K. Bult and G. J. G. M. Geelen, “A Fast-Settling CMOS Op Amp for SC Circuits with 90-dB DCGain,” IEEE Journal of Solid-State Circuits, vol. 25, pp. 1379–1384, 1990.

C. Bunel, “Integrated Passive Devices Technology Breakthrough by IPDIA,” White Paper, 2010.B. H. Calhoun, J. F. Ryan, S. Khanna, M. Putic, and J. Lach, “Flexible Circuits and Architecturesfor Ultralow Power,” Proceedings of the IEEE, vol. 98, no. 2, pp. 267–282, 2010.

D. Camacho, P. Gui, and P. Moreira, “An NMOS Low Dropout Voltage Regulator with SwitchedFloating Capacitor Gate Overdrive,” IEEE Midwest Symposium on Circuits and Systems, pp.808–811, 2009.

R. G. Carvajal, J. Ramirez Angulo, A. J. Lopez Martin, A. Torralba, J. A. G. Galan, A. Carlosena,and F. M. Chavero, “The Flipped Voltage Follower: A Useful Cell for Low-Voltage Low-PowerCircuit Design,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 52, no. 7,pp. 1276–1291, 2005.

A. P. Chandrakasan and R. W. Brodersen, “Minimizing Power Consumption in Digital CMOSCircuits,” Proceedings of the IEEE, vol. 83, no. 4, pp. 498–523, 1995.

C. K. Chava and J. Silva Martinez, “A Frequency Compensation Scheme for LDO Voltage Regula-tors,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 51, no. 6, pp. 1041–1050,2004.

Y. H. Chee, J. M. Rabaey, and A. Niknejad, Ultra Low Power Transmitters for Wireless Sensor Net-works, 2006, available: http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-57.html[Accessed October 2013].

J. J. Chen, F. C. Yang, C. M. Kung, B. P. Lai, and Y. S. Hwang, “A Capacitor-Free Fast-Transient-Response LDO with Dual-Loop Controlled Paths,” IEEE Asian Solid-State Circuits Conference(ASSCC), pp. 364–367, 2007.

D. Chinnery and K. Keutzer, Closing the Power Gap between ASIC & Custom: Tools and Tech-niques for Low Power Design. Springer-Verlag, 2005.

Y. Chiu, “On the Operation of CMOS Active-Cascode Gain Stage,” Journal of Computer andCommunications, vol. 1, pp. 18–24, 2013.

J. Doyle, B. Francis, and A. Tannenbaum, Feedback Control Theory. Prentice Hall, 1991.P. G. Drennan and C. C. McAndrew, “Understanding MOSFET Mismatch for Analog Design,”IEEE Journal of Solid-State Circuits, vol. 38, pp. 450–456, 2003.

M. Eireiner, Power Supply Integrity in Low Power Designs. Shaker Verlag, 2009.M. Eisele, J. Berthold, R. Thewes, E. Wohlrab, D. Schmitt-Landsiedel, and W. Weber, “Intra-DieDevice Parameter Variations and their Impact on Digital CMOS Gates at Low Supply Voltages,”IEEE International Electron Devices Meeting (IEDM), pp. 67–70, 1995.

D. El-Damak, S. Bandyopadhyay, and A. Chandrakasan, “A 93% Efficiency ReconfigurableSwitched-Capacitor DC-DC Converter Using On-Chip Ferroelectric Capacitors,” InternationalSolid-State Circuits Conference (ISSCC), pp. 374–375, 2013.

M. El-Nozahi, A. Amer, J. Torres, K. Entesari, and E. Sanchez-Sinencio, “High PSR Low Drop-Out Regulator with Feed-Forward Ripple Cancellation Technique,” IEEE Journal of Solid-StateCircuits, vol. 45, no. 3, pp. 565–577, 2010.

D. Ernst, N. S. Kim, S. Das, S. Pant, R. Rao, T. Pham, C. Ziesler, D. Blaauw, T. Austin, K. Flaut-ner, and T. Mudge, “Razor: A Low-Power Pipeline based on Circuit-Level Timing Speculation,”IEEE International Symposium on Microarchitecture, pp. 7–18, 2003.

J. Esteves, J. Pereira, J. Paisana, and M. Santos, “Ultra Low Power Capless LDO with DynamicBiasing of Derivative Feedback,” Microelectronics Journal, vol. 44, no. 2, pp. 94–102, 2013.

References 197

A. Faanes and S. Skogestad, “Feedforward Control under the Presence of Uncertainty,” EuropeanJournal of Control, vol. 10, no. 1, pp. 30–46, 2004.

F. Faggin, M. E. Hoff, S. Mazor, and M. Shima, “The History of the 4004,” IEEE Micro, vol. 16,no. 6, pp. 10–20, 1996.

X. Fan, C. Mishra, and E. Sanchez-Sinencio, “Single Miller Capacitor Frequency CompensationTechnique for Low-Power Multistage Amplifiers,” IEEE Journal of Solid-State Circuits, vol. 40,no. 3, pp. 584–592, 2005.

G. Gammie, A. Wang, M. Chau, S. Gururajarao, R. Pitts, F. Jumel, S. Engel, P. Royannez,R. Lagerquist, H. Mair, J. Vaccani, G. Baldwin, K. Heragu, R. Mandal, M. Clinton, D. Arden, andU. Ko, “A 45nm 3.5G Baseband-and-Multimedia Application Processor using Adaptive Body-Bias and Ultra-Low-Power Techniques,” International Solid-State Circuits Conference (ISSCC),pp. 258–611, 2008.

G. Gammie, A. Wang, H. Mair, R. Lagerquist, M. Chau, P. Royannez, S. Gururajarao, and U. Ko,“SmartReflex Power and Performance Management Technologies for 90nm, 65nm, and 45nmMobile Application Processors,” Proceedings of the IEEE, vol. 98, no. 2, pp. 144–159, 2010.

A. Gercekci and A. Krueger, “Trends in Microprocessors,” European Solid-State Circuits Conference(ESSCIRC), p. 233, 1985.

G. G. E. Gielen, “Design Methodologies and Tools for Circuit Design in CMOS Nanometer Tech-nologies,” European Solid-State Circuits Conference (ESSCIRC), pp. 21–32, 2006.

H. Graeb, D. Mueller, and U. Schlichtmann, “Pareto Optimization of Analog Circuits ConsideringVariability,” European Conference on Circuit Theory and Design (ECCTD), pp. 28–31, 2007.

J. Guo and K. N. Leung, “A 6µW Chip-Area-Efficient Output-Capacitorless LDO in 90nm CMOSTechnology,” IEEE Journal of Solid-State Circuits, vol. 45, pp. 1896–1905, 2010.

L. Gutierrez, E. Roa, and H. Hernandez, “A Current-Efficient, Low-Dropout Regulator with Im-proved Load Regulation,” IEEE Workshop Microelectronics and Electron Devices, pp. 1–4, 2009.

D. Harris, R. Ho, G. Wei, and M. Horowitz, “The Fanout-of-4 Inverter Delay Metric,” unpublishedmanuscript, 1997.

P. Hazucha, G. Schrom, J. H. Hahn, B. Bloechel, P. Hack, G. Dermer, S. Narendra, D. Gardner,T. Karnik, V. De, and S. Borkar, “A 233MHz, 80-87% Efficient, Integrated, 4-Phase DC-DCConverter in 90nm CMOS,” IEEE Symposium on VLSI Circuits, pp. 256–257, 2004.

P. Hazucha, T. Karnik, B. Bloechel, C. Parsons, D. Finan, and S. Borkar, “Area-Efficient LinearRegulator with Ultra-Fast Load Regulation,” IEEE Journal of Solid-State Circuits, vol. 40, no. 4,pp. 933–940, 2005.

M. Hempstead, G.-Y. Wei, and D. Brooks, “Architecture and Circuit Techniques for Low-Throughput, Energy-Constrained Systems across Technology Generations,” Proceedings ofCASES, pp. 368–378, 2006.

S. Henzler, Power Management of Digital Circuits in Deep Sub-Micron CMOS Technologies.Springer-Verlag, 2006.

S. Henzler, T. Nirschl, S. Skiathitis, J. Berthold, J. Fischer, P. Teichmann, F. Bauer, G. Georgakos,and D. Schmitt-Landsiedel, “Sleep Transistor Circuits for Fine-Grained Power Switch-Off withShort Power-Down Times,” International Solid-State Circuits Conference (ISSCC), pp. 302–600,2005.

S. Henzler, G. Georgakos, J. Berthold, and M. Eireiner, “Activation Technique for Sleep-TransistorCircuits for Reduced Power Supply Noise,” European Solid-State Circuits Conference (ESSCIRC),pp. 102–105, 2006.

198 References

S. Henzler, G. Georgakos, M. Eireiner, T. Nirschl, C. Pacha, J. Berthold, and D. Schmitt-Landsiedel,“Dynamic State-Retention Flip-Flop for Fine-Grained Power Gating with Small Design andPower Overhead,” IEEE Journal of Solid-State Circuits, vol. 41, no. 7, pp. 1654–1661, 2006.

V. Ivanov, “Design Methodology and Circuit Techniques of the Any-Load Stable LDOs with InstantLoad Regulation and Low Noise,” Analog Circuit Design: High-speed Clock and Data Recovery,High-performance Amplifiers, Power Management, pp. 339–358, 2008.

V. Ivanov and I. Filanovsky, Operational Amplifier Speed and Accuracy Improvement: Analog CircuitDesign with Structural Methodology. Kluwer Academic Publishers, 2004.

V. Ivanov, R. Brederlow, and J. Gerber, “An Ultra Low Power Bandgap Operational at Supply asLow as 0.75V,” European Solid-State Circuits Conference (ESSCIRC), pp. 515–518, 2011.

M. Jimenez, A. Torralba, R. G. Carvajal, and J. Ramirez Angulo, “A New Low-Voltage CMOSUnity-Gain Buffer,” IEEE International Symposium on Circuits and Systems (ISCAS), pp. 919–922, 2006.

B. Y. T. Kamath, R. G. Meyer, and P. R. Gray, “Relationship between Frequency Response andSettling Time of Operational Amplifiers,” IEEE Journal of Solid-State Circuits, vol. 9, no. 6, pp.347–352, 1974.

R. Kammel and K. Venkat, “A Simple Glass Breakage Detector Using the MSP430,” ApplicationReport, 2007, available: www.ti.com/litv/pdf/slaa351. [Accessed June 2011].

H. Kawaguchi, K. Nose, and T. Sakurai, “A Super Cut-Off CMOS (SCCMOS) Scheme for 0.5V Sup-ply Voltage with Picoampere Stand-By Current,” IEEE Journal of Solid-State Circuits, vol. 35,no. 10, pp. 1498–1501, 2000.

N. S. Kim, T. Austin, D. Baauw, T. Mudge, K. Flautner, J. S. Hu, M. J. Irwin, M. Kandemir, andV. Narayanan, “Leakage Current: Moore’s Law Meets Static Power,” IEEE Computer Society,vol. 36, no. 12, pp. 68–75, 2003.

S. Kim, S. V. Kosonocky, D. R. Knebel, K. Stawiasz, and M. C. Papaefthymiou, “A Multi-ModePower Gating Structure for Low-Voltage Deep-Submicron CMOS ICs,” IEEE Transactions onCircuits and Systems II, vol. 54, no. 7, pp. 586–590, 2007.

E. Kristjansson, “Low Power ARM7 Design,” White Paper, 2006, available:www.microcontroller.com/ARM7 Low Power Design - White Paper.htm. [Accessed January2012].

W. Kruiskamp and R. Beumer, “Low Drop-Out Voltage Regulator with Full On-Chip Capacitancefor Slot-Based Operation,” European Solid-State Circuits Conference (ESSCIRC), pp. 346–349,2008.

K. E. Kuijk, “A Precision Reference Voltage Source,” IEEE Journal of Solid-State Circuits, vol. 8,no. 3, pp. 222–226, 1973.

K. C. Kwok and P. K. T. Mok, “Pole-Zero Tracking Frequency Compensation for Low DropoutRegulator,” IEEE International Symposium on Circuits and Systems (ISCAS), pp. 735–738, 2002.

J. Kwong, Y. Ramadass, N. Verma, M. Koesler, K. Huber, H. Moormann, and A. Chandrakasan,“A 65nm Sub-Vt Microcontroller with Integrated SRAM and Switched-Capacitor DC-DC Con-verter,” International Solid-State Circuits Conference (ISSCC), pp. 318–616, 2008.

D. E. Lackey, P. S. Zuchowski, T. R. Bednar, D. W. Stout, S. W. Gould, and J. M. Cohn, “ManagingPower and Performance for System-on-Chip Designs using Voltage Islands,” IEEE InternationalConference on Computer Aided Design (ICCAD), pp. 195–202, 2002.

K. Lahiri, A. Rghunathan, S. Dey, and D. Panigrahi, “Battery-Driven System Design: A NewFrontier in Low Power Design,” IEEE International Conference on VLSI Design (VLSID), pp.261–267, 2002.

References 199

Y. H. Lam and W. H. Ki, “A 0.9V 0.35µm Adaptively Biased CMOS LDO Regulator with FastTransient Response,” International Solid-State Circuits Conference (ISSCC), pp. 442–626, 2008.

I. Landau, R. Lozano, M. M’Saad, and A. Karimi, Adaptive Control Algorithms, Analysis andApplications. Springer, 2011.

S. K. Lau, P. K. T. Mok, and K. N. Leung, “A Low-Dropout Regulator for SoC with Q-Reduction,”IEEE Journal of Solid-State Circuits, vol. 42, no. 3, pp. 658–664, 2007.

B. S. Lee, “Understanding the Stable Range of Equivalent Series Resistance of an LDO Regulator,”Texas Instruments Application Note, pp. 14–16, 1999.

Y. Lee, Y. Kim, D. Yoon, D. Blaauw, and D. Sylvester, “Circuit and System Design Guidelines forUltra-Low Power Sensor Nodes,” IEEE Design Automation Conference (DAC), pp. 1037–1042,2012.

D. Leith and W. Leithead, “Survey of Gain-Scheduling Analysis Design,” International Journal ofControl, vol. 73, pp. 1001–1025, 1999.

K. N. Leung and P. K. T. Mok, “Analysis of Multistage Amplifier-Frequency Compensation,” IEEETransactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 48, no. 9,pp. 1041–1056, 2001.

——, “A Capacitor-Free CMOS Low-Dropout Regulator with Damping-Factor-Control FrequencyCompensation,” IEEE Journal of Solid-State Circuits, vol. 38, no. 10, pp. 1691–1702, 2003.

W. Liao, L. He, and K. M. Lepak, “Temperature and Supply Voltage Aware Performance andPower Modeling at Microarchitecture Level,” IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems (TCAD), vol. 24, no. 7, pp. 1042–1053, 2005.

M. Lueders, B. Eversmann, J. Gerber, K. Huber, R. Kuhn, D. Schmitt-Landsiedel, and R. Breder-low, “A Fully-Integrated System Power Aware LDO for Energy Harvesting Applications,” IEEESymposium on VLSI Circuits, pp. 244–245, 2011.

M. Lueders, B. Eversmann, D. Schmitt-Landsiedel, and R. Brederlow, “Fully-Integrated LDO Volt-age Regulator for Digital Circuits,” Advances in Radio Science, vol. 9, pp. 263–267, 2011.

M. Lueders, R. Brederlow, and R. Kuhn, “Electronic Device and Method for Discrete Load AdaptiveVoltage Regulation,” United States Patent - US 2012/0062197 A1, 2012.

M. Lueders, B. Eversmann, J. Gerber, K. Huber, R. Kuhn, M. Zwerg, D. Schmitt-Landsiedel, andR. Brederlow, “Architectural and Circuit Design Techniques for Power Management of Ultra-Low-Power MCU Systems,” IEEE Transactions on VLSI Systems, pp. 2287–2296, 2013.

T. Luftner, J. Berthold, C. Pacha, G. Georgakos, G. Sauzon, O. Homke, J. Beshenar, P. Mahrla,K. Just, P. Hober, S. Henzler, D. Schmitt-Landsiedel, A. Yakovleff, A. Klein, R. Knight,P. Acharya, H. Mabrouki, G. Juhoor, and M. Sauer, “A 90nm CMOS Low-Power GSM/EDGEMultimedia-Enhanced Baseband Processor with 380MHz ARM9 and Mixed-Signal Extensions,”International Solid State Circuits Conference (ISSCC), pp. 952–961, 2006.

T. Y. Man, K. N. Leung, C. Y. Leung, P. K. T. Mok, and M. Chan, “Development of Single-Transistor-Control LDO Based on Flipped Voltage Follower for SoC,” IEEE Transactions onCircuits and Systems I: Regular Papers, vol. 55, no. 5, pp. 1392–1401, 2008.

D. Markovic, V. Stojanovic, B. Nikolic, M. A. Horowitz, and R. W. Brodersen, “Methods for TrueEnergy-Performance Optimization,” IEEE Journal of Solid-State Circuits, vol. 39, no. 8, pp.1282–1293, 2004.

C. O. Mathuna, N. Wang, S. Kulkarni, and S. Roy, “Review of Integrated Magnetics for PowerSupply on Chip (PwrSoC),” IEEE Transactions on Power Electronics, vol. 27, no. 11, pp. 4799–4816, 2012.

200 References

A. Mezhiba and E. G. Friedman, Power Distribution Networks in High Speed Integrated Circuits.Kluwer Academic Publishers, 2004.

R. J. Milliken, J. Silva-Martinez, and E. Sanchez-Sinencio, “Full On-Chip CMOS Low-DropoutVoltage Regulator,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 54,no. 9, pp. 1879–1890, 2007.

M. Mitchell, “Implementing a Smoke Detector With the MSP430F2012,” Application Report, 2006,available: www.ti.com/litv/pdf/slaa335. [Accessed June 2011].

G. E. Moore, “Cramming more Components onto Integrated Circuits,” Proceedings of the IEEE,vol. 86, no. 1, pp. 82–85, 1998.

M. Nagata, T. Okumoto, and K. Taki, “A Built-in Technique for Probing Power Supply and GroundNoise Distribution within Large-Scale Digital Integrated Circuits,” IEEE Journal of Solid-StateCircuits, vol. 40, no. 4, pp. 813–819, 2005.

M. Onabajo and J. Silva-Martinez, Analog Circuit Design for Process Variation-Resilient Systems-on-a-Chip. Shaker Verlag, 2012.

P. Y. Or and K. N. Leung, “An Output-Capacitorless Low-Dropout Regulator with Direct Voltage-Spike Detection,” IEEE Journal of Solid-State Circuits, vol. 45, no. 2, pp. 458–466, 2010.

M. J. M. Pelgrom, A. C. J. Duinmaijer, and A. P. G. Welbers, “Matching Properties of MOSTransistors,” IEEE Journal of Solid-State Circuits, vol. 24, pp. 1433–1439, 1989.

T. Pering, T. Burd, and R. Brodersen, “The Simulation and Evaluation of Dynamic Voltage ScalingAlgorithms,” IEEE International Symposium on Low Power Electronics and Design, pp. 76–81,1998.

J. M. Rabaey, F. De Bernardinis, A. M. Niknejad, B. Nikolic, and A. Sangiovanni Vincentelli,“Embedding Mixed-Signal Design in Systems-on-Chip,” Proceedings of the IEEE, vol. 94, no. 6,pp. 1070–1088, 2006.

J. Ramirez-Angulo, S. Gupta, I. Padilla, R. G. Carvajal, A. Torralba, M. Jimenez, and F. Munoz,“Comparison of Conventional and New Flipped Voltage Structures with Increased Input/OutputSignal Swing and Current Sourcing/Sinking Capabilities,” IEEE Midwest Symposium on Circuitsand Systems, pp. 1151–1154, 2005.

B. Razavi, Design of Analog CMOS Integrated Circuits. McGraw-Hill, 2001.G. A. Rincon-Mora, Analog IC Design with Low-Dropout Regulators. McGraw-Hill, 2009.G. A. Rincon-Mora and P. E. Allen, “A Low-Voltage, Low Quiescent Current, Low Drop-OutRegulator,” IEEE Journal of Solid-State Circuits, vol. 33, no. 1, pp. 36–44, 1998.

——, “Optimized Frequency-Shaping Circuit Topologies for LDOs,” IEEE Transactions on Circuitsand Systems II: Analog and Digital Signal Processing, vol. 45, no. 6, pp. 703–708, 1998.

K. Roy, S. Mukhopadhyay, and H. Mahmoodi-Meimand, “Leakage Current Mechanisms and Leak-age Reduction Techniques in Deep-Submicrometer CMOS Circuits,” Proceedings of the IEEE,vol. 91, pp. 305–327, 2003.

D. Saha and A. Mukherjee, “Pervasive Computing: A Paradigm for the 21st Century,” IEEE Com-puter Society, vol. 36, no. 3, pp. 25–31, 2003.

D. Seborg, T. F. Edgar, D. Mellichamp, and D. Francis, Process Dynamics and Control. JohnWiley & Sons, 2011.

H. Shichman and D. A. Hodges, “Modeling and Simulation of Insulated-Gate Field-Effect TransistorSwitching Circuits,” IEEE Journal of Solid-State Circuits, vol. 3, pp. 285–289, 1968.

C. Simpson, “A User’s Guide to Compensating Low-Dropout Regulators,” Wescon Conference Pro-ceedings, pp. 270–275, 1997.

References 201

T. Skotnicki, J. A. Hutchby, T. J. King, H. S. P. Wong, and F. Boeuf, “The End of CMOS Scal-ing: Toward the Introduction of New Materials and Structural Changes to Improve MOSFETPerformance,” IEEE Circuits and Devices Magazine, vol. 21, no. 1, pp. 16–26, 2005.

B. S. Song and P. R. Gray, “A Precision Curvature-Compensated CMOS Bandgap Reference,”IEEE Journal of Solid-State Circuits, vol. 18, no. 6, pp. 634–643, 1983.

T. Souvignet, T. Coulot, Y. David, S. Trochut, T. Di Gilio, and B. Allard, “Black Box Small-SignalModel of PMOS LDO Voltage Regulator,” IEEE Industrial Electronics Conference (IECON),pp. 495–500, 2013.

S. R. Sridhara, M. DiRenzo, S. Lingam, Seok-Jun Lee, R. Blazquez, J. Maxey, S. Ghanem, Yu-HungLee, R. Abdallah, P. Singh, and M. Goel, “Microwatt Embedded Processor Platform for MedicalSystem-on-Chip Applications,” IEEE Journal of Solid-State Circuits, vol. 46, no. 4, pp. 721–730,2011.

A. Strba, “Embedded Systems with Limited Power Resources,” White Paper, 2009.M. Takamiya, M. Mizuno, and K. Nakamura, “An On-Chip 100 GHz-Sampling Rate 8-ChannelSampling Oscilloscope with Embedded Sampling Clock Generator,” International Solid-StateCircuits Conference (ISSCC), pp. 182–458, 2002.

Texas-Instruments, MSP430FR573x Mixed-Signal Microcontrollers (Rev. A), 2011, available:http://www.ti.com/lit/ds/symlink/msp430fr5739.pdf [Accessed October 2011].

——, TPS62730 Step Down Converter with Bypass Mode for Ultra Low Power Wireless Applica-tions, 2012, available: www.ti.com/lit/ds/symlink/tps62730.pdf. [Accessed January 2013].

B. K. Thandri and J. Silva-Martinez, “A Robust Feedforward Compensation Scheme for MultistageOperational Transconductance Amplifiers with no Miller Capacitors,” IEEE Journal of Solid-State Circuits, vol. 38, no. 2, pp. 237–243, 2003.

G. Thiele and E. Bayer, “Current-Mode LDO with Active Dropout Optimization,” IEEE PowerElectronics Specialists Conference (PESC), pp. 1203–1208, 2005.

Y. Tsividis, Operation and Modeling of the MOS Transistor. Oxford University Press, 2004.T. van Breussegem and M. Steyaert, CMOS Integrated Capacitive DC-DC Converters. Springer-Verlag, 2013.

R. Vullers, R. Schaijk, H. Visser, J. Penders, and C. Hoof, “Energy Harvesting for AutonomousWireless Sensor Networks,” IEEE Solid-State Circuits Magazine, vol. 2, no. 2, pp. 29–38, 2010.

A. Wang, B. H. Calhoun, and A. P. Chandrakasan, Sub-Threshold Design for Ultra Low-PowerSystems. Springer-Verlag, 2006.

K. Wang and M. Marek-Sadowska, “On-Chip Power-Supply Network Optimization using Multigrid-Based Technique,” IEEE Transactions on Computer-Aided Design of Integrated Circuits andSystems, vol. 24, no. 3, pp. 407–417, 2005.

R. J. Widlar, “New Developments in IC Voltage Regulators,” IEEE Journal of Solid-State Circuits,vol. 6, pp. 2–7, 1971.

T. Y. Wu, J. S. Hu, and J. A. Abraham, “Robust Power Gating Reactivation by Dynamic WakeupSequence Throttling,” IEEE Asia and South Pacific Design Automation Conference (ASP-DAC),pp. 615–620, 2011.

H. C. Yang, H. H. Huang, C. L. Chen, M. H. Huang, and K. H. Chen, “Current Feedback Com-pensation (CFC) Technique for Adaptively Adjusting the Phase Margin in Capacitor-Free LDORegulators,” IEEE Midwest Symposium on Circuits and Systems, pp. 5–8, 2008.

B. Zhai, L. Nazhandali, J. Olson, A. Reeves, M. Minuth, R. Helfand, P. Sanjay, D. Blaauw, andT. Austin, “A 2.60pJ/Inst Subthreshold Sensor Processor for Optimal Energy Efficiency,” IEEESymposium on VLSI Circuits, pp. 154–155, 2006.

202 References

C. Zhan and W. H. Ki, “A High-Precision Low-Voltage Low Dropout Regulator for SoC withAdaptive Biasing,” IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2521–2524, 2009.

M. Zwerg, A. Baumann, R. Kuhn, M. Arnold, R. Nerlich, M. Herzog, R. Ledwa, C. Sichert, V. Rze-hak, P. Thanigai, and B. O. Eversmann, “An 82µA/MHz Microcontroller with Embedded FeRAMfor Energy-Harvesting Applications,” International Solid-State Circuits Conference (ISSCC), pp.334–336, 2011.

Acknowledgements

After presenting the results, I would like here also to remember the people that have contributedand made possible this work. This book is based on my research at the Institute for TechnicalElectronics of the Technische Universität München, Germany in cooperation with Texas Instru-ments, Germany. This work is therefore the attempt to combine the best of both worlds - themore research oriented work environment at university as well as the more development orientedwork environment in industry. I must confess, however, I still have much to learn. I thus hopemy devotion to this thesis and the field at large ultimately wins enough of reader’s favor to par-don any deficiencies, inconsistencies, and inaccuracies the reader may find. First and foremost, Iwant to thank my advisers Doris Schmitt-Landsiedel and Ralf Brederlow. They have taught me,both consciously and unconsciously, the intuition, rigor and creativity required for good analogcircuit design. I appreciate all their contributions of time, ideas, and funding to make my researchwork experience productive and stimulating. The joy and enthusiasm they have for the researchwas contagious and motivational for me, even during tough times in the pursuit of this researchwork. The members of the MSP430 group at Texas Instruments have contributed immensely tomy personal and professional time. The group has been a source of friendships as well as goodadvice and collaboration. I am especially grateful to Björn Eversmann, Johannes Gerber, RüdigerKuhn, Korbinian Huber and Michael Zwerg. It has been a great pleasure to work at the Institutefor Technical Electronics of the Technische Universität München, not only because of the talentedcolleagues but also the friendship. Particularly I would like to thank Christoph Friederich, StephanHenzler, Philip Teichmann, Marcus Weis, Bernhard Wicht, and Martin Wirnshofer. Finally, I wantto thank Anna-Maria Fischer, who kept me personally grounded, always showed interest for myresearch and accepted my long office hours.

MichaelLüdersmediatum.ub.tum.de/doc/1277587/1277587.pdf · 2019. 3. 26. · 4 Preface...

Documents

Transcript of MichaelLüdersmediatum.ub.tum.de/doc/1277587/1277587.pdf · 2019. 3. 26. · 4 Preface...