Managing Data Center Uncertainty Part IV — Flexibility Is the New Capacity: Unlocking 100 GW Without New Generation
The U.S. grid has ~100 GW of hidden capacity that data centers could unlock with tiny, well-timed curtailments. With stronger price signals—geometric demand charges and TOU surcharges—flexibility becomes profitable, enabling rapid AI load growth without new generation.
This article is part of the AIxEnergy Series: Managing Data Center Energy Uncertainty, drawn from the author’s full research paper. The complete version is available directly from the author by request at michael.leifman1@gmail.com.
In Part I of this article series, we discussed how AI is driving unprecedented uncertainty in electricity demand, with U.S. data-center use swinging between 325 and 580 TWh by 2028. The issue isn’t forecasting—it’s governance. Without transparency, flexibility, and better incentives, utilities overbuild, costs rise, and emissions lock in.
In Part II of this article series, we reviewed how phantom data centers distort U.S. energy planning. Developers overfile interconnection requests, utilities profit from overbuilding, and regulators approve speculative capacity. Misaligned incentives create costly overbuild and fossil lock-in—requiring governance reform, transparency, and accountability.
Part III of this article series revealed that what looks like scarcity is, in many cases, inefficiency masked by opacity—a failure of synchronization between computation and electricity, compounded by a lack of data on workload types. The takeaway was clear: the cleanest power plant is the unused GPU. Unlocking that efficiency requires not more technology, but more transparency and governance that distinguish between workload types and create appropriate economic incentives for each.
In this installment, we show that the U.S. grid has nearly 100 GW of hidden capacity that could be unlocked through very small, well-timed data center curtailments—capacity that is technically feasible today but inaccessible under current economic conditions. The conclusion is that with the right incentives, data centers can deliver this flexibility at a massive scale, enabling rapid AI-driven load growth without building new electricity generation capacity and transforming a looming grid constraint into a structural advantage.
The 100 GW estimate cited here comes from recent Duke University research that indicated that U.S. balancing authorities could integrate nearly 100 GW of new load—equivalent to 10% of national peak demand—if data centers reduced consumption by just 0.5-1% during the grid's top 177 hours annually. The average curtailment would last about 2.1 hours per event, with half the load continuing to run even during these brief reductions. This isn't theoretical: it builds on technically feasible flexibility in AI training workloads that already experience multi-day runtimes due to data delivery bottlenecks.
Yet this enormous potential remains untapped. Current rate structures make flexibility economically irrational. A data center paying 6-8¢/kWh faces electricity costs representing 15-25% of operating expenses—substantial in absolute terms, but insufficient to drive behavioral change when capital costs dominate decision-making and traditional demand charges run just $10-20/kW-month.
The solution isn't gentle nudges. It's making flexibility obviously profitable through geometric demand charges that create cliff effects—where crossing a threshold costs millions annually—combined with time-of-use surcharges during the grid's most stressed hours. A 150 MW facility investing $98 million in batteries, advanced cooling, and load management could see a 0.9-year payback period and $680+ million in net present value over 15 years. At that return, flexibility stops being a favor to the grid and becomes core business strategy.
This article details the pricing mechanisms that can unlock Duke's 100 GW opportunity, explains why different facility types will pursue different strategies based on their workload characteristics, and shows how pass-through payment structures can cascade price signals through the entire compute value chain.
The 100 GW Opportunity: What Duke Found
Duke University's Nicholas Institute analysis introduced "curtailment-enabled headroom"—a mechanism through which additional load can be absorbed using existing capacity with only modest, short-duration usage reductions. The findings quantify an enormous untapped resource: nearly 100 GW of new large loads could be integrated across the 22 largest U.S. balancing authorities—covering 95% of U.S. load—if data centers can reduce power usage by as little as 0.5-1% during peak demand periods.
The numbers are striking. Duke Energy Progress alone could gain 1.3 GW of capacity for data centers if facilities could withdraw 0.5% of power during peak system stress. At the 0.5% curtailment rate—representing 177 hours per year—load curtailment would average about 2.1 hours per event. Critically, on average for 88% of curtailment time, half the new load would continue running. These are partial reductions, not complete shutdowns.
The study assumes constant load additions at 100% utilization for modeling purposes. In the context of current demand growth, AI data centers represent the primary application of this flexibility potential, though the analysis applies to "large flexible loads" generally.
But Duke's analysis comes with important caveats. The researchers didn't consider transmission constraints, which could limit available headroom in specific regions. They based calculations on peak demand levels without factoring in reserve margin capacity, which could increase potential headroom. And most critically, the study quantifies theoretical grid capacity without examining whether the operational and economic barriers to flexibility can be overcome.
The Technical Barriers Are Real But Solvable
While Duke's analysis proves the grid has spare capacity, practical obstacles exist. Data centers already operate at 60-70% GPU utilization due to I/O bottlenecks—but as Part III of this series established, this reveals which workloads are flexible rather than constraining flexibility potential.
State persistence and coordination create challenges. Many workloads cannot simply pause. Database transactions must complete, active sessions require continuity, and model training involves thousands of synchronized GPUs. User-facing services—the most economically valuable workloads—are time-sensitive and difficult to interrupt.
Multi-tenant complexity looms large for colocation data centers serving hundreds of customers. Determining whose workloads get curtailed while maintaining service level agreements—typically 99.99% uptime, meaning just 52.6 minutes of acceptable downtime per year—creates enormous operational complexity.
Cooling and thermal management systems respond slowly and are optimized for steady-state operation. Rapid load changes create thermal transients, hot spots, or force inefficient part-load operations that can stress equipment.
Economic incentives present the deepest barrier. Data centers represent $10-15 million capital investment per megawatt. The business model assumes 60-80%+ utilization. Deliberately underutilizing capacity means accepting lower returns unless compensation mechanisms are very attractive—and current rate structures provide minimal rewards.
Duke's curtailment-enabled headroom analysis applies most readily to facilities with inherently flexible workloads—particularly those running I/O-constrained AI training jobs that are not latency-sensitive.
For such facilities, the 0.5-1% curtailment during peak periods (averaging 2.1 hours per event at the 0.5% rate) could be achieved through temporal load-shifting: deferring batch training jobs to off-peak hours with minimal operational disruption. The computational output completes slightly later, but the business impact is negligible for workloads already experiencing multi-day training times due to I/O bottlenecks.
In contrast, facilities serving real-time user requests face different constraints. For these compute-optimized facilities serving latency-sensitive workloads—inference, web services, database operations—achieving the same grid flexibility requires electrical demand management via battery storage that maintains computational continuity while reducing peak electrical draw, rather than direct computational curtailment that would violate service requirements.
A fundamental data gap undermines precise quantification. No reliable public data exists on what proportion of projected AI data center growth represents training versus inference workloads. Without this information, we cannot determine what portion of Duke's 100 GW opportunity is achievable through temporal load-shifting versus requiring alternative mechanisms like battery storage.
Anthropic's forecast of roughly equal training and inference capacity by 2028 for frontier AI suggests perhaps 40-60 GW of Duke's potential could come from training workload flexibility, with the remainder requiring electrical rather than computational flexibility—but this remains speculative absent better disclosure. Accounting for the technical and economic barriers documented above, actual achievable flexibility will be lower than the 100 GW theoretical potential, but without workload composition data and further analysis, we cannot determine whether achievable flexibility falls in the range of 75-90 GW or as low as 10-25 GW.
This uncertainty itself demonstrates why mandatory disclosure is essential for effective policy design.
The Economic Logic: Why Current Pricing Fails
The technical potential for flexibility documented by Duke and MIT raises an obvious question: if flexibility is feasible and valuable, why haven't AI data centers embraced it through existing demand response programs?
The answer lies in opportunity costs. Data centers represent $10-15 million capital investment per megawatt. The business model assumes 60-80%+ utilization. Deliberately underutilizing capacity means accepting lower returns unless demand response compensation offsets foregone revenue.
Contractual service level agreements create legal constraints. Cloud providers guarantee 99.9-99.99% availability—just 52.6 minutes acceptable downtime per year. These are legally binding with financial penalties for violations.
Demand pricing complexity matters. The most economically valuable workloads are user-facing and time-sensitive. Building a business around interruptible compute means serving less profitable market segments.
Stranded capacity risk is real. Committing to demand response requires maintaining spare capacity year-round with ongoing costs but no revenue during normal operations.
Multi-tenancy coordination is especially challenging. Colocation facilities face daunting obstacles deciding whose workloads get curtailed while maintaining diverse SLAs. Hyperscalers controlling their full stack have clear advantages.
Most fundamentally, inadequate price signals render flexibility irrational. Data centers in 2024-2025 pay 6-8¢/kWh for electricity, with electricity costs representing 15-25% of operating expenses. Yet even substantial absolute costs do not drive flexibility investments when rate structures lack appropriate price signals. Current electricity costs and traditional demand charges of $10-20/kW-month are too modest to change behavior where capital costs and revenue generation dominate decision-making.
The Solution: Geometric Demand Charges That Create Cliff Effects
The proposed mechanism implements a two-part pricing structure that makes flexibility investments economically compelling.
Part 1: Geometric Monthly Demand Charges based on the facility's highest 30-minute average demand during each month, with rates increasing geometrically:
- 0-100 MW: $15/kW-month
- 100-125 MW: $26.25/kW-month (1.75× multiplier)
- 125-150 MW: $45.94/kW-month (1.75× multiplier)
- 150+ MW: $80.39/kW-month (1.75× multiplier)
Part 2: Time-of-Use Energy Surcharges of $0.50/kWh (versus baseline $0.09/kWh) during the grid's top 100 stress hours annually, identified through day-ahead ISO price signals.
These mechanisms work together. Demand charges incentivize year-round peak reduction through efficiency and right-sizing. TOU surcharges reward load shifting and battery discharge specifically when the grid needs help most. Both make the same flexibility investments—battery storage, advanced cooling, load management—economically rational.
Consider a 150 MW data center with typical utilization patterns. Under current rate structures, the facility pays roughly $86 million annually in electricity costs: $64.4 million in energy charges at $0.09/kWh for 715,600 MWh consumed yearly (assuming 55% average utilization), plus approximately $21.6 million in standard demand charges at $12/kW-month.
Under the proposed structure without flexibility measures, costs rise dramatically. Monthly peaks at 150 MW trigger the highest geometric tier at $80.39/kW-month, generating $144.7 million in annual demand charges alone. Operating at full 150 MW capacity during the 100 grid-stressed hours adds $6.15 million in TOU surcharges. Total annual electricity costs reach $215.3 million—a 150% increase over current rates.
However, investing $98 million in flexibility infrastructure transforms the economics. The investment breakdown: a 30 MW, 4-hour battery system ($48M at approximately $1,600/kW installed cost), advanced cooling retrofits for 40% of racks ($40M for immersion or advanced liquid cooling systems), and AI-driven load management platforms ($10M for enterprise-scale DCIM systems with real-time optimization).
By keeping monthly peaks at 125 MW through battery discharge and load management, demand charges drop to $39.4 million annually. Reducing consumption 20% during the 100 grid-stressed hours through battery discharge and temporal load shifting cuts TOU surcharges to $4.92 million. Total electricity costs: $108.7 million, saving $106.6 million annually compared to the no-flexibility scenario.
Against the $98 million capital investment, this yields a 0.9-year payback period and a 15-year net present value exceeding $680 million at a 7% discount rate. Even at current baseline rates, the avoided costs justify the investment. The geometric structure simply makes flexibility investments obviously rational rather than marginally attractive.
The Power of Cliff Effects
The geometric progression creates powerful threshold effects. A facility peaking at 124 MW pays $26.25/kW-month ($39.3M annually), while one crossing to 126 MW pays $45.94/kW-month ($69.4M annually)—a $30.1 million annual penalty for 2 MW.
These cliff effects make investing in flexibility to stay below thresholds economically compelling. The mechanism is technology-neutral: facilities choose whether to achieve reductions through temporal load-shifting, battery storage, improved cooling efficiency, or workload optimization based on their operational characteristics.
One critical detail: the geometric demand charge structure charges ALL capacity at the highest tier reached during the month, not marginal pricing. If a facility peaks at 126 MW even once during the month, the entire 126 MW is charged at the $45.94/kW-month rate. This creates even stronger incentives for peak management than marginal pricing would provide.
Of course, different facility types will respond differently based on their workload characteristics. Facilities running I/O-constrained training workloads may emphasize temporal shifting, deferring batch jobs to off-peak hours. The computational output completes slightly later, but the business impact is negligible for workloads already experiencing multi-day training times.
Compute-optimized facilities serving latency-sensitive inference workloads will invest more heavily in battery storage to maintain computational continuity while reducing electrical peaks. Direct computational curtailment would violate service requirements.
Multi-tenant colocation facilities may develop curtailable compute markets with pass-through payment structures, where demand charge savings flow to compute customers who accept occasional curtailment. This cascades price signals through the value chain.
Hyperscalers controlling their full stack can implement flexibility more readily than facilities serving diverse external customers, but all face consistent price signals.
Pass-Through Payments: Cascading Price Signals
Pass-through payment structures deserve emphasis. Just as utilities offer demand response payments to customers who curtail electricity usage, data centers can compensate compute customers for accepting curtailment.
When a data center reduces peak demand and avoids geometric penalties, those savings flow through as payments to customers whose workloads were deferred. This cascades price signals through the value chain: utilities compensate data centers through reduced charges, data centers compensate compute customers for flexibility, and the entire system efficiently allocates curtailment capability.
Regulators should encourage transparent pass-through mechanisms, potentially requiring standardized curtailment product offerings that distinguish "guaranteed compute" from "curtailable compute with advance notice" at differentiated prices reflecting grid value.
Implementation must address operational constraints. Thermal management systems respond slowly. On-site battery storage allows instant response to price signals while gradually ramping compute loads, avoiding thermal stress.
Service level agreements matter. The proposal's 100-hour annual duration with day-ahead notice allows operators to maintain contractual commitments through temporal shifting, geographic load-balancing, or battery discharge rather than violating SLAs.
Multi-tenant coordination remains challenging, but technical assistance grants or automated curtailment platforms can address this barrier. The key is making the economic case overwhelming enough that facilities invest in solving operational challenges.
Academic Validation
Academic research validates the approach. Studies demonstrate data centers can reduce coincident peak by 15-30% through workload shifting with minimal disruption, confirming the technical feasibility of required reductions. Battery storage research shows data center energy storage provides valuable grid services including frequency regulation and voltage support while maintaining reliability.
Duke's analysis quantifying 100 GW of integration potential with 0.5-1% curtailment provides evidence that required flexibility magnitudes are achievable, though primarily for training rather than inference workloads.
The mechanism works alongside other policy tools. Technology incentives help overcome capital cost barriers for storage and advanced cooling. Disclosure requirements ensure utilities understand which facilities can provide flexibility. Performance-based regulation gives utilities incentives to pursue demand-side solutions. Speed-to-market rewards facilities combining flexibility with decarbonization. Together, these transform flexibility from theoretical possibility to economically rational business strategy.
MIT's comprehensive analysis of generative AI's climate impact identifies multiple efficiency opportunities that compound flexibility benefits.
Algorithmic improvements show efficiency gains from new model architectures that solve complex problems faster are doubling every eight or nine months. This means the same computational result requires progressively less energy over time.
Hardware optimization demonstrates that "turning down" GPUs so they consume about three-tenths the energy has minimal impacts on AI model performance. This isn't theoretical—it's operational practice at facilities prioritizing efficiency.
Training efficiency reveals about half the electricity used for training an AI model is spent to get the last 2-3 percentage points in accuracy. Early stopping can save substantial energy with minimal quality impact for many applications.
Temporal load shifting proves splitting computing operations so some are performed when more electricity is from renewable sources can significantly reduce a data center's carbon footprint. This aligns naturally with the geometric demand charge structure.
Long-duration storage could be a "game-changer" because data centers could use stored renewable energy during high-demand periods. This converts intermittent renewables into dispatchable capacity while providing grid flexibility.
These efficiency improvements enable flexibility rather than competing with it. A facility that solves I/O bottlenecks achieves higher baseline utilization AND can provide more flexibility through electrical storage because the computational output per watt increases.
What's Next
Part IV has shown that the 100 GW flexibility opportunity Duke identified is real and achievable—if economic incentives align with technical potential. Geometric demand charges and time-of-use surcharges create that alignment, making sub-one-year payback periods possible for comprehensive flexibility investments.
Part V, the final installment of this series, will complete the policy framework. Beyond pricing flexibility, effective governance requires four additional mechanisms: mandatory granular disclosure that reveals workload composition and flexibility potential; performance-based regulation that rewards forecast accuracy over capital expansion; technology incentives that overcome upfront cost barriers; and speed-to-market incentives that accelerate decarbonization while preventing phantom projects.
Together, these five integrated mechanisms transform data center energy uncertainty from crisis to opportunity—preventing overbuild, controlling costs, and unlocking the flexibility that makes rapid AI growth compatible with grid reliability and climate commitments.
References
Norris, T.H., Patiño-Echeverri, D., & Dworkin, M. (2025). Rethinking Load Growth: Assessing the Potential for Integration of Large Flexible Loads in US Power Systems. Nicholas Institute for Environmental Policy Solutions, Duke University.
MIT News (2025). "Responding to the climate impact of generative AI." September 30, 2025.
Lawrence Berkeley National Laboratory (2024). United States Data Center Energy Usage Report. LBNL-2001637.
Turner & Townsend (2024). "Data Center Cost Index 2024." International Construction Market Survey.
Amazon Web Services (2024). "Amazon Compute Service Level Agreement." AWS Legal Documentation.
Aikema, D., Simmonds, R., & Zareipour, H. (2012). "Data centre demand response: Avoiding the coincident peak via workload shifting and local generation." Performance Evaluation, 70(10), 770-791.
Ghatikar, G., Ganti, V., Matson, N., & Piette, M.A. (2012). Demand Response Opportunities and Enabling Technologies for Data Centers: Findings From Field Studies. Lawrence Berkeley National Laboratory. LBNL-5763E.
Michael Leifman (2025). Managing Data Center Energy Uncertainty: A Framework to Prevent Overbuild, Control Costs, and Unlock Grid Flexibility. Full research paper available from the author.