CDU ( OCP Project Deschutes ) Numbers

OCP CDU (Deschutes) Standard Overview

The provided visual summarizes the key performance metrics of the CDU (Cooling Distribution Unit) that adheres to the OCP (Open Compute Project) ‘Project Deschutes’ specification. This CDU is designed for high-performance computing environments, particularly for massive-scale liquid cooling of AI/ML workloads.


Key Performance Indicators

  • System Availability: The primary target for system availability is 99.999%. This represents an extremely high level of reliability, with less than 5 minutes and 15 seconds of downtime per year.
  • Thermal Load Capacity: The CDU is designed to handle a thermal load of up to 2,000 kW, which is among the highest thermal capacities in the industry.
  • Power Usage: The CDU itself consumes 74 kW of power.
  • IT Flow Rate: It supplies coolant to the servers at a rate of 500 GPM (approximately 1,900 LPM).
  • Operating Pressure: The overall system operating pressure is within a range of 0-130 psig (approximately 0-900 kPa).
  • IT Differential Pressure: The pressure difference required on the server side is 80-90 psi (approximately 550-620 kPa).
  • Approach Temperature: The approach temperature, a key indicator of heat exchange efficiency, is targeted at ≤3∘C. A lower value is better, as it signifies more efficient heat removal.

Why Cooling is Crucial for GPU Performance

Cooling has a direct and significant impact on GPU performance and stability. Because GPUs are highly sensitive to heat, if they are not maintained within an optimal temperature range, they will automatically reduce their performance through a process called thermal throttling to prevent damage.

The ‘Project Deschutes’ CDU is engineered to prevent this by handling a massive thermal load of 2,000 kW with a powerful 500 GPM flow rate and a low approach temperature of ≤3∘C. This robust cooling capability ensures that GPUs can operate at their maximum potential without being limited by heat, which is essential for maximizing performance in demanding AI workloads.

with Gemini

AI DC Energy Optimization

Core Technologies for AI DC Power Optimization

This diagram systematically illustrates the core technologies for AI datacenter power optimization, showing power consumption breakdown by category and energy savings potential of emerging technologies.

Power Consumption Distribution:

  • Network: 5% – Data transmission and communication infrastructure
  • Computing: 50-60% – GPUs and server processing units (highest consumption sector)
  • Power: 10-15% – UPS, power conversion and distribution systems
  • Cooling: 20-30% – Server and equipment temperature management systems

Energy Savings by Rising Technologies:

  1. Silicon Photonics: 1.5-2.5% – Optical communication technology improving network power efficiency
  2. Energy-Efficient GPUs & Workload Optimization: 12-18% (5-7%) – AI computation optimization
  3. High-Voltage DC (HVDC): 2-2.5% (1-3%) – Smart management, high-efficiency UPS, modular, renewable energy integration
  4. Liquid Cooling & Advanced Air Cooling: 4-12% – Cooling system efficiency improvements

This framework presents an integrated approach to maximizing power efficiency in AI datacenters, addressing all major power consumption areas through targeted technological solutions.

With Claude

Computing Changes with Power/Cooling

This chart compares power consumption and cooling requirements for server-grade computing hardware.

CPU Servers (Intel Xeon, AMD EPYC)

  • 1U-4U Rack: 0.2-1.2kW power consumption
  • 208V power supply
  • Standard air cooling (CRAC, server fans) sufficient
  • PUE: 1.4-1.6 (Power Usage Effectiveness)

GPU Servers (DGX Series)

Power consumption and cooling complexity increase dramatically:

Low-Power Models (DGX-1, DGX-2)

  • 3.5-10kW power consumption
  • Tesla V100 GPUs
  • High-performance air cooling required

Medium-Power Models (DGX A100, H100)

  • 6.5-10.2kW power consumption
  • 400V high voltage required
  • Liquid cooling recommended or essential

Highest-Performance Models (DGX B200, GB200)

  • 14.3-120kW extreme power consumption
  • Blackwell architecture GPUs
  • Full liquid cooling essential
  • PUE 1.1-1.2 with improved cooling efficiency

Key Trends Summary

The evolution from CPU to GPU computing represents a fundamental shift in data center infrastructure requirements. Power consumption scales dramatically from kilowatts to tens of kilowatts, driving the transition from traditional air cooling to sophisticated liquid cooling systems. Higher-performance systems paradoxically achieve better power efficiency through advanced cooling technologies, while requiring substantial infrastructure upgrades including high-voltage power delivery and comprehensive thermal management solutions.

※ Disclaimer: All figures presented in this chart are approximate reference values and may vary significantly depending on actual environmental conditions, workloads, configurations, ambient temperature, and other operational factors.

WIth Claude

AI DC Key

From Claude with some prompting
This image titled “AI DC Key” illustrates the key components of an AI data center. Here’s an interpretation of the diagram:

  1. On the left, there’s an icon representing “Massive Data”.
  2. The center showcases four core elements of AI:
    • “Super Power”
    • “Super Computing” (utilizing GPU)
    • “Super Cooling”
    • “Optimizing Operation”
  3. Below each core element, key considerations are listed:
    • Super Power: “Nature & Consistent”
    • Super Computing: “Super Parallel”
    • Super Cooling: “Liquid Cooling”
    • Optimizing Operation: “Data driven Auto & AI”
  4. On the right, an icon represents “Analyzed Data”.
  5. The overall flow illustrates the process of massive data being input, processed through the AI core elements, and resulting in analyzed data.

This diagram visualizes the essential components of a modern AI data center and their key considerations. It demonstrates how high-performance computing, efficient power management, advanced cooling technology, and optimized operations effectively process and analyze large-scale data, emphasizing the critical technologies or approaches for each element.