Data Center Shift with AI

Data Center Shift with AI

This diagram illustrates how data centers are transforming as they enter the AI era.

๐Ÿ“… Timeline of Technological Evolution

The top section shows major technology revolutions and their timelines:

  • Internet ’95 (Internet era)
  • Mobile ’07 (Mobile era)
  • Cloud ’10 (Cloud era)
  • Blockchain
  • AI(LLM) ’22 (Large Language Model-based AI era)

๐Ÿข Traditional Data Center Components

Conventional data centers consisted of the following core components:

  • Software
  • Server
  • Network
  • Power
  • Cooling

These were designed as relatively independent layers.

๐Ÿš€ New Requirements in the AI Era

With the introduction of AI (especially LLMs), data centers require specialized infrastructure:

  1. LLM Model – Operating large language models
  2. GPU – High-performance graphics processing units (essential for AI computations)
  3. High B/W – High-bandwidth networks (for processing large volumes of data)
  4. SMR/HVDC – Switched-Mode Rectifier/High-Voltage Direct Current power systems
  5. Liquid/CDU – Liquid cooling/Cooling Distribution Units (for cooling high-heat GPUs)

๐Ÿ”— Key Characteristic of AI Data Centers: Integrated Design

The circular connection in the center of the diagram represents the most critical feature of AI data centers:

Tight Interdependency between SW/Computing/Network โ†” Power/Cooling

Unlike traditional data centers, in AI data centers:

  • GPU-based computing consumes enormous power and generates significant heat
  • High B/W networks consume additional power during massive data transfers between GPUs
  • Power systems (SMR/HVDC) must stably supply high power density
  • Liquid cooling (Liquid/CDU) must handle high-density GPU heat in real-time

These elements must be closely integrated in design, and optimizing just one element cannot guarantee overall system performance.

๐Ÿ’ก Key Message

AI workloads require moving beyond the traditional layer-by-layer independent design approach of conventional data centers, demanding that computing-network-power-cooling be designed as one integrated system. This demonstrates that a holistic approach is essential when building AI data centers.


๐Ÿ“ Summary

AI data centers fundamentally differ from traditional data centers through the tight integration of computing, networking, power, and cooling systems. GPU-based AI workloads create unprecedented power density and heat generation, requiring liquid cooling and HVDC power systems. Success in AI infrastructure demands holistic design where all components are co-optimized rather than independently engineered.

#AIDataCenter #DataCenterEvolution #GPUInfrastructure #LiquidCooling #AIComputing #LLM #DataCenterDesign #HighPerformanceComputing #AIInfrastructure #HVDC #HolisticDesign #CloudComputing #DataCenterCooling #AIWorkloads #FutureOfDataCenters

With Claude

DC Power(R)

Data Center DC Power System Comprehensive Overview

This diagram illustrates the complete DC (Direct Current) power supply system for a data center infrastructure.

1. Core Components

โ‘  Power Source

  • 15.4 KV High Voltage AC Power
  • Received from utility grid
  • Efficient long-distance transmission (Efficient Delivery)
  • High voltage warning indicator (High Warning)

โ‘ก Primary Transformer

  • Voltage conversion: 15.4 KV โ†’ 6.6 KV
  • Function: Steps down high voltage to medium voltage
  • Transformation method: Voltage Step-down
  • Adjusts voltage for internal data center distribution

โ‘ข Backup Power #1 – Generator System (Long-Time Backup)

  • Configuration: Diesel generator + Fuel tank
  • Characteristic: Long-duration backup capability
  • Purpose: Continuous power supply during main power outage
  • Advantage: Unlimited operation as long as fuel is supplied

โ‘ฃ Secondary Transformer

  • Voltage conversion: 6.6 KV โ†’ 380 V
  • Function: Steps down medium voltage to low voltage
  • Transformation method: Voltage Step-down
  • Provides appropriate voltage for UPS and final loads

โ‘ค Backup Power #2 – UPS System (Short-Time Backup)

  • Configuration: UPS + Battery
  • Characteristic: Short-duration instantaneous backup
  • Purpose: Ensures uninterrupted power during main-to-generator transition
  • Role: Supplies power during generator startup time (10-30 seconds)

โ‘ฅ Final Load (Power Use)

  • Output voltage: 220 V AC or 48 V DC
  • Target: Servers, network equipment, storage systems
  • Feature: Stable IT infrastructure operation with DC power

2. Voltage Conversion Flow

15.4 KV (AC)  โ†’  6.6 KV (AC)  โ†’  380 V (AC)  โ†’  48 V (DC) / 220 V
  [Reception]   [Primary TX]   [Secondary TX]   [Final Conversion]

3. Redundant Backup Architecture

Two-Tier Backup System

Main Power (15.4 KV) โ”€โ”€โ”€โ”€โ”€โ”
                          โ”œโ”€โ”€โ†’ Transform โ”€โ”€โ†’ Load
Generator (Long-term) โ”€โ”€โ”€โ”€โ”˜
         โ†“
    UPS/Battery (Short-term) โ”€โ”€โ†’ Instantaneous uninterrupted guarantee

Backup Strategy:

  • Generator: Hours to days operation (fuel-dependent)
  • UPS: Minutes to tens of minutes operation (battery capacity-dependent)
  • Combined effect: UPS covers generator startup gap to achieve complete uninterrupted power

4. Operating Scenarios

Scenario 1: Normal Operation

Utility power (15.4KV) โ†’ Primary transform (6.6KV) โ†’ Secondary transform (380V) โ†’ UPS โ†’ DC load (48V)

Scenario 2: Momentary Power Outage

  1. Main power interruption detected (< 4ms)
  2. UPS battery immediately engaged
  3. Continuous power supply to load with zero interruption

Scenario 3: Extended Power Outage

  1. Main power interruption detected
  2. UPS battery immediately engaged (maintains uninterrupted power)
  3. Generator automatically starts (10-30 seconds required)
  4. Generator reaches rated capacity and replaces main power
  5. Generator power charges UPS + supplies load
  6. Long-term operation with continuous fuel supply

Scenario 4: Generator Failure

  • Limited-time operation within UPS battery capacity
  • Priority operation for critical systems or graceful shutdown

5. Additional Protection and Control Devices

Supplementary devices for system stability and safety:

Circuit Breaker Hierarchy

  • GCB (Generator Circuit Breaker): Primary protection at reception point
  • VCB (Vacuum Circuit Breaker): Vacuum interruption, medium voltage protection
  • ACB (Air Circuit Breaker): Low voltage distribution panel protection
  • MCCB (Molded Case Circuit Breaker): Individual load protection
  • Role: Circuit interruption during overload or short circuit to protect equipment and personnel

Switching Devices

  • STS (Static Transfer Switch): High-speed transfer between main power โ†” generator
  • ATS (Automatic Transfer Switch): Automatic transfer between power sources ( UPS level)
  • ALTS (Automatic Load Transfer Switch): Automatic load transfer ( for 22.9kV class)
  • CCTS: Circuit breaker control and transfer system
  • Role: Automatic/immediate transfer to backup power during power failure

Switching Points (Red circle indicators)

  • Reception point, before/after transformers, backup power injection points
  • Critical points for power path changes and redundancy implementation

6. Key System Features

โœ… Uninterruptible Power Supply: Three-stage protection with main power โ†’ generator โ†’ UPS
โœ… Multi-stage Voltage Conversion: Ensures both transmission efficiency and usage safety
โœ… Automated Backup Transfer: Automatic switching without human intervention
โœ… Hierarchical Protection: Stage-by-stage circuit breakers prevent cascading failures
โœ… Scalable Architecture: Modular configuration enables easy capacity expansion


Summary

This DC power system architecture ensures continuous, uninterrupted operation of mission-critical data center infrastructure through a sophisticated combination of redundant power sources, automated failover mechanisms, and multi-layered protection systems. The integration of long-term generator backup and short-term UPS battery systems creates a seamless power continuity solution that can handle any grid interruption scenario. The multi-stage voltage transformation (15.4KV โ†’ 6.6KV โ†’ 380V โ†’ 48V DC) optimizes both transmission efficiency and end-user safety while providing flexibility for diverse IT equipment requirements.


#DataCenter #DCPower #PowerSystems #CriticalInfrastructure #UPS #BackupPower #DataCenterDesign #ElectricalEngineering #PowerDistribution #MissionCritical #DataCenterInfrastructure #FacilityManagement #PowerReliability #UninterruptiblePowerSupply #DataCenterOperations

With Claude

‘tightly fused’

This illustration visualizes the evolution of data centers, contrasting the traditionally separated components with the modern AI data center where software, compute, network, and crucially, power and cooling systems are ‘tightly fused’ together. It emphasizes how power and advanced cooling are organically intertwined with GPU and memory, directly impacting AI performance and highlighting their inseparable role in meeting the demands of high-performance AI. This tight integration symbolizes a pivotal shift for the modern AI era.

Switching of the power

This diagram illustrates two main power switching methods used in electrical systems: ATS (Automatic Transfer Switch) and STS (Static Transfer Switch).

System Configuration

  • Power Sources: Utility grid and Generator
  • Protection: UPS systems
  • Load: Server infrastructure

ATS (Automatic Transfer Switch)

Location: Switchgear Area (Power Distribution Board)

Characteristics:

  • Mechanism: Mechanical breakers/contacts
  • Transfer Time: Several seconds (including generator start-up)
  • Advantages: Relatively simple, lower cost
  • Application: Standard power transfer systems

STS (Static Transfer Switch)

Location: Panelboard Area (Distribution Panel)

Characteristics:

  • Mechanism: Semiconductor devices (SCR, IGBT)
  • Transfer Time: A few milliseconds (near seamless)
  • Advantages: Ensures high-quality power supply
  • Disadvantages: Expensive

Key Differences

  1. Transfer Speed: STS is significantly faster (milliseconds vs seconds)
  2. Technology: ATS uses mechanical switching, STS uses electronic switching
  3. Cost: ATS is more economical
  4. Power Quality: STS provides more stable power delivery
  5. Complexity: STS requires more sophisticated semiconductor control

Applications

  • ATS: Suitable for applications that can tolerate brief power interruptions
  • STS: Critical for sensitive equipment like servers, data centers, and medical facilities requiring uninterrupted power

Summary: This diagram shows a redundant power system where ATS provides cost-effective backup power switching while STS offers near-instantaneous transfer for critical loads. Both systems work together with UPS backup to ensure continuous power supply to servers and sensitive equipment.

With Claude

LLM Efficiency with a Cooling

This image demonstrates the critical impact of cooling stability on both LLM performance and energy efficiency in GPU servers through benchmark results.

Cascading Effects of Unstable Cooling

Problems with Unstable Air Cooling:

  • GPU Temperature: 54-72ยฐC (high and unstable)
  • Thermal throttling occurs – where GPUs automatically reduce clock speeds to prevent overheating, leading to significant performance degradation
  • Result: Double penalty of reduced performance + increased power consumption

Energy Efficiency Impact:

  • Power Consumption: 8.16kW (high)
  • Performance: 46 TFLOPS (degraded)
  • Energy Efficiency: 5.6 TFLOPS/kW (poor performance-to-power ratio)

Benefits of Stable Liquid Cooling

Temperature Stability Achievement:

  • GPU Temperature: 41-50ยฐC (low and stable)
  • No thermal throttling โ†’ sustained optimal performance

Energy Efficiency Improvement:

  • Power Consumption: 6.99kW (14% reduction)
  • Performance: 54 TFLOPS (17% improvement)
  • Energy Efficiency: 7.7 TFLOPS/kW (38% improvement)

Core Mechanisms: How Cooling Affects Energy Efficiency

  1. Thermal Throttling Prevention: Stable cooling allows GPUs to maintain peak performance continuously
  2. Power Efficiency Optimization: Eliminates inefficient power consumption caused by overheating
  3. Performance Consistency: Unstable cooling can cause GPUs to use 50% of power budget while delivering only 25% performance

Advanced cooling systems can achieve energy savings ranging from 17% to 23% compared to traditional methods. This benchmark paradoxically shows that proper cooling investment dramatically improves overall energy efficiency.

Final Summary

Unstable cooling triggers thermal throttling that simultaneously degrades LLM performance while increasing power consumption, creating a dual efficiency loss. Stable liquid cooling achieves 17% performance gains and 14% power savings simultaneously, improving energy efficiency by 38%. In AI infrastructure, adequate cooling investment is essential for optimizing both performance and energy efficiency.

With Claude

Emergency Power System

This image shows a diagram of an Emergency Power System and the characteristics of each component.

Overall System Structure

At the top, the power grid is connected to servers/data centers, and three backup power options are presented in case of power supply interruption.

Three Backup Power Options

1. Generator

  • Long-term operation: Unlimited operation as long as fuel is available
  • Operation method: Engine rotation โ†’ Power generation
  • Type: Diesel engine generator
  • Disadvantages:
    • Start-up delay during instantaneous power outages
    • Start-up delay, noise, exhaust emissions
    • Periodic testing required
    • Requires integration with ATS (Automatic Transfer Switch)

2. Dynamic UPS

  • Features:
    • Uninterrupted/Long-term operation (until diesel engine starts)
    • Flywheel kinetic energy storage
    • Combined generator and diesel engine
  • Advantages: Seamless power supply without STS (Static Transfer Switch)
  • Disadvantages: High initial cost, large footprint, noise

DR (Diesel Rotary) UPS: A special form of Dynamic UPS that provides uninterrupted power through flywheel energy storage technology.

3. Static UPS

  • Operation time: Instantaneous/Short-term (typically 5-15 minutes)
  • Power quality: Clean power supply
  • Configuration: Battery(DC) โ†’ Inverter(AC) โ†’ Rectifier
  • Features:
    • Millisecond-level instant transfer
    • Battery life 3-5 years, replacement costs, heat generation issues

Key Characteristics Summary

Generators can operate long-term with fuel supply but have start-up delays, while Static UPS provides immediate power but only for short durations. Dynamic UPS (including DR UPS) is a hybrid solution that provides uninterrupted power through flywheel technology while enabling long-term operation when combined with diesel engines. In actual operations, it’s common to use these systems in combination, considering the advantages and disadvantages of each system.

With Claude

Power Circuit Breaker

This image presents a Power Circuit Breaker classification diagram showing the types and characteristics of electrical circuit breakers used in power systems.

System Overview

Power Flow: The diagram illustrates the electrical power path from power plant โ†’ transmission lines โ†’ circuit breakers โ†’ distribution panels.

Circuit Breaker Classification

The breakers are categorized by voltage levels and arc extinguishing methods:

Voltage Classifications

  • Very High Voltage: 66~800kV
  • High Voltage: 3.3~38kV
  • Using Voltage: 380~690V, 110~600V, 110~440V

Breaker Types and Arc Extinguishing Methods

  1. GIS/GCB (Gas Insulated Switchgear/Gas Circuit Breaker)
    • 66~800kV
    • Uses SF6 gas with high vacuum technology
  2. VCB (Vacuum Circuit Breaker)
    • 3.3~38kV
    • Vacuum arc extinguishing method
  3. ACB (Air Circuit Breaker)
    • 380~690V
    • Air + arc chute method
  4. MCCB (Molded Case Circuit Breaker)
    • 110~600V
    • Air + arc chute method
  5. ELCB (Earth Leakage Circuit Breaker)
    • 110~440V
    • Ground fault protection, no arc extinguishing

Key Safety Message

The diagram emphasizes “The bigger (Arc) the more dangerous” – highlighting that higher voltages require more sophisticated and safer arc extinguishing technologies.

Summary: This technical diagram systematically categorizes power circuit breakers from ultra-high voltage (800kV) to low voltage (110V) applications, demonstrating how arc extinguishing complexity increases with voltage levels. The chart serves as an educational reference showing that higher voltage systems require more advanced safety mechanisms like SF6 gas insulation, while lower voltage applications can use simpler air-based arc interruption methods.

With Claude