Basic Power Operations

This image illustrates “Basic Power Operations,” showing the path and processes of electricity flowing from source to end-use.

The upper diagram includes the following key components from left to right:

  • Power Source/Intake – High voltage for efficient delivery with high warning
  • Transformer – Performs voltage step-down
  • Generator and Fuel Tank – Backup Power
  • Transformer #2 – Additional voltage step-down
  • UPS/Battery – 2nd Backup Power
  • PDU/TOB – Supplies power to the final servers

The diagram displays two backup power systems:

  • Backup power (Full outage) – Functions during complete power failures with backup time provided by Oil Tank with Generators
  • Backup Power (Partial outage) – Operates during partial outages with backup time provided by the Battery with UPSs

The simplified diagram at the bottom summarizes the complex power system into these fundamental elements:

  1. Source – Origin point of power
  2. Step-down – Voltage conversion
  3. Backup – Emergency power supply
  4. Use – Final power consumption

Throughout all stages of this process, two critical functions occur continuously:

  • Transmit – The ongoing process of transferring power that happens between and during all steps
  • Switching/Block – Control points distributed throughout the system that direct, regulate, or block power flow as needed

This demonstrates that seemingly complex power systems can be distilled into these essential concepts, with transmission and switching/blocking functioning as integral operations that connect and control all stages of the power delivery process.

WIth Claude

Data in AI DC

This image illustrates a data monitoring system for an AI data center server room. Titled “Data in AI DC Server Room,” it depicts the relationships between key elements being monitored in the data center.

The system consists of four main components, each with detailed metrics:

  1. GPU Workload – Right center
    • Computing Load: GPU utilization rate (%) and type of computational tasks (training vs. inference)
    • Power Consumption: Real-time power consumption of each GPU (W) – Example: NVIDIA H100 GPU consumes up to 700W
    • Workload Pattern: Periodicity of workload (peak/off-peak times) and predictability
    • Memory Usage: GPU memory usage patterns (e.g., HBM3 memory bandwidth usage)
  2. Power Infrastructure – Left
    • Power Usage: Real-time power output and efficiency of UPS, PDU, and transformers
    • Power Quality: Voltage, frequency stability, and power loss rate
    • Power Capacity: Types and proportions of supplied energy, ensuring sufficient power availability for current workload operations
  3. Cooling System – Right
    • Cooling Device Status: Air-cooling fan speed (RPM), liquid cooling pump flow rate (LPM), and coolant temperature (°C)
    • Environmental Conditions: Data center internal temperature, humidity, air pressure, and hot/cold zone temperatures – critical for server operations
    • Cooling Efficiency: Power Usage Effectiveness (PUE) and proportion of power consumed by the cooling system
  4. Server/Rack – Top center
    • Rack Power Density: Power consumption per rack (kW) – Example: GPU server racks range from 30 to 120 kW
    • Temperature Profile: Temperature (°C) of GPUs, CPUs, memory modules, and heat distribution
    • Server Status: Operational state of servers (active/standby) and workload distribution status

The workflow sequence indicated at the bottom of the diagram represents:

  1. ① GPU WORK: Initial execution of AI workloads – GPU computational tasks begin, generating system load
  2. ② with POWER USE: Increased power supply for GPU operations – Power demand increases with GPU workload, and power infrastructure responds accordingly
  3. ③ COOLING WORK: Cooling processes activated in response to heat generation
    • Sensing: Temperature sensors detect server and rack thermal conditions, monitoring hot/cold zone temperature differentials
    • Analysis: Analysis of collected temperature data, determining cooling requirements
    • Action: Adjustment of cooling equipment (fan speed, coolant flow rate, etc. automatically regulated)
  4. ④ SERVER OK: Maintenance of normal server operation through proper power supply and cooling – Temperature and power remain stable, allowing GPU workloads to continue running under optimal conditions

The arrows indicate data flow and interrelationships between systems, showing connections from power infrastructure to servers and from cooling systems to servers. This integrated system enables efficient and stable data center operation by detecting increased power demand and heat generation from GPU workloads, and adjusting cooling systems in real-time accordingly.

With Claude

Key Factors in DC

This image is a diagram showing the key components of a Data Center (DC).

The diagram visually represents the core elements that make up a data center:

  1. Building – Shown on the left with a building icon, representing the physical structure of the data center.
  2. Core infrastructure elements (in the central blue area):
    • Network – Data communication infrastructure
    • Computing – Servers and processing equipment
    • Power – Energy supply systems
    • Cooling – Temperature regulation systems
  3. The central orange circle represents server racks, which is connected to power supply units (transformers), cooling equipment, and network devices.
  4. Digital Service – Displayed on the right, representing the end services that all this infrastructure ultimately delivers.

This diagram illustrates how a data center flows from a physical building through core elements like network, computing, power, and cooling to ultimately provide digital services.

With Claude

Modular vs Rack Cluster DC

This image illustrates a comparison between two main data center architecture approaches: “Rack Cluster DC” and “Modular DC.”

On the left side, there are basic infrastructure elements depicted, representing power supply components (transformers, generators), cooling systems, and network equipment. On the right side, two different data center configuration methods are presented.

Rack Cluster Data Center (Left)

  • Features: “Dense Computing, High Power and Cooling, Scaling Unit”
  • Organized at the rack level within a cluster
  • Shows structure connected by red solid and dotted lines
  • Multiple server racks arranged in a regular pattern

Modular Data Center (Right)

  • Features: “Modular Design, Flexible Scaling, Rapid Deployment”
  • Organized at the module level, including power, cooling, and racks as integrated units
  • Shows structure connected by blue solid and dotted lines
  • Functional elements (power, cooling, servers) integrated into single modules

Both approaches display expansion units labeled “NEW” at the bottom, demonstrating the scalability of each approach.

This diagram visually compares the structural differences, scalability, and component arrangements between the traditional rack cluster approach and the modular approach to data center design.

With Claude

Connected in AI DC

This diagram titled “Data is Connected in AI DC” illustrates the relationships starting from workload scheduling in an AI data center.

Key aspects of the diagram:

  1. The entire system’s interconnected relationships begin with workload scheduling.
  2. The diagram divides the process into two major phases:
    • Deterministic phase: Primarily concerned with power requirements that operate in a predictable, planned manner.
    • Statistical phase: Focused on cooling requirements, where predictions vary based on external environmental conditions.
  3. The “Prophet Commander” at the workload scheduling stage can predict/direct future requirements, allowing the system to prepare power (1.1 Power Ready!!) and cooling (1.2 Cooling Ready!!) in advance.
  4. Process flow:
    • Job allocation from workload scheduling to GPU cluster
    • GPUs request and receive power
    • Temperature rises due to operations
    • Cooling system detects temperature and activates cooling

This diagram illustrates the interconnected workflow in AI data centers, beginning with workload scheduling that enables predictive resource management. The process flows from deterministic power requirements to statistical cooling needs, with the “Prophet Commander” enabling proactive preparation of power and cooling resources. This integrated approach demonstrates how workload prediction can drive efficient resource allocation throughout the entire AI data center ecosystem.

With Claude

Data Center

This image explains the fundamental concept and function of a data center:

  1. Left: “Data in a Building” – Illustrates a data center as a physical building that houses digital data (represented by binary code of 0s and 1s).
  2. Center: “Data Changes” – With the caption “By Energy,” showing how data is processed and transformed through the consumption of energy.
  3. Right: “Connect by Data” – Demonstrates how processed data from the data center connects to the outside world, particularly the internet, forming networks.

This diagram visualizes the essential definition of a data center – a physical building that stores data, consumes energy to process that data, and plays a crucial role in connecting this data to the external world through the internet.

With Claude

DC growth

Data centers have expanded rapidly from the early days of cloud computing to the explosive growth driven by AI and ML.
Initially, growth was steady as enterprises moved to the cloud. However, with the rise of AI and ML, demand for powerful GPU-based computing has surged.
The global data center market, which grew at a CAGR of around 10% during the cloud era, is now accelerating to an estimated CAGR of 15–20% fueled by AI workloads.
This shift is marked by massive parallel processing with GPUs, transforming data centers into AI factories.

With ChatGPT