PIM processing-in-memory

This image illustrates the evolution of computing architectures, comparing three major computing paradigms:

1. General Computing (Von Neumann Architecture)

  • Traditional CPU-memory structure
  • CPU and memory are separated, processing complex instructions
  • Data and instructions move between memory and CPU

2. GPU Computing

  • Collaborative structure between CPU and GPU
  • GPU performs simple mathematical operations with massive parallelism
  • Provides high throughput
  • Uses new types of memory specialized for AI computing

3. PIM (Processing-in-Memory)

The core focus of the image, PIM features the following characteristics:

Core Concept:

  • “Simple Computing” approach that performs operations directly within new types of memory
  • Integrated structure of memory and processor

Key Advantages:

  • Data Movement Minimization: Reduces in-memory copy/reordering operations
  • Parallel Data Processing: Parallel processing of matrix/vector operations
  • Repetitive Simple Operations: Optimized for add/multiply/compare operations
  • “Simple Computing”: Efficient operations without complex control logic

PIM is gaining attention as a next-generation computing paradigm that can significantly improve energy efficiency and performance compared to existing architectures, particularly for tasks involving massive repetitive simple operations such as AI/machine learning and big data analytics.

With Claude

Temperate Prediction in DC (II) – The start and The Target

This image illustrates the purpose and outcomes of temperature prediction approaches in data centers, showing how each method serves different operational needs.

Purpose and Results Framework

CFD Approach – Validation and Design Purpose

Input:

  • Setup Data: Physical infrastructure definitions (100% RULES-based)
  • Pre-defined spatial, material, and boundary conditions

Process: Physics-based simulation through computational fluid dynamics

Results:

  • What-if (One Case) Simulation: Theoretical scenario testing
  • Checking a Limitation: Validates whether proposed configurations are “OK or not”
  • Used for design validation and capacity planning

ML Approach – Operational Monitoring Purpose

Input:

  • Relation (Extended) Data: Real-time operational data starting from workload metrics
  • Continuous data streams: Power, CPU, Temperature, LPM/RPM

Process: Data-driven pattern learning and prediction

Results:

  • Operating Data: Real-time operational insights
  • Anomaly Detection: Identifies unusual patterns or potential issues
  • Used for real-time monitoring and predictive maintenance

Key Distinction in Purpose

CFD: “Can we do this?” – Validates design feasibility and limits before implementation

  • Answers hypothetical scenarios
  • Provides go/no-go decisions for infrastructure changes
  • Design-time tool

ML: “What’s happening now?” – Monitors current operations and predicts immediate future

  • Provides real-time operational intelligence
  • Enables proactive issue detection
  • Runtime operational tool

The diagram shows these are complementary approaches: CFD for design validation and ML for operational excellence, each serving distinct phases of data center lifecycle management.

With Claude

Temperate Prediction in DC

Overall Structure

Top: CFD (Computational Fluid Dynamics) based approach Bottom: ML (Machine Learning) based approach

CFD Approach (Top)

  • Basic Setup:
    • Spatial Definition & Material Properties: Physical space definition of the data center and material characteristics (servers, walls, air, etc.)
    • Boundary Conditions: Setting boundary conditions (inlet/outlet temperatures, airflow rates, heat sources, etc.)
  • Processing:
    • Configuration + Physical Rules: Application of physical laws (heat transfer equations, fluid dynamics equations, etc.)
    • Heat Flow: Heat flow calculations based on defined conditions
  • Output: Heat + Air Flow Simulation (physics-based heat and airflow simulation)

ML Approach (Bottom)

  • Data Collection:
    • Real-time monitoring through Metrics/Data Sensing
    • Operational data: Power (Kw), CPU (%), Workload, etc.
    • Actual temperature measurements through Temperature Sensing
  • Processing: Pattern learning through Machine Learning algorithms
  • Output: Heat (with Location) Prediction (location-specific heat prediction)

Key Differences

CFD Method: Theoretical calculation through physical laws using physical space definitions, material properties, and boundary conditions as inputs ML Method: Data-driven approach that learns from actual operational data and sensor information for prediction

The key distinction is that CFD performs simulation from predefined physical conditions, while ML learns from actual operational data collected during runtime to make predictions.

With Claude

AI Workload

This image visualizes the three major AI workload types and their characteristics in a comprehensive graph.

Graph Structure Analysis

Visualization Framework:

  • Y-axis: AI workload intensity (requests per hour, FLOPS, CPU/GPU utilization, etc.)
  • X-axis: Time progression
  • Stacked Area Chart: Shows the proportion and changes of three workload types within the total AI system load

Three AI Workload Characteristics

1. Learning – Blue Area

Properties: Steady, Controllable, Planning

  • Located at the bottom with a stable, wide area
  • Represents model training processes with predictable and plannable resource usage
  • Maintains consistent load over extended periods

2. Reasoning – Yellow Area

Properties: Fluctuating, Unpredictable, Optimizing!!!

  • Middle layer showing dramatic fluctuations
  • Involves complex decision-making and logical reasoning processes
  • Most unpredictable workload requiring critical optimization
  • Load varies significantly based on external environmental changes

3. Inference – Green Area

Properties: On-device Side, Low Latency

  • Top layer with irregular patterns
  • Executes on edge devices or user terminals
  • Service workload requiring real-time responses
  • Low latency is the core requirement

Key Implications

Differentiated Resource Management Strategies Required:

  • Learning: Stable long-term planning and infrastructure investment
  • Reasoning: Dynamic scaling and optimization technology focus
  • Inference: Edge optimization and response time improvement

This graph provides crucial insights demonstrating that customized resource allocation strategies considering the unique characteristics of each workload type are essential for effective AI system operations.

This visualization emphasizes that AI workloads are not monolithic but consist of distinct components with varying demands, requiring sophisticated resource management approaches to handle their collective and individual requirements effectively.

With Claude

AI Platform eating all

This diagram illustrates the fundamental paradigm shift in service development across three platform evolution stages.

Platform Evolution:

  1. Cloud Platform
    • Server-Client separation with cloud infrastructure development
    • Developers directly build servers and databases to provide services
  2. SDK Platform
    • Client-side evolution based on specific OS/SDK ecosystems (iOS, Android, Windows)
    • Each platform provides development environments and tools
    • This stage generated “Vast and numerous internet services” – an explosive growth of diverse internet services
  3. AI Platform – “Eating ALL”
    • Fundamental paradigm shift: Instead of developers building individual services, the AI platform itself generates and provides services
    • “All Services by AI”: AI directly provides the diverse services that developers previously created
    • Multimodal capabilities: AI can understand and process all human senses and communication methods (language, vision, audio), enabling all functionalities through natural language conversation without specialized apps or services

Key Transformation:

  • Traditional: Developer → Platform → Service Development → User
  • AI Era: User → AI Platform → Instant Service Generation/Provision

This represents not just tool evolution, but a fundamental reorganization of the service ecosystem where countless specialized services converge into one unified AI platform due to AI’s universal cognitive abilities. The AI platform becomes a total service provider, essentially “eating” all existing service categories.

With Claude