Lechuck Park

PIM processing-in-memory

Posted on 2025-07-302025-07-29 by lechuck park

This image illustrates the evolution of computing architectures, comparing three major computing paradigms:

1. General Computing (Von Neumann Architecture)

Traditional CPU-memory structure
CPU and memory are separated, processing complex instructions
Data and instructions move between memory and CPU

2. GPU Computing

Collaborative structure between CPU and GPU
GPU performs simple mathematical operations with massive parallelism
Provides high throughput
Uses new types of memory specialized for AI computing

3. PIM (Processing-in-Memory)

The core focus of the image, PIM features the following characteristics:

Core Concept:

“Simple Computing” approach that performs operations directly within new types of memory
Integrated structure of memory and processor

Key Advantages:

Data Movement Minimization: Reduces in-memory copy/reordering operations
Parallel Data Processing: Parallel processing of matrix/vector operations
Repetitive Simple Operations: Optimized for add/multiply/compare operations
“Simple Computing”: Efficient operations without complex control logic

PIM is gaining attention as a next-generation computing paradigm that can significantly improve energy efficiency and performance compared to existing architectures, particularly for tasks involving massive repetitive simple operations such as AI/machine learning and big data analytics.

With Claude

Temperate Prediction in DC (II) – The start and The Target

Posted on 2025-07-29 by lechuck park

This image illustrates the purpose and outcomes of temperature prediction approaches in data centers, showing how each method serves different operational needs.

Purpose and Results Framework

CFD Approach – Validation and Design Purpose

Input:

Setup Data: Physical infrastructure definitions (100% RULES-based)
Pre-defined spatial, material, and boundary conditions

Process: Physics-based simulation through computational fluid dynamics

Results:

What-if (One Case) Simulation: Theoretical scenario testing
Checking a Limitation: Validates whether proposed configurations are “OK or not”
Used for design validation and capacity planning

ML Approach – Operational Monitoring Purpose

Input:

Relation (Extended) Data: Real-time operational data starting from workload metrics
Continuous data streams: Power, CPU, Temperature, LPM/RPM

Process: Data-driven pattern learning and prediction

Results:

Operating Data: Real-time operational insights
Anomaly Detection: Identifies unusual patterns or potential issues
Used for real-time monitoring and predictive maintenance

Key Distinction in Purpose

CFD: “Can we do this?” – Validates design feasibility and limits before implementation

Answers hypothetical scenarios
Provides go/no-go decisions for infrastructure changes
Design-time tool

ML: “What’s happening now?” – Monitors current operations and predicts immediate future

Provides real-time operational intelligence
Enables proactive issue detection
Runtime operational tool

The diagram shows these are complementary approaches: CFD for design validation and ML for operational excellence, each serving distinct phases of data center lifecycle management.

With Claude

Temperate Prediction in DC

Posted on 2025-07-28 by lechuck park

Overall Structure

Top: CFD (Computational Fluid Dynamics) based approach Bottom: ML (Machine Learning) based approach

CFD Approach (Top)

Basic Setup:
- Spatial Definition & Material Properties: Physical space definition of the data center and material characteristics (servers, walls, air, etc.)
- Boundary Conditions: Setting boundary conditions (inlet/outlet temperatures, airflow rates, heat sources, etc.)
Processing:
- Configuration + Physical Rules: Application of physical laws (heat transfer equations, fluid dynamics equations, etc.)
- Heat Flow: Heat flow calculations based on defined conditions
Output: Heat + Air Flow Simulation (physics-based heat and airflow simulation)

ML Approach (Bottom)

Data Collection:
- Real-time monitoring through Metrics/Data Sensing
- Operational data: Power (Kw), CPU (%), Workload, etc.
- Actual temperature measurements through Temperature Sensing
Processing: Pattern learning through Machine Learning algorithms
Output: Heat (with Location) Prediction (location-specific heat prediction)

Key Differences

CFD Method: Theoretical calculation through physical laws using physical space definitions, material properties, and boundary conditions as inputs ML Method: Data-driven approach that learns from actual operational data and sensor information for prediction

The key distinction is that CFD performs simulation from predefined physical conditions, while ML learns from actual operational data collected during runtime to make predictions.

With Claude

Simple – Super REPEAT

Posted on 2025-07-27 by lechuck park

Computing is shifting from complex logic to massive parallel processing of simple matrix operations, especially in AI. As computation becomes faster, memory—its speed, structure, and reliability—becomes the new bottleneck and the most critical resource.

Summer Vacation

Posted on 2025-07-26 by lechuck park

AI Workload

Posted on 2025-07-252025-07-25 by lechuck park

This image visualizes the three major AI workload types and their characteristics in a comprehensive graph.

Graph Structure Analysis

Visualization Framework:

Y-axis: AI workload intensity (requests per hour, FLOPS, CPU/GPU utilization, etc.)
X-axis: Time progression
Stacked Area Chart: Shows the proportion and changes of three workload types within the total AI system load

Three AI Workload Characteristics

1. Learning – Blue Area

Properties: Steady, Controllable, Planning

Located at the bottom with a stable, wide area
Represents model training processes with predictable and plannable resource usage
Maintains consistent load over extended periods

2. Reasoning – Yellow Area

Properties: Fluctuating, Unpredictable, Optimizing!!!

Middle layer showing dramatic fluctuations
Involves complex decision-making and logical reasoning processes
Most unpredictable workload requiring critical optimization
Load varies significantly based on external environmental changes

3. Inference – Green Area

Properties: On-device Side, Low Latency

Top layer with irregular patterns
Executes on edge devices or user terminals
Service workload requiring real-time responses
Low latency is the core requirement

Key Implications

Differentiated Resource Management Strategies Required:

Learning: Stable long-term planning and infrastructure investment
Reasoning: Dynamic scaling and optimization technology focus
Inference: Edge optimization and response time improvement

This graph provides crucial insights demonstrating that customized resource allocation strategies considering the unique characteristics of each workload type are essential for effective AI system operations.

This visualization emphasizes that AI workloads are not monolithic but consist of distinct components with varying demands, requiring sophisticated resource management approaches to handle their collective and individual requirements effectively.

With Claude