GPU vs NPU on Deep learning

Posted on 2025-03-142025-03-14 by lechuck park

This diagram illustrates the differences between GPU and NPU from a deep learning perspective:

GPU (Graphic Process Unit):

Originally developed for 3D game rendering
In deep learning, it’s utilized for parallel processing of vast amounts of data through complex calculations during the training process
Characterized by “More Computing = Bigger Memory = More Power,” requiring high computing power
Processes big data and vectorizes information using the “Everything to Vector” approach
Stores learning results in Vector Databases for future use

NPU (Neuron Process Unit):

Retrieves information from already trained Vector DBs or foundation models to generate answers to questions
This process is called “Inference”
While the training phase processes all data in parallel, the inference phase only searches/infers content related to specific questions to formulate answers
Performs parallel processing similar to how neurons function

In conclusion, GPUs are responsible for processing enormous amounts of data and storing learning results in vector form, while NPUs specialize in the inference process of generating actual answers to questions based on this stored information. This relationship can be summarized as “training creates and stores vast amounts of data, while inference utilizes this at the point of need.”

With Claude

AI in the data center

Posted on 2025-03-13 by lechuck park

AI in the Data Center

This diagram titled “AI in the Data Center” illustrates two key transformational elements that occur when AI technology is integrated into data centers:

1. Computing Infrastructure Changes

AI workloads powered by GPUs become central to operations
Transition from traditional server infrastructure to GPU-centric computing architecture
Fundamental changes in data center hardware configuration and network connectivity

2. Management Infrastructure Changes

Increased requirements for power (“More Power!!”) and cooling (“More Cooling!!”) to support GPU infrastructure
Implementation of data-driven management systems utilizing AI technology
AI-based analytics and management for maintaining stability and improving efficiency

These two changes are interconnected, visually demonstrating how AI technology not only revolutionizes the computing capabilities of data centers but also necessitates innovation in management approaches to effectively operate these advanced systems.

with Claude

DLSS

Posted on 2025-02-20 by lechuck park

DLSS is a graphics processing technology that consists of several key steps:

Initial 3D Data

The process begins with 3D model/data input

Rendering Process

Uses GPU to render 3D data into 2D screen output
Notes that higher resolution rendering requires more computing power

Low Resolution Stage

Initially processes images at a lower resolution
This helps conserve computing resources

DLSS Processing

Utilizes AI models and specialized hardware
Employs deep learning technology to enhance image quality
Combines lower computing requirements with AI processing

Final Output

Upscales the low resolution image to appear high resolution
Delivers high-quality visual output that looks like native high resolution

The key advantage of DLSS is its ability to produce high-quality graphics while using less computing power. This technology is particularly valuable in applications requiring real-time rendering, such as gaming, where it can maintain visual quality while improving performance.

This innovative approach effectively balances the trade-off between visual quality and computational resources, making high-quality graphics more accessible on a wider range of hardware.

With Claude

What is The Next?

Posted on 2025-01-27 by lechuck park

With Claude
a comprehensive interpretation of the image and its concept of “Rapid application evolution”:

The diagram illustrates the parallel evolution of both hardware infrastructure and software platforms, which has driven rapid application development and user experiences:

Hardware Infrastructure Evolution:

PC/Desktop → Mobile Devices → GPU
Represents the progression of core computing power platforms
Each transition brought fundamental changes in how users interact with technology

Software Platform Evolution:

Windows OS → App Store → AI/LLM
Shows the evolution of application ecosystems
Each platform created new possibilities for user applications

The symbiotic relationship between these two axes:

PC Era: Integration of PC hardware with Windows OS
Mobile Era: Combination of mobile devices with app store ecosystems
AI Era: Marriage of GPU infrastructure with LLM/AI platforms

Each transition has led to exponential growth in application capabilities and user experiences, with hardware and software platforms developing in parallel and reinforcing each other.

Future Outlook:

“Who is the winner of new platform?”

Current competition between Google, MS, Apple/Meta, OpenAI
Platform leadership in the AI era remains undecided
Possibility for new players to emerge

“Quantum is Ready?”

Suggests quantum computing as the next potential hardware revolution
Implies the possibility of new software platforms emerging to leverage quantum capabilities
Continues the pattern of hardware-software co-evolution

This cyclical pattern of hardware-software evolution suggests that we’ll continue to see new infrastructure innovations driving platform development, and vice versa. Each cycle has dramatically expanded the possibilities for applications and user experiences, and this trend is likely to continue with future technological breakthroughs.

The key insight is that major technological leaps happen when both hardware infrastructure and software platforms evolve together, creating new opportunities for application development and user experiences that weren’t previously possible.

High Computing Room Requires

Posted on 2025-01-072025-01-07 by lechuck park

With a Claude’s Help
Core Challenge:

High Variability in GPU/HPC Computing Room

Dramatic fluctuations in computing loads
Significant variations in power consumption
Changing cooling requirements

Solution Approach:

Establishing New Data Collection Systems

High Resolution Data: More granular, time-based data collection
New Types of Data Acquisition
Identification of previously overlooked data points

New Correlation Analysis

Understanding interactions between computing/power/cooling
Discovering hidden patterns among variables
Deriving predictable correlations

Objectives:

Managing variability through AI-based analysis
Enhancing system stability
Improving overall facility operational efficiency

In essence, the diagram emphasizes that to address the high variability challenges in GPU/HPC environments, the key strategy is to collect more precise and new types of data, which enables the discovery of new correlations, ultimately leading to improved stability and efficiency.

This approach specifically targets the inherent variability of GPU/HPC computing rooms by focusing on data collection and analysis as the primary means to achieve better operational outcomes.

Network for GPUs

Posted on 2024-12-232024-12-23 by lechuck park

with a Claude’s Help
The network architecture demonstrates 3 levels of connectivity technologies:

NVLink (Single node Parallel processing)

Technology for directly connecting GPUs within a single node
Supports up to 256 GPU connections
Physical HBM (High Bandwidth Memory) sharing
Optimized for high-performance GPU parallel processing within individual servers

NVSwitch

Switching technology that extends NVLink limitations
Provides logical HBM sharing
Key component for large-scale AI model operations
Enables complete mesh network configuration between GPU groups
Efficiently connects multiple GPU groups within One Box Server
Targets large AI model workloads

InfiniBand

Network technology for server clustering
Supports RDMA (Remote Direct Memory Access)
Used for distributed computing and HPC (High Performance Computing) tasks
Implements hierarchical network topology
Enables large-scale cluster configuration across multiple servers
Focuses on distributed and HPC workloads

This 3-tier architecture provides scalability through:

GPU parallel processing within a single server (NVLink)
High-performance connectivity between GPU groups within a server (NVSwitch)
Cluster configuration between multiple servers (InfiniBand)

The architecture enables efficient handling of various workload scales, from small GPU tasks to large-scale distributed computing. It’s particularly effective for maximizing GPU resource utilization in large-scale AI model training and HPC workloads.

Key Benefits:

Hierarchical scaling from single node to multi-server clusters
Efficient memory sharing through both physical and logical HBM
Flexible topology options for different computing needs
Optimized for both AI and high-performance computing workloads
Comprehensive solution for GPU-based distributed computing

This structure provides a complete solution from single-server GPU operations to complex distributed computing environments, making it suitable for a wide range of high-performance computing needs.

Evolutions

Posted on 2024-12-062024-12-05 by lechuck park

From Claude with some prompting
Summarize the key points from the image :

Manually Control:
- This stage involves direct human control of the system.
- Human intervention and judgment are crucial at this stage.
Data Driven:
- This stage uses data analysis to control the system.
- Data collection and analysis are the core elements.
AI Control:
- This stage leverages artificial intelligence technologies to control the system.
- Technologies like machine learning and deep learning are utilized.
Virtual:
- This stage involves the implementation of systems in a virtual environment.
- Simulation and digital twin technologies are employed.
Massive Data:
- This stage emphasizes the importance of collecting, processing, and utilizing vast amounts of data.
- Technologies like big data and cloud computing are utilized.

Throughout this progression, there is a gradual shift towards automation and increased intelligence. The development of data and AI technologies plays a critical role, while the use of virtual environments and massive data further accelerates this technological evolution.