Key Features

1. Full CXL (Compute Express Link) Support

Standard interface for high-speed connections between CPUs, accelerators (GPU, FPGA), and memory expansion devices

Enables high-speed data transfer

2. Enhanced HMM (Heterogeneous Memory Management)

Heterogeneous memory management capabilities

Allows device drivers to map system memory pages to GPU page tables

Enables seamless GPU memory access

3. Enhanced P2P DMA & GPUDirect Support

Enables direct data exchange between GPUs

Direct communication with NVMe storage and network cards (GPUDirect RDMA)

Operates without CPU intervention for improved performance

4. DRM Scheduler & GPU Driver Improvements

Enhanced Direct Rendering Manager scheduling functionality

Active integration of latest drivers from major vendors: AMD (AMDGPU), Intel (i915/Xe), Intel Gaudi/Ponte Vecchio

NVIDIA still uses proprietary drivers

5. Advanced Async I/O via io_uring

Efficient I/O request exchange with kernel through Ring Buffer mechanism

Optimized asynchronous I/O performance

Summary

The Linux kernel now enables GPUs to independently access memory (CXL, HMM), storage, and network resources (P2P DMA, GPUDirect) without CPU involvement. Enhanced drivers from AMD, Intel, and improved schedulers optimize GPU workload management. These features collectively eliminate CPU bottlenecks, making the kernel highly efficient for large-scale AI and HPC workloads.

#LinuxKernel #GPU #AI #HPC #CXL #HMM #GPUDirect #P2PDMA #AMDGPU #IntelGPU #MachineLearning #HighPerformanceComputing #DRM #io_uring #HeterogeneousComputing #DataCenter #CloudComputing

With Claude

OOM (Out-of-Memory) Mechanism Explained

This diagram illustrates how the Linux OOM (Out-of-Memory) Killer operates when the system runs out of memory.

Main Process Flow (Left Side)

Request
- An application requests memory from the system
VM Commit (Reserve)
- The system reserves virtual memory
- Overcommit policy allows reservation beyond physical capacity
First Use (HW mapping) → Page Fault
- Hardware mapping occurs when memory is actually accessed
- Triggers a page fault for physical allocation
Reclaim/Compaction
- System attempts to free memory through cache, SLAB, writeback, and compaction
- Can be throttled via cgroup memory.high settings
Swap (if enabled)
- Uses swap space if available and enabled
OOM Killer
- As a last resort, terminates processes to free memory

Detailed Decision Points (Center & Right Columns)

Memory Request

App asks for memory
Controlled via brk/sbrk, mmap/munmap, mremap, and prlimit(RLIMIT_AS)

Virtual Address Allocation

Overcommit policy allows reservation beyond physical limits
Uses mmap (e.g., MAP_PRIVATE) with madvise(MADV_WILLNEED) hints

Physical Memory Allocation

Checks if zone watermarks are OK
If yes, maps a physical page; if no, attempts reclamation
Optional: mlock/munlock, mprotect, mincore

Any Other Free Memory Space?

Attempts to free memory via cache/SLAB/writeback/compaction
May throttle on cgroup memory.high
Hints: madvise(MADV_DONTNEED)

Swap Space?

Checks if swap space is available to offload anonymous pages
System: swapon/swapoff; App: mlock* (to avoid swap)

OOM Killer

Sends SIGKILL to selected victim when below watermarks or cgroup memory.max is hit
Victim selection based on badness/oom_score_adj
Configurable via /proc/<pid>/oom_score_adj and vm.panic_on_oom

Summary

When an app requests memory, Linux first reserves virtual address space (overcommit), then allocates physical memory on first use. If physical memory runs low, the system tries to reclaim pages from caches and swap, but when all else fails, the OOM Killer terminates processes based on their oom_score to free up memory and keep the system running.

#Linux #OOM #MemoryManagement #KernelPanic #SystemAdministration #DevOps #OperatingSystem #Performance #MemoryOptimization #LinuxKernel

With Claude

Tag: LinuxKernel

Linux kernel for GPU Workload

Linux Kernel GPU Workload Support Features

Goal: Maximize Memory Efficiency & Data Transfer

Key Features

1. Full CXL (Compute Express Link) Support

2. Enhanced HMM (Heterogeneous Memory Management)

3. Enhanced P2P DMA & GPUDirect Support

4. DRM Scheduler & GPU Driver Improvements

5. Advanced Async I/O via io_uring

Summary

OOM (Out-of-Memory) Works

OOM (Out-of-Memory) Mechanism Explained

Main Process Flow (Left Side)

Detailed Decision Points (Center & Right Columns)

Summary