per Watt with AI

Posted on 2025-07-162025-07-15 by lechuck park

This image titled “per Watt with AI” is a diagram explaining the paradigm shift in power efficiency following the AI era, particularly after the emergence of LLMs.

Overall Context

Core Structure of AI Development:

Machine Learning = Computing = Using Power
The equal signs (=) indicate that these three elements are essentially the same concept. In other words, AI machine learning inherently means large-scale computing, which inevitably involves power consumption.

Characteristics of LLMs: As AI, particularly LLMs, have proven their effectiveness, tremendous progress has been made. However, due to their technical characteristics, they have the following structure:

Huge Computing: Massively parallel processing of simple tasks
Huge Power: Enormous power consumption due to this parallel processing
Huge Cost: Power costs and infrastructure expenses

Importance of Power Efficiency Metrics

With hardware advancements making this approach practically effective, power consumption has become a critical issue affecting even the global ecosystem. Therefore, power is now used as a performance indicator for all operations.

Key Power Efficiency Metrics

Performance-related:

FLOPs/Watt: Floating-point operations per watt
Inferences/Watt: Number of inferences processed per watt
Training/Watt: Training performance per watt

Operations-related:

Workload/Watt: Workload processing capacity per watt
Data/Watt: Data processing capacity per watt
IT Work/Watt: IT work processing capacity per watt

Infrastructure-related:

Cooling/Watt: Cooling efficiency per watt
Water/Watt: Water usage efficiency per watt

This diagram illustrates that in the AI era, power efficiency has become the core criterion for all performance evaluations, transcending simple technical metrics to encompass environmental, economic, and social perspectives.

With Claude

Learning , Reasoning, Inference

Posted on 2025-07-152025-07-14 by lechuck park

This image illustrates the three core processes of AI LLMs by drawing parallels to human learning and cognitive processes.

Learning

Depicted as a wise elderly scholar reading books in a library
Represents the lifelong process of absorbing knowledge and experiences accumulated by humanity over generations
The bottom icons show data accumulation and knowledge storage processes
Meaning: Just as AI learns human language and knowledge through vast text data, humans also build knowledge throughout their lives through continuous learning and experience

Reasoning

Shows a character deep in thought, surrounded by mathematical formulas
Represents the complex mental process of confronting a problem and searching for solutions through internal contemplation
The bottom icons symbolize problem analysis and processing stages
Meaning: The human cognitive process of using learned knowledge to engage in logical thinking and analysis to solve problems

Inference

Features a character confidently exclaiming “THE ANSWER IS CLEAR!”
Expresses the confidence and decisiveness when finally finding an answer after complex thought processes
The bottom checkmark signifies reaching a final conclusion
Meaning: The human act of ultimately speaking an answer or making a behavioral decision through thought and analysis

These three stages visually demonstrate how AI processes information in a manner similar to the natural human sequence of learning → thinking → conclusion, connecting AI’s technical processes to familiar human cognitive patterns.

With Claude

Digital Twin with LLM

Posted on 2025-06-192025-06-18 by lechuck park

This image demonstrates the revolutionary applicability of Digital Twin enhanced by LLM integration.

Three Core Components of Digital Twin

Digital Twin consists of three essential elements:

Modeling – Creating digital replicas of physical objects
Data – Real-time sensor data and operational information collection
Simulation – Predictive analysis and scenario testing

Traditional Limitations and LLM’s Revolutionary Solution

Previous Challenges: Modeling results were expressed only through abstract concepts like “Visual Effect” and “Easy to view of complex,” making practical interpretation difficult.

LLM as a Game Changer:

Multimodal Interpretation: Transforms complex 3D models, data patterns, and simulation results into intuitive natural language explanations
Retrieval Interpretation: Instantly extracts key insights from vast datasets and converts them into human-understandable formats
Human Interpretation Resource Replacement: LLM provides expert-level analytical capabilities, enabling continuous 24/7 monitoring

Future Value of Digital Twin

With LLM integration, Digital Twin evolves from a simple visualization tool into an intelligent decision-making partner. This becomes the core driver for maximizing operational efficiency and continuous innovation, accelerating digital transformation across industries.

Ultimately, this diagram emphasizes that LLM is the key technology that unlocks the true potential of Digital Twin, demonstrating its necessity and serving as the foundation for sustained operational improvement and future development.

With Claude

Personal(User/Expert) Data Service

Posted on 2025-05-272025-05-26 by lechuck park

System Overview

The Personal Data Service is an open expert RAG service platform based on MCP (Model Context Protocol). This system creates a bidirectional ecosystem where both users and experts can benefit mutually, enhancing accessibility to specialized knowledge and improving AI service quality.

Core Components

1. User Interface (Left Side)

LLM Model Selection: Users can choose their preferred language model or MoE (Mixture of Experts)
Expert Selection: Select domain-specific experts for customized responses
Prompt Input: Enter specific questions or requests

2. Open MCP Platform (Center)

Integrated Management Hub: Connects and coordinates all system components
Request Processing: Matches user requests with appropriate expert RAG systems
Service Orchestration: Manages and optimizes the entire workflow

3. LLM Service Layer (Right Side)

Multi-LLM Support: Integration with various AI model services
OAuth Authentication: Direct user selection of paid/free services
Vendor Neutrality: Open architecture independent of specific AI services

4. Expert RAG Ecosystem (Bottom)

Specialized Data Registration: Building expert-specific knowledge databases through RAG
Quality Management System: Ensuring reliability through evaluation and reputation management
Historical Logs: Continuous quality improvement through service usage records

Key Features

Bidirectional Ecosystem: Users obtain expert answers while experts monetize their knowledge
Open Architecture: Scalable platform based on MCP standards
Quality Assurance: Expert and answer quality management through evaluation systems
Flexible Integration: Compatibility with various LLM services
Autonomous Operation: Direct data management and updates by experts

With Claude

“Encoder/Decoder” in a Transformer

Posted on 2025-05-232025-05-22 by lechuck park

Transformer Encoder-Decoder Architecture Explanation

This image is a diagram that visually explains the encoder-decoder structure of the Transformer model.

Encoder Section (Top, Green)

Purpose: Process “questions” by converting input text into vectors

Processing Steps:

Tokenize input tokens and apply positional encoding
Capture relationships between tokens using multi-head attention
Extract meaning through feed-forward neural networks
Stabilize with layer normalization

Decoder Section (Bottom, Purple)

Purpose: Generate new stories from text

Processing Steps:

Apply positional encoding to output tokens
Masked Multi-Head Self-Attention (Key Difference)
- Mask future tokens using “Only Next” approach
- Constraint for sequential generation
Reference input information through encoder-decoder attention
Apply feed-forward neural networks and layer normalization

Key Features

Encoder: Processes entire input at once to understand context
Decoder: References only previous tokens to sequentially generate new tokens
Attention Mechanism: Focuses on highly relevant parts for information processing

This is the core architecture used in various natural language processing tasks such as machine translation, text summarization, and question answering.

With Claude

“Positional Encoding” in a Transformer

Posted on 2025-05-212025-05-20 by lechuck park

Positional Encoding in Transformer Models

The Problem: Loss of Sequential Information

Transformer models use an attention mechanism that enables each token to interact with all other tokens in parallel, regardless of their positions in the sequence. While this parallel processing offers computational advantages, it comes with a significant limitation: the model loses all information about the sequential order of tokens. This means that without additional mechanisms, a Transformer cannot distinguish between sequences like “I am right” and “Am I right?” despite their different meanings.

The Solution: Positional Encoding

To address this limitation, Transformers implement positional encoding:

Definition: Positional encoding adds position-specific information to each token’s embedding, allowing the model to understand sequence order.
Implementation: The standard approach uses sinusoidal functions (sine and cosine) with different frequencies to create unique position vectors:
- For each position in the sequence, a unique vector is generated
- These vectors are calculated using sin() and cos() functions
- The position vectors are then added to the corresponding token embeddings
Mathematical properties:
- Each position has a unique encoding
- The encodings have a consistent pattern that allows the model to generalize to sequence lengths not seen during training
- The relative positions of tokens can be expressed as a linear function of their encodings

Integration with Attention Mechanism

The combination of positional encoding with the attention mechanism enables Transformers to process tokens in parallel while maintaining awareness of their sequential relationships:

Context-aware processing: Each attention head learns to interpret the positional information within its specific context.
Multi-head flexibility: Different attention heads (A style, B style, C style) can focus on different aspects of positional relationships.
Adaptive ordering: The model learns to construct context-appropriate ordering of tokens, enabling it to handle different linguistic structures and semantics.

Practical Impact

This approach allows Transformers to:

Distinguish between sentences with identical words but different orders
Understand syntactic structures that depend on word positions
Process variable-length sequences effectively
Maintain the computational efficiency of parallel processing while preserving sequential information

Positional encoding is a fundamental component that enables Transformer models to achieve state-of-the-art performance across a wide range of natural language processing tasks.

With Claude

Attention in a Transformer

Posted on 2025-05-19 by lechuck park

Attention Mechanism in Transformer Models

Overview

The attention mechanism in Transformer models is a revolutionary technology that has transformed the field of natural language processing. This technique allows each word (token) in a sentence to form direct relationships with all other words.

Working Principles

Tokenization Stage: Input text is divided into individual tokens.
Attention Application: Each token calculates its relevance to all other tokens.
Mathematical Implementation:
- Each token is converted into Query, Key, and Value vectors.
- The relevance between a specific token (Query) and other tokens (Keys) is calculated.
- Weights are applied to the Values based on the calculated relevance.
- This is expressed as the ‘sum of Value * Weight’.

Multi-Head Attention

Definition: A method that calculates multiple attention vectors for a single token in parallel.
Characteristics: Each head (styles A, B, C) captures token relationships from different perspectives.
Advantage: Can simultaneously extract various information such as grammatical relationships and semantic associations.

Key Benefits

Contextual Understanding: Enables understanding of word meanings based on context.
Long-Distance Dependency Resolution: Can directly connect words that are far apart in a sentence.
Parallel Processing: High computational efficiency due to simultaneous processing of all tokens.

Applications

Transformer-based models demonstrate exceptional performance in various natural language processing tasks including machine translation, text generation, and question answering. They form the foundation of modern AI models such as GPT and BERT.

With Claude