per Watt with AI

This image titled “per Watt with AI” is a diagram explaining the paradigm shift in power efficiency following the AI era, particularly after the emergence of LLMs.

Overall Context

Core Structure of AI Development:

  • Machine Learning = Computing = Using Power
  • The equal signs (=) indicate that these three elements are essentially the same concept. In other words, AI machine learning inherently means large-scale computing, which inevitably involves power consumption.

Characteristics of LLMs: As AI, particularly LLMs, have proven their effectiveness, tremendous progress has been made. However, due to their technical characteristics, they have the following structure:

  • Huge Computing: Massively parallel processing of simple tasks
  • Huge Power: Enormous power consumption due to this parallel processing
  • Huge Cost: Power costs and infrastructure expenses

Importance of Power Efficiency Metrics

With hardware advancements making this approach practically effective, power consumption has become a critical issue affecting even the global ecosystem. Therefore, power is now used as a performance indicator for all operations.

Key Power Efficiency Metrics

Performance-related:

  • FLOPs/Watt: Floating-point operations per watt
  • Inferences/Watt: Number of inferences processed per watt
  • Training/Watt: Training performance per watt

Operations-related:

  • Workload/Watt: Workload processing capacity per watt
  • Data/Watt: Data processing capacity per watt
  • IT Work/Watt: IT work processing capacity per watt

Infrastructure-related:

  • Cooling/Watt: Cooling efficiency per watt
  • Water/Watt: Water usage efficiency per watt

This diagram illustrates that in the AI era, power efficiency has become the core criterion for all performance evaluations, transcending simple technical metrics to encompass environmental, economic, and social perspectives.

With Claude

Learning , Reasoning, Inference

This image illustrates the three core processes of AI LLMs by drawing parallels to human learning and cognitive processes.

Learning

  • Depicted as a wise elderly scholar reading books in a library
  • Represents the lifelong process of absorbing knowledge and experiences accumulated by humanity over generations
  • The bottom icons show data accumulation and knowledge storage processes
  • Meaning: Just as AI learns human language and knowledge through vast text data, humans also build knowledge throughout their lives through continuous learning and experience

Reasoning

  • Shows a character deep in thought, surrounded by mathematical formulas
  • Represents the complex mental process of confronting a problem and searching for solutions through internal contemplation
  • The bottom icons symbolize problem analysis and processing stages
  • Meaning: The human cognitive process of using learned knowledge to engage in logical thinking and analysis to solve problems

Inference

  • Features a character confidently exclaiming “THE ANSWER IS CLEAR!”
  • Expresses the confidence and decisiveness when finally finding an answer after complex thought processes
  • The bottom checkmark signifies reaching a final conclusion
  • Meaning: The human act of ultimately speaking an answer or making a behavioral decision through thought and analysis

These three stages visually demonstrate how AI processes information in a manner similar to the natural human sequence of learning → thinking → conclusion, connecting AI’s technical processes to familiar human cognitive patterns.

With Claude

Digital Twin with LLM

This image demonstrates the revolutionary applicability of Digital Twin enhanced by LLM integration.

Three Core Components of Digital Twin

Digital Twin consists of three essential elements:

  1. Modeling – Creating digital replicas of physical objects
  2. Data – Real-time sensor data and operational information collection
  3. Simulation – Predictive analysis and scenario testing

Traditional Limitations and LLM’s Revolutionary Solution

Previous Challenges: Modeling results were expressed only through abstract concepts like “Visual Effect” and “Easy to view of complex,” making practical interpretation difficult.

LLM as a Game Changer:

  • Multimodal Interpretation: Transforms complex 3D models, data patterns, and simulation results into intuitive natural language explanations
  • Retrieval Interpretation: Instantly extracts key insights from vast datasets and converts them into human-understandable formats
  • Human Interpretation Resource Replacement: LLM provides expert-level analytical capabilities, enabling continuous 24/7 monitoring

Future Value of Digital Twin

With LLM integration, Digital Twin evolves from a simple visualization tool into an intelligent decision-making partner. This becomes the core driver for maximizing operational efficiency and continuous innovation, accelerating digital transformation across industries.

Ultimately, this diagram emphasizes that LLM is the key technology that unlocks the true potential of Digital Twin, demonstrating its necessity and serving as the foundation for sustained operational improvement and future development.

With Claude

Personal(User/Expert) Data Service

System Overview

The Personal Data Service is an open expert RAG service platform based on MCP (Model Context Protocol). This system creates a bidirectional ecosystem where both users and experts can benefit mutually, enhancing accessibility to specialized knowledge and improving AI service quality.

Core Components

1. User Interface (Left Side)

  • LLM Model Selection: Users can choose their preferred language model or MoE (Mixture of Experts)
  • Expert Selection: Select domain-specific experts for customized responses
  • Prompt Input: Enter specific questions or requests

2. Open MCP Platform (Center)

  • Integrated Management Hub: Connects and coordinates all system components
  • Request Processing: Matches user requests with appropriate expert RAG systems
  • Service Orchestration: Manages and optimizes the entire workflow

3. LLM Service Layer (Right Side)

  • Multi-LLM Support: Integration with various AI model services
  • OAuth Authentication: Direct user selection of paid/free services
  • Vendor Neutrality: Open architecture independent of specific AI services

4. Expert RAG Ecosystem (Bottom)

  • Specialized Data Registration: Building expert-specific knowledge databases through RAG
  • Quality Management System: Ensuring reliability through evaluation and reputation management
  • Historical Logs: Continuous quality improvement through service usage records

Key Features

  1. Bidirectional Ecosystem: Users obtain expert answers while experts monetize their knowledge
  2. Open Architecture: Scalable platform based on MCP standards
  3. Quality Assurance: Expert and answer quality management through evaluation systems
  4. Flexible Integration: Compatibility with various LLM services
  5. Autonomous Operation: Direct data management and updates by experts

With Claude

“Encoder/Decoder” in a Transformer

Transformer Encoder-Decoder Architecture Explanation

This image is a diagram that visually explains the encoder-decoder structure of the Transformer model.

Encoder Section (Top, Green)

Purpose: Process “questions” by converting input text into vectors

Processing Steps:

  1. Tokenize input tokens and apply positional encoding
  2. Capture relationships between tokens using multi-head attention
  3. Extract meaning through feed-forward neural networks
  4. Stabilize with layer normalization

Decoder Section (Bottom, Purple)

Purpose: Generate new stories from text

Processing Steps:

  1. Apply positional encoding to output tokens
  2. Masked Multi-Head Self-Attention (Key Difference)
    • Mask future tokens using “Only Next” approach
    • Constraint for sequential generation
  3. Reference input information through encoder-decoder attention
  4. Apply feed-forward neural networks and layer normalization

Key Features

  • Encoder: Processes entire input at once to understand context
  • Decoder: References only previous tokens to sequentially generate new tokens
  • Attention Mechanism: Focuses on highly relevant parts for information processing

This is the core architecture used in various natural language processing tasks such as machine translation, text summarization, and question answering.

With Claude

“Positional Encoding” in a Transformer

Positional Encoding in Transformer Models

The Problem: Loss of Sequential Information

Transformer models use an attention mechanism that enables each token to interact with all other tokens in parallel, regardless of their positions in the sequence. While this parallel processing offers computational advantages, it comes with a significant limitation: the model loses all information about the sequential order of tokens. This means that without additional mechanisms, a Transformer cannot distinguish between sequences like “I am right” and “Am I right?” despite their different meanings.

The Solution: Positional Encoding

To address this limitation, Transformers implement positional encoding:

  1. Definition: Positional encoding adds position-specific information to each token’s embedding, allowing the model to understand sequence order.
  2. Implementation: The standard approach uses sinusoidal functions (sine and cosine) with different frequencies to create unique position vectors:
    • For each position in the sequence, a unique vector is generated
    • These vectors are calculated using sin() and cos() functions
    • The position vectors are then added to the corresponding token embeddings
  3. Mathematical properties:
    • Each position has a unique encoding
    • The encodings have a consistent pattern that allows the model to generalize to sequence lengths not seen during training
    • The relative positions of tokens can be expressed as a linear function of their encodings

Integration with Attention Mechanism

The combination of positional encoding with the attention mechanism enables Transformers to process tokens in parallel while maintaining awareness of their sequential relationships:

  1. Context-aware processing: Each attention head learns to interpret the positional information within its specific context.
  2. Multi-head flexibility: Different attention heads (A style, B style, C style) can focus on different aspects of positional relationships.
  3. Adaptive ordering: The model learns to construct context-appropriate ordering of tokens, enabling it to handle different linguistic structures and semantics.

Practical Impact

This approach allows Transformers to:

  • Distinguish between sentences with identical words but different orders
  • Understand syntactic structures that depend on word positions
  • Process variable-length sequences effectively
  • Maintain the computational efficiency of parallel processing while preserving sequential information

Positional encoding is a fundamental component that enables Transformer models to achieve state-of-the-art performance across a wide range of natural language processing tasks.

With Claude

Attention in a Transformer

Attention Mechanism in Transformer Models

Overview

The attention mechanism in Transformer models is a revolutionary technology that has transformed the field of natural language processing. This technique allows each word (token) in a sentence to form direct relationships with all other words.

Working Principles

  1. Tokenization Stage: Input text is divided into individual tokens.
  2. Attention Application: Each token calculates its relevance to all other tokens.
  3. Mathematical Implementation:
    • Each token is converted into Query, Key, and Value vectors.
    • The relevance between a specific token (Query) and other tokens (Keys) is calculated.
    • Weights are applied to the Values based on the calculated relevance.
    • This is expressed as the ‘sum of Value * Weight’.

Multi-Head Attention

  • Definition: A method that calculates multiple attention vectors for a single token in parallel.
  • Characteristics: Each head (styles A, B, C) captures token relationships from different perspectives.
  • Advantage: Can simultaneously extract various information such as grammatical relationships and semantic associations.

Key Benefits

  1. Contextual Understanding: Enables understanding of word meanings based on context.
  2. Long-Distance Dependency Resolution: Can directly connect words that are far apart in a sentence.
  3. Parallel Processing: High computational efficiency due to simultaneous processing of all tokens.

Applications

Transformer-based models demonstrate exceptional performance in various natural language processing tasks including machine translation, text generation, and question answering. They form the foundation of modern AI models such as GPT and BERT.

With Claude