inference – Lechuck Park

AI Model 3 Works

Posted on 2026-01-152026-01-14 by lechuck park

Analysis of AI Model 3 Works

The provided image illustrates the three core stages of how AI models operate: Learning, Inference, and Data Generation.

1. Learning

Goal: Knowledge acquisition and parameter updates. This is the stage where the AI “studies” data to find patterns.
Mechanism: Bidirectional (Feed-forward + Backpropagation). It processes data to get a result and then goes backward to correct errors by adjusting internal weights.
Key Metrics: Accuracy and Loss. The objective is to minimize loss to increase the model’s precision.
Resource Requirement: Very High. It requires high-performance server clusters equipped with powerful GPUs like the NVIDIA H100.

2. Inference (Reasoning)

Goal: Result prediction, classification, and judgment. This is using a pre-trained model to answer specific questions (e.g., “What is in this picture?”).
Mechanism: Unidirectional (Feed-forward). Data simply flows forward through the model to produce an output.
Key Metrics: Latency and Efficiency. The focus is on how quickly and cheaply the model can provide an answer.
Resource Requirement: Moderate. It is efficient enough to be feasible on “Edge devices” like smartphones or local PCs.

3. Data Generation

Goal: New data synthesis. This involves creating entirely new content like text, images, or music (e.g., Generative AI like ChatGPT).
Mechanism: Iterative Unidirectional (Recurring Calculation). It generates results piece by piece (token by token) in a repetitive process.
Key Metrics: Quality, Diversity, and Consistency. The focus is on how natural and varied the generated output is.
Resource Requirement: High. Because it involves iterative calculations for every single token, it requires more power than simple inference.

Summary

AI processes consist of Learning (studying data), Inference (applying knowledge), and Data Generation (creating new content).
Learning requires massive server power for bidirectional updates, while Inference is optimized for speed and can run on everyday devices.
Data Generation synthesizes new information through repetitive, iterative calculations, requiring high resources to maintain quality.

#AI #MachineLearning #GenerativeAI #DeepLearning #TechExplained #AIModel #Inference #DataScience #Learning #DataGeneration

With Gemini

Learning , Reasoning, Inference

Posted on 2025-07-152025-07-14 by lechuck park

This image illustrates the three core processes of AI LLMs by drawing parallels to human learning and cognitive processes.

Learning

Depicted as a wise elderly scholar reading books in a library
Represents the lifelong process of absorbing knowledge and experiences accumulated by humanity over generations
The bottom icons show data accumulation and knowledge storage processes
Meaning: Just as AI learns human language and knowledge through vast text data, humans also build knowledge throughout their lives through continuous learning and experience

Reasoning

Shows a character deep in thought, surrounded by mathematical formulas
Represents the complex mental process of confronting a problem and searching for solutions through internal contemplation
The bottom icons symbolize problem analysis and processing stages
Meaning: The human cognitive process of using learned knowledge to engage in logical thinking and analysis to solve problems

Inference

Features a character confidently exclaiming “THE ANSWER IS CLEAR!”
Expresses the confidence and decisiveness when finally finding an answer after complex thought processes
The bottom checkmark signifies reaching a final conclusion
Meaning: The human act of ultimately speaking an answer or making a behavioral decision through thought and analysis

These three stages visually demonstrate how AI processes information in a manner similar to the natural human sequence of learning → thinking → conclusion, connecting AI’s technical processes to familiar human cognitive patterns.

With Claude

Personal with AI

Posted on 2025-04-29 by lechuck park

This diagram illustrates a “Personal Agent” system architecture that shows how everyday life is digitized to create an AI-based personal assistant:

Left side: The user’s daily activities (coffee, computer, exercise, sleep) are represented, which serve as the source for digitization.

Center-left: Various sensors (visual, auditory, tactile, olfactory, gustatory) capture the user’s daily activities and convert them through the “Digitization” process.

Center: The “Current State (Prompting)” component stores the digitized current state data, which is provided as prompting information to the AI agent.

Upper right (pink area): Two key processes take place:

“Learning”: Processing user data from an ML/LLM perspective
“Logging”: Continuously collecting data to update the vector database

This section runs on a “Personal Server or Cloud,” preferably using a personalized GPU server like NVIDIA DGX Spark, or alternatively in a cloud environment.

Lower right: In the “On-Device Works” area, the “Inference” process occurs. Based on current state data, the AI agent infers guidance needed for the user, and this process is handled directly on the user’s personal device.

Center bottom: The cute robot icon represents the AI agent, which provides personalized guidance to the user through the “Agent Guide” component.

Overall, this system has a cyclical structure that digitizes the user’s daily life, learns from that data to continuously update a personalized vector database, and uses the current state as a basis for the AI agent to provide customized guidance through an inference process that runs on-device.

with Claude

AI DC Changes

Posted on 2025-04-22 by lechuck park

The evolution of AI data centers has progressed through the following stages:

Legacy – The initial form of data centers, providing basic computing infrastructure.
Hyperscale – Evolved into a centralized (Centric) structure with these characteristics:
- Led by Big Tech companies (Google, Amazon, Microsoft, etc.)
- Focused on AI model training (Learning) with massive computing power
- Concentration of data and processing capabilities in central locations
Distributed – The current evolutionary direction with these features:
- Expansion of Edge/On-device computing
- Shift from AI training to inference-focused operations
- Moving from Big Tech centralization to enterprise and national data sovereignty
- Enabling personalization for customized user services

This evolution represents a democratization of AI technology, emphasizing data sovereignty, privacy protection, and the delivery of optimized services tailored to individual users.

AI data centers have evolved from legacy systems to hyperscale centralized structures dominated by Big Tech companies focused on AI training. The current shift toward distributed architecture emphasizes edge/on-device computing, inference capabilities, data sovereignty for enterprises and nations, and enhanced personalization for end users.

with Claude

GPU vs NPU on Deep learning

Posted on 2025-03-142025-03-14 by lechuck park

This diagram illustrates the differences between GPU and NPU from a deep learning perspective:

GPU (Graphic Process Unit):

Originally developed for 3D game rendering
In deep learning, it’s utilized for parallel processing of vast amounts of data through complex calculations during the training process
Characterized by “More Computing = Bigger Memory = More Power,” requiring high computing power
Processes big data and vectorizes information using the “Everything to Vector” approach
Stores learning results in Vector Databases for future use

NPU (Neuron Process Unit):

Retrieves information from already trained Vector DBs or foundation models to generate answers to questions
This process is called “Inference”
While the training phase processes all data in parallel, the inference phase only searches/infers content related to specific questions to formulate answers
Performs parallel processing similar to how neurons function

In conclusion, GPUs are responsible for processing enormous amounts of data and storing learning results in vector form, while NPUs specialize in the inference process of generating actual answers to questions based on this stored information. This relationship can be summarized as “training creates and stores vast amounts of data, while inference utilizes this at the point of need.”

With Claude

Chain of thoughts

Posted on 2024-09-20 by lechuck park

From Claude with some prompting
This diagram titled “Chain of thoughts” illustrates an inferencing method implemented in AI language models like ChatGPT, inspired by human deductive reasoning processes and leveraging prompting techniques.

Key components:

Upper section:
- Shows a process from ‘Q’ (question) to ‘A’ (answer).
- Contains an “Experienced Knowledges” area with interconnected nodes A through H, representing the AI’s knowledge base.
Lower section:
- Compares “1x Prompting” with “Prompting Chains”.
- “1x Prompting” depicts a simple input-output process.
- “Prompting Chains” shows a multi-step reasoning process.
Overall process:
- Labeled “Inferencing by <Chain of thoughts>”, emphasizing the use of sequential thinking for complex reasoning.

This diagram visualizes how AI systems, particularly models like ChatGPT, go beyond simple input-output relationships. It mimics human deductive reasoning by using a multi-step thought process (Chain of thoughts) to answer complex questions. The AI utilizes its existing knowledge base and creates new connections to perform deeper reasoning.

This approach suggests that AI can process information and generate new insights in a manner similar to human cognition, rather than merely reproducing learned information. It demonstrates the AI’s capability to engage in more sophisticated problem-solving and analysis through a structured chain of thoughts.

Foundation Model

Posted on 2024-04-252024-04-25 by lechuck park

From Claude with some prompting
This image depicts a high-level overview of a foundation model architecture. It consists of various components including a knowledge base, weight database (parameters), vector database (relative data), tuning module for making answers, inference module for generating answers, prompt tools, and an evaluation component for benchmarking.

The knowledge base stores structured information, while the weight and vector databases hold learnable parameters and relative data representations, respectively. The tuning and inference modules utilize these components to generate responses or make predictions. Prompt tools assist in forming inputs, and the evaluation component assesses the model’s performance.

This architectural diagram illustrates the core building blocks and data flow of a foundation model system, likely used for language modeling, knowledge representation, or other AI applications that require integrating diverse data sources and capabilities.