Chapter 1 of 8•1 min read

LLM Anatomy

ส่วนประกอบและโครงสร้างพื้นฐานของโมเดล LLM

Chapter 1: LLM Anatomy

ส่วนประกอบและโครงสร้างพื้นฐานของโมเดล LLM

Architecture Overview

โมเดล LLM ส่วนใหญ่ใช้ Transformer architecture ซึ่งประกอบด้วย:

Encoder-Decoder หรือ Decoder-only structure
Self-attention mechanisms
Feed-forward neural networks
Layer normalization

Memory Footprint

การคำนวณ memory requirements:

Model Size (Parameters) × Precision (bytes) = Memory Required

Example:
- 7B parameter model @ FP16: 7B × 2 bytes = 14GB
- 70B parameter model @ FP16: 70B × 2 bytes = 140GB
- 7B parameter model @ FP8:  7B × 1 byte = 7GB

Key Components

Embedding Layer - Convert tokens to vectors
Transformer Blocks - Main computation layers
LM Head - Generate output tokens
KV Cache - Store attention keys/values for inference

ใน chapter ถัดไป เราจะเรียนรู้เกี่ยวกับ serving frameworks ที่จัดการ components เหล่านี้อย่างมีประสิทธิภาพ