Chaos and Order

Chaos and Order https://www.youngju.dev/blog 천천히 올바르게. AI Researcher & DevOps Engineer Youngju's tech blog. GPU/CUDA, LLM, MLOps, Kubernetes AI workloads, distributed training, and data engineering. ko fjvbn2003@gmail.com (Youngju Kim) fjvbn2003@gmail.com (Youngju Kim) Sat, 16 May 2026 00:00:00 GMT https://www.youngju.dev/blog/culture/2026-05-16-foundation-model-architectures-beyond-transformer-2026-mamba-hyena-rwkv-retnet-griffin-jamba-xlstm-ttt-dit-moe-flash-attention-3-deep-dive.en Foundation Model Architectures 2026 — Beyond the Transformer / Mamba 2 / Hyena / RWKV / RetNet / Griffin / Jamba / xLSTM / TTT / DiT / MoE / Flash Attention 3 Deep Dive https://www.youngju.dev/blog/culture/2026-05-16-foundation-model-architectures-beyond-transformer-2026-mamba-hyena-rwkv-retnet-griffin-jamba-xlstm-ttt-dit-moe-flash-attention-3-deep-dive.en In 2026 the foundation-model world is no longer Transformer-only. Vaswani 2017 "Attention is All You Need" remains the standard, but next to it stand state-space models (Mamba, Mamba 2), the linear-RNN renaissance (RWKV, RetNet, Griffin), hybrids (AI21 Jamba, Falcon Mamba), Sepp Hochreiters xLSTM, Test-Time Training, the DiT family behind Sora, MoE giants (Mixtral, DeepSeek-V3 671B, Google Million Experts), Flash Attention 3 and Ring Attention, plus the 1M+ token context era of Gemini 2M and Magic LTM-2-mini 100M. We map who solves what, who should pick which, and what the Korean and Japanese ecosystems are building. Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) foundation-modelstransformerattention-is-all-you-needvaswanimambastate-space-modelssmalbert-gutri-daomamba-2hyenastanford-h2olinear-attentionschmidhuberrwkvbo-pengretnetmicrosoft-retentivegriffindeepmind-griffins5jambaai21falcon-mambaxlstmsepp-hochreitertest-time-trainingtttsun-et-alditdiffusion-transformersora-ditmixture-of-expertsmoemixtraldeepseek-v3-moemillion-expertsgoogle-momeflash-attention-3ring-attentiongemini-2mmagic-ltm-2-minisakana-ai-evolutionary2026deep-diveenglish https://www.youngju.dev/blog/culture/2026-05-16-foundation-model-architectures-beyond-transformer-2026-mamba-hyena-rwkv-retnet-griffin-jamba-xlstm-ttt-dit-moe-flash-attention-3-deep-dive.ja 基盤モデルのアーキテクチャ 2026 — Transformer の次へ / Mamba 2 / Hyena / RWKV / RetNet / Griffin / Jamba / xLSTM / TTT / DiT / MoE / Flash Attention 3 徹底ガイド https://www.youngju.dev/blog/culture/2026-05-16-foundation-model-architectures-beyond-transformer-2026-mamba-hyena-rwkv-retnet-griffin-jamba-xlstm-ttt-dit-moe-flash-attention-3-deep-dive.ja 2026 年の基盤モデル界隈はもはや Transformer 一色ではない。Vaswani 2017 「Attention is All You Need」は今も標準だが、その隣に Mamba/Mamba 2 のような状態空間モデル、RWKV/RetNet/Griffin の線形 RNN 復活組、AI21 Jamba と Falcon Mamba のハイブリッド、Sepp Hochreiter の xLSTM、Test-Time Training、Sora の DiT、Mixtral/DeepSeek-V3 671B/Google Million Experts のような MoE、Flash Attention 3 と Ring Attention、Gemini 2M / Magic LTM-2-mini 100M の超長文脈までが揃った。どのアーキテクチャがどの問題に向くか、韓国・日本勢は何を作っているかを一気に整理。 Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) foundation-modelstransformerattention-is-all-you-needvaswanimambastate-space-modelssmalbert-gutri-daomamba-2hyenastanford-h2olinear-attentionschmidhuberrwkvbo-pengretnetmicrosoft-retentivegriffindeepmind-griffins5jambaai21falcon-mambaxlstmsepp-hochreitertest-time-trainingtttsun-et-alditdiffusion-transformersora-ditmixture-of-expertsmoemixtraldeepseek-v3-moemillion-expertsgoogle-momeflash-attention-3ring-attentiongemini-2mmagic-ltm-2-minisakana-ai-evolutionary2026deep-dive日本語 https://www.youngju.dev/blog/culture/2026-05-16-foundation-model-architectures-beyond-transformer-2026-mamba-hyena-rwkv-retnet-griffin-jamba-xlstm-ttt-dit-moe-flash-attention-3-deep-dive 파운데이션 모델 아키텍처 2026 — Transformer 이후 / Mamba 2 / Hyena / RWKV / RetNet / Griffin / Jamba / xLSTM / TTT / DiT / MoE / Flash Attention 3 심층 가이드 https://www.youngju.dev/blog/culture/2026-05-16-foundation-model-architectures-beyond-transformer-2026-mamba-hyena-rwkv-retnet-griffin-jamba-xlstm-ttt-dit-moe-flash-attention-3-deep-dive 2026년 파운데이션 모델 세계는 더 이상 Transformer 일변도가 아니다. Vaswani의 2017년 "Attention is All You Need"는 여전히 표준이지만, 그 옆에 Mamba/Mamba 2 같은 상태공간 모델(SSM), RWKV/RetNet/Griffin 같은 선형 RNN 재발견 진영, AI21 Jamba와 Falcon Mamba 같은 하이브리드, Sepp Hochreiter의 xLSTM, Test-Time Training, Sora의 DiT, Mixtral/DeepSeek-V3 671B/Google Million Experts 같은 MoE, Flash Attention 3와 Ring Attention, 그리고 Gemini 2M/Magic LTM-2-mini 100M의 초장문 컨텍스트까지 — 어떤 아키텍처가 어떤 문제에 강한지, 한국과 일본 진영은 무엇을 만들고 있는지 한 번에 정리. Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) foundation-modelstransformerattention-is-all-you-needvaswanimambastate-space-modelssmalbert-gutri-daomamba-2hyenastanford-h2olinear-attentionschmidhuberrwkvbo-pengretnetmicrosoft-retentivegriffindeepmind-griffins5jambaai21falcon-mambaxlstmsepp-hochreitertest-time-trainingtttsun-et-alditdiffusion-transformersora-ditmixture-of-expertsmoemixtraldeepseek-v3-moemillion-expertsgoogle-momeflash-attention-3ring-attentiongemini-2mmagic-ltm-2-minisakana-ai-evolutionary2026deep-dive https://www.youngju.dev/blog/culture/2026-05-16-llm-papers-llama-deepseek-qwen-mistral-phi-rlhf-cot-rag-flashattention-vllm-2026-deep-dive.en Top LLM Papers 2024-2026 - Llama, DeepSeek, Qwen, Mistral, Phi, RLHF, DPO, CoT, RAG, FlashAttention, vLLM Reading List https://www.youngju.dev/blog/culture/2026-05-16-llm-papers-llama-deepseek-qwen-mistral-phi-rlhf-cot-rag-flashattention-vllm-2026-deep-dive.en A curated reading list of 30+ must-read LLM papers for engineers building with LLMs in 2024-2026. Covers foundation models (Llama 3/4, DeepSeek-V3/R1, Qwen3, Mistral, Phi-4, Gemma 3), training innovations (MoE, MLA, GQA), post-training (RLHF, DPO, ORPO, KTO), reasoning (CoT, ToT, GRPO), agents (ReAct, SWE-Agent), retrieval (RAG, GraphRAG, ColBERT), efficiency (FlashAttention 1/2/3, vLLM PagedAttention, SGLang), evaluation (MMLU, GSM8K, SWE-Bench, OSWorld), safety, and Korean and Japanese models — each paper paired with its arXiv ID and a one-paragraph why-it-matters. Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) llmpapersllamadeepseekqwenmistralphirlhfdpochain-of-thoughtragflashattentionvllmfoundation-modelsmixture-of-experts https://www.youngju.dev/blog/culture/2026-05-16-llm-papers-llama-deepseek-qwen-mistral-phi-rlhf-cot-rag-flashattention-vllm-2026-deep-dive.ja LLM論文キュレーション 2024-2026 - Llama・DeepSeek・Qwen・Mistral・Phi・RLHF・DPO・CoT・RAG・FlashAttention・vLLM 詳細ガイド https://www.youngju.dev/blog/culture/2026-05-16-llm-papers-llama-deepseek-qwen-mistral-phi-rlhf-cot-rag-flashattention-vllm-2026-deep-dive.ja LLMを構築し運用するエンジニアのための2024-2026必読論文30+本キュレーション。基盤モデル(Llama 3/4、DeepSeek-V3/R1、Qwen3、Mistral、Phi-4、Gemma 3)、学習革新(MoE、MLA、GQA)、ポストトレーニング(RLHF、DPO、ORPO、KTO)、推論(CoT、ToT、GRPO)、エージェント(ReAct、SWE-Agent)、検索(RAG、GraphRAG、ColBERT)、効率(FlashAttention 1/2/3、vLLM PagedAttention、SGLang)、評価(MMLU、GSM8K、SWE-Bench、OSWorld)、安全性、韓国・日本モデルまで — 各論文のarXiv IDと「なぜ重要か」を一段落で整理。 Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) llmpapersllamadeepseekqwenmistralphirlhfdpochain-of-thoughtragflashattentionvllmfoundation-modelsmixture-of-experts https://www.youngju.dev/blog/culture/2026-05-16-llm-papers-llama-deepseek-qwen-mistral-phi-rlhf-cot-rag-flashattention-vllm-2026-deep-dive LLM 논문 큐레이션 2024-2026 - Llama · DeepSeek · Qwen · Mistral · Phi · RLHF · DPO · CoT · RAG · FlashAttention · vLLM 심층 가이드 https://www.youngju.dev/blog/culture/2026-05-16-llm-papers-llama-deepseek-qwen-mistral-phi-rlhf-cot-rag-flashattention-vllm-2026-deep-dive LLM을 만들고 운영하는 엔지니어를 위한 2024-2026 필독 논문 30+편 큐레이션. 파운데이션 모델(Llama 3/4, DeepSeek-V3/R1, Qwen3, Mistral, Phi-4, Gemma 3), 학습 혁신(MoE, MLA, GQA), 포스트-트레이닝(RLHF, DPO, ORPO, KTO), 추론(CoT, ToT, GRPO), 에이전트(ReAct, SWE-Agent), 검색(RAG, GraphRAG, ColBERT), 효율(FlashAttention 1/2/3, vLLM PagedAttention, SGLang), 평가(MMLU, GSM8K, SWE-Bench, OSWorld), 안전성, 한국·일본 모델까지 — 각 논문의 arXiv ID와 "왜 중요한지"를 한 단락으로 정리. Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) llmpapersllamadeepseekqwenmistralphirlhfdpochain-of-thoughtragflashattentionvllmfoundation-modelsmixture-of-experts https://www.youngju.dev/blog/culture/2026-05-16-modern-swift-2026-swift-6-concurrency-strict-swiftdata-foundation-models-swiftui-5-vapor-tca-deep-dive.en Modern Swift & Apple Development 2026 Deep Dive - Swift 6 Strict Concurrency, SwiftData, Foundation Models, SwiftUI 5, Vapor, TCA https://www.youngju.dev/blog/culture/2026-05-16-modern-swift-2026-swift-6-concurrency-strict-swiftdata-foundation-models-swiftui-5-vapor-tca-deep-dive.en Swift in 2026 is no longer just an iOS language. Swift 6.1 ships strict concurrency on by default, with region-based isolation catching data races at compile time. SwiftData, introduced with iOS 17, has nearly displaced Core Data thanks to iOS 18 additions like the History API and composite keys. The Foundation Models framework gives every app access to a ~3B parameter on-device LLM with tool calling and Guided Generation as the heart of Apple Intelligence. On the server side, Vapor 4, Hummingbird 2, and AWS Lambda Swift are growing; on cross-platform, Skip (Swift→Android) and Swift Wasm 6 are real options. Embedded Swift runs on the Raspberry Pi Pico and ARM Cortex-M without malloc. This article maps the entire 2026 Apple developer landscape — from Swift 6 concurrency through SwiftUI 5 animations, App Intents, Swift Testing, Swift Macros, SwiftPM 6, and TCA 1.x. Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) swiftswift-6swiftdatafoundation-modelsswiftuivaportcaapple-intelligenceios-developmentenglish https://www.youngju.dev/blog/culture/2026-05-16-modern-swift-2026-swift-6-concurrency-strict-swiftdata-foundation-models-swiftui-5-vapor-tca-deep-dive.ja モダンSwift & Appleプラットフォーム開発 2026 完全ガイド - Swift 6 Strict Concurrency・SwiftData・Foundation Models・SwiftUI 5・Vapor・TCA徹底解説 https://www.youngju.dev/blog/culture/2026-05-16-modern-swift-2026-swift-6-concurrency-strict-swiftdata-foundation-models-swiftui-5-vapor-tca-deep-dive.ja 2026年のSwiftはもはや単なるiOS言語ではない。Swift 6.1ではstrict concurrencyがデフォルトで有効になり、region-based isolationがコンパイル時にデータレースを検出する。SwiftDataはiOS 17で登場した後、iOS 18のHistory APIとcomposite key対応によってCore Dataの座を奪いつつある。Foundation Modelsフレームワークは約30億パラメータのオンデバイスLLMにtool callingとGuided Generationを乗せ、Apple Intelligenceの心臓となった。サーバーサイドではVapor 4・Hummingbird 2・AWS Lambda Swiftが、マルチプラットフォームではSkip(Swift→Android)とSwift Wasm 6が成長中。Embedded SwiftはRaspberry Pi PicoとARM Cortex-Mでmallocなしで動作する。本記事はSwift 6 concurrencyからSwiftUI 5アニメーション・App Intents・Swift Testing・Swift Macros・SwiftPM 6・TCA 1.xまで、2026年のApple開発者が知るべきすべてを一気に整理する。 Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) swiftswift-6swiftdatafoundation-modelsswiftuivaportcaapple-intelligenceios-development日本語 https://www.youngju.dev/blog/culture/2026-05-16-modern-swift-2026-swift-6-concurrency-strict-swiftdata-foundation-models-swiftui-5-vapor-tca-deep-dive 모던 Swift & Apple 개발 2026 완벽 가이드 - Swift 6 Strict Concurrency · SwiftData · Foundation Models · SwiftUI 5 · Vapor · TCA 심층 분석 https://www.youngju.dev/blog/culture/2026-05-16-modern-swift-2026-swift-6-concurrency-strict-swiftdata-foundation-models-swiftui-5-vapor-tca-deep-dive 2026년의 Swift는 더 이상 단순한 iOS 언어가 아니다. Swift 6.1은 strict concurrency를 기본으로 켜고, region-based isolation으로 데이터 레이스를 컴파일타임에 잡는다. SwiftData는 iOS 17에서 등장한 뒤 iOS 18의 History API와 composite key 지원으로 Core Data의 자리를 거의 차지했다. Foundation Models 프레임워크는 약 30억 파라미터의 온디바이스 LLM에 tool calling과 Guided Generation을 얹어 Apple Intelligence의 핵심이 됐다. 한편 서버에서는 Vapor 4·Hummingbird 2·AWS Lambda Swift가, 멀티플랫폼에서는 Skip(Swift→Android)과 Swift Wasm 6가 자라고 있다. Embedded Swift는 Raspberry Pi Pico와 ARM Cortex-M에서 malloc 없이 동작한다. 이 글은 Swift 6 동시성부터 SwiftUI 5 애니메이션·App Intents·Swift Testing·Swift Macros·SwiftPM 6·TCA 1.x까지, 2026년 Apple 개발자가 알아야 할 모든 것을 한 번에 정리한다. Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) swiftswift-6swiftdatafoundation-modelsswiftuivaportcaapple-intelligenceios-development https://www.youngju.dev/blog/culture/2026-05-16-vision-language-models-clip-llava-internvl-qwen-vl-gpt4o-gemini-claude-vlm-2026-deep-dive.en Vision-Language Models (VLMs) 2026 Deep Dive — CLIP, LLaVA, InternVL3, Qwen2.5-VL, GPT-4o, Gemini 2.5, Claude 4.7, DINOv2, SAM 2, and Florence-2 https://www.youngju.dev/blog/culture/2026-05-16-vision-language-models-clip-llava-internvl-qwen-vl-gpt4o-gemini-claude-vlm-2026-deep-dive.en Everything you need to know about Vision-Language Models in May 2026 in one place. CLIP family (SigLIP, EVA-CLIP), open VLMs (LLaVA-NeXT, InternVL3, Qwen2.5-VL, Pixtral, Molmo, Idefics3, MiniCPM-V), closed frontier (GPT-4o, Claude 4.7, Gemini 2.5), vision foundations (DINOv2/v3, SAM 2, Florence-2), training recipes, evaluation (MMMU, MathVista, ChartQA, DocVQA), OCR-centric VLMs, video VLMs, vLLM/SGLang serving, and the VLM scenes in Korea and Japan — covered in depth. Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) vision-language-modelsvlmclipllavainternvlqwen-vlgpt-4ogeminiclaudedinov2samflorencemultimodalfoundation-models https://www.youngju.dev/blog/culture/2026-05-16-vision-language-models-clip-llava-internvl-qwen-vl-gpt4o-gemini-claude-vlm-2026-deep-dive.ja ビジョン言語モデル(VLM)2026 完全ガイド — CLIP・LLaVA・InternVL3・Qwen2.5-VL・GPT-4o・Gemini 2.5・Claude 4.7・DINOv2・SAM 2・Florence-2 徹底解説 https://www.youngju.dev/blog/culture/2026-05-16-vision-language-models-clip-llava-internvl-qwen-vl-gpt4o-gemini-claude-vlm-2026-deep-dive.ja 2026年5月時点のビジョン言語モデル(VLM)を一本にまとめる。CLIP系列(SigLIP, EVA-CLIP)、オープンVLM(LLaVA-NeXT, InternVL3, Qwen2.5-VL, Pixtral, Molmo, Idefics3, MiniCPM-V)、クローズドフロンティア(GPT-4o, Claude 4.7, Gemini 2.5)、ビジョン基盤(DINOv2/v3, SAM 2, Florence-2)、学習レシピ、評価(MMMU, MathVista, ChartQA, DocVQA)、OCR特化VLM、動画VLM、vLLM/SGLangでのサービング、そして韓国・日本のVLMシーンまで深掘りする。 Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) vision-language-modelsvlmclipllavainternvlqwen-vlgpt-4ogeminiclaudedinov2samflorencemultimodalfoundation-models https://www.youngju.dev/blog/culture/2026-05-16-vision-language-models-clip-llava-internvl-qwen-vl-gpt4o-gemini-claude-vlm-2026-deep-dive 비전-언어 모델(VLM) 2026 완벽 가이드 - CLIP · LLaVA · InternVL3 · Qwen2.5-VL · GPT-4o · Gemini 2.5 · Claude 4.7 · DINOv2 · SAM 2 · Florence-2 심층 분석 https://www.youngju.dev/blog/culture/2026-05-16-vision-language-models-clip-llava-internvl-qwen-vl-gpt4o-gemini-claude-vlm-2026-deep-dive 2026년 5월 기준 비전-언어 모델(VLM)의 모든 것을 한 글에 담는다. CLIP 계열(SigLIP, EVA-CLIP)부터 오픈 VLM(LLaVA-NeXT, InternVL3, Qwen2.5-VL, Pixtral, Molmo, Idefics3, MiniCPM-V), 폐쇄형(GPT-4o, Claude 4.7, Gemini 2.5), 비전 파운데이션(DINOv2/v3, SAM 2, Florence-2), 학습 레시피, 평가(MMMU, MathVista, ChartQA, DocVQA), OCR-centric VLM, 비디오 VLM, vLLM/SGLang 서빙, 그리고 한국·일본 VLM 씬까지 깊이 정리한다. Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) vision-language-modelsvlmclipllavainternvlqwen-vlgpt-4ogeminiclaudedinov2samflorencemultimodalfoundation-models