Chaos and Order

Chaos and Order https://www.youngju.dev/blog 천천히 올바르게. AI Researcher & DevOps Engineer Youngju's tech blog. GPU/CUDA, LLM, MLOps, Kubernetes AI workloads, distributed training, and data engineering. ko fjvbn2003@gmail.com (Youngju Kim) fjvbn2003@gmail.com (Youngju Kim) Sat, 16 May 2026 00:00:00 GMT https://www.youngju.dev/blog/culture/2026-05-16-llm-landmark-papers-roundup-2026-transformer-scaling-laws-flash-attention-mamba-deepseek-r1-titans-deep-dive.en LLM Landmark Papers Roundup 2026 - Transformer / Scaling Laws / Flash Attention / Mamba / DeepSeek-R1 / Titans Deep Dive https://www.youngju.dev/blog/culture/2026-05-16-llm-landmark-papers-roundup-2026-transformer-scaling-laws-flash-attention-mamba-deepseek-r1-titans-deep-dive.en From the 2017 Attention Is All You Need paper to 2026 Titans and DeepSeek-R1, a thematic roundup of the 50+ landmark papers that built the LLM era. Transformer, BERT, GPT 1-3, Scaling Laws, Chinchilla, InstructGPT, PaLM, FlashAttention 1-3, LLaMA 1-4, GPT-4, Mistral and Mixtral, DPO/KTO/ORPO, RWKV, Mamba 1-2, DeepSeek-V3 and R1, o1, Titans, Test-Time Training, Era of Experience, Tulu 3, Sleeper Agents, Scaling Monosemanticity, Mixture of a Million Experts, RoPE, YaRN, Ring Attention, GPTQ, AWQ, BitNet b1.58, DDPM, DiT, LLaVA, MMR1, and more. Each paper gets a one-paragraph treatment with real arxiv URLs, plus reading-group references from PR12, Tunib, Connpass, and the PFN blog, making this the most compressed LLM paper roadmap as of May 2026. Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) llm-paperstransformerscaling-lawsflash-attentionmambadeepseek-r1titansresearchai-papers2026deep-diveenglish https://www.youngju.dev/blog/culture/2026-05-16-llm-landmark-papers-roundup-2026-transformer-scaling-laws-flash-attention-mamba-deepseek-r1-titans-deep-dive.ja LLM ランドマーク論文ロードマップ 2026 - Transformer / Scaling Laws / Flash Attention / Mamba / DeepSeek-R1 / Titans 徹底解説 https://www.youngju.dev/blog/culture/2026-05-16-llm-landmark-papers-roundup-2026-transformer-scaling-laws-flash-attention-mamba-deepseek-r1-titans-deep-dive.ja 2017年の Attention Is All You Need から 2026年の Titans と DeepSeek-R1 まで、LLM 時代を作った 50 篇超のランドマーク論文をテーマ別に整理する。Transformer、BERT、GPT 1-3、Scaling Laws、Chinchilla、InstructGPT、PaLM、FlashAttention 1-3、LLaMA 1-4、GPT-4、Mistral と Mixtral、DPO/KTO/ORPO、RWKV、Mamba 1-2、DeepSeek-V3 と R1、o1、Titans、Test-Time Training、Era of Experience、Tulu 3、Sleeper Agents、Scaling Monosemanticity、Mixture of a Million Experts、RoPE、YaRN、Ring Attention、GPTQ、AWQ、BitNet b1.58、DDPM、DiT、LLaVA、MMR1 までを 1 段落単位の貢献ノートと実在の arXiv URL でまとめ、PR12、Tunib、Connpass 論文読み会、PFN ブログ等の日韓リーディンググループ資料まで束ねた 2026 年 5 月時点で最も凝縮された LLM 論文ロードマップ。 Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) llm-paperstransformerscaling-lawsflash-attentionmambadeepseek-r1titansresearchai-papers2026deep-dive日本語 https://www.youngju.dev/blog/culture/2026-05-16-llm-landmark-papers-roundup-2026-transformer-scaling-laws-flash-attention-mamba-deepseek-r1-titans-deep-dive LLM 랜드마크 논문 2026 완벽 가이드 - Transformer · Scaling Laws · Flash Attention · Mamba · DeepSeek-R1 · Titans 심층 분석 https://www.youngju.dev/blog/culture/2026-05-16-llm-landmark-papers-roundup-2026-transformer-scaling-laws-flash-attention-mamba-deepseek-r1-titans-deep-dive 2017년 Attention Is All You Need에서 2026년 Titans와 DeepSeek-R1까지, LLM 시대를 만든 50여 편의 랜드마크 논문을 테마별로 정리한다. Transformer · BERT · GPT 시리즈 · Scaling Laws · Chinchilla · InstructGPT · PaLM · Flash Attention 1/2/3 · LLaMA 1/2/3/4 · GPT-4 · Mistral · Mixtral · DPO · KTO · ORPO · RWKV · Mamba 1/2 · DeepSeek-V3 · DeepSeek-R1 · o1 · Titans · TTT · Era of Experience · Tülu 3 · Sleeper Agents · Scaling Monosemanticity · Mixture of a Million Experts · RoPE · YARN · Ring Attention · GPTQ · AWQ · BitNet b1.58 · DDPM · DiT · MMR1까지, 각 논문의 기여와 영향을 1단락 단위로 정리하고 실제 arxiv URL을 함께 제공한다. PR12, Tunib 잎차이, 일본 Connpass 論文読み会, PFN 블로그 등 한일 리딩 그룹 자료까지 묶어 2026년 5월 기준 가장 압축된 LLM 논문 로드맵을 만든다. Sat, 16 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) llm-paperstransformerscaling-lawsflash-attentionmambadeepseek-r1titansresearchai-papers2026deep-dive