
  <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
      <title>Chaos and Order</title>
      <link>https://www.youngju.dev/blog</link>
      <description>천천히 올바르게. AI Researcher &amp; DevOps Engineer Youngju&#39;s tech blog. GPU/CUDA, LLM, MLOps, Kubernetes AI workloads, distributed training, and data engineering.</description>
      <language>ko</language>
      <managingEditor>fjvbn2003@gmail.com (Youngju Kim)</managingEditor>
      <webMaster>fjvbn2003@gmail.com (Youngju Kim)</webMaster>
      <lastBuildDate>Tue, 16 Jun 2026 00:00:00 GMT</lastBuildDate>
      <atom:link href="https://www.youngju.dev/tags/memory-wall/feed.xml" rel="self" type="application/rss+xml"/>
      
  <item>
    <guid>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-cerebras-wafer-scale-deep-dive.en</guid>
    <title>Cerebras Wafer-Scale Deep Dive — A Whole Model on a Single Chip</title>
    <link>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-cerebras-wafer-scale-deep-dive.en</link>
    <description>A close look at the design of the Cerebras WSE-3, a single chip carved from an entire wafer. We cover the on-chip SRAM-centric structure that routes around the memory wall, the fault-tolerant design, real-time inference performance, and the trade-offs versus GPU clusters.</description>
    <pubDate>Tue, 16 Jun 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>cerebras</category><category>wafer-scale</category><category>ai-hardware</category><category>memory-wall</category><category>inference</category><category>wse-3</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-cerebras-wafer-scale-deep-dive.ja</guid>
    <title>Cerebras ウェハースケール ディープダイブ — 1枚のチップにモデル全体を</title>
    <link>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-cerebras-wafer-scale-deep-dive.ja</link>
    <description>ウェハー1枚を丸ごと1つのチップにした Cerebras WSE-3 の設計を深く掘り下げます。メモリウォールを回避するオンチップSRAM中心の構造、欠陥許容設計、リアルタイム推論性能、そしてGPUクラスタとのトレードオフを整理します。</description>
    <pubDate>Tue, 16 Jun 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>cerebras</category><category>wafer-scale</category><category>ai-hardware</category><category>memory-wall</category><category>inference</category><category>wse-3</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-cerebras-wafer-scale-deep-dive</guid>
    <title>Cerebras 웨이퍼스케일 딥다이브 — 칩 하나에 모델 전체를</title>
    <link>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-cerebras-wafer-scale-deep-dive</link>
    <description>웨이퍼 한 장을 통째로 하나의 칩으로 만든 Cerebras WSE-3의 설계를 깊게 파헤칩니다. 메모리 월을 우회하는 온칩 SRAM 중심 구조, 결함 허용 설계, 실시간 추론 성능, 그리고 GPU 클러스터 대비 장단점을 정리합니다.</description>
    <pubDate>Tue, 16 Jun 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>cerebras</category><category>wafer-scale</category><category>ai-hardware</category><category>memory-wall</category><category>inference</category><category>wse-3</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-in-memory-computing-principles.en</guid>
    <title>In-Memory Computing Principles — Computing Inside the Memory</title>
    <link>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-in-memory-computing-principles.en</link>
    <description>A deep look at the principles of compute-in-memory (CIM): computing directly inside memory instead of moving data to a compute unit. We cover solving a matrix multiply in one shot with a crossbar array, the difference between analog and digital approaches, the precision-versus-noise trade-off, and 2026 research trends and commercialization challenges.</description>
    <pubDate>Tue, 16 Jun 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>in-memory-computing</category><category>compute-in-memory</category><category>ai-hardware</category><category>crossbar</category><category>reram</category><category>memory-wall</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-in-memory-computing-principles.ja</guid>
    <title>インメモリコンピューティングの原理 — メモリの中で演算する</title>
    <link>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-in-memory-computing-principles.ja</link>
    <description>データを演算ユニットへ運ぶ代わりに、メモリの中で直接演算する compute-in-memory(CIM)の原理を深く整理します。クロスバーアレイで行列積を一度に解く方法、アナログとデジタル方式の違い、精度とノイズのトレードオフ、そして2026年の研究動向と商用化の課題を扱います。</description>
    <pubDate>Tue, 16 Jun 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>in-memory-computing</category><category>compute-in-memory</category><category>ai-hardware</category><category>crossbar</category><category>reram</category><category>memory-wall</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-in-memory-computing-principles</guid>
    <title>인메모리 컴퓨팅 원리 — 메모리에서 연산하기</title>
    <link>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-in-memory-computing-principles</link>
    <description>데이터를 연산 유닛으로 옮기는 대신, 메모리 안에서 직접 연산하는 compute-in-memory(CIM)의 원리를 깊게 정리합니다. 크로스바 어레이로 행렬곱을 한 번에 푸는 방법, 아날로그와 디지털 방식의 차이, 정밀도와 노이즈의 트레이드오프, 그리고 2026년 연구 동향과 상용화 과제를 다룹니다.</description>
    <pubDate>Tue, 16 Jun 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>in-memory-computing</category><category>compute-in-memory</category><category>ai-hardware</category><category>crossbar</category><category>reram</category><category>memory-wall</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-memory-wall-hbm-bandwidth.en</guid>
    <title>The Memory Wall and HBM — The Real Bottleneck That Divides AI Performance</title>
    <link>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-memory-wall-hbm-bandwidth.en</link>
    <description>In an era where compute is cheap and data movement is expensive, the real bottleneck of AI performance is memory. From the memory-wall concept to HBM generations, the roofline model and arithmetic intensity, the KV cache, and how quantization saves bandwidth, all from a developer view.</description>
    <pubDate>Tue, 16 Jun 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>memory-wall</category><category>hbm</category><category>bandwidth</category><category>roofline</category><category>inference</category><category>ai-hardware</category><category>quantization</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-memory-wall-hbm-bandwidth.ja</guid>
    <title>メモリウォールとHBM — AI性能を分ける本当のボトルネック</title>
    <link>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-memory-wall-hbm-bandwidth.ja</link>
    <description>演算が安くなりデータ移動が高くなった時代、AI性能の本当のボトルネックはメモリです。メモリウォールの概念からHBM世代、rooflineモデルと算術強度、KVキャッシュ、量子化による帯域幅削減まで開発者目線で整理します。</description>
    <pubDate>Tue, 16 Jun 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>memory-wall</category><category>hbm</category><category>bandwidth</category><category>roofline</category><category>inference</category><category>ai-hardware</category><category>quantization</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-memory-wall-hbm-bandwidth</guid>
    <title>메모리 월과 HBM — AI 성능을 가르는 진짜 병목</title>
    <link>https://www.youngju.dev/blog/gpu-cuda/2026-06-16-memory-wall-hbm-bandwidth</link>
    <description>연산은 싸지고 데이터 이동은 비싸진 시대, AI 성능의 진짜 병목은 메모리입니다. 메모리 월 개념부터 HBM 세대, roofline 모델과 산술 강도, KV 캐시, 양자화로 대역폭을 절감하는 법까지 개발자 관점에서 정리합니다.</description>
    <pubDate>Tue, 16 Jun 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>memory-wall</category><category>hbm</category><category>bandwidth</category><category>roofline</category><category>inference</category><category>ai-hardware</category><category>quantization</category>
  </item>

    </channel>
  </rss>
