
  <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
      <title>Chaos and Order</title>
      <link>https://www.youngju.dev/blog</link>
      <description>천천히 올바르게. AI Researcher &amp; DevOps Engineer Youngju&#39;s tech blog. GPU/CUDA, LLM, MLOps, Kubernetes AI workloads, distributed training, and data engineering.</description>
      <language>ko</language>
      <managingEditor>fjvbn2003@gmail.com (Youngju Kim)</managingEditor>
      <webMaster>fjvbn2003@gmail.com (Youngju Kim)</webMaster>
      <lastBuildDate>Fri, 26 Jun 2026 00:00:00 GMT</lastBuildDate>
      <atom:link href="https://www.youngju.dev/tags/qwen2-vl/feed.xml" rel="self" type="application/rss+xml"/>
      
  <item>
    <guid>https://www.youngju.dev/blog/llm/2026-06-26-vision-language-model-architecture.en</guid>
    <title>Vision LLM Architecture — How an Image Becomes Language</title>
    <link>https://www.youngju.dev/blog/llm/2026-06-26-vision-language-model-architecture.en</link>
    <description>A vision-language model processes an image with a vision encoder, then passes it through a projector to produce tokens an LLM can read. From patch embedding to arbitrary-resolution handling, we trace the full path by which an image turns into language tokens.</description>
    <pubDate>Fri, 26 Jun 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>llm</category><category>vision-language-model</category><category>multimodal</category><category>vit</category><category>qwen2-vl</category><category>architecture</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/llm/2026-06-26-vision-language-model-architecture.ja</guid>
    <title>Vision LLM アーキテクチャ — 画像が言語になるまで</title>
    <link>https://www.youngju.dev/blog/llm/2026-06-26-vision-language-model-architecture.ja</link>
    <description>ビジョン言語モデルは画像をビジョンエンコーダで処理し、プロジェクタを通して LLM が読めるトークンへ変換します。パッチ埋め込みから任意解像度処理まで、画像が言語トークンになる全過程を構造的に見ていきます。</description>
    <pubDate>Fri, 26 Jun 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>llm</category><category>vision-language-model</category><category>multimodal</category><category>vit</category><category>qwen2-vl</category><category>architecture</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/llm/2026-06-26-vision-language-model-architecture</guid>
    <title>Vision LLM 아키텍처 — 이미지가 언어가 되기까지</title>
    <link>https://www.youngju.dev/blog/llm/2026-06-26-vision-language-model-architecture</link>
    <description>비전-언어 모델은 이미지를 비전 인코더로 처리한 뒤 프로젝터를 거쳐 LLM이 이해할 수 있는 토큰으로 바꿉니다. 패치 임베딩부터 임의 해상도 처리까지, 이미지가 언어 토큰이 되는 전 과정을 구조적으로 살펴봅니다.</description>
    <pubDate>Fri, 26 Jun 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>llm</category><category>vision-language-model</category><category>multimodal</category><category>vit</category><category>qwen2-vl</category><category>architecture</category>
  </item>

    </channel>
  </rss>
