
  <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
      <title>Chaos and Order</title>
      <link>https://www.youngju.dev/blog</link>
      <description>천천히 올바르게. AI Researcher &amp; DevOps Engineer Youngju&#39;s tech blog. GPU/CUDA, LLM, MLOps, Kubernetes AI workloads, distributed training, and data engineering.</description>
      <language>ko</language>
      <managingEditor>fjvbn2003@gmail.com (Youngju Kim)</managingEditor>
      <webMaster>fjvbn2003@gmail.com (Youngju Kim)</webMaster>
      <lastBuildDate>Sat, 16 May 2026 00:00:00 GMT</lastBuildDate>
      <atom:link href="https://www.youngju.dev/tags/florence/feed.xml" rel="self" type="application/rss+xml"/>
      
  <item>
    <guid>https://www.youngju.dev/blog/culture/2026-05-16-vision-language-models-clip-llava-internvl-qwen-vl-gpt4o-gemini-claude-vlm-2026-deep-dive.en</guid>
    <title>Vision-Language Models (VLMs) 2026 Deep Dive — CLIP, LLaVA, InternVL3, Qwen2.5-VL, GPT-4o, Gemini 2.5, Claude 4.7, DINOv2, SAM 2, and Florence-2</title>
    <link>https://www.youngju.dev/blog/culture/2026-05-16-vision-language-models-clip-llava-internvl-qwen-vl-gpt4o-gemini-claude-vlm-2026-deep-dive.en</link>
    <description>Everything you need to know about Vision-Language Models in May 2026 in one place. CLIP family (SigLIP, EVA-CLIP), open VLMs (LLaVA-NeXT, InternVL3, Qwen2.5-VL, Pixtral, Molmo, Idefics3, MiniCPM-V), closed frontier (GPT-4o, Claude 4.7, Gemini 2.5), vision foundations (DINOv2/v3, SAM 2, Florence-2), training recipes, evaluation (MMMU, MathVista, ChartQA, DocVQA), OCR-centric VLMs, video VLMs, vLLM/SGLang serving, and the VLM scenes in Korea and Japan — covered in depth.</description>
    <pubDate>Sat, 16 May 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>vision-language-models</category><category>vlm</category><category>clip</category><category>llava</category><category>internvl</category><category>qwen-vl</category><category>gpt-4o</category><category>gemini</category><category>claude</category><category>dinov2</category><category>sam</category><category>florence</category><category>multimodal</category><category>foundation-models</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/culture/2026-05-16-vision-language-models-clip-llava-internvl-qwen-vl-gpt4o-gemini-claude-vlm-2026-deep-dive.ja</guid>
    <title>ビジョン言語モデル(VLM)2026 完全ガイド — CLIP・LLaVA・InternVL3・Qwen2.5-VL・GPT-4o・Gemini 2.5・Claude 4.7・DINOv2・SAM 2・Florence-2 徹底解説</title>
    <link>https://www.youngju.dev/blog/culture/2026-05-16-vision-language-models-clip-llava-internvl-qwen-vl-gpt4o-gemini-claude-vlm-2026-deep-dive.ja</link>
    <description>2026年5月時点のビジョン言語モデル(VLM)を一本にまとめる。CLIP系列(SigLIP, EVA-CLIP)、オープンVLM(LLaVA-NeXT, InternVL3, Qwen2.5-VL, Pixtral, Molmo, Idefics3, MiniCPM-V)、クローズドフロンティア(GPT-4o, Claude 4.7, Gemini 2.5)、ビジョン基盤(DINOv2/v3, SAM 2, Florence-2)、学習レシピ、評価(MMMU, MathVista, ChartQA, DocVQA)、OCR特化VLM、動画VLM、vLLM/SGLangでのサービング、そして韓国・日本のVLMシーンまで深掘りする。</description>
    <pubDate>Sat, 16 May 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>vision-language-models</category><category>vlm</category><category>clip</category><category>llava</category><category>internvl</category><category>qwen-vl</category><category>gpt-4o</category><category>gemini</category><category>claude</category><category>dinov2</category><category>sam</category><category>florence</category><category>multimodal</category><category>foundation-models</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/culture/2026-05-16-vision-language-models-clip-llava-internvl-qwen-vl-gpt4o-gemini-claude-vlm-2026-deep-dive</guid>
    <title>비전-언어 모델(VLM) 2026 완벽 가이드 - CLIP · LLaVA · InternVL3 · Qwen2.5-VL · GPT-4o · Gemini 2.5 · Claude 4.7 · DINOv2 · SAM 2 · Florence-2 심층 분석</title>
    <link>https://www.youngju.dev/blog/culture/2026-05-16-vision-language-models-clip-llava-internvl-qwen-vl-gpt4o-gemini-claude-vlm-2026-deep-dive</link>
    <description>2026년 5월 기준 비전-언어 모델(VLM)의 모든 것을 한 글에 담는다. CLIP 계열(SigLIP, EVA-CLIP)부터 오픈 VLM(LLaVA-NeXT, InternVL3, Qwen2.5-VL, Pixtral, Molmo, Idefics3, MiniCPM-V), 폐쇄형(GPT-4o, Claude 4.7, Gemini 2.5), 비전 파운데이션(DINOv2/v3, SAM 2, Florence-2), 학습 레시피, 평가(MMMU, MathVista, ChartQA, DocVQA), OCR-centric VLM, 비디오 VLM, vLLM/SGLang 서빙, 그리고 한국·일본 VLM 씬까지 깊이 정리한다.</description>
    <pubDate>Sat, 16 May 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>vision-language-models</category><category>vlm</category><category>clip</category><category>llava</category><category>internvl</category><category>qwen-vl</category><category>gpt-4o</category><category>gemini</category><category>claude</category><category>dinov2</category><category>sam</category><category>florence</category><category>multimodal</category><category>foundation-models</category>
  </item>

    </channel>
  </rss>
