
  <rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
      <title>Chaos and Order</title>
      <link>https://www.youngju.dev/blog</link>
      <description>천천히 올바르게. AI Researcher &amp; DevOps Engineer Youngju&#39;s tech blog. GPU/CUDA, LLM, MLOps, Kubernetes AI workloads, distributed training, and data engineering.</description>
      <language>ko</language>
      <managingEditor>fjvbn2003@gmail.com (Youngju Kim)</managingEditor>
      <webMaster>fjvbn2003@gmail.com (Youngju Kim)</webMaster>
      <lastBuildDate>Sat, 16 May 2026 00:00:00 GMT</lastBuildDate>
      <atom:link href="https://www.youngju.dev/tags/apache-parquet/feed.xml" rel="self" type="application/rss+xml"/>
      
  <item>
    <guid>https://www.youngju.dev/blog/culture/2026-05-16-data-lakehouse-modern-data-engineering-2026-iceberg-delta-hudi-paimon-tabular-databricks-trino-spark-4-deep-dive.en</guid>
    <title>Data Lakehouse &amp; Modern Data Engineering 2026 — Iceberg / Delta / Hudi / Paimon / Tabular (Databricks acquisition) / Trino / Spark 4 / Flink 2 / DataFusion Deep Dive</title>
    <link>https://www.youngju.dev/blog/culture/2026-05-16-data-lakehouse-modern-data-engineering-2026-iceberg-delta-hudi-paimon-tabular-databricks-trino-spark-4-deep-dive.en</link>
    <description>Data engineering in 2026 is no longer the era of &quot;data warehouse vs data lake.&quot; Apache Iceberg emerged as the winner of the 2024-25 table format war — Netflix, Apple, LinkedIn, Stripe, Airbnb all converged on it. Databricks acquired Tabular (founded by Iceberg co-creator Ryan Blue) for over $1B in June 2024, bringing Delta Lake and Iceberg under one roof. Apache Hudi takes its own path through Onehouse, while Apache Paimon (born from the Flink team) defines a new category called &quot;streaming lakehouse.&quot; For processing engines, Spark 4 (Spark Connect) and Flink 2 lead; for query engines, Trino, Presto, DuckDB, and ClickHouse share the field; for transformations, dbt is the standard; for catalogs, Unity Catalog, Polaris, and BigLake compete. This essay maps the 2026 lakehouse landscape — table formats, engines, catalogs, commercial services, plus Korean and Japanese case studies — in one sweep.</description>
    <pubDate>Sat, 16 May 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>data-engineering</category><category>data-lakehouse</category><category>apache-iceberg</category><category>delta-lake</category><category>apache-hudi</category><category>apache-paimon</category><category>tabular</category><category>databricks</category><category>snowflake</category><category>bigquery</category><category>dbt</category><category>trino</category><category>presto</category><category>spark</category><category>flink</category><category>clickhouse</category><category>duckdb</category><category>apache-arrow</category><category>apache-parquet</category><category>apache-orc</category><category>datafusion</category><category>onehouse</category><category>2026</category><category>deep-dive</category><category>english</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/culture/2026-05-16-data-lakehouse-modern-data-engineering-2026-iceberg-delta-hudi-paimon-tabular-databricks-trino-spark-4-deep-dive.ja</guid>
    <title>データレイクハウス &amp; モダンデータエンジニアリング 2026 — Iceberg / Delta / Hudi / Paimon / Tabular (Databricks買収) / Trino / Spark 4 / Flink 2 / DataFusion 徹底ガイド</title>
    <link>https://www.youngju.dev/blog/culture/2026-05-16-data-lakehouse-modern-data-engineering-2026-iceberg-delta-hudi-paimon-tabular-databricks-trino-spark-4-deep-dive.ja</link>
    <description>2026年のデータエンジニアリングは、もはや「データウェアハウス対データレイク」の時代ではない。2024-25年のテーブルフォーマット戦争はApache Icebergの勝利に終わり、Netflix・Apple・LinkedIn・Stripe・Airbnbが揃ってその上に集まった。Databricksは2024年6月、Iceberg共同創設者Ryan BlueのTabularを10億ドル超で買収し、Delta LakeとIcebergを同じ屋根の下にまとめた。一方、Apache HudiはOnehouseという商用会社を通じて独自の道を歩み、Flinkチーム発のApache Paimonは「ストリーミング・レイクハウス」という新カテゴリを定義する。処理エンジンはSpark 4 (Spark Connect)とFlink 2、クエリエンジンはTrino・Presto・DuckDB・ClickHouse、変換はdbtが標準、カタログはUnity Catalog・Polaris・BigLakeが分担する。この記事では2026年のデータレイクハウスの全景を — テーブルフォーマット、エンジン、カタログ、商用サービス、日本・韓国の事例まで — 一気に整理する。</description>
    <pubDate>Sat, 16 May 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>data-engineering</category><category>data-lakehouse</category><category>apache-iceberg</category><category>delta-lake</category><category>apache-hudi</category><category>apache-paimon</category><category>tabular</category><category>databricks</category><category>snowflake</category><category>bigquery</category><category>dbt</category><category>trino</category><category>presto</category><category>spark</category><category>flink</category><category>clickhouse</category><category>duckdb</category><category>apache-arrow</category><category>apache-parquet</category><category>apache-orc</category><category>datafusion</category><category>onehouse</category><category>2026</category><category>deep-dive</category><category>日本語</category>
  </item>

  <item>
    <guid>https://www.youngju.dev/blog/culture/2026-05-16-data-lakehouse-modern-data-engineering-2026-iceberg-delta-hudi-paimon-tabular-databricks-trino-spark-4-deep-dive</guid>
    <title>데이터 레이크하우스 &amp; 모던 데이터 엔지니어링 2026 — Iceberg / Delta / Hudi / Paimon / Tabular (Databricks 인수) / Trino / Spark 4 / Flink 2 / DataFusion 심층 가이드</title>
    <link>https://www.youngju.dev/blog/culture/2026-05-16-data-lakehouse-modern-data-engineering-2026-iceberg-delta-hudi-paimon-tabular-databricks-trino-spark-4-deep-dive</link>
    <description>2026년의 데이터 엔지니어링은 더 이상 &quot;데이터 웨어하우스 vs 데이터 레이크&quot;의 시대가 아니다. Apache Iceberg가 2024-25년 테이블 포맷 전쟁의 승자로 떠오르면서 Netflix·Apple·LinkedIn·Stripe·Airbnb가 모두 그 위에 모였고, Databricks는 2024년 6월 Iceberg 공동 창시자 Ryan Blue의 Tabular를 $1B 이상으로 인수해 Delta Lake와 Iceberg를 한 지붕 아래로 끌어들였다. 한쪽에서는 Apache Hudi가 Onehouse라는 상용화 회사를 통해 자기 길을 가고, Flink 팀이 만든 Apache Paimon이 스트리밍 레이크하우스라는 새 카테고리를 정의한다. 처리 엔진은 Spark 4(Spark Connect)와 Flink 2가, 쿼리 엔진은 Trino·Presto·DuckDB·ClickHouse가, 변환은 dbt가, 카탈로그는 Unity Catalog·Polaris·BigLake가 나눠 들고 있다. 이 글은 2026년 데이터 레이크하우스의 풍경을 — 테이블 포맷, 엔진, 카탈로그, 상용 서비스, 한국·일본 사례까지 — 한 호흡으로 정리한다.</description>
    <pubDate>Sat, 16 May 2026 00:00:00 GMT</pubDate>
    <author>fjvbn2003@gmail.com (Youngju Kim)</author>
    <category>data-engineering</category><category>data-lakehouse</category><category>apache-iceberg</category><category>delta-lake</category><category>apache-hudi</category><category>apache-paimon</category><category>tabular</category><category>databricks</category><category>snowflake</category><category>bigquery</category><category>dbt</category><category>trino</category><category>presto</category><category>spark</category><category>flink</category><category>clickhouse</category><category>duckdb</category><category>apache-arrow</category><category>apache-parquet</category><category>apache-orc</category><category>datafusion</category><category>onehouse</category><category>2026</category><category>deep-dive</category>
  </item>

    </channel>
  </rss>
