Skip to content
Core Concepts Last updated: May 14, 2026

Iceberg Open Table Format vs. Delta Lake vs. Apache Hudi

Apache Iceberg, Delta Lake, and Apache Hudi are the three dominant open table formats competing to be the storage foundation of the data lakehouse, each with different governance, ecosystem support, and feature trade-offs.

iceberg vs delta lakeiceberg vs hudiopen table format comparisonapache iceberg delta lake hudiwhich table format to use

Apache Iceberg vs. Delta Lake vs. Apache Hudi

The open table format landscape is dominated by three projects: Apache Iceberg, Delta Lake (primarily maintained by Databricks), and Apache Hudi (primarily maintained by Onehouse/Uber). Each emerged from different companies to solve the same core problems with raw data lake storage, but with different design choices, governance models, and trade-offs.

Feature Comparison

FeatureApache IcebergDelta LakeApache Hudi
GovernanceApache Foundation (ASF)Linux FoundationApache Foundation (ASF)
Primary backerBroad ecosystem (Netflix origin)DatabricksOnehouse (Uber origin)
ACID transactionsYesYesYes
Schema evolutionYesYesPartial
Time travelYesYesYes
Hidden partitioningYesNoNo
Partition evolutionYesNoNo
Row-level deletesYes (Spec v2)YesYes
Upserts (MERGE)YesYesYes (native)
Streaming writeYes (Flink/Spark)Yes (Spark)Yes
Engine neutralityExcellentGood (Spark-centric)Good
REST Catalog specYesNoNo
Credential vendingYes (via REST Catalog)NoNo
Branching/taggingYes (table-level)NoNo
Python clientPyIcebergdelta-rsPyHudi
DuckDB supportYesYesLimited

Governance: The Critical Difference

The single most important strategic difference between these formats is governance:

For organizations concerned about vendor lock-in, Apache Iceberg and Apache Hudi’s ASF governance provides stronger neutrality guarantees than Delta Lake’s de-facto Databricks control.

Ecosystem and Engine Support

Apache Iceberg has the broadest multi-engine support:

Delta Lake has excellent Spark support and increasingly good support for other engines via the Delta Kernel and delta-rs, but Databricks remains the primary optimized runtime.

Apache Hudi has strong streaming upsert capabilities (native MOR streaming with Flink) and good Spark support, with growing support for other engines.

Why Apache Iceberg Won (and Is Winning) the Format Wars

By 2024–2026, Apache Iceberg has emerged as the clear ecosystem standard:

  1. Snowflake adopted Iceberg: Snowflake Open Catalog and native Iceberg table support.
  2. Apache Polaris: Co-created by Dremio and Snowflake, donated to Apache — the neutral reference catalog.
  3. AWS native support: Amazon Athena, S3 Tables, AWS Glue all have native Iceberg support.
  4. Google BigQuery Iceberg: BigQuery can read and write Iceberg tables.
  5. DuckDB + PyIceberg: The Python and data science ecosystem chose Iceberg.
  6. The REST Catalog standard: Iceberg’s catalog interoperability story has no equivalent in Delta Lake or Hudi.

While Delta Lake remains popular in Databricks-centric deployments and Hudi has a loyal following in streaming-first architectures, Apache Iceberg has become the format of choice for multi-engine, multi-cloud lakehouses.

Migration Between Formats

Migrating between table formats is possible but non-trivial:

For new projects in 2025+, Apache Iceberg is the clear default choice for multi-engine, vendor-neutral lakehouses.

📚 Go Deeper on Apache Iceberg

Alex Merced has authored three hands-on books covering Apache Iceberg, the Agentic Lakehouse, and modern data architecture. Pick up a copy to master the full ecosystem.

← Back to Iceberg Knowledge Base