Skip to content
Engines & Integrations Last updated: May 14, 2026

Dremio and Apache Iceberg

Dremio is an Agentic Lakehouse platform that provides a fully integrated Iceberg experience through its Intelligent Query Engine, AI Semantic Layer, and Open Catalog powered by Apache Polaris, available as Dremio Cloud (fully managed) and Dremio Enterprise (self-managed).

dremio apache icebergdremio iceberg lakehousedremio cloud icebergdremio query engine icebergdremio open catalogdremio agentic lakehouse

Dremio and Apache Iceberg

Dremio is an Agentic Lakehouse platform for AI and Analytics that is purpose-built around Apache Iceberg as its foundational table format. Dremio describes itself as “The Agentic Lakehouse” — a platform designed for the AI era, providing the context, access, and speed that both AI agents and human analysts need to work with data.

Dremio’s relationship with Apache Iceberg goes beyond simple support: Dremio was one of the co-creators of Apache Polaris (alongside Snowflake), the Apache-governed reference implementation of the Iceberg REST Catalog specification. Apache Iceberg and the open lakehouse ecosystem are central to Dremio’s product strategy and technical identity.

Dremio Products

Dremio offers three product tiers:

Dremio Cloud

The fully managed lakehouse platform for Agentic AI. Runs in AWS and Azure (North America and European regions). Dremio Cloud includes:

Dremio Enterprise

Self-managed software that runs on Kubernetes, on-premise, or in any cloud. Provides the same core capabilities as Dremio Cloud for organizations with strict data residency or regulatory requirements.

Dremio Community Edition

A free query engine for local machines or servers. Ideal for development, learning, and evaluation of Iceberg workloads without cloud infrastructure.

Core Capabilities for Apache Iceberg

Intelligent Query Engine

Dremio’s query engine is built for Iceberg performance. It natively understands Iceberg’s metadata hierarchy — manifest lists, manifest files, column statistics — and uses this information for aggressive partition pruning and data skipping. Dremio’s Reflections feature (pre-computed materializations) can accelerate Iceberg queries by orders of magnitude for frequently run workloads.

Open Catalog (Powered by Apache Polaris)

Dremio’s Open Catalog capability implements the Iceberg REST Catalog specification, built on Apache Polaris. This means:

AI Semantic Layer

Dremio’s AI Semantic Layer translates Iceberg table data into AI-readable context:

This semantic layer is what enables AI agents and LLMs to generate correct, trustworthy SQL against Iceberg tables without human curation of every prompt.

AI Agent

Dremio’s built-in AI Agent uses the AI Semantic Layer to answer data questions autonomously — converting natural language queries into SQL, executing them against Iceberg tables, and returning results. The AI Agent is the first-class user of the semantic layer.

Table Optimization (Compaction)

Dremio provides the OPTIMIZE TABLE command for Iceberg compaction:

-- Basic optimization
OPTIMIZE TABLE db.orders;

-- With explicit settings
OPTIMIZE TABLE db.orders
REWRITE DATA USING BIN_PACK
(TARGET_FILE_SIZE_MB = 256, MIN_FILE_SIZE_MB = 64);

Dremio Cloud supports automatic background optimization, keeping Iceberg tables performant without manual maintenance schedules.

Dremio’s Role in the Iceberg Ecosystem

Beyond its product capabilities, Dremio is a leading contributor to the Apache Iceberg ecosystem:

Getting Started

The fastest path to working with Apache Iceberg via Dremio:

  1. Sign up for Dremio Cloud (free tier, no credit card required).
  2. Add your object storage as a data source (S3, ADLS, GCS).
  3. Use the Open Catalog to create Iceberg namespaces and tables.
  4. Query immediately with the Intelligent Query Engine.
  5. Connect other engines (Spark, PyIceberg) via the REST Catalog API.

📚 Go Deeper on Apache Iceberg

Alex Merced has authored three hands-on books covering Apache Iceberg, the Agentic Lakehouse, and modern data architecture. Pick up a copy to master the full ecosystem.

← Back to Iceberg Knowledge Base