Skip to content
Engines & Integrations Last updated: May 14, 2026

Trino and Apache Iceberg

Trino (formerly PrestoSQL) is a distributed SQL query engine with native Apache Iceberg support, optimized for interactive, sub-second analytical queries over Iceberg tables using the Iceberg REST Catalog, Hive Metastore, or Glue as its catalog.

trino icebergtrino apache icebergtrino iceberg connectorprestosql icebergtrino iceberg query

Trino and Apache Iceberg

Trino (formerly PrestoSQL) is a distributed, open-source SQL query engine designed for interactive analytics at scale. Originally developed at Facebook, Trino is now widely deployed in data lakehouses as a high-performance SQL engine for ad-hoc queries, business intelligence, and data exploration over Apache Iceberg tables.

Trino’s Iceberg connector is one of its most mature and actively developed connectors, supporting all major Iceberg features including time travel, hidden partitioning, row-level deletes, and multiple catalog backends.

Trino Iceberg Connector Features

FeatureSupport
Read Iceberg tablesFull
Write / INSERTFull
UPDATE (CoW)Full
DELETE (CoW)Full
MERGE INTOSupported
Time travelFull
Schema evolutionFull
Hidden partitioningFull
MoR read (apply deletes)Full
REST CatalogFull
Hive Metastore catalogFull
AWS Glue catalogFull
Nessie catalogFull

Configuration

trino-catalog Iceberg connector configuration (iceberg.properties)

connector.name=iceberg
iceberg.catalog.type=rest

# REST Catalog
iceberg.rest-catalog.uri=https://my-catalog.example.com
iceberg.rest-catalog.security=OAUTH2
iceberg.rest-catalog.oauth2.credential=client-id:client-secret

# OR Hive Metastore
iceberg.catalog.type=hive_metastore
hive.metastore.uri=thrift://metastore-host:9083

# OR AWS Glue
iceberg.catalog.type=glue

SQL: Querying Iceberg Tables with Trino

-- Standard query
SELECT customer_id, SUM(total) as revenue
FROM iceberg.db.orders
WHERE order_date >= DATE '2026-01-01'
GROUP BY customer_id
ORDER BY revenue DESC;

-- Time travel by timestamp
SELECT * FROM iceberg.db.orders
FOR TIMESTAMP AS OF TIMESTAMP '2026-01-15 00:00:00 UTC';

-- Time travel by snapshot ID
SELECT * FROM iceberg.db.orders
FOR VERSION AS OF 8027658604211071520;

Metadata Inspection

-- System schemas expose table metadata
SELECT * FROM iceberg.db."orders$snapshots";
SELECT * FROM iceberg.db."orders$history";
SELECT * FROM iceberg.db."orders$manifests";
SELECT * FROM iceberg.db."orders$files";
SELECT * FROM iceberg.db."orders$partitions";

Trino’s Strengths for Iceberg

Interactive Latency

Trino is optimized for interactive, low-latency SQL — typical queries complete in seconds or sub-second. This makes it particularly well-suited for:

Federation

Trino can query Iceberg tables alongside other data sources (PostgreSQL, MySQL, Elasticsearch, S3 files) in a single SQL statement. This federation capability is valuable for workloads that need to join lakehouse data with operational database data.

-- Join Iceberg with PostgreSQL in a single Trino query
SELECT i.order_id, p.customer_name, i.total
FROM iceberg.db.orders i
JOIN postgresql.crm.customers p ON i.customer_id = p.id
WHERE i.order_date >= DATE '2026-05-01';

Trino vs. Dremio vs. Spark

DimensionTrinoDremioSpark
Best forAd-hoc SQL, BIBI + AI analyticsBatch ETL
Query latencySub-second to secondsSub-secondSeconds+
Streaming writesNoNoYes
Maintenance opsPartialYesYes (full)
AI Semantic LayerNoYesNo
Open CatalogVia REST configBuilt-in (Polaris)Via REST config
Managed serviceStarburstDremio CloudDatabricks, EMR

📚 Go Deeper on Apache Iceberg

Alex Merced has authored three hands-on books covering Apache Iceberg, the Agentic Lakehouse, and modern data architecture. Pick up a copy to master the full ecosystem.

← Back to Iceberg Knowledge Base