IngestThis
BLOG
COMMUNITY
PODCAST

Tag: dremio

2026-04-29 • Alex Merced

What Are Table Formats and Why Were They Needed?

Table formats like Apache Iceberg solved the ACID, schema, and performance problems that turned data lakes into data swa...

2026-04-29 • Alex Merced

The Metadata Structure of Modern Table Formats

Iceberg uses a metadata tree, Delta Lake uses a transaction log, Hudi uses a timeline. Here is exactly how each format o...

2026-04-29 • Alex Merced

Performance and Apache Iceberg's Metadata

Iceberg's three-layer metadata tree eliminates directory listing and enables multi-level data skipping. Here is how scan...

2026-04-29 • Alex Merced

Partition Evolution: Change Your Partitioning Without Rewriting Data

Iceberg lets you change partition schemes without rewriting data. Here is how partition evolution works internally and w...

2026-04-29 • Alex Merced

Hidden Partitioning: How Iceberg Eliminates Accidental Full Table Scans

Iceberg's hidden partitioning separates physical layout from user queries using transform functions. Here is how it work...

2026-04-29 • Alex Merced

Writing to an Apache Iceberg Table: How Commits and ACID Actually Work

Here is exactly how an engine writes to an Iceberg table, step by step, from data files through the atomic commit that m...

2026-04-29 • Alex Merced

What Are Lakehouse Catalogs? The Role of Catalogs in Apache Iceberg

Lakehouse catalogs store metadata pointers, manage namespaces, and enforce access control. Here is the complete catalog ...

2026-04-29 • Alex Merced

When Catalogs Are Embedded in Storage

S3 Tables and MinIO AI Stor embed the Iceberg catalog directly in the storage layer. Here is when embedded catalogs make...

2026-04-29 • Alex Merced

How Data Lake Table Storage Degrades Over Time

Iceberg tables degrade through small files, orphan files, metadata bloat, sort order decay, and partition skew. Here is ...

2026-04-29 • Alex Merced

Maintaining Apache Iceberg Tables: Compaction, Expiry, and Cleanup

Keep Iceberg tables fast with compaction, snapshot expiry, orphan cleanup, and manifest rewriting. Here is when and how ...

2026-04-29 • Alex Merced

Apache Iceberg Metadata Tables: Querying the Internals

Iceberg metadata tables let you query snapshots, files, manifests, and partitions using SQL. Here is every metadata tabl...

2026-04-29 • Alex Merced

Using Apache Iceberg with Python and MPP Query Engines

Access Iceberg tables from Python with PyIceberg, DuckDB, and Polars, or through MPP engines like Dremio, Spark, and Tri...

2026-04-29 • Alex Merced

Approaches to Streaming Data into Apache Iceberg Tables

Stream data into Iceberg with Spark Structured Streaming, Flink, or Kafka Connect. Here is how each works and the trade-...

2026-04-29 • Alex Merced

Hands-On with Apache Iceberg Using Dremio Cloud

A practical walkthrough of creating, querying, and optimizing Iceberg tables on Dremio Cloud, from account setup to AI-p...

2026-04-29 • Alex Merced

Migrating to Apache Iceberg: Strategies for Every Source System

Migrate to Iceberg from Hive, data warehouses, or raw files using in-place migration, full rewrite, or the zero-downtime...

2026-03-05 • Alex Merced

How to Use Dremio with Amazon Kiro: Connect, Query, and Build Data Apps

Amazon Kiro is an agentic AI IDE from AWS that introduces spec-driven development to the coding workflow. Instead of jum...

2026-03-05 • Alex Merced

How to Use Dremio with Claude Code: Connect, Query, and Build Data Apps

Claude Code is Anthropic's terminal-based coding agent. It reads your files, writes code, runs commands, and maintains c...

2026-03-05 • Alex Merced

How to Use Dremio with Claude CoWork: Connect, Query, and Build Data Apps

Claude CoWork is Anthropic's desktop agentic assistant. Unlike Claude Code (a terminal coding agent), CoWork operates as...

2026-03-05 • Alex Merced

How to Use Dremio with Cursor: Connect, Query, and Build Data Apps

Cursor is an AI-native code editor built as a fork of VS Code. It integrates AI directly into the editing experience wit...

2026-03-05 • Alex Merced

How to Use Dremio with Gemini CLI: Connect, Query, and Build Data Apps

Gemini CLI is Google's open-source terminal-based AI agent. It runs directly in your terminal, powered by Gemini models ...

Categories

data engineering
oltp
database
data
frontend
data lakehouse
Data Engineering
Data Lakehouse
Javascript
Data Architecture
Data Analytics
Devops
Data Modeling
DevOps
python
sql
rust
AI
Apache Iceberg
Software Development
Semantic Layer
TopicsData EngineeringApache IcebergData LakehouseAI & Machine Learning
SiteAll ArticlesRSS FeedSitemap
AuthorAlex MercedLinkedInTwitter / X

© 2026 Alex Merced — alexmercedcoder.dev