Lakehouse and Delta Lake tables

Microsoft Fabric Lakehouse uses Delta Lake as the default and preferred table format to provide reliable, high-performance data storage and processing. While other formats are supported, Delta Lake offers the best integration and performance across all Fabric services. This article explains what Delta Lake tables are, how they work in Fabric, and how to get the best performance from your data.

What are Delta Lake tables?

When you store data in a Microsoft Fabric Lakehouse, your data is automatically saved using a special format called Delta Lake. Think of Delta Lake as an enhanced version of regular data files that provides:

Better performance - Faster queries and data processing
Data reliability - Automatic error checking and data integrity
Flexibility - Works with both structured data (like database tables) and semi-structured data (like JSON files)

Why does this matter?

Delta Lake is the standard table format for all data in Fabric Lakehouse. This means:

Consistency: All your data uses the same format, making it easier to work with
Compatibility: Your data works seamlessly across all Fabric tools (Power BI, notebooks, data pipelines, etc.)
No extra work: When you load data into tables or use other data loading methods, Delta format is applied automatically

You don't need to worry about the technical details - Fabric handles the Delta Lake formatting behind the scenes. This article explains how it works and how to get the best performance from your data.

Apache Spark engine and data formats

Fabric Lakehouse is powered by Apache Spark Runtime, which is based on the same foundation as Azure Synapse Analytics Runtime for Apache Spark. However, Fabric includes optimizations and different default settings to provide better performance across all Fabric services.

Supported data formats:

Delta Lake - The preferred format (automatic optimization)
CSV files - Spreadsheet-like data files
JSON files - Web and application data
Parquet files - Compressed data files
Other formats - AVRO and legacy Hive table formats

Key benefits of Fabric's Apache Spark:

Optimized by default: Performance features are automatically enabled for better speed
Multiple formats supported: You can read from existing files in various formats
Automatic conversion: When you load data into tables, it's automatically optimized using Delta Lake format

Note

While you can work with different file formats, tables displayed in the Lakehouse explorer are optimized Delta Lake tables for the best performance and reliability.

Differences from Azure Synapse Analytics

If you're migrating from Azure Synapse Analytics, here are the key configuration differences in Fabric's Apache Spark runtime:

Apache Spark configuration	Microsoft Fabric value	Azure Synapse Analytics value	Notes
spark.sql.sources.default	delta	parquet	Default table format
spark.sql.parquet.vorder.default	true	N/A	V-Order writer
spark.sql.parquet.vorder.dictionaryPageSize	2 GB	N/A	Dictionary page size limit for V-Order
spark.databricks.delta.optimizeWrite.enabled	true	unset (false)	Optimize Write

These optimizations are designed to provide better performance out-of-the-box in Fabric. Advanced users can modify these configurations if needed for specific scenarios.

How Fabric finds your tables automatically

When you open your Lakehouse, Fabric automatically scans your data and displays any tables it finds in the Tables section of the explorer. This means:

No manual setup required - Fabric automatically discovers existing tables
Organized view - Tables appear in a tree structure for easy navigation
Works with shortcuts - Tables linked from other locations are also automatically discovered

This automatic discovery makes it easy to see all your available data at a glance.

Tables over shortcuts

Microsoft Fabric Lakehouse supports tables defined over OneLake shortcuts to provide utmost compatibility and no data movement. The following table contains the scenario best practices for each item type when using it over shortcuts.

Shortcut destination	Where to create the shortcut	Best practice
Delta Lake table	`Tables` section	If multiple tables are present in the destination, create one shortcut per table.
Folders with files	`Files` section	Use Apache Spark to use the destination directly using relative paths. Load the data into Lakehouse-native Delta tables for maximum performance.
Legacy Apache Hive tables	`Files` section	Use Apache Spark to use the destination directly using relative paths, or create a metadata catalog reference using `CREATE EXTERNAL TABLE` syntax. Load the data into Lakehouse-native Delta tables for maximum performance.

Load to Table

Microsoft Fabric Lakehouse provides a convenient and productive user interface to streamline loading data into Delta tables. The Load to Table feature allows a visual experience for loading common file formats to Delta to boost analytical productivity to all personas. To learn more about the Load to Table feature, read the Load to Delta Lake tables reference documentation.

Keeping your tables fast and efficient

Fabric automatically optimizes your Delta Lake tables for better performance, but sometimes you may want additional control:

What Fabric does automatically:

Combines small files into larger, more efficient files
Optimizes data layout for faster queries
Manages storage to reduce costs

When you might need manual optimization:

Very large datasets with specific performance requirements
Custom data organization needs
Advanced analytics scenarios

For detailed guidance on table optimization, see Delta Lake table optimization and V-Order.

Feedback

Was this page helpful?

Last updated on 2025-11-19