Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Note
This feature is in preview.
JDBC (Java Database Connectivity) is a widely adopted standard that enables client applications to connect to and work with data from databases and big data platforms.
The Microsoft JDBC Driver for Fabric Data Engineering lets you connect, query, and manage Spark workloads in Microsoft Fabric with the reliability and simplicity of the JDBC standard. Built on Microsoft Fabric's Livy APIs, the driver provides secure and flexible Spark SQL connectivity to your Java applications and BI tools. This integration allows you to submit and execute Spark code directly without needing to create separate Notebook or Spark Job Definition artifacts.
Key Features
- JDBC 4.2 Compliant: Full implementation of JDBC 4.2 specification
- Microsoft Entra ID Authentication: Multiple authentication flows including interactive, client credentials, and certificate-based authentication
- Enterprise Connection Pooling: Built-in connection pooling with health monitoring and automatic recovery
- Spark SQL Native Query Support: Direct execution of Spark SQL statements without translation
- Comprehensive Data Type Support: Support for all Spark SQL data types including complex types (ARRAY, MAP, STRUCT)
- Asynchronous Result Set Prefetching: Background data loading for improved performance
- Circuit Breaker Pattern: Protection against cascading failures with automatic retry
- Auto-Reconnection: Transparent session recovery on connection failures
- Proxy Support: HTTP and SOCKS proxy configuration for enterprise environments
Prerequisites
Before using the Microsoft JDBC Driver for Microsoft Fabric Data Engineering, ensure you have:
- Java Development Kit (JDK): Version 11 or higher (Java 21 recommended)
- Microsoft Fabric Access: Access to a Microsoft Fabric workspace
- Azure Entra ID Credentials: Appropriate credentials for authentication
- Workspace and Lakehouse IDs: GUID identifiers for your Fabric workspace and lakehouse
Download and Installation
Microsoft JDBC Driver for Microsoft Fabric Data Engineering version 1.0.0 is the public preview version and supports Java 11, 17 and 21. We're continually improving Java connectivity support and recommend that you work with the latest version of the Microsoft JDBC driver.
- Download Microsoft JDBC Driver for Microsoft Fabric Data Engineering (zip)
- Download Microsoft JDBC Driver for Microsoft Fabric Data Engineering (tar)
- Download either the zip or tar file from the links above.
- Extract the downloaded file to access the driver JAR files.
- Select the JAR file that matches your JRE version:
- For Java 11:
ms-sparksql-jdbc-1.0.0.jre11.jar - For Java 17:
ms-sparksql-jdbc-1.0.0.jre17.jar - For Java 21:
ms-sparksql-jdbc-1.0.0.jre21.jar
- For Java 11:
- Add the selected JAR file to your application's classpath.
- For JDBC clients, configure the JDBC driver class:
com.microsoft.spark.livy.jdbc.LivyDriver
Quick Start Example
This example demonstrates how to connect to Microsoft Fabric and execute a query using the Microsoft JDBC Driver for Microsoft Fabric Data Engineering. Before running this code, ensure you have completed the prerequisites and installed the driver.
import java.sql.*;
public class QuickStartExample {
public static void main(String[] args) {
// Connection string with required parameters
String url = "jdbc:fabricspark://api.fabric.microsoft.com;" +
"FabricWorkspaceID=<workspace-id>;" +
"FabricLakehouseID=<lakehouse-id>;" +
"AuthFlow=2;" + // Interactive browser authentication
"LogLevel=INFO";
try (Connection conn = DriverManager.getConnection(url)) {
// Execute a simple query
try (Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("SELECT 'Hello from Fabric!' as message")) {
if (rs.next()) {
System.out.println(rs.getString("message"));
}
}
} catch (SQLException e) {
e.printStackTrace();
}
}
}
Connection String Format
Basic Connection String
The Microsoft JDBC Driver for Microsoft Fabric Data Engineering uses the following connection string format:
jdbc:fabricspark://<hostname>[:<port>][;<parameter1>=<value1>;<parameter2>=<value2>;...]
Connection String Components
| Component | Description | Example |
|---|---|---|
| Protocol | JDBC URL protocol identifier | jdbc:fabricspark:// |
| Hostname | Microsoft Fabric endpoint hostname | api.fabric.microsoft.com |
| Port | Optional port number (default: 443) | :443 |
| Parameters | Semicolon-separated key=value pairs | FabricWorkspaceID=<guid> |
Example Connection Strings
Basic Connection (Interactive Authentication)
jdbc:fabricspark://api.fabric.microsoft.com;FabricWorkspaceID=<workspace-id>;FabricLakehouseID=<lakehouse-id>;AuthFlow=2
With Spark Resource Configuration
jdbc:fabricspark://api.fabric.microsoft.com;FabricWorkspaceID=<workspace-id>;FabricLakehouseID=<lakehouse-id>;DriverCores=4;DriverMemory=4g;ExecutorCores=4;ExecutorMemory=8g;NumExecutors=2;AuthFlow=2
With Spark Session Properties
jdbc:fabricspark://api.fabric.microsoft.com;FabricWorkspaceID=<workspace-id>;FabricLakehouseID=<lakehouse-id>;spark.sql.adaptive.enabled=true;spark.sql.shuffle.partitions=200;AuthFlow=2
Authentication
The Microsoft JDBC Driver for Microsoft Fabric Data Engineering supports multiple authentication methods through Microsoft Entra ID (formerly Azure Active Directory). Authentication is configured using the AuthFlow parameter in the connection string.
Authentication Flows
| AuthFlow | Authentication Method | Use Case |
|---|---|---|
| 0 | Azure CLI Credential | Development using Azure CLI |
| 1 | Client Credentials (Service Principal) | Automated/service-to-service authentication |
| 2 | Interactive Browser | Interactive user authentication (default) |
| 3 | SPN | Service Principal authentication |
| 4 | Certificate-Based | Certificate-based service principal authentication |
| 5 | Access Token | Pre-acquired access token |
Interactive Browser Authentication
Best for: Development and interactive applications
String url = "jdbc:fabricspark://api.fabric.microsoft.com;" +
"FabricWorkspaceID=<workspace-id>;" +
"FabricLakehouseID=<lakehouse-id>;" +
"AuthFlow=2;" + // Interactive browser authentication
"AuthTenantID=<tenant-id>;" + // Optional
"LogLevel=INFO";
Connection conn = DriverManager.getConnection(url);
Parameters:
AuthFlow=2: Specifies interactive browser authenticationAuthTenantID(optional): Azure tenant IDAuthClientID(optional): Application (client) ID
Behavior:
- Opens a browser window for user authentication
- Credentials are cached for subsequent connections until it's expired
- Suitable for single-user applications
Client Credentials Authentication
Best for: Automated services and background jobs
String url = "jdbc:fabricspark://api.fabric.microsoft.com;" +
"FabricWorkspaceID=<workspace-id>;" +
"FabricLakehouseID=<lakehouse-id>;" +
"AuthFlow=1;" + // Client credentials authentication
"AuthClientID=<client-id>;" +
"AuthClientSecret=<client-secret>;" +
"AuthTenantID=<tenant-id>;" +
"LogLevel=INFO";
Connection conn = DriverManager.getConnection(url);
Required Parameters:
AuthFlow=1: Specifies client credentials authenticationAuthClientID: Application (client) ID from Microsoft Entra IDAuthClientSecret: Client secret from Microsoft Entra IDAuthTenantID: Azure tenant ID
Best Practices:
- Store secrets securely (Azure Key Vault, environment variables)
- Use managed identities when possible
- Rotate secrets regularly
Certificate-Based Authentication
Best for: Enterprise applications requiring certificate-based authentication
String url = "jdbc:fabricspark://api.fabric.microsoft.com;" +
"FabricWorkspaceID=<workspace-id>;" +
"FabricLakehouseID=<lakehouse-id>;" +
"AuthFlow=4;" + // Certificate-based authentication
"AuthClientID=<client-id>;" +
"AuthCertificatePath=/path/to/certificate.pfx;" +
"AuthCertificatePassword=<certificate-password>;" +
"AuthTenantID=<tenant-id>;" +
"LogLevel=INFO";
Connection conn = DriverManager.getConnection(url);
Required Parameters:
AuthFlow=4: Specifies certificate-based authenticationAuthClientID: Application (client) IDAuthCertificatePath: Path to PFX/PKCS12 certificate fileAuthCertificatePassword: Certificate passwordAuthTenantID: Azure tenant ID
Service Principal Authentication
Best for: Headless environments and remote sessions
String url = "jdbc:fabricspark://api.fabric.microsoft.com;" +
"FabricWorkspaceID=<workspace-id>;" +
"FabricLakehouseID=<lakehouse-id>;" +
"AuthFlow=3;" + // Device code authentication
"AuthClientID=<client-id>;" +
"AuthTenantID=<tenant-id>;" +
"LogLevel=INFO";
Connection conn = DriverManager.getConnection(url);
Behavior:
- Displays a device code and URL in the console
- User visits the URL and enters the code
- Authentication completes after user verification
Access Token Authentication
Best for: Custom authentication scenarios
// Acquire token through custom mechanism
String accessToken = acquireTokenFromCustomSource();
String url = "jdbc:fabricspark://api.fabric.microsoft.com;" +
"FabricWorkspaceID=<workspace-id>;" +
"FabricLakehouseID=<lakehouse-id>;" +
"AuthFlow=5;" + // Access token authentication
"AuthAccessToken=" + accessToken + ";" +
"LogLevel=INFO";
Connection conn = DriverManager.getConnection(url);
Authentication Caching
The driver automatically caches authentication tokens to improve performance:
// Enable/disable caching (enabled by default)
String url = "jdbc:fabricspark://api.fabric.microsoft.com;" +
"FabricWorkspaceID=<workspace-id>;" +
"FabricLakehouseID=<lakehouse-id>;" +
"AuthFlow=2;" +
"AuthEnableCaching=true;" + // Enable token caching
"AuthCacheTTLMS=3600000"; // Cache TTL: 1 hour
Connection conn = DriverManager.getConnection(url);
Configuration Parameters
Required Parameters
These parameters must be present in every connection string:
| Parameter | Type | Description | Example |
|---|---|---|---|
FabricWorkspaceID |
UUID | Microsoft Fabric workspace identifier | <workspace-id> |
FabricLakehouseID |
UUID | Microsoft Fabric lakehouse identifier | <lakehouse-id> |
AuthFlow |
Integer | Authentication flow type (0-5) | 2 |
Optional Parameters
API Version Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
FabricVersion |
String | v1 |
Microsoft Fabric API version |
LivyApiVersion |
String | 2023-12-01 |
Livy API version |
Environment Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
FabricEnvironmentID |
UUID | None | Fabric environment identifier for referencing environment item for Spark session |
Spark Configuration
Session Resource Configuration
Configure Spark session resources for optimal performance:
| Parameter | Type | Default | Description | Example |
|---|---|---|---|---|
DriverCores |
Integer | Spark default | Number of CPU cores for driver | 4 |
DriverMemory |
String | Spark default | Memory allocation for driver | 4g |
ExecutorCores |
Integer | Spark default | Number of CPU cores per executor | 4 |
ExecutorMemory |
String | Spark default | Memory allocation per executor | 8g |
NumExecutors |
Integer | Spark default | Number of executors | 2 |
Example:
DriverCores=4;DriverMemory=4g;ExecutorCores=4;ExecutorMemory=8g;NumExecutors=2
Custom Spark Session Properties
Any parameter with the prefix spark. is automatically applied to the Spark session:
Example Spark Configurations:
spark.sql.adaptive.enabled=true
spark.sql.adaptive.coalescePartitions.enabled=true
spark.sql.shuffle.partitions=200
spark.sql.autoBroadcastJoinThreshold=10485760
spark.dynamicAllocation.enabled=true
spark.dynamicAllocation.minExecutors=1
spark.dynamicAllocation.maxExecutors=10
spark.executor.memoryOverhead=1g
Native Execution Engine (NEE):
spark.nee.enabled=true
Complete Example:
jdbc:fabricspark://api.fabric.microsoft.com;FabricWorkspaceID=<guid>;FabricLakehouseID=<guid>;DriverMemory=4g;ExecutorMemory=8g;NumExecutors=2;spark.sql.adaptive.enabled=true;spark.nee.enabled=true;AuthFlow=2
HTTP Connection Pool Configuration
Configure HTTP connection pooling for optimal network performance:
| Parameter | Type | Default | Description |
|---|---|---|---|
HttpMaxTotalConnections |
Integer | 100 | Maximum total HTTP connections |
HttpMaxConnectionsPerRoute |
Integer | 50 | Maximum connections per route |
HttpConnectionTimeoutInSeconds |
Integer | 30 | Connection timeout |
HttpSocketTimeoutInSeconds |
Integer | 60 | Socket read timeout |
HttpReadTimeoutInSeconds |
Integer | 60 | HTTP read timeout |
HttpConnectionRequestTimeoutSeconds |
Integer | 30 | Connection request timeout from pool |
HttpEnableKeepAlive |
Boolean | true | Enable HTTP keep-alive |
HttpKeepAliveTimeoutSeconds |
Integer | 60 | Keep-alive timeout |
HttpFollowRedirects |
Boolean | true | Follow HTTP redirects |
HttpUseAsyncIO |
Boolean | false | Use asynchronous HTTP I/O |
Example:
HttpMaxTotalConnections=200;HttpMaxConnectionsPerRoute=100;HttpConnectionTimeoutInSeconds=60
Proxy Configuration
Configure HTTP and SOCKS proxy settings for enterprise environments:
| Parameter | Type | Default | Description |
|---|---|---|---|
UseProxy |
Boolean | false | Enable proxy |
ProxyTransport |
String | http |
Proxy transport type (http/tcp) |
ProxyHost |
String | None | Proxy hostname |
ProxyPort |
Integer | None | Proxy port |
ProxyAuthEnabled |
Boolean | false | Enable proxy authentication |
ProxyUsername |
String | None | Proxy authentication username |
ProxyPassword |
String | None | Proxy authentication password |
ProxyAuthScheme |
String | basic |
Auth scheme (basic/digest/ntlm) |
ProxySocksVersion |
Integer | 5 | SOCKS version (4/5) |
HTTP Proxy Example:
UseProxy=true;ProxyTransport=http;ProxyHost=proxy.company.com;ProxyPort=8080;ProxyAuthEnabled=true;ProxyUsername=user;ProxyPassword=pass
SOCKS Proxy Example:
UseProxy=true;ProxyTransport=tcp;ProxyHost=socks.company.com;ProxyPort=1080;ProxySocksVersion=5
Logging Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
LogLevel |
String | INFO |
Logging level: TRACE, DEBUG, INFO, WARN, ERROR |
Example:
LogLevel=DEBUG
Default Log Location:
${user.home}/.microsoft/livy-jdbc-driver/driver.log
Custom Log Configuration:
Use a custom log4j2.xml or logback.xml file on your classpath.
Usage Examples
Basic Connection
import java.sql.*;
public class BasicConnectionExample {
public static void main(String[] args) {
String url = "jdbc:fabricspark://api.fabric.microsoft.com;" +
"FabricWorkspaceID=<workspace-id>;" +
"FabricLakehouseID=<lakehouse-id>;" +
"AuthFlow=2";
try (Connection conn = DriverManager.getConnection(url)) {
System.out.println("Connected successfully!");
System.out.println("Database: " + conn.getMetaData().getDatabaseProductName());
System.out.println("Driver: " + conn.getMetaData().getDriverName());
System.out.println("Driver Version: " + conn.getMetaData().getDriverVersion());
} catch (SQLException e) {
System.err.println("Connection failed: " + e.getMessage());
e.printStackTrace();
}
}
}
Executing Queries
Simple Query
public void executeSimpleQuery(Connection conn) throws SQLException {
String sql = "SELECT current_timestamp() as now";
try (Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery(sql)) {
if (rs.next()) {
Timestamp now = rs.getTimestamp("now");
System.out.println("Current timestamp: " + now);
}
}
}
Query with Filter
public void executeQueryWithFilter(Connection conn) throws SQLException {
String sql = "SELECT * FROM sales WHERE amount > 1000 ORDER BY amount DESC";
try (Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery(sql)) {
while (rs.next()) {
int id = rs.getInt("id");
double amount = rs.getDouble("amount");
Date date = rs.getDate("sale_date");
System.out.printf("ID: %d, Amount: %.2f, Date: %s%n",
id, amount, date);
}
}
}
Query with Limit
public void executeQueryWithLimit(Connection conn) throws SQLException {
String sql = "SELECT * FROM customers LIMIT 10";
try (Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery(sql)) {
ResultSetMetaData metaData = rs.getMetaData();
int columnCount = metaData.getColumnCount();
// Print column names
for (int i = 1; i <= columnCount; i++) {
System.out.print(metaData.getColumnName(i) + "\t");
}
System.out.println();
// Print rows
while (rs.next()) {
for (int i = 1; i <= columnCount; i++) {
System.out.print(rs.getString(i) + "\t");
}
System.out.println();
}
}
}
Working with Result Sets
Navigating Result Sets
public void navigateResultSet(Connection conn) throws SQLException {
String sql = "SELECT id, name, amount FROM orders";
try (Statement stmt = conn.createStatement(
ResultSet.TYPE_SCROLL_INSENSITIVE,
ResultSet.CONCUR_READ_ONLY);
ResultSet rs = stmt.executeQuery(sql)) {
// Move to first row
if (rs.first()) {
System.out.println("First row: " + rs.getString("name"));
}
// Move to last row
if (rs.last()) {
System.out.println("Last row: " + rs.getString("name"));
System.out.println("Total rows: " + rs.getRow());
}
// Move to specific row
if (rs.absolute(5)) {
System.out.println("Row 5: " + rs.getString("name"));
}
}
}
Processing Large Result Sets
public void processLargeResultSet(Connection conn) throws SQLException {
String sql = "SELECT * FROM large_table";
try (Statement stmt = conn.createStatement()) {
// Set fetch size for efficient memory usage
stmt.setFetchSize(1000);
try (ResultSet rs = stmt.executeQuery(sql)) {
int rowCount = 0;
while (rs.next()) {
// Process row
processRow(rs);
rowCount++;
if (rowCount % 10000 == 0) {
System.out.println("Processed " + rowCount + " rows");
}
}
System.out.println("Total rows processed: " + rowCount);
}
}
}
private void processRow(ResultSet rs) throws SQLException {
// Process individual row
}
Using Prepared Statements
public void usePreparedStatement(Connection conn) throws SQLException {
String sql = "SELECT * FROM products WHERE category = ? AND price > ?";
try (PreparedStatement pstmt = conn.prepareStatement(sql)) {
// Set parameters
pstmt.setString(1, "Electronics");
pstmt.setDouble(2, 100.0);
try (ResultSet rs = pstmt.executeQuery()) {
while (rs.next()) {
String name = rs.getString("name");
double price = rs.getDouble("price");
System.out.printf("Product: %s, Price: $%.2f%n", name, price);
}
}
}
}
Batch Operations
public void executeBatchInsert(Connection conn) throws SQLException {
String sql = "INSERT INTO logs (timestamp, level, message) VALUES (?, ?, ?)";
try (PreparedStatement pstmt = conn.prepareStatement(sql)) {
conn.setAutoCommit(false); // Disable auto-commit for batch
// Add multiple statements to batch
for (int i = 0; i < 1000; i++) {
pstmt.setTimestamp(1, new Timestamp(System.currentTimeMillis()));
pstmt.setString(2, "INFO");
pstmt.setString(3, "Log message " + i);
pstmt.addBatch();
// Execute batch every 100 statements
if (i % 100 == 0) {
pstmt.executeBatch();
pstmt.clearBatch();
}
}
// Execute remaining statements
pstmt.executeBatch();
conn.commit();
System.out.println("Batch insert completed successfully");
} catch (SQLException e) {
conn.rollback();
throw e;
} finally {
conn.setAutoCommit(true);
}
}
Data Type Mapping
The driver maps Spark SQL data types to JDBC SQL types and Java types:
| Spark SQL Type | JDBC SQL Type | Java Type | Notes |
|---|---|---|---|
BOOLEAN |
BOOLEAN |
Boolean |
|
BYTE |
TINYINT |
Byte |
|
SHORT |
SMALLINT |
Short |
|
INT |
INTEGER |
Integer |
|
LONG |
BIGINT |
Long |
|
FLOAT |
FLOAT |
Float |
|
DOUBLE |
DOUBLE |
Double |
|
DECIMAL |
DECIMAL |
BigDecimal |
Precision and scale preserved |
STRING |
VARCHAR |
String |
|
VARCHAR(n) |
VARCHAR |
String |
|
CHAR(n) |
CHAR |
String |
|
BINARY |
BINARY |
byte[] |
|
DATE |
DATE |
java.sql.Date |
|
TIMESTAMP |
TIMESTAMP |
java.sql.Timestamp |
|
ARRAY |
VARCHAR |
String |
Serialized as JSON |
MAP |
VARCHAR |
String |
Serialized as JSON |
STRUCT |
VARCHAR |
String |
Serialized as JSON |