หมายเหตุ
การเข้าถึงหน้านี้ต้องได้รับการอนุญาต คุณสามารถลอง ลงชื่อเข้าใช้หรือเปลี่ยนไดเรกทอรีได้
การเข้าถึงหน้านี้ต้องได้รับการอนุญาต คุณสามารถลองเปลี่ยนไดเรกทอรีได้
Important
This feature is Beta and is available in the following regions: us-east-1 and us-west-2.
After you have created your declarative feature definitions, which are stored in Unity Catalog, you can produce feature data from your source table using the feature definitions. This process is called materializing your features. Azure Databricks creates and manages Lakeflow Spark Declarative Pipelines to populate tables in Unity Catalog for model training and batch scoring or online serving.
Requirements
- Features must be created with the declarative feature API and stored in Unity Catalog.
- For version requirements, see Requirements.
API data structures
OfflineStoreConfig
Configuration for the offline store where materialized features will be written. The materialization pipelines create new tables in this store.
OfflineStoreConfig(
catalog_name: str, # Catalog name for the offline table where materialized features will be stored
schema_name: str, # Schema name for the offline table
table_name_prefix: str # Table name prefix for the offline table. The pipeline may create multiple tables with this prefix, each updated at different cadences
)
from databricks.feature_engineering.entities import OfflineStoreConfig
offline_store = OfflineStoreConfig(
catalog_name="main",
schema_name="feature_store",
table_name_prefix="customer_features"
)
OnlineStoreConfig
Configuration for the online store, which stores features used by model serving. Materialization creates Delta tables with the catalog.schema.table_name_prefix, and streams the tables to Lakebase tables with the same name.
from databricks.feature_engineering.entities import OnlineStoreConfig
online_store = OnlineStoreConfig(
catalog_name="main",
schema_name="feature_store",
table_name_prefix="customer_features_serving",
online_store_name="customer_features_store"
)
MaterializedFeature
Represents a declarative feature that has been materialized, that is, that has a precomputed representation available in Unity Catalog. There is a distinct materialized feature for the offline table and online table. Typically, users will not instantiate a MaterializedFeature directly.
API function calls
materialize_features()
Materializes a list of declarative features into either an offline Delta table or to an Online Feature Store.
FeatureEngineeringClient.materialize_features(
features: List[Feature], # List of declarative features to materialize
offline_config: OfflineStoreConfig, # Offline store config if materializing offline
online_config: Optional[OnlineStoreConfig] = None, # Online store config if materializing online
pipeline_state: Union[MaterializedFeaturePipelineScheduleState, str], # Materialization pipeline state - currently must be "ACTIVE"
cron_schedule: Optional[str] = None, # Materialization schedule, specified in quartz cron syntax. Currently must be provided.
) -> List[MaterializedFeature]:
The method returns a list of materialized features, which contain metadata such as cron schedule when feature values are updated and information about the Unity Catalog tables where the features are materialized.
If both an OnlineStoreConfig and an OfflineStoreConfig are provided, then two materialized features are returned per feature provided, one for each type of store.
list_materialized_features()
Returns a list of all materialized features in the user's Unity Catalog metastore.
By default, a maximum of 100 features are returned. You can change this limit using the max_results parameter.
To filter the returned materialized features by a feature name, use the optional feature_name parameter.
FeatureEngineeringClient.list_materialized_features(
feature_name: Optional[str] = None, # Optional feature name to filter by
max_results: int = 100, # Maximum number of features to be returned
) -> List[MaterializedFeature]:
How to delete a materialized feature
To delete a materialized feature, use list_materialized_features(). Check the table_name attribute, navigate to that table in Unity Catalog, and delete the table containing the feature. Use the Lineage tab to identify any associated pipelines and delete them as well. Finally, ensure that for online tables, the offline pipeline and table are also deleted.
In beta, deletion APIs are not supported. If needed, you can manually delete feature pipelines and feature tables via the Databricks UI.
Use online features in real-time applications
To serve features to real-time applications and services, create a feature serving endpoint. See Feature Serving endpoints.
Models that are trained using features from Databricks automatically track lineage to the features they were trained on. When deployed as endpoints, these models use Unity Catalog to find appropriate features in online stores. For details, see Use features in online workflows.
Limitations
- Continuous features cannot be materialized.
- You can only work with materialized features in the workspace in which they were created.
- Deleting and pausing a feature must be manually managed at the pipeline level.