Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Microsoft Fabric's Cosmos DB database workload provides built-in sample data sets to help you explore, learn, and experiment with NoSQL database patterns. This data set represents an e-commerce scenario with products and customer reviews, demonstrating how different entity types coexist in the same container.
Two sample data sets are available:
- Standard sample data: Core e-commerce data with products and reviews
- Vector sample data: Enhanced version that includes 1536-dimensional vector embeddings generated using OpenAI's text-embedding-ada-002 model for semantic search scenarios.
Data set overview
Both sample data sets contain the same e-commerce data with two document types.
- Product documents (
docType: "product") - Individual products with name, description, inventory, current price, and an embedded array of the price history for that product. - Review documents (
docType: "review") - Customer reviews and ratings linked to products viaproductId
The vector sample data set is based on the standard sample data set. Product documents in the vector data set include an additional vectors property containing 1536-dimensional embeddings for semantic search capabilities.
Note
You can find both datasets as well as an additional dataset with vectors generated using the OpenAI text-embedding-3-large model with 512 dimensions in the Sample Datasets folder of the Cosmos DB in Fabric - Samples Repository
Document schemas
Product document schema
Product documents contain detailed information about individual items in the e-commerce catalog:
| Property | Type | Description |
|---|---|---|
id |
string |
Unique identifier for the product in GUID format |
docType |
string |
Document type identifier, always "product" |
productId |
string |
Product identifier, same as id for product documents |
name |
string |
Product display name |
description |
string |
Detailed product description |
categoryName |
string |
Product category (e.g., "Computers, Laptops", "Media", "Accessories") |
inventory |
number |
Number of items currently in stock |
firstAvailable |
string |
Date when product became available (ISO 8601 format) |
currentPrice |
number |
Current selling price |
priceHistory |
array |
Array of price change objects with date and price fields |
priceHistory[].date |
string |
Date and time of the price change in ISO 8601 format |
priceHistory[].price |
number |
Price at the specified date |
vectors |
array |
Vector sample data only - 1536-dimensional vector embedding |
Review document schema
Review documents contain customer feedback and ratings for products:
| Property | Type | Description |
|---|---|---|
id |
string |
Unique identifier for the review in GUID format |
docType |
string |
Document type identifier, always "review" |
productId |
string |
References the id of the product being reviewed |
categoryName |
string |
Product category (inherited from the reviewed product) |
customerName |
string |
Name of the customer who wrote the review |
reviewDate |
string |
Date when the review was submitted (ISO 8601 format) |
stars |
number |
Rating given by the customer (1-5 scale) |
reviewText |
string |
Written review content from the customer |
Note
Cosmos DB automatically generates system properties (_rid, _self, _etag, _attachments, _ts) for all documents.
Note
For more information about the ISO 8601 format, see international date and time standard. For more information about the GUID format, see universally unique identifiers.
Example documents
The following examples show the structure of documents in both sample data sets.
Standard product document example
{
"id": "ae449848-3f15-4147-8eee-fe76cfcc6bb4",
"docType": "product",
"productId": "ae449848-3f15-4147-8eee-fe76cfcc6bb4",
"name": "EchoSphere Pro ANC-X900 Premium Headphones",
"description": "EchoSphere Pro ANC-X900 Premium Headphones deliver immersive sound with advanced 40mm drivers and Adaptive Hybrid Active Noise Cancellation. Bluetooth 5.3 ensures seamless connectivity.",
"categoryName": "Accessories, Premium Headphones",
"inventory": 772,
"firstAvailable": "2024-01-01T00:00:00",
"currentPrice": 454.87,
"priceHistory": [
{
"date": "2024-01-01T00:00:00",
"price": 349.0
},
{
"date": "2024-08-01T00:00:00",
"price": 363.0
},
{
"date": "2025-04-01T00:00:00",
"price": 408.14
},
{
"date": "2025-08-01T00:00:00",
"price": 454.87
}
]
}
Vectorized product document example
{
"id": "ae449848-3f15-4147-8eee-fe76cfcc6bb4",
"docType": "product",
"productId": "ae449848-3f15-4147-8eee-fe76cfcc6bb4",
"name": "EchoSphere Pro ANC-X900 Premium Headphones",
"description": "EchoSphere Pro ANC-X900 Premium Headphones deliver immersive sound with advanced 40mm drivers and Adaptive Hybrid Active Noise Cancellation. Bluetooth 5.3 ensures seamless connectivity.",
"categoryName": "Accessories, Premium Headphones",
"inventory": 772,
"firstAvailable": "2024-01-01T00:00:00",
"currentPrice": 454.87,
"priceHistory": [
{
"date": "2024-01-01T00:00:00",
"price": 349.0
},
{
"date": "2025-08-01T00:00:00",
"price": 454.87
}
],
"vectors": [
-0.02783808670938015,
0.011827611364424229,
-0.04711977392435074,
// ... (1536 dimensions total)
0.04251981899142265
]
}
Review document example
Review documents are identical in both sample data sets:
{
"id": "fa799013-1746-4a7f-bd0f-2a95b2b76481",
"docType": "review",
"productId": "e847e069-d0f9-4fec-b42a-d37cd5b2f536",
"categoryName": "Accessories, Premium Headphones",
"customerName": "Emily Rodriguez",
"reviewDate": "2025-03-02T00:00:00",
"stars": 5,
"reviewText": "Excellent sound quality! Premium build! This EchoSphere Pro ANC-X900 exceeded hopes."
}
How to use the sample data
Both sample data sets help you practice querying, filtering, and aggregating data in Cosmos DB. The mixed document types provide realistic scenarios for various use cases.
Standard sample data scenarios
- Joining related data: Link reviews to products using
productId - Category analysis: Query products and reviews by
categoryName - Review analysis: Examine customer feedback patterns and ratings
Common query patterns
Get all products in a category:
SELECT *
FROM c
WHERE
c.docType = "product" AND
c.categoryName = "Computers, Laptops"
Get reviews for a specific product:
SELECT *
FROM c
WHERE
c.docType = "review" AND
c.productId = "77be013f-4036-4311-9b5a-dab0c3d022be"
Vector sample data scenarios
- Semantic similarity search: Find products with similar features using vector embeddings
- Content-based recommendations: Generate product suggestions based on description similarity
- Hybrid queries: Combine traditional filters with vector similarity for enhanced results
JSON schemas
The following JSON schemas describe the structure of documents in both sample data sets. Use these schemas to validate or generate similar data for your own Cosmos DB workloads.
Standard product document schema
{
"type": "object",
"properties": {
"id": { "type": "string" },
"docType": { "type": "string" },
"productId": { "type": "string" },
"name": { "type": "string" },
"description": { "type": "string" },
"categoryName": { "type": "string" },
"inventory": { "type": "number" },
"firstAvailable": { "type": "string" },
"currentPrice": { "type": "number" },
"priceHistory": {
"type": "array",
"items": {
"type": "object",
"properties": {
"date": { "type": "string" },
"price": { "type": "number" }
},
"required": ["date", "price"]
}
}
},
"required": [
"id", "docType", "productId", "name", "description", "categoryName", "inventory", "firstAvailable", "currentPrice", "priceHistory"
]
}
Vector-enabled product document schema
{
"type": "object",
"properties": {
"id": { "type": "string" },
"docType": { "type": "string" },
"productId": { "type": "string" },
"name": { "type": "string" },
"description": { "type": "string" },
"categoryName": { "type": "string" },
"inventory": { "type": "number" },
"firstAvailable": { "type": "string" },
"currentPrice": { "type": "number" },
"priceHistory": {
"type": "array",
"items": {
"type": "object",
"properties": {
"date": { "type": "string" },
"price": { "type": "number" }
},
"required": ["date", "price"]
}
},
"vectors": {
"type": "array",
"items": { "type": "number" },
"minItems": 1536,
"maxItems": 1536
}
},
"required": [
"id", "docType", "productId", "name", "description", "categoryName", "inventory", "firstAvailable", "currentPrice", "priceHistory", "vectors"
]
}
Review document schema
{
"type": "object",
"properties": {
"id": { "type": "string" },
"docType": { "type": "string", "const": "review" },
"productId": { "type": "string" },
"categoryName": { "type": "string" },
"customerName": { "type": "string" },
"reviewDate": { "type": "string" },
"stars": { "type": "number" },
"reviewText": { "type": "string" }
},
"required": [
"id", "docType", "productId", "categoryName", "customerName",
"reviewDate", "stars"
]
}