Unexpected Behavior with Field-Scoped Queries on Non-Searchable ID Field

Question

Unexpected Behavior with Field-Scoped Queries on Non-Searchable ID Field

Sharath Nataraj 0

Summary

I'm implementing a hybrid search solution where I calculate vector similarity scores separately (using a chunks index) and need to apply these as per-document boost values to my main search index. When I attempt to apply these boosts using field-scoped Lucene syntax on the id field (the key field), the document ranking changes unexpectedly in ways that don't align with the boost values I'm applying.

The core issue: My text search query returns documents in the correct order based on content relevance. However, when I add field-scoped ID boost clauses like (id:753100^2.2 OR id:752506^2.2 OR ...) to apply my externally-calculated vector boost scores, the document order flips unexpectedly.

Expected behavior: I expect each document to have a standardized base search score that I can then multiply by my boost values. If two documents both receive the same boost multiplier (e.g., ^2.2), their relative ranking should remain unchanged.

Actual behavior: Documents with identical boost values change their relative ranking, suggesting Azure Search is not applying boosts in a predictable, multiplicative manner.

Index Name

dev_sharath_nataraj_articles_v4

Index Configuration

Field: id Type: Edm.String Key: true Searchable: false ❌ Filterable: true Retrievable: true Sortable: true

Other Searchable Fields: title (searchable) contentPlainText (searchable)

Scoring Profile: Name: timeDecay Type: Freshness function on publishedDate field Boost: 50 Interpolation: Logarithmic

Observed Behavior

Step 1: Text Search Works Correctly

This text search query returns documents in the expected order based on content relevance:

{
  "search": "((title:us^1.5 OR contentPlainText:us) AND (title:seafood~1^1.5 OR contentPlainText:seafood~1) AND (title:market~1^1.5 OR contentPlainText:market~1) AND (title:update~1^1.5 OR contentPlainText:update~1))^1",
  "select": "id,title,publishedAt",
  "filter": "id eq '752506' or id eq '753100'",
  "queryType": "full",
  "searchMode": "all",
  "scoringProfile": "timeDecay",
  "top": 15
}

Results:

{
  "value": [
    {
      "@search.score": 553.4935,
      "id": "753100",
      "publishedAt": "2025-11-20T22:11:17.277Z",
      "title": "US Seafood Market Update"
    },
    {
      "@search.score": 487.40683,
      "id": "752506",
      "publishedAt": "2025-11-18T22:21:49.127Z",
      "title": "US Seafood Market Update"
    }
  ]
}

✅ Document order: 753100 (553.49) scores higher than 752506 (487.41) - this is correct based on text matching

Step 2: Adding Vector Boosts Causes Unexpected Order Flip

Context: I calculate vector similarity scores separately using a chunks index. Based on these vector scores, I want to boost specific documents. In this example:

Document 753100 should receive a boost of ^2.2

Document 752506 should receive a boost of ^2.2

Approximately 200 other documents also receive various boost values

I add these boosts using field-scoped syntax on the id field:

{
  "search": "((title:us^1.5 OR contentPlainText:us) AND (title:seafood~1^1.5 OR contentPlainText:seafood~1) AND (title:market~1^1.5 OR contentPlainText:market~1) AND (title:update~1^1.5 OR contentPlainText:update~1))^1 OR (id:757511^2.2 OR id:755664^2.2 OR id:755376^2.2 OR id:753100^2.2 OR id:752506^2.2 OR id:751651^2.2 OR id:751318^2.2 OR ... approximately 200 IDs with their respective boost values ...)",
  "select": "id,title,publishedAt",
  "filter": "id eq '752506' or id eq '753100'",
  "queryType": "full",
  "searchMode": "all",
  "scoringProfile": "timeDecay",
  "top": 15
}

Results:

{
  "value": [
    {
      "@search.score": 1408.0555,
      "id": "752506",
      "publishedAt": "2025-11-18T22:21:49.127Z",
      "title": "US Seafood Market Update"
    },
    {
      "@search.score": 1406.3658,
      "id": "753100",
      "publishedAt": "2025-11-20T22:11:17.277Z",
      "title": "US Seafood Market Update"
    }
  ]
}

❌ Document order: 752506 (1408.06) now scores higher than 753100 (1406.37) - ORDER FLIPPED

Analysis of the Problem

What I expected:

Document 753100 had original score: 553.49

Document 752506 had original score: 487.41

Both receive identical boost: ^2.2

Expected: Relative order should be preserved: 753100 * 2.2 > 752506 * 2.2

What actually happened:

After applying boosts, 752506 scores 1408.06

After applying boosts, 753100 scores 1406.37

The document that scored LOWER originally now scores HIGHER

This suggests boosts are not being applied multiplicatively to a base score.

Key observations:

Both documents received identical boost values (^2.2) in my query

The original text search clearly preferred 753100 (553.49 vs 487.41)

After adding boosts, the order reversed despite identical boost multipliers

Scores increased dramatically (from ~500 to ~1400)

The difference between published dates (Nov 18 vs Nov 20) affects the freshness scoring profile, but this alone doesn't explain the ranking flip

Questions

Question 1: Is using field-scoped queries on non-searchable fields supported?

The documentation states: "The field specified in fieldName:searchExpression must be a searchable field"

However:

My query with (id:753100^2.2 OR ...) does NOT throw an error (unlike boolean non-searchable fields which throw: "Illegal arguments in query request: [field] is not a searchable field")

The query executes and returns results, but with unpredictable scoring behavior

There is no documentation explaining what happens when this requirement is violated for string fields

Question: Is this supported or unsupported behavior? If unsupported, why doesn't it throw an error?

Question 2: What is Azure Search actually doing with field-scoped queries on non-searchable ID fields?

When I include (id:753100^2.2 OR id:752506^2.2 OR ...) in my query:

Is it searching for the literal ID values as text in my searchable fields (title, contentPlainText)?

Is it applying the boost values (^2.2)? If so, how and to what base score?

Why do documents with identical boost values change their relative ranking?

Why don't the boosts behave multiplicatively as expected?

Question 3: What provides the "base score" for boosting in Azure Search?

For term boosting to work predictably, there needs to be a consistent base score per document that can be multiplied by the boost factor.

When I use (id:753100^2.2), what base score is being boosted?

Is the base score from the text search portion of the query?

Is the base score from the scoring profile alone?

Why would two documents with identical boosts have their ranking order reverse?

Question 4: What is the recommended approach for applying externally-calculated per-document boost scores?

RAMAMURTHY MAKARAPU 1,125 Reputation points Microsoft External Staff Moderator

2025-12-05T19:11:48.9266667+00:00
Hi @Sharath Nataraj ,

Thank you for submitting your question on Microsoft Q&A.

Let me answer your questions

Q1) Is field-scoping on non-searchable fields supported?

No it’s not supported. Azure AI Search only allows field-scoped queries (for example, id:123) on fields that are marked searchable in the index schema.

This is because searchable fields are placed into the full-text inverted index, which is what the search engine uses to match terms and compute relevance scores. If a field is not searchable, it isn’t included in that inverted index so it cannot participate in normal full-text matching.

Why you don’t get an error even though it’s unsupported

Azure AI Search uses a full Lucene query parser, which means it can parse expressions like id:123 without complaining. But parsing successfully does not mean the query will behave as you expect.

Since id is not searchable, Azure AI Search simply ignores the clause for text-matching purposes.

Because it’s not part of the full-text index, the term doesn’t contribute to the scoring pipeline.

As a result, your boost expressions (e.g., id:123^10) won’t behave in the usual “increase importance of this match” way.

This leads to results that look confusing scores may not increase the way you expect, because the boosted clause isn’t acting like a true searchable term match.

Q2) What is Azure Search actually doing with id:753100^2.2 on a non‑searchable field?

With a non‑searchable id:

The engine does not perform a lexical match against the id field (there’s no text index there).

The clause gets parsed, but the way it contributes to the query tree/score is not documented for non‑searchable fields (hence “unsupported”). It isn’t searching for "753100" inside title/contentPlainText unless you removed field scoping.

The boost (^2.2) increases the weight of that clause, but final rank is the sum of weighted clause scores plus scoring‑profile contributions—not a pure multiplier of “the base score of the document.” In large OR expressions, Boolean scoring and normalization can reshuffle relative order even when the two target docs get the same multiplier.

The jump from ~500 to ~1400 you observed is consistent with “extra boosted clause(s) added to the query tree → more total score,” not with a deterministic “base × 2.2”.

Q3) What provides the “base score” that boosts act on?

For keyword (lexical) search, Azure AI Search uses Lucene’s scoring (BM25) over matched terms, then applies scoring profile functions (freshness, magnitude, distance, tags) as additive contributions. Term boosts change clause weights; they don’t multiply an already‑computed global base.

Two things to be aware of:

Query/local statistics vs global statistics. If you do not set scoringStatistics: "global", DF/IDF can be computed over the current candidate set, and adding big OR groups changes that set—thus the BM25 component can shift enough to flip close rankings. Setting "global" makes lexical scores far more stable across query variants.

Scoring profile is additive, not multiplicative. Your timeDecay freshness function (boost 50, log interpolation) adds to the BM25 score. Two docs with the same term boosts can still swap order if freshness contributions differ slightly and BM25 normalizations move.

Q4) Recommended way to apply externally‑computed per‑document boosts

There are two supported, predictable patterns. Choose the one that fits your architecture:

Pattern A: Store the external score in a numeric field and boost via a scoring profile (recommended when you want full control)

Add a field to the index schema, e.g.:

{ "name": "vectorBoost", "type": "Edm.Double", "filterable": true, "sortable": false, "facetable": false, "searchable": false, "retrievable": true }

Normalize your external vector score offline (e.g., min‑max to [0,1] or map to [0,3] for a small, bounded influence).

Combine functions inside a single scoring profile (you can only select one profile per query). Keep your timeDecay freshness and add a magnitude function for vectorBoost

{ "name": "timeDecay_plus_vector", "text": { "weights": { "title": 1.5, "contentPlainText": 1.0 } }, "functions": [ { "type": "freshness", "fieldName": "publishedDate", "boost": 50, "interpolation": "logarithmic", "freshness": { "boostingDuration": "P60D" } }, { "type": "magnitude", "fieldName": "vectorBoost", "boost": 250, // tune this “gain” "interpolation": "linear", "magnitude": { "boostingRangeStart": 0.0, "boostingRangeEnd": 1.0, "constantBoostBeyondRange": true } } ], "functionAggregation": "sum" }

Why “sum”? Because profile functions are additive. To emulate “multiply the base by k” you choose a gain that moves scores proportionally within the typical BM25 range you observe. This keeps ordering stable when two docs share the same vectorBoost

Query settings for stability:

"queryType": "full", "searchMode": "all", "scoringProfile": "timeDecay_plus_vector", "scoringStatistics": "global", // important

By scoping the lexical match to your actual text fields and keeping ID selection in filter, you prevent the OR‑clause normalization effects you saw

Pattern B Use built‑in hybrid search (lexical + vector)

If you can store embeddings per document, let Azure AI Search do the merge/rerank for you:

Add a vector field (e.g., 768‑dim float array) to the index.

At query time, send both the text query and the vector query (vectorSearch) and let the service combine them consistently (GA). This removes the need to handcraft ID‑based boost clauses and produces stable ranking across updates.

This is especially effective when you’re already calculating chunk‑level vectors—aggregate per‑doc vectors (e.g., average or max‑pooled), or keep chunk vectors and use document mapping in your app.

Reference:

https://learn.microsoft.com/en-us/azure/search/search-query-create?tabs=portal-text-query

https://learn.microsoft.com/en-us/java/api/com.azure.search.documents.models.searchoptions?view=azure-java-stable

https://learn.microsoft.com/en-us/azure/search/query-lucene-syntax

Kindly let us know if the above comment helps or you need further assistance on this issue.

Please "upvote" if the information helped you. This will help us and others in the community as well

Share via

Unexpected Behavior with Field-Scoped Queries on Non-Searchable ID Field

Summary

Index Name

Index Configuration

Observed Behavior

Step 1: Text Search Works Correctly

Step 2: Adding Vector Boosts Causes Unexpected Order Flip

Analysis of the Problem

What I expected:

What actually happened:

Key observations:

Questions

Question 1: Is using field-scoped queries on non-searchable fields supported?

Question 2: What is Azure Search actually doing with field-scoped queries on non-searchable ID fields?

Question 3: What provides the "base score" for boosting in Azure Search?

Question 4: What is the recommended approach for applying externally-calculated per-document boost scores?

Your answer