Azure ocr How can ı find the it depended sentence

Gökhan DUMLUPINAR 0 Reputation points
2025-11-22T13:41:00.7166667+00:00

I use Azure OCR. I can use it to find words, but when I click on a word, I want to see the whole sentence that it is part of. The extracted text is not well separated.

const extractedText =

iaResult.readResult.blocks

.map((block) => block.lines.map((line) => line.text).join('\n'))

.join('\n\n') | '';

iaResult.readResult?.blocks?.forEach((block) => {

block.satırlar?.forEach((satır) => {

Instead of taking 'line.boundingPolygon', we are entering the 'line.words' array.

line.words.forEach((word) => {

// Each 'word' (kelime) object has its own boundingPolygon.

const flatBox = word.boundingPolygon?.flatMap((point) => [point.x, point.y]) || [];

if (flatBox.length > 0) {

sınırlayıcıKutular.push( {

text: word.text,

box: flatBox,

});

}

Computer Vision
Computer Vision
An Azure artificial intelligence service that analyzes content in images and video.
{count} votes

3 answers

Sort by: Most helpful
  1. Q&A Assist
    2025-11-22T13:41:11.0433333+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    To find the entire sentence that a specific word belongs to when using Azure OCR, you can modify your approach to extract the lines of text instead of just the words. Since each line of text is associated with its own bounding polygon, you can use this information to retrieve the full sentence when a word is clicked.

    Here’s a general approach you can follow:

    1. Extract Lines: Instead of focusing solely on words, extract the lines from the OCR results. Each line will contain the words that make up that sentence.
    2. Identify Sentence: When a word is clicked, check which line it belongs to by comparing the bounding box of the word with the bounding boxes of the lines.
    3. Display Sentence: Once you identify the line that contains the clicked word, you can display the entire line (or sentence) to the user.

    Here’s a simplified code snippet to illustrate this:

    const extractedText = iaResult.readResult.blocks
        .map((block) => block.lines.map((line) => line.text).join('\n'))
        .join('\n\n') || '';
    
    iaResult.readResult?.blocks?.forEach((block) => {
        block.lines.forEach((line) => {
            line.words.forEach((word) => {
                const flatBox = word.boundingPolygon?.flatMap((point) => [point.x, point.y]) || [];
                if (flatBox.length > 0) {
                    // Logic to handle word click
                    // Display the entire line text when a word is clicked
                    console.log('Sentence:', line.text);
                }
            });
        });
    });
    

    In this example, when a word is clicked, the entire line that contains that word is logged to the console. You can replace the console.log statement with your own logic to display the sentence in your application.

    0 comments No comments

  2. Adam Zachary 2,025 Reputation points
    2025-11-24T00:06:40.6166667+00:00

    I’ve worked with Azure OCR and ran into the same problem.

    The reason you can’t easily get the full sentence from a clicked word is because the OCR API doesn’t return “sentences.” It only returns lines, and each line is already the closest thing to a complete sentence.

    So, the easiest way is:

    When the user clicks a word, find the line object that contains that word, and return line.text. That will give you the full phrase Azure OCR produced from that region.

    Something like:

    const line = block.lines.find(l => 
      l.words.some(w => w.text === clickedWord)
    );
    
    const sentence = line?.text;
    

    Azure OCR can’t reconstruct grammar-based sentences, it only groups text visually into lines. So, the only way is to map the clicked word back to its parent line.

    0 comments No comments

  3. SRILAKSHMI C 11,040 Reputation points Microsoft External Staff Moderator
    2025-11-28T18:00:49.76+00:00

    Hello Gökhan DUMLUPINAR,

    Welcome to Microsoft Q&A and Thank you for reaching out.

    I understand that you are trying to get Azure OCR to return the full sentence or line whenever a specific word is clicked. Azure OCR does give you word-level bounding boxes, but the sentence/line structure isn't always clean because the API mainly organizes text into blocks → lines → words.

    To achieve what you want, here are the key things to adjust:

    1. Use line or block context when a word is selected

    Each word already belongs to a specific line, and each line belongs to a block. So instead of relying on extracted text (which might not be well separated), you should use the API’s structure directly:

    • When a word is clicked → find its parent line
    • Return line.text (which is the closest equivalent to a “sentence”)
    • If you need a bigger context, return block text
    1. Modify your logic to capture the entire line on word click

    Right now you are looping only through line.words. You should simply map the clicked word back to its parent line:

    iaResult.readResult.blocks.forEach((block) => {
        block.lines.forEach((line) => {
            line.words.forEach((word) => {
                if (word.text === targetWord) {
                    console.log(line.text); 
                    // This gives the complete sentence/line for the clicked word
                }
            });
        });
    });
    
    1. Improve extracted text structure (optional)

    If the extracted text looks jumbled when you concatenate it, make sure you group it by:

    const extractedText =
      iaResult.readResult.blocks
        .map(block => block.lines.map(line => line.text).join('\n'))
        .join('\n\n');
    

    For better sentence/paragraph separation

    Azure OCR is line-based. If you need richer structure (paragraphs, tables, semantic grouping), consider using Azure Document Intelligence, which gives better layout and sentence continuity.

    You can find which sentence a clicked word belongs to by using the OCR hierarchy. Every word already has a parent line, so when you click a word, simply return line.text. This is the correct and most reliable way to show the full sentence or line for that word, since Azure OCR does not automatically separate full sentences. If you need more advanced text grouping, Azure Document Intelligence can provide better structure.

    I Hope this helps. Do let me know if you have any further queries.

    Thank you!

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.