用 Prompt API 來提示內建語言模型

Prompt API 是一個實驗性的網頁 API，允許你從網站或瀏覽器擴充功能的 JavaScript 程式碼中， (Microsoft Edge 內建的小型語言模型 SLM) 提示。使用 Prompt API 生成並分析文字，或根據使用者輸入建立應用程式邏輯，並發掘創新方法將提示工程功能整合進你的網頁應用程式。

詳細內容：

提示 API 的可用性
提示 API 的替代方案與優點
Phi-4-mini 型號
啟用提示 API
請參考工作範例
使用 Prompt API
傳送意見反應
另請參閱

提示 API 的可用性

提示 API 可於 Microsoft Edge Canary 或開發者頻道中以開發者預覽形式提供，版本為 138.0.3309.2。

Prompt API 旨在協助發現使用案例並理解內建 SLM 的挑戰。此 API 預計將由其他實驗性 API 接替，用於特定 AI 驅動任務，如寫作協助與文字翻譯。欲了解更多關於這些其他 API，請參閱：

使用寫作輔助 API 摘要、撰寫並重寫文字
Web 機器學習/翻譯API 倉庫。

提示 API 的替代方案與優點

為了在網站和瀏覽器擴充功能中發揮 AI 功能，你也可以採用以下方法：

將網路請求傳送到雲端 AI 服務，例如 Azure AI 解決方案。
使用網頁神經網路 (WebNN) API 或 ONNX 執行時用於網頁，執行本地 AI 模型。

提示 API 使用一個 SLM，該裝置運行於同一裝置，該裝置同時使用模型的輸入與輸出， (本地) 。與雲端解決方案相比，這有以下優點：

降低成本： 使用雲端 AI 服務是沒有費用的。
網路獨立性： 除了初次下載模型外，提示模型時不會有網路延遲，且在裝置離線時也能使用。
提升隱私： 輸入到模型的資料不會離開裝置，也不會被收集來訓練 AI 模型。

提示 API 採用 Microsoft Edge 提供的模型，並內建於瀏覽器中，這比基於 WebGPU、WebNN 或 WebAssembly 的自訂本地解決方案有額外優勢：

一次性共擔費用： 瀏覽器提供的模型會在首次呼叫 API 時下載，並在所有瀏覽器中運行的網站共享，降低使用者與開發者的網路成本。
網頁開發者的簡化使用方法： 內建模型可透過簡單的網頁 API 運行，無需 AI/ML 專業知識或第三方框架。

Phi-4-mini 型號

Prompt API 允許你提示 Phi-4-mini——一個強大的小型語言模型，擅長文字處理，內建於 Microsoft Edge。欲了解更多 Phi-4-mini 及其功能，請參閱 microsoft/Phi-4-mini-instruct 的型號卡。

免責聲明

像其他語言模型一樣，Phi 家族模型可能表現得不公平、不可靠或冒犯。欲了解更多關於模型 AI 考量的資訊，請參閱「負責任的 AI 考量」。

硬體需求

Prompt API 開發者預覽版旨在支援具備硬體功能、能產生可預測品質與延遲的 SLM 輸出的裝置。提示 API 目前限制於：

作業系統：Windows 10 或 11，以及 macOS 13.3 或更新版本。
倉庫： 至少要在包含你 Edge 設定檔的磁碟區有 20GB 空間。若可用儲存空間低於 10 GB，該型號將被刪除，以確保其他瀏覽器功能有足夠空間運作。
顯示卡： 5.5GB VRAM 或以上。
網：無限數據方案或免計費連線。如果使用計量連接，該型號不會被下載。

要確認你的裝置是否支援 Prompt API 開發者預覽版，請參考下方的「啟用提示 API 」並查看你的裝置效能類別。

由於 Prompt API 的實驗性質，你可能會在特定硬體配置上觀察到問題。如果你在特定硬體配置上發現問題，請透過在 MSEdgeExplainers 倉庫開啟新問題來提供回饋。

模型可用性

網站首次呼叫內建 AI API 時，必須先下載模型。你可以在建立新的 Prompt API 會話時，使用監控選項來監控模型下載。欲了解更多，請參閱下方「監控模型下載進度」。

啟用提示 API

要在 Microsoft Edge 中使用 Prompt API：

請確保你使用的是最新版本的 Microsoft Edge Canary，或是 Dev (版本 138.0.3309.2 或更新的) 。請參閱成為 Microsoft Edge 內部人士。
在 Microsoft Edge、Canary 或 Dev 中，開啟一個新分頁或視窗，然後前往 edge://flags/。
在頁面頂端的搜尋框中輸入 Prompt API for Phi mini。

頁面會經過篩選以顯示匹配的旗標。
在 Phi mini 的提示 API 中，選擇啟用：
可選擇性地，為了在本地記錄可能對除錯問題有用的資訊，也啟用 裝置 AI 模型除錯日誌 旗標。
重新啟動 Microsoft Edge Canary 或 Dev。
要檢查你的裝置是否符合 Prompt API 開發者預覽版的硬體需求，請開啟新分頁，前往 edge://on-device-internals，檢查 裝置效能類別 值。

如果你的裝置 效能等級很高 或更高，你的裝置應該會支援 Prompt API。如果你持續發現問題，請在 MSEdgeExplainers 倉庫中建立新問題。

請參考工作範例

要查看 Prompt API 的運作，並檢視使用該 API 的現有程式碼：

啟用上述的提示 API。
在 Microsoft Edge Canary 或開發者瀏覽器中，打開一個分頁或視窗，進入提示 API 遊玩區。

在左側 內建 AI 遊樂場 導覽中，選取 了提示 。
在頂部的資訊橫幅中，請查看狀態：最初顯示 「Model 下載中，請稍候：

模型下載完成後，資訊橫幅顯示 API 與模型準備好，表示 API 與模型可使用：

如果模型下載無法啟動，請重新啟動 Microsoft Edge 再試一次。

提示 API 僅支援符合特定硬體需求的裝置。更多資訊請參閱上方的硬體需求。
可選擇性地更改提示設定值，例如：
- 使用者提示
- 系統提示
- 響應約束結構
- 更多設定>N 射提示指示
- TopK
- 溫度
點擊頁面底部的提示按鈕。

回應會在頁面的回應區產生：
要停止產生回應，隨時點擊停止按鈕。

另請參閱:

/built-in-ai/ - 內建 AI 遊樂場的原始碼與說明文件，包括 Prompt API 遊樂場。

使用 Prompt API

檢查 API 是否啟用

在使用網站或擴充功能程式碼中的 API 前，請先透過測試物件的存在 LanguageModel 來確認 API 是否已啟用：

if (!LanguageModel) {
  // The Prompt API is not available.
} else {
  // The Prompt API is available.
}

確認該模型是否可用

提示詞 API 僅在裝置支援執行模型，且語言模型與模型執行時由 Microsoft Edge 下載後才能使用。

要檢查該 API 是否可用，請使用以下 LanguageModel.availability() 方法：

const availability = await LanguageModel.availability();

if (availability == "unavailable") {
  // The model is not available.
}

if (availability == "downloadable" || availability == "downloading") {
  // The model can be used, but it needs to be downloaded first.
}

if (availability == "available") {
  // The model is available and can be used.
}

建立新工作階段

建立會話會指示瀏覽器將語言模型載入記憶體，以便使用。在你啟動語言模型提示前，請先用以下 create() 方法建立一個新工作階段：

// Create a LanguageModel session.
const session = await LanguageModel.create();

若要自訂模型會話，您可以將選項傳入方法：create()

// Create a LanguageModel session with options.
const session = await LanguageModel.create(options);

可用選項包括：

monitor，以追蹤模型下載的進度。
initialPrompts，為模型提供將傳送給模型的提示詞脈絡，並建立未來提示應遵循的使用者/助理互動模式。
topK 以及 temperature，以調整模型輸出的一致性與確定性。

這些選項詳述如下。

監控模型下載進度

你可以透過這個 monitor 選項追蹤模型下載進度。當模型尚未完全下載到將使用的裝置時，這很有用，可以提醒網站用戶應該等待。

// Create a LanguageModel session with the monitor option to monitor the model
// download.
const session = await LanguageModel.create({
  monitor: m => {
    // Use the monitor object argument to add an listener for the 
    // downloadprogress event.
    m.addEventListener("downloadprogress", event => {
      // The event is an object with the loaded and total properties.
      if (event.loaded == event.total) {
        // The model is fully downloaded.
      } else {
        // The model is still downloading.
        const percentageComplete = (event.loaded / event.total) * 100;
      }
    });
  }
});

提供模型系統提示

要定義系統提示詞，也就是給模型指令以回應提示詞時使用的指令，請使用以下 initialPrompts 選項。

你建立新工作階段時提供的系統提示會被保留，直到整個工作階段存在，即使上下文視窗因提示過多而溢出。

// Create a LanguageModel session with a system prompt.
const session = await LanguageModel.create({
  initialPrompts: [{
    role: "system",
    content: "You are a helpful assistant."
  }]
});

將提示放在 { role: "system", content: "You are a helpful assistant." } 第 0 位 initialPrompts 以外的任何地方，都會以 TypeError。

N-shot 提示與 initialPrompts

這個initialPrompts選項也允許你提供使用者/助理互動的範例，當被提示時，模型會持續使用這些互動。

此技術也稱為 N 點提示 ，有助於使模型產生的回應更具確定性。

// Create a LanguageModel session with multiple initial prompts, for N-shot
// prompting.
const session = await LanguageModel.create({
  initialPrompts: [
    { role: "system", content: "Classify the following product reviews as either OK or Not OK." },
    { role: "user", content: "Great shoes! I was surprised at how comfortable these boots are for the price. They fit well and are very lightweight." },
    { role: "assistant", content: "OK" },
    { role: "user", content: "Terrible product. The manufacturer must be completely incompetent." },
    { role: "assistant", content: "Not OK" },
    { role: "user", content: "Could be better. Nice quality overall, but for the price I was expecting something more waterproof" },
    { role: "assistant", content: "OK" }
  ]
});

設定 topK 與溫度

topK 以及 temperature 稱為 抽樣參數 ，模型用以影響文本生成。

TopK 抽樣限制了生成文本中每個後續字詞所考慮的字數，這能加快生成過程並帶來更連貫的輸出，但同時也會降低多樣性。
溫度取樣控制輸出的隨機性。較低的溫度會產生較少的隨機輸出，偏好較高機率的單字，從而產生更具確定性的文字。

設定和 topKtemperature 選項，以設定模型的抽樣參數：

// Create a LanguageModel session and setting the topK and temperature options.
const session = await LanguageModel.create({
  topK: 10,
  temperature: 0.7
});

複製一個會話，重新開始對話，選項相同

複製一個現有的會話，在沒有先前互動知識的情況下提示模型，但使用相同的會話選項。

克隆會話很有用，當你想使用前一個會話的選項，但又不想用之前的回應影響模型時。

// Create a first LanguageModel session.
const firstSession = await LanguageModel.create({
  initialPrompts: [
    role: "system",
    content: "You are a helpful assistant."
  ],
  topK: 10,
  temperature: 0.7
});

// Later, create a new session by cloning the first session to start a new
// conversation with the model, but preserve the first session's settings.
const secondSession = await firstSession.clone();

提示模型

在建立模型會話後，使用 session.prompt() or session.promptStreaming() 方法來提示模型。

請等待最終回覆

該 prompt 方法會回傳一個承諾，當模型完成對你的提示產生文字後，該承諾就會被解析：

// Create a LanguageModel session.
const session = await LanguageModel.create();

// Prompt the model and wait for the response to be generated.
const result = await session.prompt(promptString);

// Use the generated text.
console.log(result);

在產生的代幣中顯示

該方法會 promptStreaming 立即回傳一個串流物件。利用串流顯示正在產生的回應標記：

// Create a LanguageModel session.
 const session = await LanguageModel.create();

// Prompt the model.
 const stream = session.promptStreaming(myPromptString);

// Use the stream object to display tokens that are generated by the model, as
// they are being generated.
for await (const chunk of stream) {
  console.log(chunk);
}

你可以在同一個會話物件中多次呼叫 prompt and promptStreaming 方法，繼續產生基於該會話中先前與模型互動的文字。

透過 JSON 架構或正則表達式來限制模型輸出

為了讓模型回應的形式更具確定性且更易於程式化使用，請在提示模型時使用該 responseConstraint 選項。

此 responseConstraint 選項可接受 JSON 架構或正規表達式：

要讓模型回應一個遵循特定結構的串化 JSON 物件，設定 responseConstraint 為你想使用的 JSON 架構。
要讓模型回應一個與正則表達式相符的字串，請設 responseConstraint 為該正則表達式。

以下範例展示了如何讓模型對一個遵循特定結構的 JSON 物件的提示做出回應：

// Create a LanguageModel session.
const session = await LanguageModel.create();

// Define a JSON schema for the Prompt API to constrain the generated response.
const schema = {
  "type": "object",
  "required": ["sentiment", "confidence"],
  "additionalProperties": false,
  "properties": {
    "sentiment": {
      "type": "string",
      "enum": ["positive", "negative", "neutral"],
      "description": "The sentiment classification of the input text."
    },
    "confidence": {
      "type": "number",
      "minimum": 0,
      "maximum": 1,
      "description": "A confidence score indicating certainty of the sentiment classification."
    }
  }
}
;

// Prompt the model, by providing a system prompt and the JSON schema in the
// responseConstraints option.
const response = await session.prompt(
  "Ordered a Philly cheesesteak, and it was not edible. Their milkshake is just milk with cheap syrup. Horrible place!",
  {
    initialPrompts: [
      {
        role: "system",
        content: "You are an AI model designed to analyze the sentiment of user-provided text. Your goal is to classify the sentiment into predefined categories and provide a confidence score. Follow these guidelines:\n\n- Identify whether the sentiment is positive, negative, or neutral.\n- Provide a confidence score (0-1) reflecting the certainty of the classification.\n- Ensure the sentiment classification is contextually accurate.\n- If the sentiment is unclear or highly ambiguous, default to neutral.\n\nYour responses should be structured and concise, adhering to the defined output schema."
      },
    ],
    responseConstraint: schema
  }
);

執行上述程式碼會回傳包含串聯 JSON 物件的回應，例如：

{"sentiment": "negative", "confidence": 0.95}

接著你可以用以下函式解析 JSON.parse() 回應，將回應納入你的程式碼邏輯：

// Parse the JSON string generated by the model and extract the sentiment and
// confidence values.
const { sentiment, confidence } = JSON.parse(response);

// Use the values.
console.log(`Sentiment: ${sentiment}`);
console.log(`Confidence: ${confidence}`);

想了解更多，請參閱帶有 JSON 架構的結構化輸出或 RegExp 約束。

每個提示發送多則訊息

除了字串之外， prompt and promptStreaming 方法還接受一個物件陣列，用來傳送多個自訂角色的訊息。你傳送的物件應該是形式 { role, content }為，其中 role 是 user 或 assistant，是 content 訊息。

例如，在同一提示中提供多個使用者訊息與助理訊息：

// Create a LanguageModel session.
const session = await LanguageModel.create();

// Prompt the model by sending multiple messages at once.
const result = await session.prompt([
  { role: "user", content: "First user message" },
  { role: "user", content: "Second user message" },
  { role: "assistant", content: "The assistant message" }
]);

停止產生文字

若要在回傳的承諾 session.prompt() 尚未解決或回傳 session.promptStreaming() 的串流結束前中止提示，請使用 AbortController 以下訊號：

// Create a LanguageModel session.
const session = await LanguageModel.create();

// Create an AbortController object.
const abortController = new AbortController();

// Prompt the model by passing the AbortController object by using the signal
// option.
const stream = session.promptStreaming(myPromptString , {
  signal: abortController.signal
});

// Later, perhaps when the user presses a "Stop" button, call the abort()
// method on the AbortController object to stop generating text.
abortController.abort();

摧毀一場會話

銷毀該會話，讓瀏覽器知道你不再需要語言模型，這樣模型就能從記憶體中卸載。

你可以用兩種不同的方式摧毀一場遊戲：

透過使用這個 destroy() 方法。
透過使用 AbortController.

使用 destroy () 方法銷毀會話

// Create a LanguageModel session.
const session = await LanguageModel.create();

// Later, destroy the session by using the destroy method.
session.destroy();

使用中止控制器摧毀會話

// Create an AbortController object.
const controller = new AbortController();

// Create a LanguageModel session and pass the AbortController object by using
// the signal option.
const session = await LanguageModel.create({ signal: controller.signal });

// Later, perhaps when the user interacts with the UI, destroy the session by
// calling the abort() function of the AbortController object.
controller.abort();

傳送意見反應

Prompt API 開發者預覽版旨在協助發現瀏覽器提供的語言模型的應用案例。我們非常想知道你打算在哪些情境下使用 Prompt API、API 或語言模型的問題，以及新的任務專用 API，例如校對或翻譯，是否有用。

若要對您的情境及想完成的任務提供回饋，請在 Prompt API 回饋問題中留言。

如果你發現使用 API 時有任何問題，請向倉庫回報。

你也可以在 W3C 網路機器學習工作小組的資料庫中，參與關於提示 API 設計的討論。

另請參閱

Prompt API 的說明，網路上的機器學習 GitHub 倉庫。
使用 Writing Assistance API 撰寫、重寫及摘要文字
使用 Translator API 進行文字翻譯
/built-in-ai/ - 內建 AI 遊樂場的原始碼與說明文件，包括 Prompt API 遊樂場。

意見反應

此頁面對您有幫助嗎？

Last updated on 2025-11-27