你当前正在访问 Microsoft Azure Global Edition 技术文档网站。 如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站,请访问 https://docs.azure.cn

快速入门:使用 Azure OpenAI 耳语模型将语音转换为文本

在本快速入门中,你将使用 Azure OpenAI Whisper 模型将语音转换为文本。

Azure OpenAI Whisper 模型的文件大小限制为 25 MB。 如果需要听录大于 25 MB 的文件,可以使用 Azure AI 语音批量听录 API。

先决条件

注意

目前,必须提交应用程序才能访问 Azure OpenAI 服务。 若要申请访问权限,请填写此表单

设置

检索密钥和终结点

若要成功对 Azure OpenAI 发出调用,需要一个终结点和一个密钥

变量名称
AZURE_OPENAI_ENDPOINT 从 Azure 门户检查资源时,可在“密钥和终结点”部分中找到此值。 或者,可以在“Azure OpenAI Studio”>“操场”>“代码视图”中找到该值。 示例终结点为:https://aoai-docs.openai.azure.com/
AZURE_OPENAI_API_KEY 从 Azure 门户检查资源时,可在“密钥和终结点”部分中找到此值。 可以使用 KEY1KEY2

在 Azure 门户中转到你的资源。 可以在“资源管理”部分找到“终结点和密钥”。 复制终结点和访问密钥,因为在对 API 调用进行身份验证时需要这两项。 可以使用 KEY1KEY2。 始终准备好两个密钥可以安全地轮换和重新生成密钥,而不会导致服务中断。

Azure 门户中 Azure OpenAI 资源概述 UI 的屏幕截图,其中终结点和访问密钥的位置用红圈标示。

为密钥和终结点创建和分配持久环境变量。

环境变量

setx AZURE_OPENAI_API_KEY "REPLACE_WITH_YOUR_KEY_VALUE_HERE" 
setx AZURE_OPENAI_ENDPOINT "REPLACE_WITH_YOUR_ENDPOINT_HERE" 

REST API

在 bash shell 中运行以下命令: 需要将 YourDeploymentName 替换为你在部署耳语模型时选择的部署名称。 部署名称不一定与模型名称相同。 输入模型名称将会导致错误,除非所选部署名称与基础模型名称相同。

curl $AZURE_OPENAI_ENDPOINT/openai/deployments/YourDeploymentName/audio/transcriptions?api-version=2024-02-01 \
 -H "api-key: $AZURE_OPENAI_API_KEY" \
 -H "Content-Type: multipart/form-data" \
 -F file="@./wikipediaOcelot.wav"

该命令的第一行的格式及示例终结点如下所示 curl https://aoai-docs.openai.azure.com/openai/deployments/{YourDeploymentName}/audio/transcriptions?api-version=2024-02-01 \

可以从 GitHub 上的 Azure AI 语音 SDK 存储库获取示例音频文件。

重要

对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关凭据安全性的详细信息,请参阅 Azure AI 服务安全性一文。

输出

{"text":"The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, Mexico, and Central and South America. This medium-sized cat is characterized by solid black spots and streaks on its coat, round ears, and white neck and undersides. It weighs between 8 and 15.5 kilograms, 18 and 34 pounds, and reaches 40 to 50 centimeters 16 to 20 inches at the shoulders. It was first described by Carl Linnaeus in 1758. Two subspecies are recognized, L. p. paradalis and L. p. mitis. Typically active during twilight and at night, the ocelot tends to be solitary and territorial. It is efficient at climbing, leaping, and swimming. It preys on small terrestrial mammals such as armadillo, opossum, and lagomorphs."}

PowerShell

运行以下命令: 需要将 YourDeploymentName 替换为你在部署耳语模型时选择的部署名称。 部署名称不一定与模型名称相同。 输入模型名称将会导致错误,除非所选部署名称与基础模型名称相同。

# Azure OpenAI metadata variables
$openai = @{
    api_key     = $Env:AZURE_OPENAI_API_KEY
    api_base    = $Env:AZURE_OPENAI_ENDPOINT # your endpoint should look like the following https://YOUR_RESOURCE_NAME.openai.azure.com/
    api_version = '2024-02-01' # this may change in the future
    name        = 'YourDeploymentName' #This will correspond to the custom name you chose for your deployment when you deployed a model.
}

# Header for authentication
$headers = [ordered]@{
    'api-key' = $openai.api_key
}

$form = @{ file = get-item -path './wikipediaOcelot.wav' }

# Send a completion call to generate an answer
$url = "$($openai.api_base)/openai/deployments/$($openai.name)/audio/transcriptions?api-version=$($openai.api_version)"

$response = Invoke-RestMethod -Uri $url -Headers $headers -Form $form -Method Post -ContentType 'multipart/form-data'
return $response.text

可以从 GitHub 上的 Azure AI 语音 SDK 存储库获取示例音频文件。

重要

对于生产,请使用安全的方式存储和访问凭据,例如使用 Azure Key Vault 的 PowerShell Secret Management。 有关凭据安全性的详细信息,请参阅 Azure AI 服务安全性一文。

输出

The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, Mexico, and Central and South America. This medium-sized cat is characterized by solid black spots and streaks on its coat, round ears, and white neck and undersides. It weighs between 8 and 15.5 kilograms, 18 and 34 pounds, and reaches 40 to 50 centimeters 16 to 20 inches at the shoulders. It was first described by Carl Linnaeus in 1758. Two subspecies are recognized, L. p. paradalis and L. p. mitis. Typically active during twilight and at night, the ocelot tends to be solitary and territorial. It is efficient at climbing, leaping, and swimming. It preys on small terrestrial mammals such as armadillo, opossum, and lagomorphs.

Python

先决条件

设置

使用以下项安装 OpenAI Python 客户端库:

pip install openai
  1. 创建名为 quickstart.py 的新 Python 文件。 然后在你偏好的编辑器或 IDE 中打开该文件。

  2. 将 quickstart.py 的内容替换为以下代码。 修改代码以添加你的部署名称:

    import os
    from openai import AzureOpenAI
        
    client = AzureOpenAI(
        api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
        api_version="2024-02-01",
        azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
    )
    
    deployment_id = "YOUR-DEPLOYMENT-NAME-HERE" #This will correspond to the custom name you chose for your deployment when you deployed a model."
    audio_test_file = "./wikipediaOcelot.wav"
    
    result = client.audio.transcriptions.create(
        file=open(audio_test_file, "rb"),            
        model=deployment_id
    )
    
    print(result)

使用以下 python 命令对你的快速入门文件运行应用程序:

可以从 GitHub 上的 Azure AI 语音 SDK 存储库获取示例音频文件。

重要

对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关凭据安全性的详细信息,请参阅 Azure AI 服务安全性一文。

输出

{"text":"The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, Mexico, and Central and South America. This medium-sized cat is characterized by solid black spots and streaks on its coat, round ears, and white neck and undersides. It weighs between 8 and 15.5 kilograms, 18 and 34 pounds, and reaches 40 to 50 centimeters 16 to 20 inches at the shoulders. It was first described by Carl Linnaeus in 1758. Two subspecies are recognized, L. p. paradalis and L. p. mitis. Typically active during twilight and at night, the ocelot tends to be solitary and territorial. It is efficient at climbing, leaping, and swimming. It preys on small terrestrial mammals such as armadillo, opossum, and lagomorphs."}

清理资源

如果你想要清理和移除 Azure OpenAI 资源,可以删除该资源。 在删除资源之前,必须先删除所有已部署的模型。

后续步骤