Speaker Recognition

Identify individual speakers or use speech as a means of verification with Speaker Recognition

Speaker Identification

Identify who is speaking. The API can be used to determine the identity of an unknown speaker. Input audio of the unknown speaker is paired against a group of selected speakers, and in the case there is a match found, the speaker’s identity is returned.

We have selected 5 different US presidents and enrolled them to the service using one of the speeches they gave. To see how the demo works, select a speech for one of the presidents by clicking on the sample audios below, or upload one of your own, to test how to automatically identify which president is speaking.

See it in action

President Barack Obama
President George W Bush
President William J Clinton
President George H W Bush
President Ronald Reagan
President Jimmy Carter

Want to build this?

Explore the Cognitive Services APIs

Computer Vision

Distill actionable information from images

Face

Detect, identify, analyze, organize, and tag faces in photos

Ink Recognizer

An AI service that recognizes digital ink content, such as handwriting, shapes, and ink document layout

Video Indexer

Unlock video insights

Custom Vision

Easily customize your own state-of-the-art computer vision models for your unique use case

Form Recognizer

The AI-powered document extraction service that understands your forms

Text Analytics

Easily evaluate sentiment and topics to understand what users want

Translator Text

Easily conduct machine translation with a simple REST API call

QnA Maker

Distill information into conversational, easy-to-navigate answers

Language Understanding

Teach your apps to understand commands from your users

Immersive Reader

Empower users of all ages and abilities to read and comprehend text

Speech Services

Unified speech services for speech-to-text, text-to-speech and speech translation

Speaker Recognition

Use speech to identify and verify individual speakers

Speech Translation

Easily integrate real-time speech translation to your app

Speech to Text

Convert spoken audio to text for more natural interactions

Text to Speech

Convert text to speech to create more natural, accessible interfaces

Content Moderator

Automated image, text, and video moderation

Anomaly Detector

Easily add anomaly detection capabilities to your apps.

Personalizer

An AI service that delivers a personalized user experience

Ready to supercharge your app?