Azure AI Vision with OCR and AI

Overview

Enhance your apps with Azure AI Vision

Azure AI Vision is a unified service that offers innovative computer vision capabilities. Give your apps the ability to analyze images, read text, and detect faces with prebuilt image tagging, text extraction with optical character recognition (OCR), and responsible facial recognition. Incorporate vision features into your projects with no machine learning experience required.

Try it on Vision Studio
Automatically caption images with natural language, use smart crop, and classify images (in preview).

Try image analysis in Vision Studio
Track movement and analyze environments in real time using computer vision with image analysis and object detection.

Try spatial analysis in Vision Studio
Extract printed and handwritten text from images with mixed languages and writing styles using OCR technology.

Try OCR in Vision Studio
Create apps with facial recognition for a seamless and highly secure user experience.

Try facial recognition in Vision Studio
Customize image classification and object detection to fit your needs with just a handful of images and without compromising accuracy (in preview).

Try custom image classification and custom object detection in Vision Studio
Get clear guidance on how to use computer AI Vision responsibly to meet your goals and achieve accurate results.

Review Microsoft responsible AI principles and documentation

Features

Analyze visual content in different ways with Azure AI Vision

Image analysis

Image analysis that pulls from more than 10,000 concepts and objects to detect, classify, caption, and generate insights.

Spatial analysis

Spatial analysis to understand people's presence and movements within physical areas in real time.

Optical character recognition (OCR)

Optical character recognition (OCR) to extract printed and handwritten text from images with varied languages and writing styles.

Facial recognition

Facial recognition to create intelligent applications that recognize and verify human identity.

Pricing

Azure AI Vision pricing

Pay for only what you use with no upfront costs. Azure AI Vision uses a pay-as-you-go consumption model based on number of transactions. Learn more about pricing for Azure AI Vision and Face API.

See Azure AI Vision pricing See Face API pricing

Customer stories

Trusted across industries, by companies of all sizes

“Coaches look at these elements. They look at the compression of the body. They look at various dynamic factors. These machine learning models, by measuring angles between the joints of the body while performing surf maneuvers, can actually help coaches to provide feedback.”

Kevin Schulz: Aerial Phenom and Surfer, Team USA

KPMG helps banking customers identify financial risk.

With AI Vision, KPMG finds and analyzes images and videos and uses optical character recognition (OCR) APIs to identify risk.

A person sitting at a desk holding a paper and a computer

“Give us a shoebox of tax documents, and we’ll use AI and machine learning to put the data in the right places.”

Sameer Agarwal: IT Director, H&R Block

"The newly created image captions make Reddit more accessible and give redditors more opportunities to explore our images, engage in conversations, and ultimately build connections and community."

Tiffany Ong: Product Manager of Guest Experience & SEO, Reddit

Resources

Documentation and resources

A person holding a computer and a cup of coffee

Azure AI Vision documentation

Learn how to analyze visual content in different ways with quickstarts, tutorials, and samples.

Explore the documentation

Microsoft Learn courses

Build your skills with step-by-step guidance.

Start learning

Quickstart: Image analysis

Get started with the Image Analysis REST API or client libraries to set up a basic image tagging script.

Get started

Code samples

Explore what’s possible with Azure AI Vision.

Browse code samples

Transparency Note

Explore use cases for Azure AI Face service.

Learn more

See availability by region
Azure AI Vision and other Azure AI services guarantee 99.9% availability. No service level agreement (SLA) is provided for the free pricing tier.

See SLA details
No. Microsoft automatically deletes your images and videos after processing and does not train on your data to enhance the underlying models. Video data does not leave your premises, and video data is not stored on the edge where the container runs. Learn more about privacy and terms of usage.

Learn more about privacy and terms
No, spatial analysis detects and locates human presence in video footage and outputs a bounding box around each person detected. The AI models do not detect faces nor determine individuals’ identities nor demographics.
The spatial analysis AI models detect and track movements in the video feed based on algorithms that identify the presence of one or more humans by a body bounding box. For each person and bounding box detected in a zone in the camera field of view, the AI models output event data including bounding box coordinates of a person’s body, event type (for example, zone entry or exit, or directional line crossing), pseudonymous identifiers to track the bounding box, and a detection confidence score. This event data is sent to your own instance of Azure IoT Hub.
Yes. Because model customization is designed to be fine-tuned for your scenario, you need to provide labeled data to train your model.
The model customization feature of the service is optimized to quickly recognize major differences between images, so you can start prototyping your model with a small amount of data. You may start with as little as one image per label. If you have more labeled images, you may add more. Depending on the complexity of the problem and degree of accuracy required, you can continue adding additional images per label to improve your model.
It’s both. You can use the site to access a graphical interface for managing datasets, training, and evaluation of models for a no-code experience. Or, as an alternative, you can use the AI Vision APIs.
You can label the images in Azure Machine Learning studio, which is integrated with Vision Studio for easy export of labeled data. You can also label the data in the Common Objects in Context (COCO) file format and import the COCO file directly into Vision Studio. Vision Studio is a set of UI-based tools that let you explore, build, and integrate features from Azure AI Vision.

See documentation for details
The model customization feature for Azure AI Vision is the next generation of Custom Vision, with improved accuracy and few-shot learning capabilities. You may continue to use Custom Vision, or you can migrate your training data to retrain your model with model customization from Azure AI Vision.

See documentation for details
After using Azure AI Vision to extract insights and text from images and video, you can use text analytics to analyze sentiment, Azure AI Translator to translate text into your desired language, or Immersive Reader to read the text aloud, making it more accessible. Related services and capabilities include Azure AI Document Intelligence to extract key-value pairs and tables from documents, Azure AI Video Indexer for extracting advanced metadata from audio and video files, and Azure AI Content Safety to detect unwanted text or images.

Get the Azure mobile app

Azure AI Vision

Enhance your apps with Azure AI Vision

Elevate your computer vision projects

Boost content discoverability with image analysis

Stream video in real time with spatial analysis

Read text from images with optical character recognition (OCR)

Verify identities with facial recognition

Train custom computer vision models

Apply AI responsibly

Analyze visual content in different ways with Azure AI Vision

Image analysis

Spatial analysis

Optical character recognition (OCR)

Facial recognition

Built-in security and compliance

Azure AI Vision pricing

Trusted across industries, by companies of all sizes

KPMG helps banking customers identify financial risk.

Documentation and resources

Azure AI Vision documentation

Microsoft Learn courses

Quickstart: Image analysis

Code samples

Transparency Note

Frequently asked questions

Where is Azure AI Vision available?

What is the SLA for Azure AI Vision?

Do you store my images or videos or use them for product improvements?

Does spatial analysis detect faces or a person’s identity?

How does Azure AI Vision analyze people in a physical space?

Do I need to use my own data for training my custom model?

How much data do I need?

Is the Vision Studio user interface a website or a service?

How do I label the data for model customization?

How is the model customization feature different from Custom Vision?

What other services can I use with Azure AI Vision?

Account signup

Get started with a free account

Get started with pay-as-you-go pricing

AI-powered assistant