Cognitive Services Pricing – Computer Vision API

Use intelligence APIs to enable vision, language and search capabilities.

This state-of-the-art, cloud-based API provides developers with access to advanced algorithms that allow you to extract rich information from images in order to categorise and process visual data. Capabilities include image analytics, tagging, recognition of celebrities, text extraction and smart thumbnail generation.

Pricing details

The below pricing for Brand will be effective from 29 January 2019.

Instance Transactions per second (TPS)** Features Price
Free - Web/Container 20 per minute
S1 - Web/Container 10 TPS Tag
Image type
0-1M transactions — $- per 1,000 transactions
1M-5M transactions — $- per 1,000 transactions
5M-10M transactions — $- per 1,000 transactions
10M-100M transactions — $- per 1,000 transactions
100M+ transactions — $- per 1,000 transactions
Detect, objects
0-1M transactions — $- per 1,000 transactions
1M-5M transactions — $- per 1,000 transactions
5M-10M transactions — $- per 1,000 transactions
10M-100M transactions — $- per 1,000 transactions
100M+ transactions — $- per 1,000 transactions
Recognise text *
$- per 1,000 transactions

Customers are charged per transaction not per API call. Learn more about what transactions are below.

* Products in Preview

+ Non-English languages are in Preview

** TPS only applies to web endpoint

Support and SLA

  • Free billing and subscription management support are included.
  • We guarantee that Cognitive Services running in the standard tier will be available at least 99.9 per cent of the time. No SLA is provided for the free trial. Read the SLA


  • Please refer to the documentation for more detailed descriptions of these operations.

    • Tag: Computer Vision API returns tags based on more than 2,000 recognisable objects, living beings, scenery and actions. In cases where tags may be ambiguous or not commonly known, the API response provides “hints” to clarify the meaning of the tag.
    • Face: Detects human faces within a picture.
    • GetThumbnail: After an image is uploaded, GetThumbnail generates a high-quality thumbnail. The Computer Vision API algorithm analyses the objects within the image, then crops the image to fit the requirements of the region of interest (ROI).
    • Colour: The Computer Vision algorithm extracts colours from an image. The colours are analysed in three different contexts: foreground, background and whole. The colours are grouped into 12 dominant accent colours.
    • Image Type: Computer Vision API can set a Boolean flag to indicate whether an image is black and white or colour, as well as use the same method to indicate whether an image is a line drawing or not. Image Type also indicates whether an image is clip art or not, and the quality.
    • OCR: Optical Character Recognition (OCR) technology detects text content in an image. The identified text is extracted into a machine-readable character stream for search and numerous other purposes, ranging from medical records to security and banking. It automatically detects the language. OCR saves time and provides convenience for users by allowing them to simply take photos of text instead of transcribing it. Please refer to Documentation for supported languages.
    • Adult: Apply the adult/racy settings to enable automated restriction of adult content in images.
    • Celebrity: Azure’s celebrity recognition model recognises 200,000 celebrities from business, politics, sports and entertainment around the world.
    • Analyse: Call multiple operations at once. Specify which functions you want to run and the API will run all of these together. Each operation included in “Analyse” will be counted as a separate transaction.
  • For Recognise Text, each POST call counts as a transaction. All GET calls to see the results of the async service are counted as transactions but are free of charge. For all other operations, each feature call counts as a transaction, whether called independently or grouped through the Analyse call. Analyse calls are used to make calling the API easier, but each feature used counts as a transaction. For instance, an Analyse call containing Tag, Face and Adult would count as three transactions.

    Please refer to the documentation for the complete list and detailed descriptions of operations.

  • Each operation that you call (either individually or through “Analyse”) will be counted as a transaction. The total bill will be based on the number of transactions for each type of operation within a monthly billing period.

    As a specific example, let’s say you make the following calls in a certain monthly billing period:

    • 1,500,000 Analyse operations, each calling both Tag and Describe operations
    • 500,000 OCR operations
    • 4,000,000 Recognise text operations

    Your total bill will be constructed as follows:

    Operations Resource Calculations Subtotal
    1,500,000 Tag and 1,500,000 Face operations: S1 transactions First 1,000,000 transactions: $-/1000 * 1,000,000 = $-
    Remaining 2,000,000 transactions: $-/1000 * 2,000,000 = $-
    500,000 OCR operations: S2 transactions $-/1000 * 500,000 = $- $-
    1,500,000 Describe and 4,000,000 Recognise Text operations: S3 transactions $-/1000 * 5,500,000 = $- $-
    Total $- $-


Estimate your monthly costs for Azure services

Review Azure pricing frequently asked questions

Learn more about Cognitive Services

Review technical tutorials, videos and more resources

Added to estimate. Press 'v' to view on calculator

Learn and build with $200 in credit, and keep going for free