At Build, Microsoft expands its Cognitive Services collection of intelligent APIs

已于 五月 10, 2017 发布

This blog post was authored by the Microsoft Cognitive Services Team.

Microsoft Cognitive Services enables developers to augment the next generation of applications with the ability to see, hear, speak, understand, and interpret needs using natural methods of communication.

Today at the Build 2017 conference, we are excited to announce the next big wave of innovation for Microsoft Cognitive Services, significantly increasing the value for developers looking to embrace AI and build the next generation of applications.

  • Customizable: With the addition of Bing Custom Search, Custom Vision Service and Custom Decision Service on top of Custom Speech and Language Understanding Intelligent Service, we now have a broader set of custom AI APIs available, allowing customers to use their own data with algorithms that are customized for their specific needs.
  • Cutting edge technologies: Today we are launching Microsoft’s Cognitive Services Labs, which allow any developer to take part in the broader research community’s quest to better understand the future of cognitive computing, by experimenting with new services still in the early stages of development. One of the first AI services being made available via our Cognitive Services Labs is Project Prague,  which lets you use gestures to control and interact with technologies to have more intuitive and natural experiences.  This cutting edge and easy to use SDK is in private preview.
  • High pace of innovation: We’re expanding our Cognitive Services portfolio to 29 intelligent APIs with the addition of Video Indexer, Custom Decision Service, Bing Custom Search, and Custom Vision Service, along with the new Cognitive Services Lab Project Prague, for gestures, and updates to our existing Cognitive Services, such as Bing Search, Microsoft Translator and Language Understanding Intelligent Service.

Today, 568,000+ developers from more than 60 of countries are using Microsoft Cognitive Services that allow systems to see, hear, speak, understand and interpret our needs.

What are the capabilities of these new services?

  • Custom Vision Service, available today in free public preview, is an easy-to-use, customizable web service that learns to recognize specific content in imagery, powered by state-of-the-art machine learning neural networks that become smarter with training. You can train it to recognize whatever you choose, whether that be animals, objects, or abstract symbols. This technology could easily apply to retail environments for machine-assisted product identification, or in digital space to automatically help sorting categories of pictures.
  • Video Indexer, available today in free public preview, is one of the industry’s most comprehensive video AI services. It helps you unlock insights from any video by indexing and enabling you to search spoken audio that is transcribed and translated, sentiment, faces that appeared and objects. With these insights, you can improve discoverability of videos in your applications or increase user engagement by embedding this capability in sites. All of these capabilities are available through a simple set of APIs, ready to use widgets and a management portal.
  • Custom Decision Service, available today in free public preview, is a service that helps you create intelligent systems with a cloud-based contextual decision-making API that adapts with experience. Custom Decision service uses reinforcement learning in a new approach for personalizing content; it’s able to plug into your application and helps to make decisions in real time as it automatically adapts to optimize your metrics over time.
  • Bing Custom Search, available today in free public preview, lets you create a highly-customized web search experience, which delivers better and more relevant results from your targeted web space. Featuring a straightforward User Interface, Bing Custom Search enables you to create your own web search service without a line of code. Specify the slices of the web that you want to draw from and explore site suggestions to intelligently expand the scope of your search domain. Bing Custom Search can empower businesses of any size, hobbyists and entrepreneurs to design and deploy web search applications for any possible scenario.
  • Microsoft’s Cognitive Services Labs allow any developer to experiment with new services still in the early stages of development. Among them, Project Prague is one of the services currently in private preview. This SDK is built from an intensive library of hand poses that creates more intuitive experiences by allowing users to control and interact with technologies through typical hand movements. Using a special camera to record the gestures, the API then recognizes the formation of the hand and allows the developer to tie in-app actions to each gesture.
  • Next version of Bing APIs, available in public preview, allowing developers to bring the vast knowledge of the web to their users and benefit from improved performance, new sorting and filtering options, robust documentation, and easy Quick Start guides. This release includes the full suite of Bing Search APIs (Bing Web Search API Preview, Bing News Search API Preview, Bing Video Search API Preview, and Bing Image Search API Preview), Bing Autosuggest API Preview, and Bing Spell Check API Preview. Please find more information in the announcement blog.
  • Presentation Translator, a Microsoft Garage project provides presenters the ability to add subtitles to their presentations, in the same language for accessibility scenarios or in another language for multi-language situations. Audience members get subtitles in their desired language on their own device through the Microsoft Translator app, in a browser and (optionally) translate the slides while preserving their formatting. Click here to be notified when it’s available.
  • Language Understanding Intelligent Service (LUIS) improvements - helps developers integrate language models that understand users quickly and easily, using either prebuilt or customized models. Updates to LUIS include increased intents and entities, introduction of new powerful developer tools for productivity, additional ways for the community to use and contribute, improved speech recognition with Microsoft Bot Framework, and more global availability.

Let’s take a closer look at what these new APIs and Services can do for you.

Bring custom vision to your app

Thank to Custom Vision Service, it becomes pretty easy to create your own image recognition service. You can use the Custom Vision Service Portal to upload a series of images to train your classifier and a few images to test it after the classifier is trained.

Custom Vision

It’s also possible to code each step: let’s say I need to quickly create my image classifier for a specific need, this can be products my users are uploading on my website, retail merchandize or even animal images in a forest.

  • To get started, I would need the Custom Vision API, which can be found with this SDK. I need to create a console application and prepare the training key & the images needed for the example

I can start with Visual Studio to create a new Console Application, and replace the contents of Program.cs with the following code. This code defines and calls two helper methods:

  • The method called GetTrainingKey prepares the training key.
  • The one called LoadImagesFromDisk loads two sets of images that this example uses to train the project, and one test image that the example loads to demonstrate the use of the default prediction endpoint.
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Threading;
using Microsoft.Cognitive.CustomVision;

namespace SmokeTester
{
    class Program
    {
        private static List<MemoryStream> hemlockImages;

        private static List<MemoryStream> japaneseCherryImages;

        private static MemoryStream testImage;

        static void Main(string[] args)
        {
            // You can either add your training key here, pass it on the command line, or type it in when the program runs
            string trainingKey = GetTrainingKey("<your key here>", args);

            // Create the Api, passing in a credentials object that contains the training key
            TrainingApiCredentials trainingCredentials = new TrainingApiCredentials(trainingKey);
            TrainingApi trainingApi = new TrainingApi(trainingCredentials);

            // Upload the images we need for training and the test image
            Console.WriteLine("\tUploading images");
            LoadImagesFromDisk();
        }

        private static string GetTrainingKey(string trainingKey, string[] args)
        {
            if (string.IsNullOrWhiteSpace(trainingKey) || trainingKey.Equals("<your key here>"))
            {
                if (args.Length >= 1)
                {
                    trainingKey = args[0];
                }

                while (string.IsNullOrWhiteSpace(trainingKey) || trainingKey.Length != 32)
                {
                    Console.Write("Enter your training key: ");
                    trainingKey = Console.ReadLine();
                }
                Console.WriteLine();
            }

            return trainingKey;
        }

        private static void LoadImagesFromDisk()
        {
            // this loads the images to be uploaded from disk into memory
            hemlockImages = Directory.GetFiles(@"..\..\..\..\..\SampleImages\Hemlock").Select(f => new MemoryStream(File.ReadAllBytes(f))).ToList();
            japaneseCherryImages = Directory.GetFiles(@"..\..\..\..\..\SampleImages\Japanese Cherry").Select(f => new MemoryStream(File.ReadAllBytes(f))).ToList();
            testImage = new MemoryStream(File.ReadAllBytes(@"..\..\..\..\..\SampleImages\Test\test_image.jpg"));

        }
    }
}
  • As next step, I would need to Create a Custom Vision Service project, adding the following code in the Main() method after the call to LoadImagesFromDisk().

 

            // Create a new project
            Console.WriteLine("Creating new project:");
            var project = trainingApi.CreateProject("My New Project");

 

 

 

  • Next, I need to add tags to my project by insert the following code after the call to CreateProject()
            // Make two tags in the new project
            var hemlockTag = trainingApi.CreateTag(project.Id, "Hemlock");
            var japaneseCherryTag = trainingApi.CreateTag(project.Id, "Japanese Cherry");
  • Then, I need to Upload images in memory to the project, by inserting the following code at the end of the Main() method:

 

            // Images can be uploaded one at a time
            foreach (var image in hemlockImages)
            {
                trainingApi.CreateImagesFromData(project.Id, image, new List<string>() { hemlockTag.Id.ToString() });
            }

            // Or uploaded in a single batch 
            trainingApi.CreateImagesFromData(project.Id, japaneseCherryImages, new List<Guid>() { japaneseCherryTag.Id });

 

  • Now that I've added tags and images to the project, I can train it. I would need to insert the following code at the end of Main(). This creates the first iteration in the project. I can then mark this iteration as the default iteration.
            // Now there are images with tags start training the project
            Console.WriteLine("\tTraining");
            var iteration = trainingApi.TrainProject(project.Id);

            // The returned iteration will be in progress, and can be queried periodically to see when it has completed
            while (iteration.Status == "Training")
            {
                Thread.Sleep(1000);

                // Re-query the iteration to get it's updated status
                iteration = trainingApi.GetIteration(project.Id, iteration.Id);
            }

            // The iteration is now trained. Make it the default project endpoint
            iteration.IsDefault = true;
            trainingApi.UpdateIteration(project.Id, iteration.Id, iteration);
            Console.WriteLine("Done!\n");
  • As I’m now ready to use the model for prediction, I first obtain the endpoint associated with the default iteration; then I send a test image to the project using that endpoint. Insert the code below at the end of Main().

 

            // Now there is a trained endpoint, it can be used to make a prediction

            // Get the prediction key, which is used in place of the training key when making predictions
            var account = trainingApi.GetAccountInfo();
            var predictionKey = account.Keys.PredictionKeys.PrimaryKey;

            // Create a prediction endpoint, passing in a prediction credentials object that contains the obtained prediction key
            PredictionEndpointCredentials predictionEndpointCredentials = new PredictionEndpointCredentials(predictionKey);
            PredictionEndpoint endpoint = new PredictionEndpoint(predictionEndpointCredentials);

            // Make a prediction against the new project
            Console.WriteLine("Making a prediction:");
            var result = endpoint.PredictImage(project.Id, testImage);

            // Loop over each prediction and write out the results
            foreach (var c in result.Predictions)
            {
                Console.WriteLine($"\t{c.Tag}: {c.Probability:P1}");
            }

            Console.ReadKey();

 

  • Last step, let’s build and run the solution: the prediction results appear on the console.

For more information about Custom Vision Service, please take a look at the following resources:

Personalization of your site with Custom Decision Service

With Custom Decision Service, you can personalize content on your website, so that users see the most engaging content for them.

Let’s say I own a news website, with a front page with links to several articles. As the page loads, I want to request Custom Decision Service to provide a ranking of articles to include on the page.

When one of my users clicks on an article, a second request is going to be sent to the Custom Decision Service to log the outcome of the decision. The easiest integration mode requires just an RSS feed for the content and a few lines of javascript to be added into the application. Let’s get started!

  • First, I need to register on the Decision Service Portal by clicking on My Portal menu item in the top ribbon, then I can register the application, choosing a unique identifier. It’s also possible to create a name for an action set feed, along with an RSS or Atom end point currently.

Custom Decision Service

  • The basic use of Custom Decision Service is fairly straightforward: the front page will use Custom Decision Service to specify the ordering of the article pages. I just need to insert the following code into the HTML head of the front page.

 

// Define the "callback function" to render UI
<script> function callback(data) { … } </script>

// call to Ranking API
<script src="https://ds.microsoft.com/<domain>/rank/<actionSetId>" async></script>

 

The order matters as the callback function should be defined before the call to Ranking API. The data argument contains the ranking of URLs to be rendered. For more information, see the tutorial and API reference.

  • For each article page, I need to make sure the canonical URL is set and matches the URLs provided your RSS feed, and insert the following code into the HTML head to call Reward API:
<script src="https://ds.microsoft.com/DecisionService.js"></script>
<script> window.DecisionService.trackPageView(); </script>
  • Finally, I need to provide the Action Set API, which returns the list of articles (a.k.a., actions) to be considered by Custom Decision Service. I can implement this API as an RSS feed, as shown here:

 

<rss version="2.0">
<channel>
   <item>
      <title><![CDATA[title (possibly with url) ]]></title>
      <link>url</link>
      <pubDate>Thu, 27 Apr 2017 16:30:52 GMT</pubDate>
    </item>
   <item>
       ....
   </item>
</channel>
</rss>

 

For more information about Custom Decision Service, please take a look at the following resources:

Unlock video insights

With Video Indexer, it’s now possible to process and extract lots of insights from video files, such as:

  • Face detection and identification (finds, identifies, and tracks human faces within a video)
  • OCR (optical character recognition, extracting text content from videos and generates searchable digital text)
  • Transcript (converting audio to text based on specified language)
  • One of my favorites, differentiation of speakers (maps and understands each speaker and identifies when each speaker is present in the video)
  • Voice/sound detection (separating background noise/voice activity from silence)
  • Sentiment analysis (performing analysis based on multiple emotional attributes - currently, Positive, Neutral, Negative options are supported)

Video

From one video to multiple insights

Let’s say I’m a news agency with a video library that my users need to search against: I need to easily extract metadata on the videos to enhance the search experience with indexed spoken words and faces.

  • The easiest first step is the simply go to the Video Indexer Web Portal: I can sign-in, upload a video and let Video Indexer start indexing and analyzing the video. Once it’s done, I will receive a notification with a link to my video and a short description of what was found in your video (people, topics, OCRs,..).
  • If I want to use the Video Indexer APIs, I also need to sign-in to the Video Indexer Web Portal, select production and subscribe. This sends the Video Indexer team a subscription request, which will be approved shortly. Once approved, I will be able to see my subscription and my keys.

The following C# code snippet demonstrates the usage of all the Video Indexer APIs together.

    var apiUrl = "https://videobreakdown.azure-api.net/Breakdowns/Api/Partner/Breakdowns";
    var client = new HttpClient();
    client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "InsertYourKey");

    var content = new MultipartFormDataContent();

    Console.WriteLine("Uploading...");
    var videoUrl = "https:/...";
    var result = client.PostAsync(apiUrl + "?name=some_name&description=some_description&privacy=private&partition=some_partition&videoUrl=" + videoUrl, content).Result;
    var json = result.Content.ReadAsStringAsync().Result;

    Console.WriteLine();
    Console.WriteLine("Uploaded:");
    Console.WriteLine(json);

    var id = JsonConvert.DeserializeObject<string>(json);

    while (true)
    {
        Thread.Sleep(10000);

        result = client.GetAsync(string.Format(apiUrl + "/{0}/State", id)).Result;
        json = result.Content.ReadAsStringAsync().Result;

        Console.WriteLine();
        Console.WriteLine("State:");
        Console.WriteLine(json);

        dynamic state = JsonConvert.DeserializeObject(json);
        if (state.state != "Uploaded" && state.state != "Processing")
        {
            break;
        }
    }

    result = client.GetAsync(string.Format(apiUrl + "/{0}", id)).Result;
    json = result.Content.ReadAsStringAsync().Result;
    Console.WriteLine();
    Console.WriteLine("Full JSON:");
    Console.WriteLine(json);

    result = client.GetAsync(string.Format(apiUrl + "/Search?id={0}", id)).Result;
    json = result.Content.ReadAsStringAsync().Result;
    Console.WriteLine();
    Console.WriteLine("Search:");
    Console.WriteLine(json);

    result = client.GetAsync(string.Format(apiUrl + "/{0}/InsightsWidgetUrl", id)).Result;
    json = result.Content.ReadAsStringAsync().Result;
    Console.WriteLine();
    Console.WriteLine("Insights Widget url:");
    Console.WriteLine(json);

    result = client.GetAsync(string.Format(apiUrl + "/{0}/PlayerWidgetUrl", id)).Result;
    json = result.Content.ReadAsStringAsync().Result;
    Console.WriteLine();
    Console.WriteLine("Player token:");
    Console.WriteLine(json);
  • When I make an API call and the response status is OK, I will get a detailed JSON output containing details of the specified video insights including keywords (topics), faces, blocks. Each block includes time ranges, transcript lines, OCR lines, sentiments, faces, and block thumbnails.

For more information, please take a look at:

Create a highly targeted search for your users

With Bing Custom Search, I can create a highly-customized web search experience for my targeted web space: there are a lot of integration scenarios and end-user entry points for a custom search solution.

For example, Amicus is building an app that changes the way global aid is funded and delivered by providing donors with full transparency. Amicus needed to help donors learn, find and fund projects specifically related to global aid that were of interest and relevant to them. With Bing Custom Search, Amicus has been able to identify its own set of  relevant web pages in advance: when users have a single concept of interest (like ‘water’, ‘education’ or ‘India’), Bing Custom Search is able to deliver highly relevant results in the context of global aid.

For more information about Bing Custom Search, don’t hesitate to look at the Bing Custom Search Blog announcement.

Let’s imagine that I need to build a customized search for my public website on ‘bike touring’ – a very important activity in Seattle area.

  • I can get started by signing up on the Bing Custom Search Portal and get my free trial key.
  • Once logged in, I can start creating a custom search instance: it contains all the settings that are required to define a custom search tailored towards a scenario of my choice. Here, I want to create a search to find bike touring related content, in that case, I’d create a custom search instance called ‘BikeTouring’.
  • Then, I need to define the slices of the web that I want to search over for my scenario and add them to my search instance. The custom slices can include domains, subdomains, or web-pages.
  • I can now adjust the default order of the results based on my needs. For example, for a specific query I can pin a specific web-page to the top. Or I can boost and demote sites, or web pages so that they show up higher or lower, respectively, in the set of results that my custom search service returns.
  • After this, I can track my ranking adjustments in the tabs ‘Active’, ‘Blocked’, and ‘Pinned’. Also, I can revisit my adjustments at any time.

Bike Tours

  • Then, I publish my settings. Before calling Bing Web Search API directly and programmatically, I can try out my custom search service in the UI directly. For that, I specify a query and click ‘Test API’. I can then see the algorithmic results from my custom search service on the right-hand side.
  • To call and retrieve the results for my custom search service programmatically, I can call Bing Web Search API. In that case, I’d augment the standard Bing Web Search API call with a custom configuration parameter called costumconfig. Below is the API request URL with the costumconfig parameter:
https://api.cognitive.microsoft.com/bingcustomsearch/v5.0/search[?q][&customconfig][&count][&offset][&mkt][&safesearch]

Below is a JSON response of a Bing Web Search API call with a customconfig parameter.

 

{
    "_type" : "SearchResponse",
    "queryContext" : {...},
    "webPages" : {...},
    "spellSuggestion" : {...},
    "rankingResponse" : {...}
}

 

For more information, please take a look at the dedicated blog announcement as well as the following resources:
•    Bing Custom Search portal
•    The full list of resources in the Get started guide

New AI MVP Award Category

Thank you for reaching this far! As a reward, we’re pleased to inform you about our new AI MVP Program!

As you know, the world of data and AI is evolving at an unprecedented pace, and so is its community of experts.  The Microsoft MVP Award Program is pleased to announce the launch of the new AI Award Category for recognizing outstanding community leadership among AI experts. Potential “AI MVPs” include developers creating intelligent apps and bots, modeling human interaction (voice, text, speech,...), writing AI algorithms, training data sets and sharing this expertise with their technical communities.

An AI MVP will be award based on contributions in the follow technology contribution areas:

Screenshot_3

The AI Award Category will be a new addition to the current award categories. If you or someone you know may qualify, submit a nomination!

Thank you again and happy coding!