• 3 min read

New machine-assisted text classification on Content Moderator now in public preview

Content Moderator’s new machine-assisted text classification feature (preview) augments human review by detecting potentially undesired content that may be deemed as inappropriate depending on context.

This blog post is co-authored by Ashish Jhanwar, Data Scientist, Microsoft

Content Moderator is part of Microsoft Cognitive Services allowing businesses to use machine assisted moderation of text, images, and videos that augment human review.

The text moderation capability now includes a new machine-learning based text classification feature which uses a trained model to identify possible abusive, derogatory or discriminatory language such as slang, abbreviated words, offensive, and intentionally misspelled words for review.

In contrast to the existing text moderation service that flags profanity terms, the text classification feature helps detect potentially undesired content that may be deemed as inappropriate depending on context. In addition, to convey the likelihood of each category it may recommend a human review of the content.

The text classification feature is in preview and supports the English language.

How to use

Content Moderator consists of a set of REST APIs. The text moderation API adds an additional request parameter in the form of classify=True. If you specify the parameter as true, and the auto-detected language of your input text is English, the API will output the additional classification insights as shown in the following sections.

If you specify the language as English for non-English text, the API assumes the language as English, and outputs the additional insights, but they may not be relevant or useful.

The following code sample shows how to invoke the new feature by using the text moderation API.

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Threading.Tasks;

namespace TextClassifier
    class Program
        //Content Moderator Key, API endpoint, and the new parameter
        public const string CONTENTMODERATOR_APIKEY = "YOUR API KEY";
        public const string APIURI = "https://[REGIONNAME].api.cognitive.microsoft.com/contentmoderator/moderate/v1.0/ProcessText/Screen";
        public const string CLASSIFYPARAMETER = "classify=True";

        static void Main(string[] args)
            string ResponseJSON;
            string Message = "This is crap!";

            HttpClient client = new HttpClient();
            client.BaseAddress = new Uri(APIURI);

            string FullUri = APIURI + "?" + CLASSIFYPARAMETER;

            // Add an Accept header for JSON format.
            new MediaTypeWithQualityHeaderValue("text/plain"));

            client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", CONTENTMODERATOR_APIKEY);

            HttpResponseMessage response = null;
            response = client.PostAsync(FullUri, new StringContent(
                                   Message, System.Text.Encoding.UTF8, "text/plain")).Result;

            if (response.StatusCode == System.Net.HttpStatusCode.OK)
                Console.WriteLine("Message insights:");
                ResponseJSON = response.Content.ReadAsStringAsync().Result;

Sample response

If you run the preceding sample console application, the resulting output shows the following classification insights. The ReviewRecommended value is set to true because the score for a classification was greater than the internal thresholds. Customers use either the ReviewRecommended flag to determine when content is flagged for human review or custom thresholds based on their content policies. The scores are in the range from 0 to 1.

 "Classification": {
    "ReviewRecommended": true,
    "Category1": { "Score": 0.0746903046965599 },
    "Category2": { "Score": 0.23644307255744934 },
    "Category3": { "Score": 0.98799997568130493 }

Explanation of the response

  • Category1: Represents the potential presence of language that may be considered sexually explicit or adult in certain situations.
  • Category2: Represents the potential presence of language that may be considered sexually suggestive or mature in certain situations.
  • Category3: Represents the potential presence of language that may be considered offensive in certain situations.
  • Score: The score range is between 0 and 1. The higher the score, the higher the model is predicting that the category may be applicable. This preview relies on a statistical model rather than manually coded outcomes. We recommend testing with your own content to determine how each category aligns to your requirements.
  • ReviewRecommended: ReviewRecommended is either true or false depending on the internal score thresholds. Customers should assess whether to use this value or decide on custom thresholds based on their content policies.

Benefits of machine-assisted text moderation

The text classification feature is powered by a blend of advanced machine learning and Natural Language Processing (NLP) techniques. It is designed to work in different text domains like chats, comments, paragraphs etc.

Businesses use the text moderation service to either block, approve or review the content based on their policies and thresholds. The text moderation service can be used to augment human moderation of environments that require partners, employees and consumers to generate text content. These include chat rooms, discussion boards, chatbots, eCommerce catalogs, documents, and more.

Next steps

Sign up for Content Moderator by using either the Azure portal or the Content Moderator human review tool. Get the API key and your region as explained in the Credentials article.

Use the text moderation API console to test drive the capability online. Get started on your integration by either using the REST API samples or the .NET SDK samples.