Today, we are happy to announce the public preview of Named Entity Recognition as part of the Text Analytics Cognitive Service. Named Entity Recognition (NER) is the ability to take free-form text and identify the occurrences of entities such as people, locations, organizations, and more. With just a simple API call, NER in Text Analytics uses robust machine learning models to find and categorize more than twenty types of named entities in any text documents.
Many organizations have messy piles of unstructured text in the form of customer feedback, enterprise documents, social media feeds, and more. However, it is challenging to understand what information these ever-growing stacks of documents contain. Text Analytics has long been helping customers make sense of these troves of text with capabilities such as Key Phrase Extraction, Sentiment Analysis, and Language Detection. Today's announcement adds to this suite of powerful and easy-to-use natural language processing solutions that make it easy to tackle many problems.
Named Entity Recognition and Entity Linking
Building upon the Entity Linking feature that was announced at Build earlier this year, the new Entities API processes the text using both NER and Entity Linking capabilities. This makes it an extremely powerful solution for squeezing the most structured information out of the unstructured text.
Entity Linking is the ability to identify and disambiguate the well-known identity of an entity found in the text, for example, determining whether the word "Mars" is being used as the planet or as the Roman god of war. This process requires the presence of a knowledge base which recognizes entities are linked. Knowledge bases from Bing and Wikipedia are used for Text Analytics. When the Text Analytics Entities API recognizes an entity using entity linking, it will provide links to more information about the entity on the web.
Named Entity Recognition, in contrast, can identify the entities in unstructured text regardless of whether the entities are well-known or exist in a knowledge base. When Text Analytics identifies an entity using NER, it will provide the type of entity i.e. person, location, organization, and others in the API response. In some cases, it will also provide a subtype.
In cases where an entity is recognized using both Entity Linking and Named Entity Recognition, the API will return the entity's type as well as web links to more information about the entity.
Supported entity types
|Person||N/A*||"Jeff", "Ashish Makadia"|
|Location||N/A*||"Redmond, Washington", "Paris"|
|Quantity||Percentage||"50%", "fifty percent"|
|Quantity||NumberRange||"4 to 8"|
|Quantity||Age||"90 days old", "30 years old"|
|Quantity||Dimension||"10 miles", "40 cm"|
|DateTime||N/A*||"6:30PM February 4, 2012"|
|DateTime||Date||"May 2nd, 2017", "05/02/2017"|
|DateTime||DateRange||"May 2nd to May 5th"|
|DateTime||TimeRange||"6pm to 7pm"|
|DateTime||Duration||"1 minute and 45 seconds"|
Depending on the input and extracted entities, certain entities may omit the SubType.