Solution architecture: Information discovery with deep learning and natural language processing

Social sites, forums and other text-heavy Q&A services rely heavily on tagging, which enables indexing and user search. Without appropriate tagging, these sites are far less effective. Often, however, tagging is left to the users’ discretion. And as users don’t have lists of commonly searched terms or a deep understanding of the categorisation or information architecture of a site, posts are frequently mislabelled. This makes it difficult or impossible to find that content when it’s needed later.

By combining deep learning and natural language processing (NLP) with data on site-specific search terms, this solution helps greatly improve tagging accuracy on your site. As your user types their post, it offers highly used terms as suggested tags, making it easier for others to find the information they’re providing.

Implementation guidance

Products Documentation

Microsoft SQL Server

Data is stored, structured and indexed using Microsoft SQL Server.

GPU-based Azure Data Science Virtual Machine

The core development environment is the Microsoft Windows Server 2016 GPU DSVM NC24.

Azure Machine Learning Workbench

The Workbench is used for data cleaning and transformation, and it serves as the primary interface to the Experimentation and Model Management services.

Azure Machine Learning Experimentation Service

The Experimentation Service is used for model training, including hyperparameter tuning.

Azure Machine Learning Model Management Service

The Model Management service is used for deployment of the final model, including scaling out to a Kubernetes-managed Azure cluster.

Jupyter Notebooks on Azure Data Science VM

Jupyter Notebooks is used as the base IDE for the model, which was developed in Python.

Azure Container Registry

The Model Management Service creates and packages real-time web services as Docker containers. These containers are uploaded and registered via Azure Container Registry.

Azure Container Service Cluster

Deployment for this solution uses Azure Container Service running a Kubernetes-managed cluster. The containers are deployed from images stored in Azure Container Registry.