Loan ChargeOff Prediction with SQL Servercha

This solution demonstrates how to build and deploy a machine learning model with SQL Server 2016 with R Services to predict whether a bank loan will need to be charged off within the next three months.


Note: If you have already deployed this solution, click here to view your deployment.

Estimated provisioning time: 20 minutes

STOP before you proceed You need to accept the Terms of Use of the Data Science Virtual Machine on your Azure Subscription before you deploy this VM for the first time by clicking here.


There are multiple benefits to lending institutions equipping themselves with loan charge-off prediction data. Charging off a loan is a bank’s last resort for a loan that is severely in arrears. With prediction data at hand, a loan officer can offer personalised incentives like a lower interest rate or longer repayment period to help the customer keep making loan payments and thus prevent the loan from getting charged off. To obtain this type of prediction data, credit unions or banks often manually handcraft the data based on a customer’s past payment history and perform a simple statistical regression analysis. This method is highly liable to generate data compilation errors and not statistically sound.

This solution template offers an end-to-end solution for running predictive analytics on loan data and producing a score of charge-off probability. A PowerBI report will also walk you through the analysis and trend of credit loans and prediction of charge-off probability.

Business Manager Perspective

This loan charge-off prediction uses simulated loan history data to predict the probability of loan charge-off in the immediate future (next three months). The higher the score, the higher the probability of the loan getting charged off in the future.

With the analytics data, the loan manager is also presented with the trends and analytics of charge-off loans by branch locations. Characteristics of high charge-off risk loans will help loan managers to make a business plan for loan offering in that specific geographical area.

SQL Server R Services brings the compute to the data by allowing R to run on the same computer as the database. It includes a database service that runs outside the SQL Server process and communicates securely with the R runtime.

This solution template walks through how to create and clean up a set of simulated data, use various algorithms to train the R models, select the best performing model, perform charge-off predictions and save the prediction results back to SQL Server. A PowerBI report connects to the prediction results table and shows interactive reports with the user on the predictive analytics.

Data Scientist Perspective

SQL Server R Services brings the compute to the data by running R on the computer that hosts the database. It includes a database service that runs outside the SQL Server process and communicates securely with the R runtime.

This solution walks through the steps to create and refine data, train R models and perform scoring on the SQL Server machine. The final prediction results will be stored in SQL Server. This data is then visualised in PowerBI, which also contains a summary of the loan charge-off analysis and a charge-off prediction for the next three months. (Simulated data is shown in this template to illustrate the feature.)

Data scientists who are testing and developing solutions can work from the convenience of their R IDE on their client machine, while pushing the compute to the SQL Server machine. The completed solutions are deployed to SQL Server 2016 by embedding calls to R in stored procedures. These solutions can then be further automated with SQL Server Integration Services and SQL Server agent.

Click on the Deploy button to test the automation and the entire solution will be made available in your Azure subscription.


Your Azure subscription used for the deployment will incur consumption charges on the services used in this solution, approximately $1.15/hour for the default VM.

Please ensure that you stop your VM instance when not actively using the solution. Running the VM will incur higher costs.

Please delete the solution if you are not using it.


©2017 Microsoft Corporation. All rights reserved. This information is provided “as is” and may change without notice. Microsoft makes no warranties, express or implied, with respect to the information provided here. Third-party data was used to generate the solution. You are responsible for respecting the rights of others, including procuring and complying with relevant licences in order to create similar datasets.

Related solution architectures

Loan Credit Risk with SQL Server

By using SQL Server 2016 with R Services, a lending institution can make use of predictive analytics to reduce the number of loans they offer to those borrowers most likely to default, thereby increasing the profitability of their loan portfolio.