This project aims to help data scientists become familar with the Microsoft Academic Graph through analystics and visualization samples using Data Lake Analytics (USQL) and Power BI.
Samples
The project contains 9 samples
- Field of Study Top Authors
- Conference Top Authors By Static Rank
- Conference Paper Statistics
- Conference Top Papers
- Conference Top Authors
- Conference Top Institutions
- Conference Memory of References
- Conference Top Referenced Venues
- Conference Top Citing Venues
Getting Started
Prerequisites
- An Azure Data Lake Store with a copy of Microsoft Academic Graph
- Azure Data Lake Store
- Contact Academic API to get Microsoft Academic Graph access on your data lake store
- An Azure Data Lake Analytics account
- Power BI Desktop
- Visual Studio with Data lake tools
- Included in Visual Studio 2017
- Plug in for Visual Studio 2015
Quickstart
Overview
- Download or clone the repository.
- Open the solution /src/AcademicAnalytics.sln
- Take a look at the academic graph data schema and run CreateDatabase.usql from common scripts
- For each tutorial there should be: A USQL script(.usql), a Power BI report(.pbix), a Power BI template(.pbit) and a README explaining the tutorial.
- Althought each tutorial is different, running the USQL script as is and filling out the Power BI template using the same USQL parameters should give you a Power BI report with visualizations that match the Power BI report example included in the tutorial. Since the Microsoft Academic graph is contently improving, different graph verions may give you slightly different results.
Working with USQL scripts
How to run
- Make sure you have selected your ADLA account
- Build the script first to validate syntax
- Submit your script to your ADLA account
How to view the results
- You can view the results via azure portal
Using Power BI
- Make sure USQL script finished sucessfully
- Open up corresponding Power BI Template(.pbit) from file explorer (Visual studio doesn't recognize Power BI files)
- Enter your ADL information and parameters corrisponding to your scripts
- Make sure the parameters cases are the same as your script and "click" to load
Resources
- Get started with Azure Data Lake Analytics using Azure portal
- Develop USQL scripts by using Data Lake Tools for Visual Studio
- Get started with USQL
- Deep Dive into Query Parameters and Power BI Templates
- Manage Azure Data Lake Store resources by using Storage Explorer
- Scalable Data Science with Azure Data Lake: An end-to-end Walkthrough
- Microsoft Academic Website