• 4 min read

Cloud Scale Analytics meets Office 365 data – empowered by Azure Data Factory

Office 365 holds a wealth of information about how people work and how they interact and collaborate with each other, and this valuable asset enables intelligent applications to derive value and to optimize organizational productivity.

Office 365 holds a wealth of information about how people work and how they interact and collaborate with each other, and this valuable asset enables intelligent applications to derive value and to optimize organizational productivity. Today application developers use Microsoft Graph API to access Office 365 in a transactional way. This approach however is not efficient if you need to analyze over large amount of Office artifacts across a long time horizon. Further, Office 365 data is isolated from other business data and systems, leading to data silos and untapped opportunity for additional insights.


Azure offers a rich set of hyperscale analytics services with enterprise-grade security and are available in data centers worldwide. By marrying Office 365 data and Azure, Office 365 data can be available in Azure and developers can harness the full power of Azure to build highly scalable and secure applications against the combination of Office 365 data and other business data.


This week at Ignite we announced the Public Preview of Microsoft Graph data connect, which enables secured, governed, and scalable access of Office 365 data in Azure. With this offering, for the very first time, all your data – organizational, customer, transactional, external – can come together for innovative analytics and insights, in a way that was not possible before.

Integrate Office 365 data at scale in Azure using Azure Data Factory

Azure Data Factory (ADF) is a managed data integration service that allows you to bring together diverse data sources in Azure and build operationalized ETL flows. With this added ability to bulk ingest Office 365 data, ADF’s collection of 70+ on-prem and cloud data connectors just got richer and more comprehensive.

Using visual tools in ADF, you can easily author a pipeline that does a one-time or scheduled ingestion of the Microsoft 365 dataset of interest. As of today, the available datasets include: Email messages, calendar events, personal contacts, mail folders, and mailbox settings. More types of data from Microsoft 365 will be added over time.  For each dataset, you can choose which column you wish to ingest, which groups of users to include, and set a time-based row filter.

Office 365 data access through ADF offers many additional benefits including:

  • Once data has landed in Azure, you can use Azure Databricks to prepare, transform, and further enrich the data with machine learning. ADF provides integration with Azure Databricks to execute Databricks Notebooks, Jars, and Python scripts.
  • You can bring in additional data sources to integrate with the Office 365 dataset, for example you can combine customer sales and marketing data from Salesforce, SAP, and Dynamics 365 together with email exchanges and meeting events in Office 365 in order to keep track of your sales pipeline and predict customer propensity.
  • You can publish curated and analyzed results into Azure Cosmos DB for consumption in geo-distributed applications. ADF provides efficient data loading into Azure Cosmos DB through the bulk executor library.
  • ADF uses Service Principals for secure service-to-service authentication. Securely manage the key in Azure Key Vault (AKV) and ADF integrates with AKV to retrieve the key during runtime.
  • ADF provides iterative development and debugging, including ability to export to/import from ARM templates.
  • ADF enables CICD through integration with VSTS Git and Github.

View and approve data access requests using Privileged Access Management

Office 365 privileged access management goes beyond traditional access control capabilities by enabling access granularity for specific data pipelines. It provides just-enough-access to the developer for their use, scoped down to the specific properties in the dataset they need.  The customer’s Office 365 administrators can view all data users along with what access they have. Administrators may also exclude certain sensitive users.

If you develop with Azure Managed Applications, the Office 365 administrators will also be able to see what policies your application complies with.

Package and deploy your application using Azure managed applications

Azure managed applications enable you to offer cloud solutions that are easy for customers to deploy and operate and are easy for you to manage. You can publish to your organization’s service catalog for internal deployment or to the Azure Marketplace to be sold to external customers. The customer will not have access to the code or operations of your application, leaving this to be managed by you. Coming soon, you may also choose to not be provided standing access to the application as a security assurance to your sensitive customers.

Finally, you may opt in to certain policies being enforced by Azure managed applications over your application. These policies are continuously verified and represented to the customer’s Office 365 administrator. If a policy is violated, the Data Factory pipeline fails.

ISV testimonial

For the last few months, we have been working with ISVs who find the value propositions offered by Microsoft Graph data connect really appealing. A great example of these early adopters is Harmon.ie.

Harmon.ie is a long-standing Microsoft partner who focuses on delivering people-centric applications for Office 365 and SharePoint. One of their solutions, Harmon.ie 10, is a next-generation application that allows information workers to view key topics in a graph view and easily navigate documents and calendar events by topic.  Below is a quote from Harmoni.ie on how they have benefited from Microsoft Graph data connect:

“We have been working on harmon.ie 10 for a couple of years with a public cloud provider and the two main issues were:

  • How to retrieve all an organization email – it’s a massive amount of data and the traditional APIs method didn’t scale, and
  • How to protect our customers’ data privacy – machine learning on all the organization’s emails in the public cloud would be unsavory.

Migrating our code into an Azure Data Factory pipeline allowed us to simplify our source code and to provide our solution as a Managed App that runs in the customer Azure subscription without pulling the data away from the customer trusted data boundaries.

For us, this was a game changer.”   Yehonathan Sharvit, Vice President Research and Development, Harmon.ie

Get started today!

Read more about Office 365 connector in ADF and follow this tutorial to build pipelines against Office 365 data using Azure Data Factory.

We cannot wait to see what you build!  If you have any feature requests or want to provide feedback, please visit the Azure Data Factory forum.