I’ve been committed to open source software for over a decade because it fosters a deep collaboration across the developer community, resulting in ground-breaking innovation. At the heart of open source is the freedom to learn from each other and share ideas, empowering the brightest minds to work together on the cutting edge of software development.
Over the last decade, Microsoft has become one of the largest open source contributors in the world, adding to Hadoop, Linux, Kubernetes, Python, and more. Not only did we release our own technologies like Visual Studio Code as open source, we have also collaborated and contributed to existing open source projects. One of our proudest moments was when we became the release masters for YARN in late 2018, having open sourced over 150,000 lines of code, which enabled YARN to run on clusters 10x larger than before. We're actively growing our community of open source committers within Microsoft.
We’re constantly exploring new ways to better serve our customers in their open source journey. Our commitment is to combine the innovation open source has to offer with the global reach and scale of Azure. Today, we're excited to share a few important updates to accelerate our customers’ open source innovation.
Microsoft supported distribution of Apache Hadoop
Microsoft has been an early supporter of the Hadoop ecosystem since the launch of HDInsight in 2013. With HDInsight, we have been focused on delivering seamless integration of key Azure services like Azure Data Factory and Azure Data Lake Storage, with the power of the most popular open source frameworks to enable comprehensive analytics pipelines. To accelerate this momentum, we're pleased to share a Microsoft supported distribution of Apache Hadoop and Spark for our new and existing HDInsight customers. This distribution of Apache Hadoop is 100 percent open source and compatible with the latest version of Hadoop. Users can now provision a new HDInsight cluster based on Apache code that is built and wholly supported by Microsoft.
By providing a Microsoft supported distribution of Apache Hadoop and Spark, our customers will benefit from enterprise-grade security features like encryption, and native integration with key Azure stores and services like Azure Synapse Analytics and Azure Cosmos DB. Best of all, given that Microsoft directly supports this distribution, we can quickly provide support and upgrades to our customers and deliver the latest innovation from the Hadoop ecosystem. All of this will enable customers to innovate faster, without being restricted to proprietary technology just to use our support and features. Additionally, Azure will continue to develop a vibrant marketplace of open source vendors
“We at Cloudera welcome the commitment from Microsoft to Apache Hadoop and Spark. Open-source is key to our mutual customers’ success. Microsoft’s initiative represents a strong endorsement of open-source for the enterprise and we are excited to continue our partnership with Cloudera Data Platform for Microsoft Azure.” Mick Hollison, Chief Marketing Officer at Cloudera
This is part of our strong commitment to Hadoop, open source analytics, and the HDInsight service. In addition to our deeper engagement in supporting open source Hadoop and Spark, in the coming months, we’ll enable the most requested features on HDInsight that lower costs and accelerate time to value. These include an improved provisioning and management experience, reserved instance pricing, low-priority virtual machines, and auto-scale.
We have always sought to meet customers where they are, from our decision four years ago to support HDInsight solely on Linux, to our recent migration of clusters distribution in-house. Customers don't need to take any specific actions to benefit from these changes. These upcoming improvements to HDInsight will be seamless and automatic, with no business interruption or pricing changes.
Welcome new PostgreSQL committers
Since the Citus Data acquisition, we have doubled down on our PostgreSQL investment based on the tremendous customer demand and developer enthusiasm for one of the most versatile databases in the world. Today, Azure Database for PostgreSQL Hyperscale is generally available, and it’s one of our first Azure Arc-enabled services.
The innovation and ingenuity of PostgreSQL continue to inspire us, and it would not be possible without the contribution and passion of a dedicated community. We will continue to contribute to PostgreSQL. Recently, we contributed pg_autofailover to the community to share our learnings of operating PostgreSQL at cloud scale.
To build on our investment in PostgreSQL, we're excited to welcome Andres Freund, Thomas Munro, and Jeff Davis to the team. Together, they bring a decade of collective experience and a leading track record as core committers to PostgreSQL. They, like the rest of the team, are engaging with and listening to the global Postgres community, as we work to deliver the best of cloud scale, security, and manageability to open source innovation.
We're committed to actively engaging the open source community and providing our customers with choice and flexibility. The true open source spirit is about collaboration, and we’re excited to combine the best of open source software with the breadth of Azure. Most importantly, we are bringing together the best minds and talented visionaries, both at Microsoft and in the broader open source community, to constantly improve our open source products and deliver the newest features to our customers. Here’s to open source!
Additional resources
- HDInsight Documentation is your one-stop-shop for learning all about this analytics platform.
- PostgreSQL Committers Blog: Visit to learn more about the three new committers we hired.