Accelerate big data analytics with the Spark 3.0 compatible connector for SQL Server—now in preview.
We are announcing that the preview release of the Apache Spark 3.0 compatible Apache Spark Connector for SQL Server and Azure SQL, available through Maven.
Open sourced in June 2020, the Apache Spark Connector for SQL Server is a high-performance connector that enables you to use transactional data in big data analytics and persist results for ad-hoc queries or reporting. It allows you to use SQL Server or Azure SQL as input data sources or output data sinks for Spark jobs. To date, the connector supported Spark 2.4 workloads, but now, you can use the connector as you take advantage of the many benefits of Spark 3.0 too.
Why use the Apache Spark Connector for SQL Server and Azure SQL
The Apache Spark Connector for SQL Server and Azure SQL is based on the Apache Spark DataSourceV1 API and SQL Server Bulk API and uses the same interface as the built-in Java Database Connectivity (JDBC) Spark-SQL connector. This allows you to easily integrate the connector and migrate your existing Spark jobs by simply updating the format parameter.
Notable features and benefits of the connector:
• Compatible with Apache Spark 3.0.
• Support for all Apache Spark bindings (Scala, Python, R).
• Basic authentication, Active Directory (AD) Key Tab, and Azure Active Directory support.
To learn more about the connector and how to use it, visit the GitHub page. The 3.0 compatible connector is available through the spark-mssql-connnector_2.12_3.0:1.0.0-alpha Maven coordinate.
The Apache Spark Connector for SQL Server and Azure SQL makes the interaction between SQL Server and Apache Spark flawless. The connector has a growing and engaged community and has been installed thousands of times. We are continuously evolving and improving the connector, and we look forward to your feedback and contributions.
Want to contribute or have feedback or questions? Check out the project on GitHub and follow us on Twitter.
Note: The connector is community supported and does not include Microsoft SLA support. Please file an issue on GitHub to engage the community for help.