Skip to main content Microsoft Intelligent Data Platform Azure Arc Azure databases Power BI SQL Server 2025 SQL Server BI SQL Server 2022 SQL Server 2019 SQL Server 2017 SQL Server 2016 SQL Server 2005 - 2014 Downloads Community SQL End of Support Data Security - SQL Server Encryption SQL Server blog SQL Server and Azure SQL workshops Browse Microsoft Solutions Hub SQL Server Tech Community Azure Databases Tech Community Azure Synapse Analytics Tech Community Developer Find a partner Become a partner Partner resources Try SQL Server 2025 Microsoft Security Azure Dynamics 365 Microsoft 365 Microsoft Teams Windows 365 Microsoft AI Azure Space Mixed reality Microsoft HoloLens Microsoft Viva Quantum computing Sustainability Education Automotive Financial services Government Healthcare Manufacturing Retail Find a partner Become a partner Partner Network Microsoft Marketplace Marketplace Rewards Software development companies Blog Microsoft Advertising Developer Center Documentation Events Licensing Microsoft Learn Microsoft Research View Sitemap
·
1 min read

Apache Spark Connector for SQL Server and Azure SQL is now open source

Accelerating big data analytics with the Spark connector for SQL Server 

SQL Server logoWe’re happy to announce that we have opensourced the Apache Spark Connector for SQL Server and Azure SQL on GitHubBorn out of Microsoft’s SQL Server Big Data Clusters investments, the Apache Spark Connector for SQL Server and Azure SQL is a high-performance connector that enables you to use transactional data in big data analytics and persists results for ad-hoc queries or reporting. The connector allows you to use any SQL database, on-premises or in the cloud, as an input data source or output data sink for Spark jobs. 

Why use the Apache Spark Connector for SQL Server and Azure SQL 

The Apache Spark Connector for SQL Server and Azure SQL is based on the Spark DataSourceV1 API and SQL Server Bulk API and uses the same interface as the built-in JDBC Spark-SQL connector. This allows you to easily integrate the connector and migrate your existing Spark jobs by simply updating the format parameter! 

Notable features and benefits of the connector: 

  • Support for all Spark bindings (Scala, Python, R). 
  • Basic authentication and Active Directory (AD) keytab support. 
  • Reordered DataFrame write support. 
  • Reliable connector support for single instance. 

Depending on your scenario, the Apache Spark Connector for SQL Server and Azure SQL is up to 15X faster than the default connector. The connector takes advantage of Spark’s distributed architecture to move data in parallel, efficiently using all cluster resources.

Visit the GitHub page for the connector to download the project and get started! 

Get involved 

The release of the Apache Spark Connector for SQL Server and Azure SQL makes the interaction between SQL Server and Spark even more flawless. We are continuously evolving and improving the connector, and we look forward to your feedback and contributions!  

Want to contribute or have feedback or questions? Check out the project on GitHub and follow us on Twitter at @SQLServer. 

English (United States)
Your Privacy Choices Opt-Out Icon Your Privacy Choices
Consumer Health Privacy Sitemap Contact Microsoft Privacy Manage cookies Terms of use Trademarks Safety & eco Recycling About our ads