The rise of big data and real-time analytics has changed the value proposition of SQL Server. As noted by Datanami, SQL-on-Hadoop offers significant benefits and is driving market competition, while TechTarget highlights that SQL pros are often left out of the loop as Java plays a more important role. What’s the future of SQL?
SQL Server 201 was well received by users, and although many still prefer 2014, 2012 or even 2008 R2 versions, features of interest — such as Microsoft’s new “Nano” SQL server deployments, which require significantly less space to deliver a high-quality database experience — have encouraged many companies to switch. Thankfully, this hasn’t stopped the Redmond giant from pushing SQL forward and finding new ways to broaden the appeal of its relational database technology.
According to Tech Crunch, for example, Microsoft has now opened the Linux beta version of its SQL server to the public. And thanks to support for Docker containers, it’s possible to run the Linux iteration on MacOS, creating a crossover many users thought would never occur. In fact, this is a significant departure for Microsoft, which has historically looked for ways to compete rather than collaborate with other tech giants, especially Apple. Tools such as massively popular productivity suite Office have always been Windows-only and PC-only offerings — until now. In large measure, this change stems from a recognition that cloud and other distributed technology services are the way forward for IT, and companies will no longer accept the notion that a single provider can handle all of the service and security needs. Given the massive success of Azure, it appears that Microsoft is taking this lesson to heart and now providing improved opportunities for companies and users who love SQL but prefer a non-Windows OS.
Along with the public release of the SQL Linux client, the company also announced changes to programmability features: A large number of SQL users, including those running the free “Express” variant, now have access to features previously gated for Enterprise edition.
The Hadoop Happening
One of the biggest forward pushes for SQL in the marketplace at large is the development of SQL-on-Hadoop. According to the Datanami piece, efforts by multiple companies to build better Hadoop/SQL combinations has led to significant performance gains from mainstream products such as Hive, Impala, Spark SQL and Presto, resulting in two- to four-times faster query results than just a few months ago. Impala and Spark especially are delivering great large-join performance thanks to a feature called “runtime filtering,” which reduces the total volume of data that needs to be scanned.
Better still? Taking advantage of these gains doesn’t require companies to buy extra hardware or change query structure. It’s also worth noting that while open-source alternatives can’t quite match the speed of proprietary solutions yet, their development is progressing faster than similarly equipped single company engines. In other words? Proprietary engines have the advantage for the moment — but not for long.
No discussion of Hadoop would be complete without talking about language. As discussed by the TechTarget piece, what many data pros don’t talk about when it comes to SQL-on-Hadoop is that Java plays a larger role than T-SQL, meaning some database programming veterans are at-risk of getting left behind. And while Microsoft could simply chalk this up to the inevitable shift of databases away from single-language design, the company is looking for a way to empower T-SQL users and get them back on board. The answer? U-SQL. Although it’s a dialect of T-SQL, there are several differences between the new offering and standard SQL language. For example, while it can handle disparate data, it also supports C# extensions and .NET libraries; automatically deploys code to run in parallel; and supports queries on all kinds of data, not just the relational data found in SQL. Currently part of Microsoft’s Azure Data Lake Analytics public preview, the new code is fundamentally designed to give data professionals “the access to a big data platform without requiring as much learning,” which may be a critical feature as IT pros are bombarded with new tools to master and new techniques to help streamline the performance of technology departments.
As noted by DZone, the first computerized database models emerged in the 1960s. By the 1970s, dedicated tools such as SQL had been developed. In the early part of the next decade, Microsoft’s offering became the de-facto industry standard. The 1990s brought a significant shift, however, as single-server databases struggled to contend with massive data volumes and resource requirements. By the turn of the century, alternatives emerged such as NoSQL and Hadoop, even as data velocity, variety and volume continued to skyrocket. Today, companies are turning to scale out SQL solutions rather than trying to match the pace of data by scaling up internally; especially as real-time analytics becomes a critical component of long-term corporate strategy.
Yet where does SQL go from here? One possibility is all-out replacement, where proprietary SQL databases are replaced by open-source alternatives or flexible databases from other companies. A more likely scenario, however, suggests the development of distributed SQL environments that provide access to scale on demand, support the integration of other toolsets, and enable the addition of real-time analytics tools. Microsoft’s current trajectory supports this aim: The move to develop a Linux SQL offering, support for ongoing Hadoop integration, and focus on cooperating with rather than competing against open-source alternatives. What’s more, the company seems focused on lowering the bar for entry with initiatives such as U-SQL, which will help data professionals on-board more quickly and make better use of SQL resources.
Bottom line? The relational database isn’t dead, but is simply undergoing a cloud-enabled transformation. While it’s unlikely that SQL will retain its position as the de-facto standard for organizations, Microsoft’s current efforts and future plans point to database development that should support SQL as an integral part of the new, cloud-connected, scale-out database environment.
Updated: January 2019