Consolidation looms for pack of upstack data tools
Industry activity suggests consolidation may lie ahead for a noisy ecosystem of upstack data tools that ride top data platforms. …
Check out all the on-demand sessions from the Intelligent Security Summit here.
In recent years, a new breed of cloud data platforms has arisen right in the backyard of hyperscale mainstays such as AWS and Microsoft. Today, Snowflake, Databricks and a handful of others are successfully driving enterprise data efforts, enabling global giants to connect, store and generate insights from information flowing from different sources.
The solutions provide companies with tremendous power and capabilities. But their dominance has also triggered a âgold rushâ of sorts. Case in point: a massive surge in the number of upstack tools for the data infrastructure.
A crowded ecosystem of tools has arisen in the wake of Snowflakeâs and Databricksâ successes. The tool vendors seek to unlock the potential of modern data platforms. Yet as their ranks are growing, they may also see consolidation. Signs of that were seen earlier this week in analytics engineering house dbt Labsâ agreement to acquire Transform, which has sought to create a semantic data layer to better integrate the modern data stack.
While players like Snowflake and Databricks provide a platform to host the data and build applications, they canât do it all. There are plenty of areas in the data lifecycle that these solutions do not fully serve â like data ingestion, transformation, orchestration, management and observability. Modern-day upstack tools, provided by third-party vendors, fill these gaps.
Event
Intelligent Security Summit On-Demand
Learn the critical role of AI & ML in cybersecurity and industry specific case studies. Watch on-demand sessions today.
âA large number of companies are vying to provide different products and services to companies [that] are trying to build on top of the Snowflake and Databricks ecosystems,â according to Sean Knapp, founder and CEO of Ascend.io, which automates data and analytics engineering workloads. Knapp told VentureBeat that the problem of crowding in this space has been compounded with overfunding, resulting in many potential features thriving among many separate companies.
Evolution of data monoliths
When data platforms rose to the fore, the earliest adopters looked to address their immediate pain points by building the required software solutions on their own. This was the first wave in the evolution of upstack data tools, when there was no pattern or widespread adoption to justify the existence of enterprise solutions.
Gradually, as needs emerged from the early adopter era, the second wave of point solutions arose. This is where most enterprises are right now. They take whatever specialized data tools they can find to solve small pieces of the puzzle and achieve significant gains in short timeframes.
Today, Snowflake and Databricks support partner tools in the dozens. Some popular ones come from dbt Labs, Matillion and Prophecy (for data prep and transformation); Hightouch Hevo and Fivetran (for data ingestion); and Anomalo and Lightup (for data quality).
Meanwhile, business intelligence stalwarts like Alteryx, PowerBI and Tableau tailor analytics and visualization tooling now widely used in Snowflake and Databricks implementations.
There is much overlap in what the vendors provide, and many solutions also cover aspects like data science and observability.
Most available upstack tools do the job well, but when there are too many solutions for different capabilities on the same infrastructure, teams may end up architecting extremely complex data ecosystems. They have to assemble, integrate and manage all their disparate tools at the same time, which means paying not only for the technology in use but for engineering time and opportunity cost. This directly impacts ROI.
Further, when data bounces among multiple tools, it becomes very difficult to tune and optimize its movement and processing.
âMoving from a simple monolithic model to a complex model with hundreds or even thousands of interdependencies can lead to a data ecosystem that is difficult to understand and maintain, requires many costly licenses, and forces a steep learning curve for user training and onboarding,â Ben Haynes, co-founder and CEO of Directus, told VentureBeat. Directus fields a data platform which includes a âback-end-as-service engineâ for developers along with no-code tooling for non-technical users.
The different component services within stacks are constantly moving objects.
âIf one of the services advances and another stagnates or is no longer supported, the integrations and dependencies between them may break,â Ascend.ioâs Haynes added. âOne dependency breaking can have a domino effect, bringing operations to a halt. Because microservices often donât perfectly bookend to each other, there can also be gaps in capabilities that need to be filled with custom code and logic.â
Are new waves of consolidation ahead?
As teams tire of managing dozens of tools, and standard patterns emerge of whatâs needed in the long run, the third wave, ârapid consolidation,â is expected to rise. Here teams will look to implement a single platform that unifies most, if not all, of the capabilities they use. Such capabilities often include ingestion, transformation and observability. Teams will look to reduce complexity and better focus on core product requirements.
âWhat our data does, how weâre doing it, or how weâre applying the information may be different, but there are many common patterns. As we see these patterns emerge, thereâs tremendous value in creating a single platform that unifies a lot more of these capabilities,â Knapp explained.
âWith consolidation, our teams donât have to spend the majority of their time just cobbling together and integrating tools, which is non-value add,â he added. âThe more unified system makes them more efficient and paves the way for new advancements. You can, for instance, apply really advanced layers of intelligence to data lifecycle because you have more unified metadata and can build automated systems.
For his part, Directus leader Haynes sees a balanced âhub-and-spokeâ model emerging, where the hub serves as a baseline of common or critical functionality, doing 80% of the job, but still provides the option to easily connect other business-critical hyper-specialized tools such as those from Stripe, Hubspot or Salesforce.
Broadly, the consolidation of upstack tools is expected to be driven by private equity-driven mergers and acquisitions, especially those led by the dominant data platforms.
Snowflake, for instance, recently announced the decision to acquire Myst for time-series forecasting as well as SnowConvert to aid cloud migration. Similarly, last month, Thoma Bravo-owned Qlik announced its intent to join efforts with Talend, another Thoma Bravo-owned entity.
âIt makes a ton of sense for the Snowflakes and the Databricks of the world to be very acquisitive. Whether we see really big acquisitions right now or whether they come towards the latter half of this year or the next year is a point of question. Iâd probably bet more on the latter half of this year and early part of next year,â Knapp said. For Snowflake and Databricks, he added, there will be some level of caution around acquiring entities that could create competitive dynamics inside of their ecosystems.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.