Quentin Clark is general manager at General catalystleading corporate investments in next-generation products and solutions.
Databases are a fundamental part of the modern world: everything we do today is powered by data, from online shopping to banking to streaming our favorite shows. Data is also critical to businesses as it helps them make better decisions based on facts, trends and statistics.
While databases and the surrounding data platform have evolved over the decades, the way the technology is used is stuck on a linear process: data flows step by step through a single “stack”. With this model, end users can only extract insights from data after it has gone through a long series of steps and is often already outdated.
Businesses today prioritize nonlinearity and real-time insights to maximize their data. At the same time, companies are also using more products and services, resulting in complex tech stacks that require better coordination between different tools.
To get the most out of their data, enterprises need to rethink their data management frameworks and facilitate the arrival of a non-linear future.
A brief history of databases: the journey to multi-cloud
Before exploring the future of databases, it’s important to understand how we got where we are today. Automated databases emerged in the 1960s, when Charles W. Bachman designed the integrated database system known as the “first” DBMS. Soon after, IBM created its own Information Management System (IMS) database. Both were considered the forerunners of navigation databases.
As computers became more commercially available, so too did general-purpose database systems that ran on early computer hardware (i.e. mainframes) and focused entirely on transactions. These databases were an instant hit due to their ability to scale with businesses of all sizes.
In the late 1980s and early 1990s, data warehouses emerged to transform data from operational to decision-making systems. These were forced to be built as a separate architecture, creating new data workflows, governance and controls. The emergence of the World Wide Web also fueled the exponential growth of the database industry as companies themselves globalized. In the mid-1990s, Amazon web service (AWS), among others, the rise of cloud computing made it possible to host databases on virtual machines, giving companies anywhere access to data storage and analysis.
While these systems have become faster, cheaper and better implemented, their fundamental architecture has not changed. Even as we take advantage of the cloud’s storage and compute scale, we’re essentially deploying the same incremental data workloads as we did in the client-server era (transactional databases, ETL, data warehouses, and analytics systems). .
This outdated way of thinking is the result of iterative changes in needs and the industry successively benefiting from one technical innovation at a time. Modern business processes, app building and analytics are starting to feel the burden.
The linear process and the need for real-time data
During the evolution of business data, the prevailing technology system has always been composed as a “stack” – or a linear process through which the data flows. For example, once the data is born, it is copied to the data warehousing system, where it is converted into more usable information; from there, data is brought into an application or analytics tool where it is applied. This all happens step by step.
The problem: This linear “stacked” process comes at the expense of a company’s ability to act on its data insights in real time. Information flows through a system that aggregates, averages, and aggregates data in many sequential steps, creating longer delays before this data ever reaches an end user. Once the data hits their dashboard, there’s no way to tell where it came from. The system cannot detect that a failure of a single source of information will affect key metrics over time. As a result, business leaders are not aware of critical data events happening right in front of their eyes (e.g. a glitch or outlier) hours, days, or even weeks later.
The same goes for batch-oriented, “back-end” data processing systems that rely on reports – these take hours or even days to generate analysis, leaving business leaders unaware of what data is actually happening at the moment.
I believe the solution is to break this traditional linearity and move towards something more event driven. Business operations and customer experiences unfold in the present, so why can’t data do the same? Real-time analytics enable immediate action and increase productivity by enabling businesses to prevent problems before they arise. Non-linear data processing is a great way to get there, but this new wave of real-time insights requires the use of a new generation of tools and innovation.
Modern data orchestration
The modern data stack is not a single, linear entity: each enterprise uses multiple products and arranges them in unique, confusing ways. Orchestration allows for more point-to-point connectivity, allowing the data stack to evolve into a data system (also known as a fabric, mesh, or graph).
The greater challenges arise as data and compute patterns become increasingly complex and non-linear, making data movement, transformation, and business flow more difficult to manage than ever before. Cloud computing has also spread data across multiple cloud vendors, leaving IT leaders with no single plane of control and unsupported data management.
A new generation of data-aware orchestration tools is emerging to solve this problem. These tools can unify, transform and enrich data from any source system into any target application. Historically, users who wanted to work with data obtained it from sources such as databases, Excel spreadsheets, or CSV files. Once validated, the data is converted to an acceptable format and then loaded into the target destination (for example, a BI dashboard). Revitalized data orchestration frees businesses from these time-consuming and error-prone data processing processes that become more complex every year.
Modern orchestration software, along with a multi-source strategy that lends itself to multiple clouds, can enable enterprises to extract the most value from their data. We’ve seen data stacks move to multi-cloud, and now new data systems need to support the future.