Dynatrace Pushes The Boat Out With Grail Data Lakehouse

Dynatrace Pushes The Boat Out With Grail Data Lakehouse

Spiralling data is often said to be coming at us like a firehose. The ‘data firehose’ is not a formalized IT industry term that we use to express what is happening inside the cloud-native poly-platform continuously-containerized world of modern software application development – but still – this notion of information now being produced at a massive rate does fit nicely to the image of a firehose spewing forth a stream of water.

Today, we know that organizations across every business vertical are facing the data and information firehose, usually head-on.

Firms report scenarios where their data ingestion burden (the ‘feed’ of information they experience from applications, machines, edge devices, API connections and much more) is doubling every year. The core problem here according to software intelligence company Dynatrace is that most of the firms experiencing that doubling effect are not also balancing it with double the ability to analyze that data.

Because companies need to analyze their data ingestion stream for security, operational robustness and business decision-making, that’s an unleashed firehose likely to cause some flooding damage.

Humans overloaded

According to Dynatrace’s 2022 Global CIO report, nearly three-quarters (71%) of chief information officers (CIOs) believe the explosion of data produced by cloud-native technology stacks is beyond human ability to manage. Why then hasn’t the promise of data analytics (remember big data anyone?) worked to empower us through this maelstrom and mire?

Because, say the new-age data analytics purists, traditional analytics solutions lack the flexibility, speed and scale required for real-time, precise analytics in multi-cloud poly-cloud, cloud-native ecosystems. But again, why isn’t traditional analytics able to work at this level?

According to Bernd Greifeneder, founder and chief technical officer at Dynatrace, it is because these traditional systems very typically rely on sampled and/or incomplete data. A reality which means they cannot represent the dynamic relationships between the millions to billions of distinct components in a modern cloud architecture.

Interwoven multi-cloud topologies

What do millions-to-billions of distinct cloud components look like in reality? They look like a complex graph of multi-cloud topologies interwoven with complex application-to-application (and service-to-service & dataset-to-dataset and API-to-API and so on) dependencies, all of which must be observed if the system is to function effectively. In addition, traditional analytics lacks capabilities that allow business and development teams to collaborate on data analysis projects.

The Dynatrace team say that its platform, with new graph analytics capabilities, addresses these complexities. The company is now extending its Grail data lakehouse technology beyond logs and business events to support metrics, distributed traces and multi-cloud topology and dependencies.

Before we remind ourselves what a data lakehouse is, why does the expansion of Dynatrace Grail in this context matter? Greifeneder says it is a question of being able to analyze not just some data, but ‘all’ data – and this is kind of where the company’s special sauce lies with causal AI – so here this means being able to execute analytics on all sorts of different data in stores that are now petabytes and yottabytes large.

“As companies collect data from more data sources, data itself is becoming more heterogenous, so working out what all that information means and how to analyze it becomes really much more difficult,” said Greifeneder. “We’re now at a point where observability and security have converged. Organizations need extensive automation and Everything-as-Code so that they are able to compose and create in new cloud-native deployment environments.”

With Grail, Dynatrace says it has taken the cost-effectiveness of a data lake and combined it with an ability to enable instantly queryable data without having to define database schema to explain the structure of the information set being targetted. By keeping Grail schemaless, it enables businesses to be able to ask all sorts of queries without needing to define the schema (the structure of a database) upfront – Greifeneder and team call this technology schema-on-read.


What is a data lakehouse?

Part of the whole new universe of virtual data geographies, topologies and morphologies, a data lakehouse combines the expansive and unstructured raw data reserves we find in the data lake (that place we use to ‘pour’ data into, often before we know what to do with it)… and the more structured and ordered world of the data warehouse (that place we assemble data into, not always in a state where it is ready to be used in live production environments, but at least in a place where we know what label we have put on the box or crate). The more types of information we have the ability to put into the data lakehouse (as we are seeing happen here) the more ‘furnished’ it becomes.

In terms of how the Dynatrace data lakehouse here is becoming a more sophisticated place, we are seeing an expansion of Grail’s ability to store, process and analyze the enormous volume and variety of data from modern cloud ecosystems, while retaining its context – and all that happens without having to structure or rehydrate it.

In line with the Grail development, Dynatrace has unveiled a new user experience for its software intelligence platform, featuring dashboarding capabilities and a visual interface to help drive collaboration between development and business teams. This UX powers Dynatrace Notebooks, a new document capability that allows IT, development, security and business users to collaborate using code, text and rich media to build, evaluate and share insights from exploratory, causal-AI analytics projects.


The new capabilities add AI-powered graph analytics for custom queries to the analytics that already come out of the box with the Dynatrace platform. This delivers answers for an array of BizDevSecOps use cases.

Examples here include the ability to protect customers and brands by conducting application security forensics to identify, mitigate or prevent data breaches. This should improve customer satisfaction and maximize revenue from e-commerce customers who have not been able to finalize their online check-outs due to a service outage. It enables more efficient multi-cloud operations by predicting cloud performance and utilization over time to optimize resource allocation based on user needs.

“The ability to conduct exploratory, causal-AI-based analytics on petabytes of unified observability, security and business event data multiplies the value of this data for our customers,” said Greifeneder. Organizations can now perform custom queries that leverage directed graphs that reflect a continuously updated ecosystem topology and dependencies to derive answers with causation.

“These answers are far more precise than results from correlated data analytics and powerful for proactive and reactive analytics for expansive BizDevSecOps use cases and automation. The new Dynatrace user experience with collaborative dashboards and notebooks is optimized for cross-team collaboration and interpreting and visualizing data in context. This allows more people from across organizations to make data-backed decisions and transforms the massive data from modern clouds into a goldmine for precise answers and intelligent automation,” added Greifeneder.

Extending cloud observability

What’s happening here at a macro tech trend level is a sophistication extension process designed to extend our ability to perform cloud system insight.

We used to talk about Application Performance Management (APM) as a process designed to help ‘look after’ (and into, to observe) applications in every aspect from their ability to perform, their security and the requirements to connect inside the network structures that they would reside in. In truth, we do still talk about APM, but the dicussion is widening.

Cloud computing has moved us on somewhat from ‘basic’ application management and throughout much of the last decade we have preferred to talk about observability instead. With so many computing and data resources now abstracted into the virtualized world of cloud, it seems to make more sense to think of orchestration and observability as a natural evolutionary progression, especially given the proliferation of cloud container technologies (which inherently require more orchestration) today.

We don’t really have a term yet for the next phase of application and service control, which is probably because firms like Dynatrace and others now refer to themselves as software intelligence specialists.

In full, Greifeneder now says Dynatrace is an ‘analytics and automation specialist for observability and security’ and that use of the word ‘security’ implies a focus at the business operations level i.e. the ability to lock down the robustness of an organization’s supply chain, its ability to ensure data privacy and its ability to work with massively heterogeneous data sources throughout its application and services stack, rather than security in the cyber defense sense of the word.

In terms of where all this intelligence goes next… let’s just observe.


Leave a Reply

Your email address will not be published. Required fields are marked *