Architecting the Agentic Data Cloud: From System of Record to System of Action
Overview
For thirty years the enterprise data platform has been a place you go to look things up. The Agentic Data Cloud reframes it as a place where work actually gets done — autonomously, in real time, by software agents acting on your behalf.
In April 2026, Andi Gutmans (VP/GM, Data Cloud at Google Cloud) published a piece arguing that the arrival of capable AI agents breaks a foundational assumption of every data warehouse, lake, and lakehouse built to date: that a human sits at the end of the query, reads the answer, and decides what to do. This post is an architectural read of that thesis — what the Agentic Data Cloud is, why legacy stacks structurally fail at agent scale, and how the four-layer design closes the gap between thinking and doing.
The framing matters because it is not a feature announcement. It is a claim that the shape of the platform has to change. We will treat it the way we treat any architecture on this blog: diagram the moving parts, interrogate the design choices, and separate the durable ideas from the marketing.
Objectives
- Name the shift:
- understand why the industry is moving from a System of Record → System of Intelligence → System of Action, and what each transition costs the architecture beneath it.
- Diagnose the failure modes:
- the four reasons a conventional "modern data stack" breaks when agents — not people — become the primary consumers.
- Walk the architecture:
- the four layers of an Agentic Data Cloud, from the autonomous workforce at the top to the active, multimodel engines at the base.
- Trace the solutions:
- how vertical integration, a cross-cloud lakehouse, and a knowledge flywheel answer the cost, openness, and trust problems respectively.
- Check it against production:
- the metrics and customer deployments Google offers as evidence the model is real, not aspirational.
The Shift: Record → Intelligence → Action
Enterprise data has moved through three eras. The System of Record captured what happened and stored it reliably. The System of Intelligence — the analytics era we still live in — turned that record into dashboards and forecasts, telling us what happened yesterday and what is likely to happen next. Both share a defining trait: they are reactive. They wait for a human to ask a question, then report the news rather than make it.
The System of Action inverts this. An agent does not wait to be asked; it monitors continuously, reasons over live data, and executes. The prediction that used to sit idle on a dashboard now triggers a transaction. The architecture's job is no longer to inform a decision — it is to be the decision loop.
Three concrete shifts sit underneath this reframe, and each one stresses a different part of the stack:
- Human scale → agent scale
- Practitioners stop being the ones who run queries and become orchestrators of fleets of agents. The platform must absorb orders of magnitude more workload, generated by software that never sleeps and operates at digital speed.
- Reactive intelligence → proactive action
- Forecasting becomes execution. An agent that can see a window to act but cannot reach the operational system in time has failed, regardless of how good its model is.
- Data → knowledge
- Discovering and querying tables is no longer enough. An agent needs to know what "revenue" means versus "projected revenue," how assets relate, and how data is actually used. It also needs the roughly 90% of "dark data" trapped in contracts, specs, emails, images, and video — unstructured content a traditional warehouse never indexed.
Why Legacy Architectures Break at Agent Scale
The "modern data stack" works for human-scale analytics. The argument is that it fails — structurally, not incrementally — once agents are the primary consumers. The failure has four faces.
The common root, the argument goes, is fragmentation. The modern stack is a patchwork in which no single provider owns the outcome: hyperscalers own infrastructure and rent out models, AI labs own models and rent infrastructure, and data vendors stitch borrowed components together. That division of labor is fine when a human absorbs the latency and reconciles the seams. It is fatal when an autonomous agent has to do it thousands of times an hour.
What an Agentic Data Cloud Actually Is
An Agentic Data Cloud is a System of Action that evolves the data platform from a static repository into a dynamic data reasoning engine. It merges analytical insight with transactional power in a single closed loop — moving past dashboards toward self-healing data systems. To do that without reintroducing the four failures, it has to satisfy three non-negotiable requirements.
- It must be AI-native.
- Efficient AI is infused into every layer, from hardware to software — not bolted onto a legacy database. That is the only way to process multimodal data, reason in real time, and reach agent scale without the cost spiralling.
- It must be flexible.
- Tear down the walled garden. Agents activate data across open formats, multiple clouds, and native engines, without first funding a multi-year modernization project to relocate the estate.
- It must be trusted.
- Governance is the bridge between intelligence and action. Every automated action must be permissioned, explainable, compliant, and safe, with the agent operating inside strict boundaries and full business context.
The Architecture: Four Layers
Google's instantiation of the pattern is a four-layer stack. Read it top-down as a request — an agent receives a goal, draws on context and memory, reaches for tools through a common protocol, and executes against active engines — then bottom-up as the result and any triggered action flowing back. Trust and governance run vertically across all four layers; an AI-native, vertically integrated foundation sits beneath them.
Layer 1 — The Agentic Workforce
The top layer is the workforce itself: a Data Science Agent, a Data Engineering Agent, and a Database Observability Agent, alongside a markedly more capable Conversational Analytics. Practitioners extend this through the Data Agent Kit, which exposes Model Context Protocol (MCP) tools, skills, and extensions with native understanding of the platform — agents that build and operate the platform, not just sit on top of it.
Layer 2 — Context and Memory
Agents are only as good as what they know. The Knowledge Catalog evolves from a technical inventory into a master of business semantics and enterprise context. An agent Memory Bank gives stateful recall for long-term learning across sessions, and AgentOps provides real-time telemetry into the agent's reasoning — the observability you need before you let software act on its own.
Layer 3 — Unified Orchestration and Tooling
This layer standardizes how agents reach data and how they act. MCP serves as the open, universal toolbox, with specific tools across Spanner, AlloyDB, Cloud SQL, Looker, and BigQuery. Its most consequential piece is the Unified Action Plane: the mechanism that lets an agent trigger a transaction and update operational systems the instant an analytical insight is reached. This is where "thinking" and "doing" are stitched back together.
Layer 4 — Active, Multimodal Engines
At the base, the core engines are rebuilt as an operations centre rather than a passive store. Graph processing, vector search, multimodal handling, and integrated reasoning are native capabilities. Real-time AI functions process data as it arrives, and autonomous triggers fire actions the moment the business changes — closing the loop without waiting for a human to refresh a dashboard.
Solving the Three Legacy Problems
The four layers are the structure; the design choices that make them viable are the more interesting part. Each of the three legacy failures gets a direct architectural answer.
Cost → Vertical Integration
Rather than bolting AI onto a legacy database, the full stack is harmonized: efficient AI infrastructure beneath differentiated data systems. Integrating the models with the data infrastructure removes extra network hops and the operational tax of stitching rented components together — the difference between an AI program that scales and one that becomes an unpredictable line item.
Openness → The Cross-Cloud Lakehouse
The walled garden is answered with Lakehouse for Apache Iceberg, an open foundation that exposes multiple engines over the same data — BigQuery and Spanner, the Lightning Engine for Apache Spark, and AlloyDB for PostgreSQL. Omni technology unchains those engines to run across clouds, on-premises, and at the edge, producing a "borderless" lakehouse where an agent can use data sitting in AWS as if it were local — removing cross-cloud latency and the egress fees that usually punish that pattern.
Trust → The Knowledge Flywheel
The trust gap is the hardest of the three, because it cannot be solved by faster hardware. It is answered with universal context, organized as a flywheel that turns the catalog from a passive inventory into an active layer that gets more accurate the more it is used.
At the center sits the Knowledge Catalog, continuously aggregating data and enriching its meaning. Semantics are generated automatically — the LookML Agent produces them, BigQuery Measures define business metrics, and zero-copy federation pulls context from SAP, Salesforce Data360, and other applications without duplicating it. Smart Storage paired with Gemini extracts meaning from unstructured files, finally activating the dark data, while retrieval techniques borrowed from Google Search make sure the agent gets the right context at the moment it asks. Each rotation of the flywheel makes the next agent action more accurate.
The Closed Loop in Practice
Put the layers in motion and the payoff is a single loop that legacy stacks cannot form: live business events flow into active engines that reason over analytical history and operational reality together; an agent decides within its guardrails; the Unified Action Plane commits a transaction back to the operational system — fast enough that the window to act is still open.
Evidence from Production
An architecture is only as credible as its mileage. Three numbers signal that the agentic pattern is already load-bearing rather than theoretical:
The migration figure is the most pointed: it implies organizations are actively ripping out the previous generation's "intelligent" platforms in favor of this model. Underwriting it all is a foundation built on Site Reliability Engineering — the discipline Google originated — and the same infrastructure that nine of the top ten AI labs already run on, with specialized models like TimesFM and WeatherNext and curated datasets such as Earth Engine available natively to agents.
"To secure the AI era, we needed a foundation that could think as fast as the threats evolve… Leveraging Spanner's unified graph and non-graph capabilities, we can now provide our customers with a seamless, highly scalable identity posture that enables AI agents to perceive and act on security gaps in real time."— Sreejith Rajkumar, Director of Engineering, Palo Alto Networks
"The Agentic Data Cloud allows us to dismantle the legacy silos and technical debt that once slowed us down. By integrating the operational reliability of Cloud SQL with the deep reasoning of BigQuery, we've created a data ecosystem where our developers and AI agents can validate, optimize, and innovate in real time."— Kristofer Shane Sikora, Executive Director, Cloud Data Engineering, CME Group
"To deliver AI that actually works across HR, payroll, and workforce operations, you need a consistent, real-time data layer… People Fabric is the backbone of UKG's Workforce Operating Platform — turning fragmented systems into a single source of truth that powers intelligent, agent-driven experiences."— Radhi Chagarlamudi, Group VP, Product Engineering, UKG
Review
Strip away the product names and the durable idea is simple: when the consumer of data changes from a person to an agent, the platform's job changes from answering to acting — and a stack designed for the former breaks at the seams of the latter. The four failures (walled garden, trust gap, time factor, cost spiral) are a genuinely useful diagnostic regardless of vendor, because they describe what an autonomous consumer cannot tolerate: data it cannot reach safely, context it cannot trust, latency it cannot afford, and economics that punish it for running continuously.
The Agentic Data Cloud's answers map cleanly onto those failures — vertical integration for cost, an open cross-cloud lakehouse for the walled garden, and a knowledge flywheel for trust — with the Unified Action Plane as the piece that actually closes the loop between insight and execution. The honest caveat is the obvious one: vertical integration is both the source of the efficiency claim and the lock-in risk, and "tear down the walled garden" is a more comfortable slogan for the vendor whose garden you are standing in. Read the architecture for its ideas, not its allegiances.
Still, the direction of travel is hard to argue with. The era of passive observation — of data that waits to be asked — is closing. The platforms that matter next will be the ones that can be trusted to act. That is the bet behind the System of Action, and it is a bet worth understanding before it is one you have to make.
Analysis based on "Architecting the agentic data cloud" by Andi Gutmans, VP/GM, Data Cloud, Google Cloud (April 2026). Source: cloud.google.com/transform/shift-system-of-action-architecting-the-agentic-data-cloud-ai. All product names and quotations belong to their respective owners; this piece is independent commentary.
