Multi-Agentic AI deployment with ADK(Agent Development Kit) and MCP(Model Context Protocol) [Meridian]
Overview
Meridian Lite is a purpose-built demonstration of multi-agentic AI capabilities, deployed in the context of UK mergers and acquisitions intelligence. It coordinates a hierarchy of six AI specialists — each with distinct tools and domain expertise — to evaluate acquisition targets, assess market dynamics, and synthesise deal recommendations from private in-house and live public data.
The Lite designation is deliberate. This build strips away third-party data overhead and SaaS complexity, isolating the core agentic patterns — specialist agent orchestration routing, function tool-calling, session state sharing, and resilient failover — in a form that is legible, testable, and safe to demonstrate. The underlying infrastructure is production-grade: hardened access control, security, token budgeting, and session persistence. The data coverage is intentionally constrained.
- Multi-agent orchestration with intelligent specialist routing: a root coordinator invokes only the agents relevant to each query, not all five on every turn.
- Function tool-calling via the Model Context Protocol (MCP) against the Companies House REST API, Google Search as a grounded retrieval primitive, and a native Python financial screener tool.
- Persistent session state shared across agents: each specialist can read the outputs of its predecessors within the same conversation.
- Per-client access control and token budgeting suitable for live demos, with no restart required to rotate credentials.
- Resilient handling of model capacity limits: in-turn retry with exponential backoff, then automatic failover to a backup LLM (DeepSeek v3.1 via OpenRouter).
- A full-stack streaming implementation: WebSocket event pipeline from ADK runner through to real-time React UI.
Prerequisites:
The stack: Google ADK 2.1, Gemini 2.5-flash, FastAPI, React 18, SQLite, MCP, Docker.
Architecture (The Agent Topology)
Meridian Lite runs six `LlmAgent` instances wired via the Google ADK `AgentTool` pattern. A root coordinator handles intent routing and the conversational persona. Five domain specialists handle the analytical work.
Each specialist carries a scoped system prompt and only the tools appropriate to its role. The `m_and_a_strategist` has no tools at all: it reasons over the outputs produced by the agents that ran before it in the same session, synthesising a deal perspective from structured financial data and market context without independently querying any external source.
Orchestration discipline is encoded in the root agent's system prompt: a single-domain query routes to one specialist only; a multi-domain question chains them in sequence — typically `financial_screener` → `industry_analyst` → `m_and_a_strategist`. The root never fans out to all five agents on a single turn.
The Request and Event Flow
ADK's `DatabaseSessionService` (SQLite) provides the shared state layer. Session data is scoped to a `(app, user, session)` key triple, persists across application restarts, and is available to every agent in the chain — meaning the strategist sees what the screener found without a second API call.
Fully Functional Elements
| Real-time chat with incremental WebSocket streaming.
| Companies House MCP — search, profile, officers, filings, charges, due diligence.
| Financial screener (`screen_companies`) — ranked by EBITDA, turnover, margins, growth.
| Per-client access codes — SHA-256 hashed, per-code token budget, live revoke/mint.
| Multi-layer token budget — session 150k, daily 1M, per-turn 40k, per-request 8k.
| Session persistence — ADK DatabaseSessionService over SQLite, survives restart.
| Backup LLM failover — in-turn retry, then DeepSeek v3.1 via OpenRouter on capacity errors (429 / 503 / RESOURCE_EXHAUSTED).
| Signed 24h session tokens — URLSafeTimedSerializer.
| React chat UI — quota meter, tool trace, company cards, streaming agent response.
| Docker + Render Blueprint deployment
A pytest suite covers token budgeting, authentication, access codes, agent topology wiring, orchestration routing, structured event emission, and failover messaging — verifying static wiring and budget logic without hitting the live LLM or Companies House API. Separately, a deterministic eval harness grades the agent topology across six dimensions — routing discipline, tool selection, grounding, output format, token-budget adherence, and reliability — against a golden set of M&A scenarios.
Placeholder Elements
These features are present in architecture or dependency but are not yet connected to live data sources, omitted due to third-party vendor overhead at this tier.
Web search on the backup path
The `industry_analyst`, `investor_matching_specialist`, and `owner_sentiment_analyst` agents rely on `google_search`, which is a Gemini-native ADK primitive unavailable when the system fails over to the OpenRouter path. On the backup, these specialists fall back to trained knowledge and emit an explicit disclaimer noting that figures are indicative and could not be verified against live sources. This is a known, accepted limitation of the Lite tier.
Owner and director sentiment
The sentiment agent draws on public web signals — news coverage, search-visible professional profiles — sourced through Google Search. No proprietary news feed, regulatory filing tracker, or social listening pipeline is wired at this tier. Outputs are qualitative and indicative.
Investor matching
Potential acquirer and PE/VC identification is driven by web search rather than a curated investor database. Results reflect publicly visible market participants; coverage is not exhaustive.
Advanced usage analytics
Quota data is streamed to the client after every turn and surfaced as a live budget meter. Richer historical analytics — usage charts, per-client reporting — are not yet built.
Security
Application-Level Controls
Access-code gate
Each client or demo session receives its own high-entropy code (192 bits via `secrets.token_urlsafe`). Only the SHA-256 hash is stored — plaintext codes never touch the database. Individual codes can be revoked and rotated live, with no application restart or redeploy required. Per-code token budgets allow different clients to receive different quota allocations.
Token budgeting as a security boundary
The four-layer budget (session, daily, per-turn, per-request) is not purely an operational cost control: it is also a prompt injection defence. An adversarial input designed to trigger a cascade of LLM calls — for example, a crafted message that instructs agents to loop, re-query, or produce verbose outputs — hits the per-turn ceiling before it can exhaust session or daily quota. The 8,000-token input ceiling independently blocks prompt-stuffing and context-flooding attacks at the request boundary.
Signed session tokens
Tokens are issued by `itsdangerous`'s `URLSafeTimedSerializer` with a 24-hour TTL. The WebSocket handler rejects unsigned or expired tokens before any agent invocation.
Admin surface minimised
The admin API (`/api/admin/codes`) returns HTTP 404 by default. It only activates when the `ADMIN_TOKEN` environment variable is explicitly set. All admin-token comparisons use constant-time equality to prevent timing attacks.
CORS
Development origins are explicitly listed. Production restricts to same-origin WebSocket (`wss://...onrender.com`).
LLM and Agentic Security (Google Gemini + ADK)
Gemini built-in safety filters
Google's responsible AI content classifiers evaluate every input and output at the model layer, independently of application code. Harmful, abusive, or policy-violating content is blocked before it reaches the agent's reasoning step.
System-instruction boundaries per specialist
Each of the six agents carries its own scoped system prompt. A crafted user message cannot override a specialist's domain constraints or re-task it outside its defined role — the system prompt is applied at the inference call, not derived from the conversation.
ADK agent isolation
Every specialist runs in its own `Runner` instance. A misbehaving sub-agent cannot directly access the state or tool results of a sibling. All cross-agent communication passes through the root coordinator as structured output, not as shared memory.
Session isolation
`DatabaseSessionService` scopes all state to a `(app, user, session)` key triple. Concurrent sessions cannot read each other's data, even when sharing the same SQLite file under load.
Grounding reduces hallucination surface
The `google_search` tool grounds industry analysis and sentiment outputs in verifiable web content. This reduces the viability of adversarial prompts designed to elicit confident but fabricated financial or legal claims — grounded answers are tied to retrievable sources rather than model interpolation.
Routing discipline as a prompt injection barrier
The root agent's orchestration prompt encodes strict constraints on which specialists are invoked and in what order. This makes it significantly harder for a crafted user message to force arbitrary tool calls, trigger out-of-sequence agent chains, or exfiltrate session state through a specialist's text output.
Further Enhancements
SearcherVA Lite demonstrates the foundational layer. The full platform extends it in several directions, each of which introduces additional third-party vendor access and operational overhead.
Notification on model response
Push alerts when a long-running multi-specialist analysis completes — relevant for deep research queries where the user moves away from the interface mid-run.
Deep research
Multi-turn, document-grounding research loops per acquisition target: iterative search, retrieval, and synthesis rather than single-pass agent chains.
ETL of bulk Companies House data
A snapshot ingestion pipeline over the Companies House bulk data extracts, transforming and loading filings at scale into a vector store. This enables semantic search over financial narratives — retrieving companies by conceptual similarity rather than exact metric filters — replacing on-demand MCP lookups with a pre-indexed corpus.
Data aggregation via MCP and third-party APIs
Richer signals at higher operational overhead: Financial Times deal coverage, LinkedIn director and ownership intelligence, Orbis company financial data, X/Twitter for real-time market sentiment. Each source is wired as an MCP server or API integration, extending the specialist tool sets without changing the agent architecture.
Full SaaS layer
Subscription billing, OAuth, multi-tenant organisation support, and a mobile client complete the transition from demo prototype to production product.
Review
Meridian Lite packages a meaningful and functional slice of the full advisory platform into a form that is safe to demonstrate, straightforward to deploy, and honest about its data boundaries. The agentic patterns at its core — specialist routing, function tool orchestration, cross-agent session state, and resilient failover — are production-transferable. They scale up with richer data connections; the architecture does not need to change.
For teams evaluating AI-augmented deal intelligence workflows, Meridian Lite offers a concrete, inspectable starting point: multi-specialist reasoning, live Companies House data, and hardened access control, running today on a single container.
