Skip to content
Learn Agentic AI
Learn Agentic AI13 min read14 views

Contract Testing for AI Agent Microservices: Pact and Schema Validation

Implement consumer-driven contract testing for AI agent microservices using Pact and JSON Schema validation. Catch breaking API changes before they reach production with automated CI integration.

Why Contract Testing Matters for Agent Microservices

In a monolithic agent, if you change a function signature, the compiler or linter catches it immediately. In a microservices architecture, if the RAG service changes its response format from {"documents": [...]} to {"results": [...]}, the conversation manager breaks at runtime. Integration tests might catch this, but they require running the entire system together — which is slow and fragile.

Contract testing sits between unit tests and integration tests. It verifies that two services agree on the shape of their API interaction without requiring both services to run simultaneously. Each side of the contract is tested independently, and mismatches are caught in CI before deployment.

Consumer-Driven Contracts

In consumer-driven contract testing, the consumer (the service making the API call) defines what it expects from the provider (the service receiving the call). The conversation manager consumes the RAG service, so it defines the contract:

flowchart TD
    START["Contract Testing for AI Agent Microservices: Pact…"] --> A
    A["Why Contract Testing Matters for Agent …"]
    A --> B
    B["Consumer-Driven Contracts"]
    B --> C
    C["Provider Verification"]
    C --> D
    D["JSON Schema Validation as a Lightweight…"]
    D --> E
    E["CI Integration"]
    E --> F
    F["FAQ"]
    F --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
# test_rag_contract.py — Consumer side (conversation manager)
import pytest
from pact import Consumer, Provider

pact = Consumer("ConversationManager").has_pact_with(
    Provider("RAGRetrieval"),
    pact_dir="./pacts",
)

def test_retrieve_documents_contract():
    """Define what the conversation manager expects from RAG."""
    expected_body = {
        "documents": [
            {
                "content": "Account balance policies state...",
                "score": 0.92,
                "metadata": {"source": "policy-docs"},
            }
        ]
    }

    (
        pact.given("documents exist for the query")
        .upon_receiving("a retrieval request")
        .with_request(
            method="POST",
            path="/retrieve",
            headers={"Content-Type": "application/json"},
            body={
                "query": "account balance policy",
                "top_k": 5,
            },
        )
        .will_respond_with(
            status=200,
            headers={"Content-Type": "application/json"},
            body=expected_body,
        )
    )

    with pact:
        # Make the actual call against the Pact mock server
        import httpx
        response = httpx.post(
            f"{pact.uri}/retrieve",
            json={"query": "account balance policy", "top_k": 5},
        )
        assert response.status_code == 200
        data = response.json()
        assert "documents" in data
        assert len(data["documents"]) > 0
        assert "content" in data["documents"][0]
        assert "score" in data["documents"][0]

This test generates a Pact file — a JSON document describing the expected interaction. The Pact file is shared with the provider team.

Provider Verification

The RAG service team runs the Pact file against their actual service to verify they honor the contract:

# test_rag_provider.py — Provider side (RAG service)
from pact import Verifier

def test_rag_provider_honors_contracts():
    verifier = Verifier(
        provider="RAGRetrieval",
        provider_base_url="http://localhost:8002",
    )

    output, _ = verifier.verify_pacts(
        pact_dir="./pacts",
        provider_states_setup_url=(
            "http://localhost:8002/_pact/setup"
        ),
    )

    assert output == 0, "Provider verification failed"

The provider needs a state setup endpoint that configures test data:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

# Added to the RAG service for Pact verification
@app.post("/_pact/setup")
async def pact_provider_state(request: Request):
    body = await request.json()
    state = body.get("state")

    if state == "documents exist for the query":
        # Seed the vector store with test documents
        await vector_store.insert_test_document(
            content="Account balance policies state...",
            metadata={"source": "policy-docs"},
        )
    elif state == "no documents exist":
        await vector_store.clear_test_data()

    return {"status": "ok"}

JSON Schema Validation as a Lightweight Alternative

When Pact feels too heavy, JSON Schema validation provides a simpler contract mechanism. Define schemas for each service's API and validate in tests:

# schemas/rag_response.json
{
    "$schema": "http://json-schema.org/draft-07/schema#",
    "type": "object",
    "required": ["documents"],
    "properties": {
        "documents": {
            "type": "array",
            "items": {
                "type": "object",
                "required": ["content", "score"],
                "properties": {
                    "content": {"type": "string", "minLength": 1},
                    "score": {
                        "type": "number",
                        "minimum": 0,
                        "maximum": 1
                    },
                    "metadata": {"type": "object"}
                }
            }
        }
    }
}

Validate responses against this schema in your consumer tests:

import jsonschema
import json

def load_schema(name: str) -> dict:
    with open(f"schemas/{name}.json") as f:
        return json.load(f)

class RAGClient:
    def __init__(self, base_url: str):
        self.base_url = base_url
        self.schema = load_schema("rag_response")
        self.client = httpx.AsyncClient()

    async def retrieve(self, query: str, top_k: int = 5) -> dict:
        resp = await self.client.post(
            f"{self.base_url}/retrieve",
            json={"query": query, "top_k": top_k},
        )
        resp.raise_for_status()
        data = resp.json()
        # Validate response matches expected schema
        jsonschema.validate(instance=data, schema=self.schema)
        return data

# Test
async def test_rag_response_matches_schema():
    client = RAGClient("http://localhost:8002")
    result = await client.retrieve("test query")
    # If the schema changed, jsonschema.validate raises
    assert len(result["documents"]) >= 0

CI Integration

Add contract verification to your CI pipeline so that breaking changes are caught before merge:

# .github/workflows/contract-tests.yml
name: Contract Tests

on:
  pull_request:
    branches: [main]

jobs:
  consumer-contracts:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - run: pip install pact-python pytest httpx
      - run: pytest tests/contracts/consumer/ -v
      - uses: actions/upload-artifact@v4
        with:
          name: pacts
          path: pacts/

  provider-verification:
    needs: consumer-contracts
    runs-on: ubuntu-latest
    strategy:
      matrix:
        service: [rag-retrieval, tool-execution, memory-service]
    steps:
      - uses: actions/checkout@v4
        with:
          repository: "org/${{ matrix.service }}"
      - uses: actions/download-artifact@v4
        with:
          name: pacts
          path: pacts/
      - run: pip install pact-python pytest
      - run: |
          docker compose up -d ${{ matrix.service }}
          pytest tests/contracts/provider/ -v

FAQ

How is contract testing different from integration testing?

Integration tests run multiple real services together and test end-to-end flows. Contract tests verify that two services agree on API shapes without running them simultaneously. Integration tests are slower (minutes), harder to debug, and catch issues late. Contract tests are fast (seconds), run independently per service, and catch API mismatches early. Use both — contracts in CI on every PR, integration tests nightly or before releases.

What should I include in an AI agent service contract?

Include the request path, method, required headers, request body shape, response status code, and response body shape. For agent services, pay special attention to the structure of LLM-related fields like token counts, model names, and streaming chunk formats. Do not include exact values for dynamic fields — use type matchers instead.

How do I handle contract testing for event-driven communication?

Pact supports message-based contracts. Instead of HTTP interactions, you define the expected message shape. The consumer specifies what events it expects to receive, and the provider verifies it publishes events matching that shape. This works for Kafka, RabbitMQ, and NATS events between agent services.


#ContractTesting #Pact #SchemaValidation #Microservices #AgenticAI #Testing #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Learn Agentic AI

Fine-Tuning LLMs for Agentic Tasks: When and How to Customize Foundation Models

When fine-tuning beats prompting for AI agents: dataset creation from agent traces, SFT and DPO training approaches, evaluation methodology, and cost-benefit analysis for agentic fine-tuning.

AI Interview Prep

7 Agentic AI & Multi-Agent System Interview Questions for 2026

Real agentic AI and multi-agent system interview questions from Anthropic, OpenAI, and Microsoft in 2026. Covers agent design patterns, memory systems, safety, orchestration frameworks, tool calling, and evaluation.

Learn Agentic AI

Adaptive Thinking in Claude 4.6: How AI Agents Decide When and How Much to Reason

Technical exploration of adaptive thinking in Claude 4.6 — how the model dynamically adjusts reasoning depth, its impact on agent architectures, and practical implementation patterns.

Learn Agentic AI

How NVIDIA Vera CPU Solves the Agentic AI Bottleneck: Architecture Deep Dive

Technical analysis of NVIDIA's Vera CPU designed for agentic AI workloads — why the CPU is the bottleneck, how Vera's architecture addresses it, and what it means for agent performance.

Learn Agentic AI

Microservices for AI Agents: Service Decomposition and Inter-Agent Communication

How to structure AI agents as microservices with proper service boundaries, gRPC communication, circuit breakers, health checks, and service mesh integration.

Learn Agentic AI

Agent Evaluation Benchmarks 2026: SWE-Bench, GAIA, and Custom Eval Frameworks

Overview of agent evaluation benchmarks including SWE-Bench Verified, GAIA, custom evaluation frameworks, and how to build your own eval pipeline for production agents.