Contract Testing for AI Agent Microservices: Pact and Schema Validation

Why Contract Testing Matters for Agent Microservices

In a monolithic agent, if you change a function signature, the compiler or linter catches it immediately. In a microservices architecture, if the RAG service changes its response format from {"documents": [...]} to {"results": [...]}, the conversation manager breaks at runtime. Integration tests might catch this, but they require running the entire system together — which is slow and fragile.

Contract testing sits between unit tests and integration tests. It verifies that two services agree on the shape of their API interaction without requiring both services to run simultaneously. Each side of the contract is tested independently, and mismatches are caught in CI before deployment.

Consumer-Driven Contracts

In consumer-driven contract testing, the consumer (the service making the API call) defines what it expects from the provider (the service receiving the call). The conversation manager consumes the RAG service, so it defines the contract:

flowchart LR
    PR(["PR opened"])
    UNIT["Unit tests"]
    EVAL["Eval harness<br/>PromptFoo or Braintrust"]
    GOLD[("Golden set<br/>200 tagged cases")]
    JUDGE["LLM as judge<br/>plus regex graders"]
    SCORE["Aggregate score<br/>and per slice"]
    GATE{"Score regress<br/>more than 2 percent?"}
    BLOCK(["Block merge"])
    MERGE(["Merge to main"])
    PR --> UNIT --> EVAL --> GOLD --> JUDGE --> SCORE --> GATE
    GATE -->|Yes| BLOCK
    GATE -->|No| MERGE
    style EVAL fill:#4f46e5,stroke:#4338ca,color:#fff
    style GATE fill:#f59e0b,stroke:#d97706,color:#1f2937
    style BLOCK fill:#dc2626,stroke:#b91c1c,color:#fff
    style MERGE fill:#059669,stroke:#047857,color:#fff

# test_rag_contract.py — Consumer side (conversation manager)
import pytest
from pact import Consumer, Provider

pact = Consumer("ConversationManager").has_pact_with(
    Provider("RAGRetrieval"),
    pact_dir="./pacts",
)

def test_retrieve_documents_contract():
    """Define what the conversation manager expects from RAG."""
    expected_body = {
        "documents": [
            {
                "content": "Account balance policies state...",
                "score": 0.92,
                "metadata": {"source": "policy-docs"},
            }
        ]
    }

    (
        pact.given("documents exist for the query")
        .upon_receiving("a retrieval request")
        .with_request(
            method="POST",
            path="/retrieve",
            headers={"Content-Type": "application/json"},
            body={
                "query": "account balance policy",
                "top_k": 5,
            },
        )
        .will_respond_with(
            status=200,
            headers={"Content-Type": "application/json"},
            body=expected_body,
        )
    )

    with pact:
        # Make the actual call against the Pact mock server
        import httpx
        response = httpx.post(
            f"{pact.uri}/retrieve",
            json={"query": "account balance policy", "top_k": 5},
        )
        assert response.status_code == 200
        data = response.json()
        assert "documents" in data
        assert len(data["documents"]) > 0
        assert "content" in data["documents"][0]
        assert "score" in data["documents"][0]

This test generates a Pact file — a JSON document describing the expected interaction. The Pact file is shared with the provider team.

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

Provider Verification

The RAG service team runs the Pact file against their actual service to verify they honor the contract:

# test_rag_provider.py — Provider side (RAG service)
from pact import Verifier

def test_rag_provider_honors_contracts():
    verifier = Verifier(
        provider="RAGRetrieval",
        provider_base_url="http://localhost:8002",
    )

    output, _ = verifier.verify_pacts(
        pact_dir="./pacts",
        provider_states_setup_url=(
            "http://localhost:8002/_pact/setup"
        ),
    )

    assert output == 0, "Provider verification failed"

The provider needs a state setup endpoint that configures test data:

# Added to the RAG service for Pact verification
@app.post("/_pact/setup")
async def pact_provider_state(request: Request):
    body = await request.json()
    state = body.get("state")

    if state == "documents exist for the query":
        # Seed the vector store with test documents
        await vector_store.insert_test_document(
            content="Account balance policies state...",
            metadata={"source": "policy-docs"},
        )
    elif state == "no documents exist":
        await vector_store.clear_test_data()

    return {"status": "ok"}

JSON Schema Validation as a Lightweight Alternative

When Pact feels too heavy, JSON Schema validation provides a simpler contract mechanism. Define schemas for each service's API and validate in tests:

# schemas/rag_response.json
{
    "$schema": "http://json-schema.org/draft-07/schema#",
    "type": "object",
    "required": ["documents"],
    "properties": {
        "documents": {
            "type": "array",
            "items": {
                "type": "object",
                "required": ["content", "score"],
                "properties": {
                    "content": {"type": "string", "minLength": 1},
                    "score": {
                        "type": "number",
                        "minimum": 0,
                        "maximum": 1
                    },
                    "metadata": {"type": "object"}
                }
            }
        }
    }
}

Validate responses against this schema in your consumer tests:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

import jsonschema
import json

def load_schema(name: str) -> dict:
    with open(f"schemas/{name}.json") as f:
        return json.load(f)

class RAGClient:
    def __init__(self, base_url: str):
        self.base_url = base_url
        self.schema = load_schema("rag_response")
        self.client = httpx.AsyncClient()

    async def retrieve(self, query: str, top_k: int = 5) -> dict:
        resp = await self.client.post(
            f"{self.base_url}/retrieve",
            json={"query": query, "top_k": top_k},
        )
        resp.raise_for_status()
        data = resp.json()
        # Validate response matches expected schema
        jsonschema.validate(instance=data, schema=self.schema)
        return data

# Test
async def test_rag_response_matches_schema():
    client = RAGClient("http://localhost:8002")
    result = await client.retrieve("test query")
    # If the schema changed, jsonschema.validate raises
    assert len(result["documents"]) >= 0

CI Integration

Add contract verification to your CI pipeline so that breaking changes are caught before merge:

# .github/workflows/contract-tests.yml
name: Contract Tests

on:
  pull_request:
    branches: [main]

jobs:
  consumer-contracts:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - run: pip install pact-python pytest httpx
      - run: pytest tests/contracts/consumer/ -v
      - uses: actions/upload-artifact@v4
        with:
          name: pacts
          path: pacts/

  provider-verification:
    needs: consumer-contracts
    runs-on: ubuntu-latest
    strategy:
      matrix:
        service: [rag-retrieval, tool-execution, memory-service]
    steps:
      - uses: actions/checkout@v4
        with:
          repository: "org/${{ matrix.service }}"
      - uses: actions/download-artifact@v4
        with:
          name: pacts
          path: pacts/
      - run: pip install pact-python pytest
      - run: |
          docker compose up -d ${{ matrix.service }}
          pytest tests/contracts/provider/ -v

FAQ

How is contract testing different from integration testing?

Integration tests run multiple real services together and test end-to-end flows. Contract tests verify that two services agree on API shapes without running them simultaneously. Integration tests are slower (minutes), harder to debug, and catch issues late. Contract tests are fast (seconds), run independently per service, and catch API mismatches early. Use both — contracts in CI on every PR, integration tests nightly or before releases.

What should I include in an AI agent service contract?

Include the request path, method, required headers, request body shape, response status code, and response body shape. For agent services, pay special attention to the structure of LLM-related fields like token counts, model names, and streaming chunk formats. Do not include exact values for dynamic fields — use type matchers instead.

How do I handle contract testing for event-driven communication?

Pact supports message-based contracts. Instead of HTTP interactions, you define the expected message shape. The consumer specifies what events it expects to receive, and the provider verifies it publishes events matching that shape. This works for Kafka, RabbitMQ, and NATS events between agent services.

#ContractTesting #Pact #SchemaValidation #Microservices #AgenticAI #Testing #LearnAI #AIEngineering

Contract Testing for AI Agent Microservices: Pact and Schema Validation

Why Contract Testing Matters for Agent Microservices

Consumer-Driven Contracts

Provider Verification

JSON Schema Validation as a Lightweight Alternative

CI Integration

FAQ

How is contract testing different from integration testing?

What should I include in an AI agent service contract?

How do I handle contract testing for event-driven communication?

Try CallSphere AI Voice Agents

Related Articles You May Like

Desktop AI Agents in 2026: Project Arc, Claude Cowork, OpenAI Agents Compared

WebRTC Mobile Testing with BrowserStack + Sauce Labs (2026)

Anthropic Skills System: Loadable Tool Packs for Claude Agents

Enterprise CIO Guide: Harvey AI — Legal Agents Move from Pilot to Practice

Enterprise CIO Guide: Perplexity Comet — The Agentic Browser Goes Mass Market

Enterprise CIO Guide: Hippocratic AI — Healthcare Agents at Scale