Skip to content
Learn Agentic AI
Learn Agentic AI14 min read2 views

Build a Language Translation Agent: Multi-Language Support with Context Awareness

Create an AI translation agent that translates between multiple languages while preserving context, manages terminology databases for domain-specific vocabulary, and performs quality checks on translations.

Why Build a Translation Agent

Machine translation has improved dramatically, but raw translation APIs still struggle with context, domain terminology, and nuance. A translation agent wraps translation capabilities with context management, terminology databases, and quality checking. It remembers the subject matter of your conversation, applies domain-specific vocabulary correctly, and flags potential issues before delivering the final translation.

This tutorial builds a multi-language translation agent with mock translation, a terminology database, context tracking, and quality validation.

Project Setup

mkdir translation-agent && cd translation-agent
python -m venv venv && source venv/bin/activate
pip install openai-agents pydantic
mkdir -p src
touch src/__init__.py src/translator.py src/terminology.py
touch src/quality.py src/agent.py

Step 1: Build the Translation Engine

We simulate translation with a dictionary-based approach. In production, replace this with calls to Google Translate, DeepL, or AWS Translate APIs.

flowchart TD
    START["Build a Language Translation Agent: Multi-Languag…"] --> A
    A["Why Build a Translation Agent"]
    A --> B
    B["Project Setup"]
    B --> C
    C["Step 1: Build the Translation Engine"]
    C --> D
    D["Step 2: Terminology Database"]
    D --> E
    E["Step 3: Quality Checker"]
    E --> F
    F["Step 4: Assemble the Agent"]
    F --> G
    G["FAQ"]
    G --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff
# src/translator.py
from pydantic import BaseModel

class TranslationResult(BaseModel):
    source_lang: str
    target_lang: str
    original: str
    translated: str
    confidence: float

SUPPORTED_LANGUAGES = [
    "english", "spanish", "french", "german",
    "japanese", "portuguese", "italian",
]

# Simple word-level mock translations for demonstration
MOCK_TRANSLATIONS: dict[str, dict[str, str]] = {
    "english->spanish": {
        "hello": "hola", "world": "mundo", "how": "como",
        "are": "estas", "you": "tu", "good": "bueno",
        "morning": "manana", "thank": "gracias", "please": "por favor",
        "the": "el", "is": "es", "and": "y",
        "software": "software", "database": "base de datos",
        "server": "servidor", "network": "red",
        "meeting": "reunion", "report": "informe",
    },
    "english->french": {
        "hello": "bonjour", "world": "monde", "how": "comment",
        "are": "allez", "you": "vous", "good": "bon",
        "morning": "matin", "thank": "merci", "please": "s'il vous plait",
        "the": "le", "is": "est", "and": "et",
        "software": "logiciel", "database": "base de donnees",
        "server": "serveur", "network": "reseau",
        "meeting": "reunion", "report": "rapport",
    },
}

class TranslationContext:
    """Tracks conversation context for better translations."""
    def __init__(self):
        self.domain: str = "general"
        self.previous_translations: list[TranslationResult] = []
        self.source_lang: str = "english"
        self.target_lang: str = "spanish"

    def set_context(self, domain: str, source: str, target: str):
        self.domain = domain
        self.source_lang = source.lower()
        self.target_lang = target.lower()

    def add_translation(self, result: TranslationResult):
        self.previous_translations.append(result)
        if len(self.previous_translations) > 20:
            self.previous_translations.pop(0)

context = TranslationContext()

def translate_text(
    text: str,
    source_lang: str = "",
    target_lang: str = "",
) -> TranslationResult:
    src = source_lang.lower() or context.source_lang
    tgt = target_lang.lower() or context.target_lang
    pair_key = f"{src}->{tgt}"

    word_map = MOCK_TRANSLATIONS.get(pair_key, {})
    words = text.lower().split()
    translated_words = [word_map.get(w, w) for w in words]
    translated = " ".join(translated_words)

    known = sum(1 for w in words if w in word_map)
    confidence = known / len(words) if words else 0.0

    result = TranslationResult(
        source_lang=src,
        target_lang=tgt,
        original=text,
        translated=translated,
        confidence=round(confidence, 2),
    )
    context.add_translation(result)
    return result

Step 2: Terminology Database

Domain-specific terms need consistent translations. A terminology database ensures "server" always translates to "servidor" in IT context, not "camarero" (waiter).

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

flowchart LR
    S0["Step 1: Build the Translation Engine"]
    S0 --> S1
    S1["Step 2: Terminology Database"]
    S1 --> S2
    S2["Step 3: Quality Checker"]
    S2 --> S3
    S3["Step 4: Assemble the Agent"]
    style S0 fill:#4f46e5,stroke:#4338ca,color:#fff
    style S3 fill:#059669,stroke:#047857,color:#fff
# src/terminology.py
from pydantic import BaseModel

class TermEntry(BaseModel):
    term: str
    translations: dict[str, str]  # lang -> translation
    domain: str
    notes: str = ""

class TerminologyDB:
    def __init__(self):
        self.entries: dict[str, TermEntry] = {}
        self._load_defaults()

    def _load_defaults(self):
        defaults = [
            TermEntry(
                term="server",
                translations={
                    "spanish": "servidor",
                    "french": "serveur",
                },
                domain="technology",
                notes="Computing context, not restaurant",
            ),
            TermEntry(
                term="bug",
                translations={
                    "spanish": "error",
                    "french": "bogue",
                },
                domain="technology",
                notes="Software defect, not insect",
            ),
            TermEntry(
                term="cloud",
                translations={
                    "spanish": "nube",
                    "french": "nuage",
                },
                domain="technology",
                notes="Cloud computing context",
            ),
            TermEntry(
                term="sprint",
                translations={
                    "spanish": "sprint",
                    "french": "sprint",
                },
                domain="technology",
                notes="Agile methodology term, keep as-is",
            ),
        ]
        for entry in defaults:
            self.entries[entry.term.lower()] = entry

    def lookup(self, term: str, target_lang: str) -> str | None:
        entry = self.entries.get(term.lower())
        if entry:
            return entry.translations.get(target_lang.lower())
        return None

    def add_term(
        self, term: str, translations: dict[str, str],
        domain: str, notes: str = "",
    ) -> str:
        self.entries[term.lower()] = TermEntry(
            term=term, translations=translations,
            domain=domain, notes=notes,
        )
        return f"Added term '{term}' to terminology database"

    def list_terms(self, domain: str = "") -> str:
        entries = list(self.entries.values())
        if domain:
            entries = [e for e in entries if e.domain == domain]
        if not entries:
            return "No terms found."
        lines = []
        for e in entries:
            trans = ", ".join(
                f"{lang}: {word}"
                for lang, word in e.translations.items()
            )
            lines.append(f"  {e.term} [{e.domain}]: {trans}")
            if e.notes:
                lines.append(f"    Note: {e.notes}")
        return "\n".join(lines)

term_db = TerminologyDB()

Step 3: Quality Checker

# src/quality.py
from src.translator import TranslationResult

def check_quality(result: TranslationResult) -> dict:
    issues = []
    if result.confidence < 0.3:
        issues.append(
            "Low confidence: many words were not found in "
            "translation dictionary. Consider manual review."
        )
    if result.original.lower() == result.translated.lower():
        issues.append(
            "Translation identical to source. The text may "
            "already be in the target language or untranslatable."
        )
    if len(result.translated.split()) < len(result.original.split()) * 0.5:
        issues.append(
            "Translation significantly shorter than source. "
            "Some content may be lost."
        )
    return {
        "confidence": result.confidence,
        "issues": issues if issues else ["No issues detected."],
        "recommendation": (
            "Manual review recommended"
            if issues else "Translation looks good"
        ),
    }

Step 4: Assemble the Agent

# src/agent.py
import asyncio
import json
from agents import Agent, Runner, function_tool
from src.translator import translate_text, context, SUPPORTED_LANGUAGES
from src.terminology import term_db
from src.quality import check_quality

@function_tool
def translate(
    text: str, source_lang: str = "", target_lang: str = "",
) -> str:
    """Translate text between languages."""
    result = translate_text(text, source_lang, target_lang)
    quality = check_quality(result)
    return json.dumps({
        "original": result.original,
        "translated": result.translated,
        "confidence": result.confidence,
        "quality": quality,
    }, indent=2)

@function_tool
def set_translation_context(
    domain: str, source_lang: str, target_lang: str,
) -> str:
    """Set the translation context for the session."""
    context.set_context(domain, source_lang, target_lang)
    return f"Context set: {domain} domain, {source_lang} -> {target_lang}"

@function_tool
def lookup_term(term: str, target_lang: str = "") -> str:
    """Look up domain-specific terminology."""
    tgt = target_lang or context.target_lang
    result = term_db.lookup(term, tgt)
    if result:
        return f"'{term}' -> '{result}' in {tgt}"
    return f"Term '{term}' not found in terminology database"

@function_tool
def add_terminology(
    term: str, translations_json: str,
    domain: str, notes: str = "",
) -> str:
    """Add a term to the terminology database."""
    translations = json.loads(translations_json)
    return term_db.add_term(term, translations, domain, notes)

@function_tool
def list_supported_languages() -> str:
    """List supported languages."""
    return ", ".join(SUPPORTED_LANGUAGES)

translation_agent = Agent(
    name="Translation Agent",
    instructions="""You are a professional translation agent.
Translate text while preserving context and using correct
domain terminology. Always check quality after translating.
Use the terminology database for technical or specialized terms.
If confidence is low, warn the user and suggest alternatives.""",
    tools=[
        translate, set_translation_context,
        lookup_term, add_terminology,
        list_supported_languages,
    ],
)

async def main():
    result = await Runner.run(
        translation_agent,
        "Set context to technology domain, English to Spanish. "
        "Then translate: 'The server has a critical bug in "
        "the cloud deployment pipeline.'",
    )
    print(result.final_output)

if __name__ == "__main__":
    asyncio.run(main())

The agent sets the technology domain context, looks up "server," "bug," and "cloud" in the terminology database to get the correct technical translations, translates the full sentence, and runs a quality check.

FAQ

How do I replace the mock translator with a real translation API?

Install the googletrans library or use the official Google Cloud Translation or DeepL API. Replace the translate_text function body with an API call that sends the text, source language, and target language. Keep the TranslationResult model as the return type so the quality checker and context tracker continue to work without changes.

How does context awareness improve translation quality?

Context tracking ensures that when translating a series of related sentences, the agent remembers the domain and previous translations. This prevents inconsistencies like translating "server" as "servidor" in one sentence and "camarero" in the next. The terminology database enforces consistent vocabulary within a domain.

Can this handle document-level translation?

Yes. Split the document into paragraphs, translate each one sequentially while maintaining the context object, and reassemble the output. The context tracker accumulates domain signals across paragraphs, so translations improve as the agent processes more of the document and builds a stronger understanding of the subject matter.


#Translation #NLP #AIAgent #Python #MultiLanguage #AgenticAI #LearnAI #AIEngineering

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Technical Guides

Post-Call Analytics with GPT-4o-mini: Sentiment, Lead Scoring, and Intent

Build a post-call analytics pipeline with GPT-4o-mini — sentiment, intent, lead scoring, satisfaction, and escalation detection.

AI Interview Prep

7 AI Coding Interview Questions From Anthropic, Meta & OpenAI (2026 Edition)

Real AI coding interview questions from Anthropic, Meta, and OpenAI in 2026. Includes implementing attention from scratch, Anthropic's progressive coding screens, Meta's AI-assisted round, and vector search — with solution approaches.

Learn Agentic AI

Creating an AI Email Assistant Agent: Triage, Draft, and Schedule with Gmail API

Build an AI email assistant that reads your inbox, classifies urgency, drafts context-aware responses, and schedules sends using OpenAI Agents SDK and Gmail API.

Learn Agentic AI

Building a Multi-Agent Data Pipeline: Ingestion, Transformation, and Analysis Agents

Build a three-agent data pipeline with ingestion, transformation, and analysis agents that process data from APIs, CSVs, and databases using Python.

Learn Agentic AI

Building a Research Agent with Web Search and Report Generation: Complete Tutorial

Build a research agent that searches the web, extracts and synthesizes data, and generates formatted reports using OpenAI Agents SDK and web search tools.

Learn Agentic AI

OpenAI Agents SDK in 2026: Building Multi-Agent Systems with Handoffs and Guardrails

Complete tutorial on the OpenAI Agents SDK covering agent creation, tool definitions, handoff patterns between specialist agents, and input/output guardrails for safe AI systems.