---
title: "7 AI Coding Interview Questions From Anthropic, Meta & OpenAI (2026 Edition)"
description: "Real AI coding interview questions from Anthropic, Meta, and OpenAI in 2026. Includes implementing attention from scratch, Anthropic's progressive coding screens, Meta's AI-assisted round, and vector search — with solution approaches."
canonical: https://callsphere.ai/blog/ai-coding-interview-questions-2026-anthropic-meta-openai
category: "AI Interview Prep"
tags: ["AI Interview", "Coding Interview", "Anthropic", "Meta", "OpenAI", "Python", "PyTorch", "LeetCode", "2026"]
author: "CallSphere Team"
published: 2026-03-25T00:00:00.000Z
updated: 2026-05-06T23:10:51.596Z
---

# 7 AI Coding Interview Questions From Anthropic, Meta & OpenAI (2026 Edition)

> Real AI coding interview questions from Anthropic, Meta, and OpenAI in 2026. Includes implementing attention from scratch, Anthropic's progressive coding screens, Meta's AI-assisted round, and vector search — with solution approaches.

## AI Coding Interviews in 2026: Not Your Father's LeetCode

The coding bar for AI roles has shifted dramatically. Anthropic doesn't ask LeetCode at all — they test progressive system building. Meta now has an **AI-assisted coding round** where you work with real AI tools. OpenAI's coding questions focus on practical ML implementation.

```mermaid
flowchart LR
    subgraph IN["Inputs"]
        I1["Monthly call volume"]
        I2["Average deal value"]
        I3["Current answer rate"]
        I4["Receptionist cost
per month"]
    end
    subgraph CALC["CallSphere Captures"]
        C1["Missed calls converted
at 24 by 7 coverage"]
        C2["Receptionist payroll
displaced or freed"]
    end
    subgraph OUT["Outputs"]
        O1["Recovered revenue
per month"]
        O2["Operating cost saved"]
        O3((Net ROI
monthly))
    end
    I1 --> C1
    I2 --> C1
    I3 --> C1
    I4 --> C2
    C1 --> O1 --> O3
    C2 --> O2 --> O3
    style C1 fill:#4f46e5,stroke:#4338ca,color:#fff
    style C2 fill:#4f46e5,stroke:#4338ca,color:#fff
    style O3 fill:#059669,stroke:#047857,color:#fff
```

Here are 7 real coding questions from these companies, with the approaches that pass.

> **Important**: Anthropic **strictly prohibits** AI assistance during live interviews. Meta explicitly provides AI tools. Know the rules before your interview.

---

HARD
OpenAI
Google DeepMind

**Q1: Implement Multi-Head Attention From Scratch**

### The Task

Implement scaled dot-product multi-head attention using only basic PyTorch tensor operations. No `nn.MultiheadAttention`.

### Solution Approach

```python
import torch
import torch.nn as nn
import math

class MultiHeadAttention(nn.Module):
    def __init__(self, d_model: int, n_heads: int):
        super().__init__()
        assert d_model % n_heads == 0

        self.d_model = d_model
        self.n_heads = n_heads
        self.d_k = d_model // n_heads

        # Projection matrices
        self.W_q = nn.Linear(d_model, d_model, bias=False)
        self.W_k = nn.Linear(d_model, d_model, bias=False)
        self.W_v = nn.Linear(d_model, d_model, bias=False)
        self.W_o = nn.Linear(d_model, d_model, bias=False)

    def forward(self, x: torch.Tensor, mask: torch.Tensor = None):
        batch_size, seq_len, _ = x.shape

        # Project and reshape: (B, N, d) -> (B, h, N, d_k)
        Q = self.W_q(x).view(batch_size, seq_len, self.n_heads, self.d_k).transpose(1, 2)
        K = self.W_k(x).view(batch_size, seq_len, self.n_heads, self.d_k).transpose(1, 2)
        V = self.W_v(x).view(batch_size, seq_len, self.n_heads, self.d_k).transpose(1, 2)

        # Scaled dot-product attention
        scores = torch.matmul(Q, K.transpose(-2, -1)) / math.sqrt(self.d_k)

        # Apply causal mask if provided
        if mask is not None:
            scores = scores.masked_fill(mask == 0, float('-inf'))

        attn_weights = torch.softmax(scores, dim=-1)

        # Apply attention to values
        context = torch.matmul(attn_weights, V)  # (B, h, N, d_k)

        # Reshape back: (B, h, N, d_k) -> (B, N, d)
        context = context.transpose(1, 2).contiguous().view(batch_size, seq_len, self.d_model)

        return self.W_o(context)
```

### What They Evaluate

| Criteria | What They Look For |
| --- | --- |
| **Correctness** | Proper scaling by sqrt(d_k), correct reshape/transpose operations |
| **Mask handling** | Causal mask for autoregressive, padding mask for variable-length |
| **Memory layout** | Using `.contiguous()` before `.view()` after transpose |
| **Edge cases** | What happens with seq_len=1? With d_model not divisible by n_heads? |

**Common Follow-Up Questions**

- "Add GQA support" — Modify so n_kv_heads

HARD
Anthropic

**Q2: Build an In-Memory Database With Progressive Complexity**

### The Format

Anthropic's coding interviews use **progressive rounds** — you start with a simple implementation and the interviewer adds complexity every 15-20 minutes. The question below is reconstructed from candidate reports.

### Round 1 — Basic Operations (15 min)

```python
class InMemoryDB:
    """Implement SET, GET, DELETE operations."""

    def __init__(self):
        self.store = {}

    def set(self, key: str, value: str) -> None:
        self.store[key] = value

    def get(self, key: str) -> str | None:
        return self.store.get(key)

    def delete(self, key: str) -> bool:
        if key in self.store:
            del self.store[key]
            return True
        return False
```

### Round 2 — Filtered Scans (15 min)

"Now add a SCAN operation that filters by a prefix and returns matching key-value pairs."

```python
def scan(self, prefix: str) -> list[tuple[str, str]]:
    return [(k, v) for k, v in self.store.items() if k.startswith(prefix)]
```

The interviewer pushes: "This is O(n) over all keys. How would you make prefix scan efficient?"

**Better approach**: Use a trie or sorted dict (`SortedDict` from `sortedcontainers`) for O(log n + k) prefix scans where k is the number of matches.

### Round 3 — TTL Support (15 min)

"Add TTL (time-to-live) support. Keys should expire after a specified duration."

```python
import time

class InMemoryDB:
    def __init__(self):
        self.store = {}        # key -> value
        self.ttls = {}         # key -> expiry_timestamp

    def set(self, key: str, value: str, ttl: int = None) -> None:
        self.store[key] = value
        if ttl is not None:
            self.ttls[key] = time.time() + ttl
        elif key in self.ttls:
            del self.ttls[key]  # Remove TTL if re-set without one

    def get(self, key: str) -> str | None:
        if key in self.ttls and time.time() > self.ttls[key]:
            self.delete(key)
            return None
        return self.store.get(key)

    def _lazy_cleanup(self):
        """Periodically clean expired keys."""
        now = time.time()
        expired = [k for k, exp in self.ttls.items() if now > exp]
        for k in expired:
            self.delete(k)
```

### Round 4 — Persistence (15 min)

"Add save/load to compress the database to a file and restore it."

```python
import json, gzip

def save(self, filepath: str) -> None:
    data = {"store": self.store, "ttls": self.ttls}
    with gzip.open(filepath, 'wt') as f:
        json.dump(data, f)

def load(self, filepath: str) -> None:
    with gzip.open(filepath, 'rt') as f:
        data = json.load(f)
    self.store = data["store"]
    self.ttls = {k: float(v) for k, v in data["ttls"].items()}
```

**What Anthropic Is Really Evaluating**

- **Code quality under pressure**: Clean, readable code even as complexity grows
- **Modular design**: Can you extend your initial design without rewriting everything?
- **Edge case awareness**: What happens when you GET a key that's expired? What about concurrent TTL cleanup?
- **Communication**: Do you talk through your approach before coding? Do you ask clarifying questions?
- **Progressive thinking**: Do you anticipate where this is going and design for extensibility?

---

MEDIUM
Anthropic

**Q3: Implement a Bank Application With Transaction Types**

### The Task

Build a banking system that handles deposits, withdrawals, and transfers with proper validation. Progressive complexity adds transaction history and balance queries.

### Core Implementation

```python
from dataclasses import dataclass, field
from datetime import datetime
from enum import Enum

class TxnType(Enum):
    DEPOSIT = "deposit"
    WITHDRAWAL = "withdrawal"
    TRANSFER = "transfer"

@dataclass
class Transaction:
    txn_type: TxnType
    amount: float
    timestamp: datetime
    from_account: str | None = None
    to_account: str | None = None

class Bank:
    def __init__(self):
        self.accounts: dict[str, float] = {}
        self.history: dict[str, list[Transaction]] = {}

    def create_account(self, account_id: str, initial_balance: float = 0) -> None:
        if account_id in self.accounts:
            raise ValueError(f"Account {account_id} already exists")
        if initial_balance  float:
        self._validate_account(account_id)
        if amount  float:
        self._validate_account(account_id)
        if amount  None:
        self._validate_account(from_id)
        self._validate_account(to_id)
        if from_id == to_id:
            raise ValueError("Cannot transfer to same account")
        self.withdraw(from_id, amount)
        self.deposit(to_id, amount)
        # Record transfer in both histories
        txn = Transaction(TxnType.TRANSFER, amount, datetime.now(), from_id, to_id)
        self.history[from_id].append(txn)
        self.history[to_id].append(txn)

    def _validate_account(self, account_id: str) -> None:
        if account_id not in self.accounts:
            raise ValueError(f"Account {account_id} not found")
```

**Progressive Follow-Ups**

- **"Add transaction rollback"**: If deposit in a transfer succeeds but something fails, undo the withdrawal. Implement a simple saga pattern.
- **"Add concurrent access"**: Use locks to handle multiple threads doing transfers simultaneously. Discuss deadlock prevention (always lock accounts in sorted order).
- **"Add interest calculation"**: Compound interest on all accounts, run monthly. Discuss precision issues with floating point.

---

MEDIUM
Anthropic

**Q4: Debug Broken ML Notebooks**

### The Format

Anthropic's "Bug Fixing" round (reported March 2026): You're given a Jupyter notebook with ML training/inference code that has multiple bugs. Find and fix them.

### Common Bug Patterns to Watch For

**1. Shape Mismatches**

```python
# BUG: Wrong dimension for softmax
logits = model(x)  # shape: (batch, seq_len, vocab_size)
probs = torch.softmax(logits, dim=1)  # Bug! Should be dim=-1 (or dim=2)
```

**2. Device Mismatches**

```python
# BUG: Model on GPU, new tensor on CPU
model = model.cuda()
mask = torch.ones(batch_size, seq_len)  # CPU tensor!
output = model(x.cuda(), mask)  # RuntimeError: tensors on different devices
# Fix: mask = mask.cuda() or mask = mask.to(x.device)
```

**3. Gradient Bugs**

```python
# BUG: Forgetting to zero gradients
for batch in dataloader:
    loss = criterion(model(batch), targets)
    loss.backward()
    optimizer.step()
    # Missing: optimizer.zero_grad() — gradients accumulate!
```

**4. Data Leakage**

```python
# BUG: Fitting scaler on test data
scaler = StandardScaler()
X_all_scaled = scaler.fit_transform(X_all)  # Fits on ALL data including test
X_train, X_test = X_all_scaled[:800], X_all_scaled[800:]
# Fix: Fit on train only, transform test
```

**5. Off-By-One in Tokenization**

```python
# BUG: Not accounting for special tokens
max_length = 512
tokens = tokenizer(text, max_length=max_length, truncation=True)
# Actual content tokens = 510 (2 slots taken by [CLS] and [SEP])
```

**How to Approach This Round**

1. **Read the full notebook first** — understand the intended logic before looking for bugs
2. **Check shapes at each step** — most bugs are shape/dimension errors
3. **Trace the data flow** — input → preprocessing → model → loss → backward → update
4. **Look for silent bugs** — code that runs but produces wrong results (wrong dim for softmax, missing gradient zeroing) is harder to catch than crashes
5. **Test incrementally** — fix one bug, run the cell, check the output, move to the next

---

HARD
Anthropic

**Q5: Implement Concurrent System Components With Fault Tolerance**

### The Task

Build a concurrent task processor that executes independent tasks in parallel, handles failures gracefully, and reports results.

### Solution Approach

```python
import asyncio
from dataclasses import dataclass
from enum import Enum
from typing import Callable, Any

class TaskStatus(Enum):
    PENDING = "pending"
    RUNNING = "running"
    COMPLETED = "completed"
    FAILED = "failed"

@dataclass
class TaskResult:
    task_id: str
    status: TaskStatus
    result: Any = None
    error: str | None = None

class ConcurrentProcessor:
    def __init__(self, max_concurrency: int = 5, timeout: float = 30.0):
        self.semaphore = asyncio.Semaphore(max_concurrency)
        self.timeout = timeout

    async def _execute_task(
        self, task_id: str, func: Callable, *args
    ) -> TaskResult:
        async with self.semaphore:
            try:
                result = await asyncio.wait_for(
                    func(*args), timeout=self.timeout
                )
                return TaskResult(task_id, TaskStatus.COMPLETED, result=result)
            except asyncio.TimeoutError:
                return TaskResult(task_id, TaskStatus.FAILED, error="Timeout")
            except Exception as e:
                return TaskResult(task_id, TaskStatus.FAILED, error=str(e))

    async def process_all(
        self, tasks: list[tuple[str, Callable, tuple]]
    ) -> list[TaskResult]:
        """Execute all tasks concurrently, return all results."""
        coros = [
            self._execute_task(task_id, func, *args)
            for task_id, func, args in tasks
        ]
        return await asyncio.gather(*coros)

    async def process_with_retry(
        self, task_id: str, func: Callable, args: tuple,
        max_retries: int = 3, backoff: float = 1.0
    ) -> TaskResult:
        """Execute with exponential backoff retry."""
        for attempt in range(max_retries):
            result = await self._execute_task(task_id, func, *args)
            if result.status == TaskStatus.COMPLETED:
                return result
            if attempt
**Follow-Up Questions**

- **"Add a circuit breaker"**: After N consecutive failures, stop sending tasks to that function and return a fast failure for a cooldown period.
- **"Handle task dependencies"**: Some tasks depend on others. Build a DAG executor that respects ordering constraints.
- **"Add graceful shutdown"**: On shutdown signal, finish running tasks but don't start new ones. Return pending tasks as cancelled.

---

NEW FORMAT
Meta

**Q6: Meta's AI-Assisted Coding Round**

### What Is It?

Meta launched this new interview format in late 2025. You get a real multi-file codebase and **real AI tools** (GPT-4o mini, Claude Sonnet, Gemini 2.5 Pro, LLaMA 4). You're evaluated on how effectively you use AI to solve programming tasks.

### What You're Given

- A multi-file project (typically Python or Java)
- Access to AI chat (like Copilot Chat)
- 60 minutes to complete multiple tasks of increasing complexity

### What They Evaluate

| Criteria | Weight | What They Look For |
| --- | --- | --- |
| **Problem decomposition** | High | How you break tasks into AI-promptable sub-tasks |
| **Prompt quality** | High | Specific, contextual prompts that give the AI what it needs |
| **Verification** | High | Do you test AI output? Do you catch AI mistakes? |
| **Code understanding** | Medium | Can you read and navigate unfamiliar code? |
| **Speed & efficiency** | Medium | How much you accomplish in 60 minutes |

### Strategies That Work

1. **Read the codebase yourself first** — Don't immediately ask AI to explain everything. Understand the structure, then use AI for specific tasks.
2. **Give AI context** — "Here's the function signature, the test that should pass, and the error I'm getting. Fix the implementation." — much better than "write a function."
3. **Verify AI output** — Run the code. Check edge cases. AI will write plausible-looking code with subtle bugs.
4. **Use AI for boilerplate, think yourself for logic** — AI is great for generating test scaffolding, data classes, and configuration. Use your brain for the actual algorithm.

**Common Mistakes That Fail Candidates**

- Blindly copying AI output without reading it
- Spending too long prompting when you could write it faster yourself
- Not running/testing code after AI generates it
- Over-relying on AI for simple tasks (wastes time waiting for responses)
- Under-utilizing AI for complex boilerplate (reinventing the wheel)

---

MEDIUM
AI Startups
Amazon

**Q7: Implement Vector Similarity Search**

### The Task

Implement cosine similarity search over a collection of vectors. Then discuss how to scale it with approximate nearest neighbors.

### Exact Search Implementation

```python
import numpy as np
from typing import List, Tuple

class VectorStore:
    def __init__(self, dimension: int):
        self.dimension = dimension
        self.vectors: list[np.ndarray] = []
        self.metadata: list[dict] = []

    def add(self, vector: np.ndarray, meta: dict = None) -> int:
        assert vector.shape == (self.dimension,)
        # Normalize for cosine similarity
        norm = np.linalg.norm(vector)
        if norm > 0:
            vector = vector / norm
        self.vectors.append(vector)
        self.metadata.append(meta or {})
        return len(self.vectors) - 1

    def search(self, query: np.ndarray, top_k: int = 5) -> List[Tuple[int, float, dict]]:
        query_norm = query / np.linalg.norm(query)

        # Cosine similarity = dot product of normalized vectors
        if not self.vectors:
            return []

        matrix = np.stack(self.vectors)  # (N, d)
        similarities = matrix @ query_norm  # (N,)

        # Get top-k indices
        top_indices = np.argpartition(similarities, -top_k)[-top_k:]
        top_indices = top_indices[np.argsort(similarities[top_indices])[::-1]]

        return [
            (int(idx), float(similarities[idx]), self.metadata[idx])
            for idx in top_indices
        ]
```

### Scaling Discussion: ANN Algorithms

| Algorithm | How It Works | Tradeoff |
| --- | --- | --- |
| **HNSW** | Hierarchical navigable small world graph — multi-layer graph traversal | Best recall, but high memory (graph overhead) |
| **IVF** | Inverted file — cluster vectors, search only nearby clusters | Good speed, lower memory, tunable recall |
| **PQ** | Product quantization — compress vectors to compact codes | Lowest memory, but lower recall |
| **IVF-PQ** | Combine IVF and PQ | Best memory/speed/recall balance for large scale |

**The Discussion They Want**

"Exact search is O(n*d) per query — fine for

---

## Frequently Asked Questions

### Does Anthropic ask LeetCode?

No. Anthropic's coding interviews focus on progressive system building (like the database question above) and bug fixing. They evaluate code quality, design thinking, and how you handle increasing complexity — not algorithm puzzle solving.

### What language should I use?

Python is standard for AI roles. Some companies (Meta, Google) accept C++ or Java. For ML-specific questions (attention implementation), PyTorch is expected. Anthropic's coding round is language-agnostic but most candidates use Python.

### How should I prepare for Meta's AI-assisted round?

Practice working with AI coding tools on real projects. The key skill is knowing when to use AI vs. when to code yourself. Practice giving specific, context-rich prompts. And always verify AI output — candidates who blindly accept AI suggestions fail.

### How much LeetCode do I still need?

For AI engineering roles specifically: Medium-level proficiency is sufficient. You should be comfortable with arrays, hashmaps, trees, and basic graph algorithms. Hard LeetCode problems are rarely asked for AI roles (except at Google, which still asks traditional coding).

---

Source: https://callsphere.ai/blog/ai-coding-interview-questions-2026-anthropic-meta-openai
