---
title: "Terraform for AI Voice Infrastructure: Ephemeral Resources + Vault (2026)"
description: "Provision an AI voice agent stack with Terraform 1.10+: ephemeral Vault credentials, AWS LB + EKS, OpenSearch for vector store, and OIDC trust without long-lived keys."
canonical: https://callsphere.ai/blog/vw6h-terraform-ai-voice-infrastructure-ephemeral-resources-2026
category: "AI Infrastructure"
tags: ["Terraform", "AWS", "Vault", "EKS", "Tutorial"]
author: "CallSphere Team"
published: 2026-03-23T00:00:00.000Z
updated: 2026-05-07T16:46:14.836Z
---

# Terraform for AI Voice Infrastructure: Ephemeral Resources + Vault (2026)

> Provision an AI voice agent stack with Terraform 1.10+: ephemeral Vault credentials, AWS LB + EKS, OpenSearch for vector store, and OIDC trust without long-lived keys.

> **TL;DR** — Terraform 1.10 ephemeral resources finally let you fetch a Vault-issued AWS token *during* an apply without it ever touching the state file. That removes the most common reason AI infra leaks credentials.

## What you'll set up

A Terraform configuration that stands up: an EKS cluster, an OpenSearch Serverless collection (vector store), an Application Load Balancer with WebSocket support, and IRSA roles for the voice-agent pod — all using ephemeral Vault credentials so no static AWS keys exist in TF state or CI.

## Architecture

```mermaid
flowchart TD
  TF[terraform apply] --> VAULT[(Vault)]
  VAULT -->|ephemeral STS| AWS[AWS APIs]
  AWS --> EKS[EKS cluster]
  AWS --> OSS[OpenSearch Serverless]
  AWS --> ALB[ALB WebSocket]
  EKS --> IRSA[Pod IRSA]
  IRSA --> OSS
```

## Step 1 — Configure the Vault provider with ephemeral output

```hcl
terraform {
  required_version = ">= 1.10"
  required_providers {
    vault = { source = "hashicorp/vault", version = "~~> 4.4" }
    aws   = { source = "hashicorp/aws",   version = "~~> 5.70" }
  }
}

provider "vault" { address = "[https://vault.example.com](https://vault.example.com)" }

ephemeral "vault_aws_access_credentials" "tf_role" {
  backend = "aws"
  role    = "terraform-deployer"
}

provider "aws" {
  region     = "us-east-1"
  access_key = ephemeral.vault_aws_access_credentials.tf_role.access_key
  secret_key = ephemeral.vault_aws_access_credentials.tf_role.secret_key
  token      = ephemeral.vault_aws_access_credentials.tf_role.security_token
}
```

The `ephemeral` block is read fresh each phase (plan, apply, refresh). Nothing lands in `terraform.tfstate`. When the apply finishes, Vault revokes the STS lease.

## Step 2 — Stand up EKS with OIDC

```hcl
module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 20.30"
  cluster_name    = "voice-prod"
  cluster_version = "1.31"
  vpc_id     = aws_vpc.voice.id
  subnet_ids = aws_subnet.private[*].id
  enable_irsa = true
  eks_managed_node_groups = {
    voice = {
      instance_types = ["m7g.large"]
      min_size = 2; max_size = 10; desired_size = 3
    }
  }
}
```

ARM Graviton (`m7g`) cuts our voice-agent compute spend ~30% vs x86, and the OpenAI Realtime client and LiveKit both run cleanly on arm64 in 2026.

## Step 3 — OpenSearch Serverless vector store

```hcl
resource "aws_opensearchserverless_collection" "vectors" {
  name = "voice-vectors"
  type = "VECTORSEARCH"
}

resource "aws_opensearchserverless_security_policy" "encr" {
  name = "voice-vectors-encr"
  type = "encryption"
  policy = jsonencode({
    Rules = [{ ResourceType = "collection", Resource = ["collection/voice-vectors"] }]
    AWSOwnedKey = true
  })
}
```

We use this for tool-result caching and for the per-tenant FAQ embeddings. `VECTORSEARCH` is the right type — `SEARCH` will silently bill more.

## Step 4 — IRSA for the voice-agent pod

```hcl
data "aws_iam_policy_document" "agent_assume" {
  statement {
    actions = ["sts:AssumeRoleWithWebIdentity"]
    principals { type = "Federated"; identifiers = [module.eks.oidc_provider_arn] }
    condition {
      test = "StringEquals"; variable = "${replace(module.eks.cluster_oidc_issuer_url,"https://","")}:sub"
      values = ["system:serviceaccount:voice:voice-agent"]
    }
  }
}

resource "aws_iam_role" "agent" {
  name = "voice-agent"
  assume_role_policy = data.aws_iam_policy_document.agent_assume.json
}
```

Now the voice-agent pod assumes `voice-agent` via service-account annotation `eks.amazonaws.com/role-arn` — no static IAM keys, ever.

## Step 5 — ALB with WebSocket idle timeout tuning

```hcl
resource "aws_lb" "voice" {
  name               = "voice-alb"
  load_balancer_type = "application"
  subnets            = aws_subnet.public[*].id
  idle_timeout       = 3600
  enable_http2       = true
}
```

`idle_timeout = 3600` (1 hour) is critical. Default 60s drops long voice sessions; 3600s plus a TCP keepalive at 30s on the agent keeps WebSockets alive.

## Step 6 — Run `terraform plan` with a generated lockfile in CI

```yaml

# .github/workflows/terraform.yml

- run: terraform init -backend-config=bucket=tf-state -backend-config=key=voice/prod
- run: terraform validate
- run: terraform plan -out=tfplan
- run: terraform apply -auto-approve tfplan
```

The state goes to S3 with versioning and SSE-KMS; PRs only get `plan` permissions, `main` gets `apply`.

## Step 7 — Drift-detection cron

```yaml
on: { schedule: [{ cron: '0 3 * * *' }] }
jobs:
  drift:
    steps:
      - run: terraform plan -detailed-exitcode || (gh issue create --title "Drift detected" --body "see logs"; exit 1)
```

`-detailed-exitcode` returns 2 when there's drift — turns into a Slack alert before someone "fixed it in the console" causes a 3am page.

## Pitfalls

- **Ephemeral resources in `output`** — outputs cannot reference ephemeral. The provider error message is misleading; just don't output ephemeral values.
- **OpenSearch Serverless eventual consistency** — collection ARNs aren't ready immediately; add a `time_sleep` of ~30s before granting policies.
- **ALB idle_timeout vs upstream** — set `server_side_timeout` on the agent ≥ ALB idle_timeout, otherwise upstream closes mid-call.
- **EKS OIDC issuer URL formatting** — the `replace(...,"https://","")` is required for the `sub` condition; many tutorials get this wrong.
- **Vault role TTL too short** — if `max_ttl` < apply duration, mid-apply refreshes fail. Set 30 min minimum.

## How CallSphere does this in production

CallSphere's primary infra is a self-hosted k3s cluster (not EKS) with Postgres at 72.62.162.83 behind Cloudflare Tunnel — but tenant-isolated HIPAA installs use exactly this Terraform pattern on customer AWS accounts. We never store long-lived AWS keys in CI; Vault issues 30-min STS for every apply. 37 agents, 90+ tools, 115+ DB tables, $149/$499/$1499, 14-day [trial](/trial), 22% [affiliate](/affiliate), see [pricing](/pricing).

## FAQ

**Q: Terraform vs OpenTofu in 2026?**
OpenTofu is now wire-compatible with TF 1.10's ephemeral feature; pick OpenTofu if license matters, TF if you need HCP integrations.

**Q: How do I share state across teams?**
S3 + DynamoDB lock with SSE-KMS, or HCP Terraform Workspaces with VCS-driven runs.

**Q: Why not store AWS keys in CI secrets and skip Vault?**
You can, but you're now committed to rotating those keys. OIDC + Vault is set-and-forget.

**Q: Where do model API keys go?**
External Secrets Operator pulls them from Vault into the pod at runtime. Never in TF state.

## Sources

- [Terraform 1.10 improves handling secrets with ephemeral values — HashiCorp](https://www.hashicorp.com/en/blog/terraform-1-10-improves-handling-secrets-in-state-with-ephemeral-values)
- [HashiCorp Terraform 1.10 adds Ephemeral Values — InfoQ](https://www.infoq.com/news/2024/11/terraform-1-10-ephemeral-values/)
- [Terraform Latest Trends 2026 — Clanker Cloud](https://clankercloud.ai/blog/terraform-latest-trends-2026-infrastructure-as-code)
- [Terraform Ephemeral Resources — Infisical](https://infisical.com/blog/terraform-ephemeral-resources)
- [Terraform + AI: 5 Futuristic Ways — TecAdmin](https://tecadmin.net/terraform-ai-futuristic-ways/)

---

Source: https://callsphere.ai/blog/vw6h-terraform-ai-voice-infrastructure-ephemeral-resources-2026
