---
title: "Building a Floor Plan Analysis Agent: Room Detection, Measurement, and Description"
description: "Build an AI agent that analyzes architectural floor plans to detect rooms, classify their types, estimate areas, identify furniture, and generate natural language descriptions for real estate and interior design applications."
canonical: https://callsphere.ai/blog/building-floor-plan-analysis-agent-room-detection-measurement
category: "Learn Agentic AI"
tags: ["Floor Plan AI", "Room Detection", "Architecture", "Computer Vision", "Real Estate AI"]
author: "CallSphere Team"
published: 2026-03-18T00:00:00.000Z
updated: 2026-05-06T07:04:28.597Z
---

# Building a Floor Plan Analysis Agent: Room Detection, Measurement, and Description

> Build an AI agent that analyzes architectural floor plans to detect rooms, classify their types, estimate areas, identify furniture, and generate natural language descriptions for real estate and interior design applications.

## Why Floor Plan Analysis Matters

Real estate listings, interior design platforms, and property management systems all need structured data from floor plans. A floor plan image contains room layouts, dimensions, furniture placement, and spatial relationships — but this information is trapped in pixels. An AI agent that can parse floor plans into structured data unlocks automated property descriptions, accurate square footage calculations, and intelligent room comparisons.

The challenge is that floor plans come in wildly different styles: architectural blueprints, hand-drawn sketches, 3D-rendered marketing plans, and everything in between. A robust agent must handle this variety.

## The Analysis Pipeline

The agent processes floor plans through four stages: wall detection and room segmentation, room type classification, dimension estimation, and description generation.

```mermaid
flowchart LR
    CALLER(["Buyer or Seller Lead"])
    subgraph TEL["Telephony"]
        SIP["Twilio SIP and PSTN"]
    end
    subgraph BRAIN["Real Estate AI Agent"]
        STT["Streaming STT
Deepgram or Whisper"]
        NLU{"Intent and
Entity Extraction"}
        TOOLS["Tool Calls"]
        TTS["Streaming TTS
ElevenLabs or Rime"]
    end
    subgraph DATA["Live Data Plane"]
        CRM[("CRM and Notes")]
        CAL[("Calendar and
Schedule")]
        KB[("Knowledge Base
and Policies")]
    end
    subgraph OUT["Outcomes"]
        O1(["Showing scheduled"])
        O2(["Lead routed to agent"])
        O3(["Pre-qual handed to broker"])
    end
    CALLER --> SIP --> STT --> NLU
    NLU -->|Lookup| TOOLS
    TOOLS  CRM
    TOOLS  CAL
    TOOLS  KB
    NLU --> TTS --> SIP --> CALLER
    NLU -->|Resolved| O1
    NLU -->|Schedule| O2
    NLU -->|Escalate| O3
    style CALLER fill:#f1f5f9,stroke:#64748b,color:#0f172a
    style NLU fill:#4f46e5,stroke:#4338ca,color:#fff
    style O1 fill:#059669,stroke:#047857,color:#fff
    style O2 fill:#0ea5e9,stroke:#0369a1,color:#fff
    style O3 fill:#f59e0b,stroke:#d97706,color:#1f2937
```

## Wall Detection and Room Segmentation

Walls are the structural elements that define rooms. Detect them using morphological operations:

```python
import cv2
import numpy as np
from dataclasses import dataclass, field

@dataclass
class Room:
    room_id: int
    contour: np.ndarray
    bbox: tuple           # (x, y, w, h)
    area_pixels: float
    area_sqft: float = 0.0
    room_type: str = "unknown"
    center: tuple = (0, 0)
    furniture: list[str] = field(default_factory=list)

def detect_walls(image_path: str) -> np.ndarray:
    """Detect walls in a floor plan image."""
    img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

    # Binarize: walls are typically dark lines
    _, binary = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY_INV)

    # Use morphological closing to connect broken wall segments
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))
    walls = cv2.morphologyEx(binary, cv2.MORPH_CLOSE, kernel)

    # Thicken walls slightly for better room segmentation
    walls = cv2.dilate(walls, kernel, iterations=1)

    return walls

def segment_rooms(wall_mask: np.ndarray) -> list[Room]:
    """Segment the floor plan into individual rooms."""
    # Invert: rooms are the spaces between walls
    room_spaces = cv2.bitwise_not(wall_mask)

    # Remove small noise regions
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
    room_spaces = cv2.morphologyEx(room_spaces, cv2.MORPH_OPEN, kernel)

    # Find connected components (each is a room)
    num_labels, labels, stats, centroids = cv2.connectedComponentsWithStats(
        room_spaces, connectivity=4
    )

    rooms = []
    for i in range(1, num_labels):  # Skip background (label 0)
        area = stats[i, cv2.CC_STAT_AREA]

        # Filter out very small regions (noise) and the exterior
        if area  0.5 * wall_mask.size:
            continue

        # Create contour for this room
        room_mask = (labels == i).astype(np.uint8) * 255
        contours, _ = cv2.findContours(
            room_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE
        )

        if contours:
            rooms.append(Room(
                room_id=len(rooms),
                contour=contours[0],
                bbox=(
                    stats[i, cv2.CC_STAT_LEFT],
                    stats[i, cv2.CC_STAT_TOP],
                    stats[i, cv2.CC_STAT_WIDTH],
                    stats[i, cv2.CC_STAT_HEIGHT],
                ),
                area_pixels=area,
                center=(int(centroids[i][0]), int(centroids[i][1])),
            ))

    return rooms
```

## Scale Detection and Area Estimation

Floor plans often include a scale bar or dimension annotations. Detect these to convert pixel areas to real-world measurements:

```python
import pytesseract
from PIL import Image
import re

def detect_scale(image_path: str) -> float:
    """Detect the scale factor (pixels per foot) from annotations."""
    img = Image.open(image_path)
    text = pytesseract.image_to_string(img)

    # Look for dimension patterns like "10'" or "10ft" or "3m"
    ft_pattern = r"(\d+)['′]"
    m_pattern = r"(\d+\.?\d*)\s*m"

    ft_matches = re.findall(ft_pattern, text)
    m_matches = re.findall(m_pattern, text)

    if ft_matches:
        # Find the dimension annotation and its pixel length
        # This is a simplified version; production code would
        # locate the dimension line in the image
        return estimate_scale_from_annotation(image_path, ft_matches[0])

    return 10.0  # Default: 10 pixels per foot (rough estimate)

def estimate_scale_from_annotation(
    image_path: str,
    known_dimension: str,
) -> float:
    """Estimate pixels-per-foot from a known dimension annotation."""
    # In production, you would locate the dimension line endpoints
    # and compute: pixels_between_endpoints / dimension_in_feet
    known_ft = float(known_dimension)
    estimated_pixel_length = 100  # Placeholder
    return estimated_pixel_length / known_ft

def calculate_room_areas(
    rooms: list[Room],
    pixels_per_foot: float,
) -> list[Room]:
    """Convert pixel areas to square feet."""
    sqft_per_pixel = 1.0 / (pixels_per_foot ** 2)

    for room in rooms:
        room.area_sqft = round(room.area_pixels * sqft_per_pixel, 1)

    return rooms
```

## Room Type Classification

Classify rooms based on their size, shape, position, and any detected text labels or furniture:

```python
from openai import OpenAI

def classify_rooms_with_context(
    rooms: list[Room],
    image_path: str,
) -> list[Room]:
    """Classify room types using spatial context and LLM reasoning."""
    # Extract any text labels near each room
    img = Image.open(image_path)
    full_text = pytesseract.image_to_string(img)

    room_descriptions = []
    for room in rooms:
        desc = (
            f"Room {room.room_id}: "
            f"area={room.area_sqft:.0f} sqft, "
            f"dimensions={room.bbox[2]}x{room.bbox[3]} pixels, "
            f"center=({room.center[0]}, {room.center[1]})"
        )
        room_descriptions.append(desc)

    client = OpenAI()
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "You are a floor plan analyst. Given room measurements "
                "and positions, classify each room as one of: living_room, "
                "bedroom, kitchen, bathroom, dining_room, hallway, closet, "
                "garage, office, laundry, entrance. Consider typical room "
                "sizes: bathrooms are small (30-80 sqft), bedrooms are "
                "medium (100-200 sqft), living rooms are large (200+ sqft)."
            )},
            {"role": "user", "content": (
                f"Text found on floor plan: {full_text}\n\n"
                f"Rooms:\n" + "\n".join(room_descriptions)
            )},
        ],
    )

    classifications = response.choices[0].message.content
    # Parse LLM response and update rooms
    for room in rooms:
        for room_type in ["living_room", "bedroom", "kitchen",
                          "bathroom", "hallway", "closet"]:
            if (f"Room {room.room_id}" in classifications and
                    room_type in classifications.lower()):
                room.room_type = room_type
                break

    return rooms
```

## Furniture and Fixture Detection

Detect common furniture symbols in the floor plan:

```python
FURNITURE_TEMPLATES = {
    "toilet": {"min_area": 200, "max_area": 800, "aspect_range": (0.5, 1.5)},
    "bathtub": {"min_area": 800, "max_area": 3000, "aspect_range": (0.3, 0.6)},
    "sink": {"min_area": 100, "max_area": 500, "aspect_range": (0.7, 1.3)},
    "bed": {"min_area": 2000, "max_area": 8000, "aspect_range": (0.5, 0.8)},
    "table": {"min_area": 500, "max_area": 3000, "aspect_range": (0.6, 1.4)},
}

def detect_furniture(
    image: np.ndarray,
    rooms: list[Room],
) -> list[Room]:
    """Detect furniture symbols within each room."""
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) if len(image.shape) == 3 else image
    _, binary = cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY_INV)

    for room in rooms:
        x, y, w, h = room.bbox
        room_region = binary[y:y+h, x:x+w]

        contours, _ = cv2.findContours(
            room_region, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE
        )

        for contour in contours:
            area = cv2.contourArea(contour)
            if area  str:
    """Generate a natural language property description."""
    total_sqft = sum(r.area_sqft for r in rooms)
    bedrooms = [r for r in rooms if r.room_type == "bedroom"]
    bathrooms = [r for r in rooms if r.room_type == "bathroom"]

    client = OpenAI()
    room_details = "\n".join(
        f"- {r.room_type.replace('_', ' ').title()}: "
        f"{r.area_sqft:.0f} sqft"
        f"{', with ' + ', '.join(r.furniture) if r.furniture else ''}"
        for r in rooms
    )

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "Write a professional real estate listing description "
                "based on these floor plan details. Be factual, "
                "highlight the layout flow, and mention room sizes."
            )},
            {"role": "user", "content": (
                f"Total area: {total_sqft:.0f} sqft\n"
                f"Bedrooms: {len(bedrooms)}\n"
                f"Bathrooms: {len(bathrooms)}\n\n"
                f"Room details:\n{room_details}"
            )},
        ],
    )

    return response.choices[0].message.content
```

## FAQ

### How accurate are the area measurements from floor plan analysis?

Pixel-based area estimation typically achieves 85-95% accuracy when a reliable scale reference is detected. The main error sources are perspective distortion in photographs of floor plans, inconsistent line weights, and scale bars that are not detected correctly. For critical measurements, always include a disclaimer that areas are estimates and should be verified by professional measurement.

### Can this work on hand-drawn floor plans?

Yes, but with reduced accuracy. Hand-drawn plans have inconsistent line weights, imprecise angles, and often lack scale references. The wall detection stage needs more aggressive morphological operations, and room classification relies more heavily on text labels (which may be handwritten and harder to OCR). Expect 70-80% accuracy on room detection for clean hand-drawn plans.

### How do I handle multi-story buildings?

Process each floor plan image independently, then use the LLM to identify common elements (staircases, elevators) that connect floors. Generate a combined description that references the flow between levels. The key challenge is maintaining consistent room numbering across floors.

---

#FloorPlanAI #RoomDetection #RealEstateAI #ComputerVision #ArchitectureAI #PropertyTech #AgenticAI #Python

---

Source: https://callsphere.ai/blog/building-floor-plan-analysis-agent-room-detection-measurement