Skip to content
AI News
AI News10 min read1 views

Robotics Meets Agentic AI: Figure and Boston Dynamics Deploy LLM-Powered Robot Agents

Humanoid robots powered by large language models can now understand natural language commands and autonomously plan complex physical tasks, merging embodied AI with agentic reasoning.

When Language Models Get a Body

The convergence of large language models and physical robotics has produced what may be the most consequential development in AI since the transformer architecture itself. In Q1 2026, both Figure AI and Boston Dynamics demonstrated production-ready humanoid robots that use LLM-based agentic reasoning to understand natural language commands, plan multi-step physical tasks, and adapt to unexpected situations in real-time.

This is not the scripted, pre-programmed robotics of the past decade. These systems combine the natural language understanding and reasoning capabilities of frontier LLMs with the perception, manipulation, and locomotion capabilities of modern humanoid platforms. The result is robots that can be instructed in plain English to perform complex, multi-step tasks they have never been explicitly programmed to execute.

"We've spent decades trying to hand-code every possible scenario a robot might encounter," said Brett Adcock, CEO of Figure AI. "LLMs give us general-purpose reasoning that transfers to the physical world. The robot doesn't need to have seen a specific task before — it can reason about novel situations using the same common-sense understanding that makes language models useful."

Figure 02: The First Commercial LLM-Powered Humanoid

Figure AI's second-generation humanoid robot, Figure 02, began commercial deployments in January 2026 at BMW's manufacturing facility in Spartanburg, South Carolina. The robot stands 5'6", weighs 130 pounds, and features 40 degrees of freedom with hands capable of manipulating objects as small as a pen.

flowchart TD
    START["Robotics Meets Agentic AI: Figure and Boston Dyna…"] --> A
    A["When Language Models Get a Body"]
    A --> B
    B["Figure 02: The First Commercial LLM-Pow…"]
    B --> C
    C["Boston Dynamics Atlas: From Research Pl…"]
    C --> D
    D["The Technical Challenges"]
    D --> E
    E["The Competitive Landscape"]
    E --> F
    F["Sources"]
    F --> DONE["Key Takeaways"]
    style START fill:#4f46e5,stroke:#4338ca,color:#fff
    style DONE fill:#059669,stroke:#047857,color:#fff

What distinguishes Figure 02 from previous industrial robots is its cognitive architecture. The robot uses a multimodal LLM — developed in partnership with OpenAI — that processes visual input from stereo cameras, proprioceptive feedback from joint sensors, and natural language instructions simultaneously.

Task Planning and Execution

When given an instruction like "Sort these parts by size and place the defective ones in the red bin," Figure 02 decomposes the task into sub-steps: identify all parts, estimate relative sizes, determine sort order, detect defects using visual inspection, and execute the physical manipulation sequence. This decomposition happens in real-time using chain-of-thought reasoning within the LLM.

The robot's planning system generates a hierarchical task graph that it executes while continuously monitoring for deviations. If a part slips from its grasp, it doesn't fail catastrophically — it recognizes the error, re-plans, and recovers. This robustness comes from the LLM's ability to reason about unexpected situations rather than relying on brittle pre-programmed error handlers.

Performance Metrics

In BMW's initial deployment, Figure 02 achieved the following metrics after a 90-day evaluation period:

  • Task completion rate: 94% for trained tasks, 78% for novel task variations
  • Mean time to complete: Within 1.3x of human worker speed for manipulation tasks
  • Unplanned downtime: Less than 2% over the evaluation period
  • Safety incidents: Zero reportable incidents across 10,000+ operating hours

Boston Dynamics Atlas: From Research Platform to Agentic Worker

Boston Dynamics took a different path to the same destination. Their electric Atlas humanoid, which replaced the hydraulic research platform in 2024, now ships with what the company calls the "Cognitive Layer" — an LLM-based planning system that sits atop their industry-leading locomotion and manipulation controllers.

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

flowchart TD
    ROOT["Robotics Meets Agentic AI: Figure and Boston…"] 
    ROOT --> P0["Figure 02: The First Commercial LLM-Pow…"]
    P0 --> P0C0["Task Planning and Execution"]
    P0 --> P0C1["Performance Metrics"]
    ROOT --> P1["Boston Dynamics Atlas: From Research Pl…"]
    P1 --> P1C0["The Cognitive Architecture"]
    P1 --> P1C1["Warehouse Deployments"]
    ROOT --> P2["The Technical Challenges"]
    P2 --> P2C0["Latency"]
    P2 --> P2C1["Hallucination in Physical Space"]
    P2 --> P2C2["Cost"]
    style ROOT fill:#4f46e5,stroke:#4338ca,color:#fff
    style P0 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
    style P1 fill:#e0e7ff,stroke:#6366f1,color:#1e293b
    style P2 fill:#e0e7ff,stroke:#6366f1,color:#1e293b

The Cognitive Architecture

Atlas's cognitive architecture separates reasoning into three layers:

Strategic Layer (LLM): Processes natural language instructions, decomposes them into sub-tasks, and manages high-level planning. This layer uses a fine-tuned version of a frontier model that has been trained on millions of hours of robotic task execution data.

Tactical Layer (Neural Controllers): Converts high-level sub-tasks into motion plans, handling path planning, obstacle avoidance, and dynamic balance. These controllers were trained using reinforcement learning in simulation and transfer to the physical robot.

Reflexive Layer (Real-time Control): Handles sub-millisecond balance corrections, force feedback during manipulation, and safety-critical responses. This layer operates independently of the higher layers and can override any command that would put the robot in an unsafe state.

"The LLM gives Atlas common sense," said Robert Playter, CEO of Boston Dynamics. "It knows that fragile things should be handled gently, that heavy objects need a wide base of support, and that it should ask for clarification if an instruction is ambiguous. These are things we could never hand-code for every possible scenario."

Warehouse Deployments

Atlas began warehouse deployments with Hyundai's logistics division in February 2026. In these environments, the robot performs mixed-SKU picking, palletizing, and inventory auditing — tasks that require the combination of physical dexterity, spatial reasoning, and language understanding that neither pure robotics nor pure AI could handle alone.

The Technical Challenges

Despite the impressive demonstrations, significant technical challenges remain before LLM-powered robots achieve widespread deployment.

Latency

LLM inference takes hundreds of milliseconds, which is acceptable for high-level planning but too slow for reactive physical control. The current architectures handle this through the layered approach described above, but edge cases still exist where the strategic layer's decisions arrive too late for the tactical layer to execute safely.

Hallucination in Physical Space

LLM hallucination in a text context is inconvenient. LLM hallucination in a physical context is dangerous. If a robot's language model incorrectly reasons that an object is lightweight when it's actually heavy, the resulting manipulation attempt could cause damage or injury. Both Figure and Boston Dynamics have invested heavily in grounding mechanisms that cross-reference LLM reasoning with sensor data, but the problem is not fully solved.

Cost

Figure 02 is priced at approximately $60,000-$80,000 per unit for commercial customers, with a total cost of ownership including maintenance and cloud compute for the LLM layer estimated at $15-20 per operating hour. While this is competitive with fully loaded human labor costs in many markets, it remains prohibitive for smaller operations.

The Competitive Landscape

Figure and Boston Dynamics are not alone. Tesla's Optimus program continues development with a target of sub-$20,000 unit cost. Chinese manufacturers including Unitree Robotics and UBTECH are shipping simpler humanoid platforms at lower price points. Agility Robotics' Digit, focused on logistics, has been deployed at Amazon facilities since 2024.

flowchart TD
    CENTER(("Key Developments"))
    CENTER --> N0["Task completion rate: 94% for trained t…"]
    CENTER --> N1["Mean time to complete: Within 1.3x of h…"]
    CENTER --> N2["Unplanned downtime: Less than 2% over t…"]
    CENTER --> N3["Safety incidents: Zero reportable incid…"]
    CENTER --> N4["Figure AI Blog: Figure 02 Commercial De…"]
    CENTER --> N5["Boston Dynamics: Atlas Cognitive Archit…"]
    style CENTER fill:#4f46e5,stroke:#4338ca,color:#fff

The race is now on to determine which architecture — and which business model — will dominate the emerging market for general-purpose humanoid robots. The LLM-powered agentic approach championed by Figure and Boston Dynamics represents the highest-capability but also highest-cost end of the spectrum.

What is clear is that the combination of large language models and physical robotics has crossed a threshold. Robots that can understand, reason, plan, and act in the physical world are no longer science fiction. They are shipping products with paying customers and measurable ROI.

Sources

Share
C

Written by

CallSphere Team

Expert insights on AI voice agents and customer communication automation.

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Technology

The Rise of Humanoid Robots: From Research Labs to Factory Floors | CallSphere Blog

Humanoid robots are moving from lab demos to real factory deployments. Explore the engineering breakthroughs in dexterity, balance, and AI that make this possible.

Technology

What Is Physical AI? How Robots Are Learning to Understand the Real World | CallSphere Blog

Physical AI combines embodied intelligence with world models so robots can perceive, reason, and act in unstructured environments. Learn how it works and why it matters.

Learn Agentic AI

Multi-Modal Agent Interfaces: Beyond Text to Voice, Vision, and Physical Interaction

Explore how AI agents are evolving beyond text-only interfaces to incorporate voice, vision, and physical interaction. Learn about modality fusion, embodied agents, spatial computing integration, and the design principles for multi-modal agent systems.

Agentic AI

AI-Powered Warehouse Robotics and Autonomous Inventory Management

Learn how agentic AI coordinates warehouse robots, automates inventory tracking, and optimizes order fulfillment across global logistics operations in the US, China, EU, and Japan.

AI News

AI Agents for Sales: Outreach, Apollo, and HubSpot Ship Autonomous SDR Agents

Sales platforms deploy AI agents that autonomously prospect, personalize outreach, follow up, and book meetings, transforming the sales development function.

AI News

From Pilot to Production: Why Most AI Projects Stall and How to Break Through | CallSphere Blog

A practical guide to overcoming the pilot-to-production gap in AI, covering the organizational, technical, and strategic barriers that prevent AI projects from scaling, with proven frameworks for breaking through.