Skip to content
Playwright Page Interactions: Clicking, Typing, and Navigating with Python
Learn Agentic AI12 min read41 views

Playwright Page Interactions: Clicking, Typing, and Navigating with Python

Master Playwright's interaction API for AI agents — learn how to click buttons, fill forms, select dropdowns, use keyboard and mouse actions, and implement reliable waiting strategies.

Beyond Navigation: Interacting with Pages

Once your AI agent can navigate to a page and locate elements, the next step is interacting with those elements — clicking buttons, filling forms, selecting options from dropdowns, and handling keyboard shortcuts. Playwright provides a rich interaction API that automatically waits for elements to be actionable before performing actions, which eliminates the flaky timing issues that plague other automation tools.

This post covers every major interaction method with practical examples you can use in your AI agents.

Clicking Elements

Playwright's click() method automatically waits for the element to be visible, stable (not animating), enabled, and not obscured by other elements:

flowchart LR
    INPUT(["User intent"])
    PARSE["Parse plus<br/>classify"]
    PLAN["Plan and tool<br/>selection"]
    AGENT["Agent loop<br/>LLM plus tools"]
    GUARD{"Guardrails<br/>and policy"}
    EXEC["Execute and<br/>verify result"]
    OBS[("Trace and metrics")]
    OUT(["Outcome plus<br/>next action"])
    INPUT --> PARSE --> PLAN --> AGENT --> GUARD
    GUARD -->|Pass| EXEC --> OUT
    GUARD -->|Fail| AGENT
    AGENT --> OBS
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OBS fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://example.com")

    # Click a button by role
    page.get_by_role("button", name="Submit").click()

    # Click a link by text
    page.get_by_text("Learn More").click()

    # Double click
    page.locator("#editable-field").dblclick()

    # Right click (context menu)
    page.locator("#item").click(button="right")

    # Click at specific position within an element
    page.locator("#canvas").click(position={"x": 100, "y": 200})

    # Force click — bypass actionability checks (use sparingly)
    page.locator("#hidden-button").click(force=True)

    browser.close()

The force=True option should be reserved for edge cases where Playwright's actionability checks conflict with unusual page behavior. In most situations, if an element is not clickable, that is a real problem your agent should handle rather than force through.

Filling Forms

Form filling is one of the most common tasks for AI agents. Playwright provides specialized methods for different input types:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →
# Text input — clears existing content first
page.get_by_label("Username").fill("ai_agent_user")
page.get_by_label("Password").fill("secure_password_123")

# Type character by character (simulates real typing)
page.get_by_label("Search").type("machine learning", delay=50)

# Clear a field
page.get_by_label("Email").clear()

# Fill and press Enter in one flow
page.get_by_label("Search").fill("agentic AI")
page.get_by_label("Search").press("Enter")

The difference between fill() and type() matters for AI agents. fill() sets the value instantly (fast, reliable), while type() simulates individual keystrokes with an optional delay (slower, but triggers keystroke event listeners that some sites rely on for validation or autocomplete).

Selecting from Dropdowns

Playwright handles both native HTML <select> elements and custom dropdown components:

# Native <select> — by value
page.get_by_label("Country").select_option("us")

# By visible text
page.get_by_label("Country").select_option(label="United States")

# By index
page.get_by_label("Country").select_option(index=2)

# Multiple selection
page.get_by_label("Skills").select_option(["python", "javascript", "rust"])

# Custom dropdown (not a <select>) — click to open, then click option
page.locator(".custom-dropdown-trigger").click()
page.locator(".dropdown-option", has_text="United States").click()

Checkbox and Radio Button Interactions

# Check a checkbox
page.get_by_label("I agree to terms").check()

# Uncheck
page.get_by_label("Subscribe to newsletter").uncheck()

# Set to a specific state (check if unchecked, noop if already checked)
page.get_by_label("Enable notifications").set_checked(True)

# Verify state
is_checked = page.get_by_label("I agree to terms").is_checked()
print(f"Terms accepted: {is_checked}")

Keyboard Actions

AI agents sometimes need to trigger keyboard shortcuts or special keys:

# Press a single key
page.keyboard.press("Escape")
page.keyboard.press("Tab")
page.keyboard.press("Enter")

# Keyboard shortcuts
page.keyboard.press("Control+a")  # Select all
page.keyboard.press("Control+c")  # Copy
page.keyboard.press("Control+v")  # Paste

# Type a string (fires keydown, keypress, keyup for each char)
page.keyboard.type("Hello, World!", delay=100)

# Hold and release keys
page.keyboard.down("Shift")
page.keyboard.press("ArrowDown")
page.keyboard.press("ArrowDown")
page.keyboard.up("Shift")

Mouse Actions

For complex interactions like drag-and-drop or hover menus:

# Hover to reveal a tooltip or dropdown
page.locator(".user-menu").hover()
page.locator(".dropdown-item", has_text="Settings").click()

# Drag and drop
page.locator("#source-item").drag_to(page.locator("#target-area"))

# Manual mouse movement
page.mouse.move(100, 200)
page.mouse.down()
page.mouse.move(300, 400)
page.mouse.up()

Waiting Strategies for Reliable Interactions

Playwright auto-waits before actions, but sometimes your agent needs explicit waits:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

# Wait for an element to appear
page.wait_for_selector(".loading-spinner", state="hidden")

# Wait for an element to be visible
page.locator(".results-panel").wait_for(state="visible")

# Wait for a specific condition with a custom timeout
page.get_by_role("button", name="Download").wait_for(
    state="visible",
    timeout=10000
)

# Wait for a function to return true
page.wait_for_function("document.querySelector('.data-loaded') !== null")

# Wait for navigation after a click
with page.expect_navigation():
    page.get_by_text("Next Page").click()

Complete Form Automation Example

Here is a complete example that demonstrates a realistic AI agent form-filling workflow:

from playwright.sync_api import sync_playwright

def fill_contact_form():
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        page = browser.new_page()
        page.goto("https://httpbin.org/forms/post")

        # Fill text fields
        page.get_by_label("Customer name").fill("AI Agent Demo")
        page.get_by_label("Telephone").fill("555-0100")
        page.get_by_label("E-mail address").fill("agent@example.com")

        # Select pizza size (radio buttons)
        page.get_by_label("Medium").check()

        # Select toppings (checkboxes)
        page.get_by_label("Bacon").check()
        page.get_by_label("Onion").check()

        # Fill delivery time
        page.get_by_label("Preferred delivery time").fill("19:30")

        # Add special instructions
        page.get_by_label("Delivery instructions").fill(
            "Ring doorbell twice. Leave at door if no answer."
        )

        # Submit the form
        page.get_by_role("button", name="Submit order").click()

        # Wait for and capture the response
        page.wait_for_load_state("networkidle")
        print("Form submitted successfully")
        print(f"Response URL: {page.url}")

        browser.close()

fill_contact_form()

Assertions for Verification

After performing actions, your AI agent should verify the results:

from playwright.sync_api import expect

# Verify text content
expect(page.locator(".success-message")).to_have_text("Form submitted")

# Verify visibility
expect(page.locator(".error-banner")).not_to_be_visible()

# Verify input value
expect(page.get_by_label("Email")).to_have_value("agent@example.com")

# Verify URL after navigation
expect(page).to_have_url("**/success**")

FAQ

When should an AI agent use type() instead of fill()?

Use type() when the website relies on keystroke events for functionality like autocomplete suggestions, real-time validation, or search-as-you-type features. Use fill() for everything else because it is faster and more reliable. A good heuristic is to start with fill() and switch to type() only if the site does not respond correctly.

How does Playwright handle elements that are not yet on the page?

Playwright's locator API is lazy — it does not query the DOM until you perform an action. When you call page.get_by_role("button", name="Submit").click(), Playwright waits up to 30 seconds (configurable) for the button to appear, become visible, and be actionable before clicking. If the element never appears, it throws a TimeoutError that your agent can catch and handle.

Can Playwright interact with iframes?

Yes. Use page.frame_locator() to target elements inside iframes. For example, page.frame_locator("#payment-iframe").get_by_label("Card number").fill("4242..."). Each iframe is treated as a separate frame, and Playwright handles cross-origin iframes transparently.


#Playwright #FormAutomation #AIAgents #Python #WebInteraction #BrowserTesting #ClickAndType

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

AI Agents

Personal AI Assistant: How to Pick One for Business in 2026

A founder's guide to the personal AI assistant market: best AI assistant apps, business-grade options, and how CallSphere's voice agent fits in.

AI Agents

Free AI Agents in 2026: When Free Wins and When It Costs You

A founder's guide to free AI agents, low-code AI agent builders, and how to know when you should pay for a real platform like CallSphere.

Agentic AI

Graphiti: How Temporal Knowledge Graphs Give AI Voice Agents Persistent Memory (2026 Guide)

Graphiti is the open-source temporal knowledge graph for AI agents in 2026. Learn how bi-temporal memory beats vector RAG for voice agents and long-running LLMs.

AI Agents

Chatbot App vs ChatGPT: What's the Difference, and Which Do I Need?

Chatbot app vs ChatGPT in 2026: a founder's clear take on the difference, when to use which, and how a real AI chatbot app development works.

HVAC

Building an HVAC After-Hours Emergency Escalation System: A Complete Engineering Guide

How we built a fault-tolerant HVAC emergency triage and tech-dispatch platform on Kubernetes — three-tier CQRS, 11 micro-agents on the OpenAI Agents SDK + LangGraph, NATS JetStream, DTMF/SMS/WebSocket acceptance, circuit breakers, and an evaluation pipeline that catches regressions before they wake a tech at 3 AM.

Enterprise AI

OpenAI Frontier vs Anthropic Managed Agents: 2026 Comparison

Head-to-head: OpenAI Frontier and Anthropic's managed agent stack — strengths, fit, and what each means for enterprise AI voice and chat deployment.