Skip to content
Learn Agentic AI
Learn Agentic AI archive page 20 of 146

Learn Agentic AI — Build Voice & Chat Agents

Step-by-step tutorials on building voice and chat AI agents using OpenAI Agents SDK, Realtime API, function calling, multi-agent orchestration, and production deployment patterns.

9 of 1309 articles

Learn Agentic AI
13 min read9Mar 17, 2026

Building a Claude Browser Agent: Automated Web Navigation with Anthropic SDK

Step-by-step guide to building a browser automation agent with Claude Computer Use — from SDK setup and screenshot capture to executing click, type, and scroll actions for real web navigation tasks.

Learn Agentic AI
12 min read9Mar 17, 2026

Building a Claude Web Scraper: Extracting Data Using Vision Instead of Selectors

Learn how to use Claude Computer Use for visual data extraction — reading HTML tables, parsing charts, extracting structured data from complex layouts, and converting visual information to JSON without any CSS selectors.

Learn Agentic AI
12 min read6Mar 17, 2026

Building Custom UFO Tasks: Automating Excel, Word, and Outlook with Natural Language

Practical examples of automating Microsoft Office applications with UFO — from Excel data manipulation and Word document formatting to Outlook email workflows, with multi-step task descriptions and result verification.

Learn Agentic AI
13 min read6Mar 17, 2026

Error Handling and Retry Patterns for Playwright AI Agents

Build resilient Playwright AI agents with comprehensive error handling for timeouts, missing elements, navigation failures, and network errors, plus retry decorators and graceful degradation strategies.

Learn Agentic AI
11 min read12Mar 17, 2026

Element Detection with GPT Vision: Finding Buttons, Forms, and Links Without Selectors

Discover how GPT Vision identifies interactive web elements visually, eliminating the need for CSS selectors or XPaths. Learn bounding box extraction, OCR-free text reading, and visual element classification.

Learn Agentic AI
11 min read6Mar 17, 2026

Using GPT-4 Vision to Understand Web Pages: Screenshot Analysis for AI Agents

Learn how to capture web page screenshots and send them to GPT-4 Vision for element identification, layout understanding, and structured analysis that powers browser automation agents.

Learn Agentic AI
13 min read8Mar 17, 2026

Playwright with Async Python: Concurrent Browser Automation for AI Agents

Learn how to use Playwright's async API with Python asyncio to run concurrent browser sessions, parallelize page interactions, and build high-throughput AI agent automation pipelines.

Learn Agentic AI
13 min read3Mar 17, 2026

Building a Vision-Based Web Navigator: GPT-4V Sees and Acts on Web Pages

Build a complete screenshot-action loop where GPT-4V analyzes web pages, decides where to click, and navigates autonomously. Learn coordinate extraction, click targeting, and navigation decision-making.