Oday Bakkour Logo
Back to Vault
aitoolkitfree

Page Agent: The GUI Agent Living in Your Webpage

30 minutes
beginner
Page Agent: The GUI Agent Living in Your Webpage

Get Started

Control web interfaces with natural language using Page Agent. JavaScript GUI automation toolkit for developers.

Access Resource

External Resource • Safe Link

Page Agent: The GUI Agent Living in Your Webpage

Page Agent is a revolutionary JavaScript tool that allows you to control web interfaces using natural language commands. It integrates seamlessly into your web pages without requiring browser extensions, Python, or headless browsers.

Key Features

Easy Integration

  • No need for browser extensions / Python / headless browser
  • Just in-page JavaScript
  • Everything happens in your web page
  • The best tool for your agent to control web pages

Text-based DOM Manipulation

  • No screenshots required
  • No OCR or multi-modal LLMs needed
  • No special permissions required

Flexible AI Integration

  • Bring your own LLMs
  • Compatible with various AI models
  • Support for custom configurations

User Experience

  • Pretty UI with human-in-the-loop functionality
  • Optional Chrome extension for multi-page tasks

Use Cases

SaaS AI Copilot

  • Ship an AI copilot in your product in lines of code
  • No backend rewrite needed

Smart Form Filling

  • Turn 20-click workflows into one sentence
  • Perfect for ERP, CRM, and admin systems

Accessibility

  • Make any web app accessible through natural language
  • Voice commands, screen readers, zero barrier

Multi-page Agent

  • Extend your agent's reach across browser tabs
  • With the optional Chrome extension

Quick Start

One-line Integration

The fastest way to try PageAgent with our free Demo LLM:

<script src="https://cdn.jsdelivr.net/npm/page-agent@1.5.5/dist/iife/page-agent.demo.js" crossorigin="true"></script>

For users in China:

<script src="https://registry.npmmirror.com/page-agent/1.5.5/files/dist/iife/page-agent.demo.js" crossorigin="true"></script>

⚠️ Note: This demo CDN uses our free testing LLM API. By using it, you agree to its terms.

NPM Installation

npm install page-agent
import { PageAgent } from 'page-agent'

const agent = new PageAgent({
  model: 'qwen3.5-plus',
  baseURL: 'https://dashscope.aliyuncs.com/example/v1',
  apiKey: 'YOUR_API_KEY',
  language: 'en-US',
})

await agent.execute('Click the login button')

For more programmatic usage, refer to the 📖 Documentation.

Technical Details

  • Built with TypeScript (82.1%), JavaScript (11.2%), CSS (6.2%), and HTML (0.5%)
  • 697 commits in the repository
  • Active community with 3.7k stars, 8 watchers, and 285 forks
  • Licensed under MIT License

Community & Support

Contributing

We welcome contributions from the community! Follow our instructions in CONTRIBUTING.md for environment setup and local development.

Acknowledgments

Page Agent builds upon the excellent work of browser-use. The project acknowledges browser-use for its excellent work on web automation and DOM interaction patterns.

#ai#web-automation#javascript#llm#gui-agent

Comments

Share your thoughts and join the conversation

Leave a Comment

Loading comments...