Page Agent: The GUI Agent Living in Your Webpage

Get Started
Control web interfaces with natural language using Page Agent. JavaScript GUI automation toolkit for developers.
Access ResourceExternal Resource • Safe Link
Page Agent: The GUI Agent Living in Your Webpage
Page Agent is a revolutionary JavaScript tool that allows you to control web interfaces using natural language commands. It integrates seamlessly into your web pages without requiring browser extensions, Python, or headless browsers.
Key Features
Easy Integration
- No need for browser extensions / Python / headless browser
- Just in-page JavaScript
- Everything happens in your web page
- The best tool for your agent to control web pages
Text-based DOM Manipulation
- No screenshots required
- No OCR or multi-modal LLMs needed
- No special permissions required
Flexible AI Integration
- Bring your own LLMs
- Compatible with various AI models
- Support for custom configurations
User Experience
- Pretty UI with human-in-the-loop functionality
- Optional Chrome extension for multi-page tasks
Use Cases
SaaS AI Copilot
- Ship an AI copilot in your product in lines of code
- No backend rewrite needed
Smart Form Filling
- Turn 20-click workflows into one sentence
- Perfect for ERP, CRM, and admin systems
Accessibility
- Make any web app accessible through natural language
- Voice commands, screen readers, zero barrier
Multi-page Agent
- Extend your agent's reach across browser tabs
- With the optional Chrome extension
Quick Start
One-line Integration
The fastest way to try PageAgent with our free Demo LLM:
<script src="https://cdn.jsdelivr.net/npm/page-agent@1.5.5/dist/iife/page-agent.demo.js" crossorigin="true"></script>For users in China:
<script src="https://registry.npmmirror.com/page-agent/1.5.5/files/dist/iife/page-agent.demo.js" crossorigin="true"></script>⚠️ Note: This demo CDN uses our free testing LLM API. By using it, you agree to its terms.
NPM Installation
npm install page-agentimport { PageAgent } from 'page-agent'
const agent = new PageAgent({
model: 'qwen3.5-plus',
baseURL: 'https://dashscope.aliyuncs.com/example/v1',
apiKey: 'YOUR_API_KEY',
language: 'en-US',
})
await agent.execute('Click the login button')For more programmatic usage, refer to the 📖 Documentation.
Technical Details
- Built with TypeScript (82.1%), JavaScript (11.2%), CSS (6.2%), and HTML (0.5%)
- 697 commits in the repository
- Active community with 3.7k stars, 8 watchers, and 285 forks
- Licensed under MIT License
Community & Support
Contributing
We welcome contributions from the community! Follow our instructions in CONTRIBUTING.md for environment setup and local development.
Acknowledgments
Page Agent builds upon the excellent work of browser-use. The project acknowledges browser-use for its excellent work on web automation and DOM interaction patterns.
Repository Links
Comments
Share your thoughts and join the conversation