🌐 MCP Browser Agent [Node.js - Playwright - Claude Desktop]

Ivan Luna
AI Agent , Node.js , MCP , Playwright , Claude Desktop , Browser Automation , API Client , Integration
10 May, 2025

MCP Browser Agent is a powerful Model Context Protocol (MCP) integration that provides Claude Desktop with autonomous browser automation capabilities. This agent enables Claude to interact with web content, manipulate DOM elements, execute JavaScript, and perform API requests, all through natural language instructions.

Demo

Timestamps:

00:00 - Google Search for MCP: Navigation to Google homepage and search for “Model Context Protocol”.
00:33 - Screenshot Capture: Taking a screenshot of the search results with a custom filename and showcasing it in Finder. Shows how Claude can capture and save visual content from web pages during browser automation.
01:00 - Wikipedia Search: Navigation to Wikipedia.org and search for “Model Context Protocol”. Illustrates Claude’s ability to interact with different websites and their search functionality.
01:38 - Dropdown Menu Interaction I: Navigation to a test website (the-internet.herokuapp.com/dropdown) and selection of “Option 1” from a dropdown menu. Demonstrates Claude’s capability to interact with form elements and make selections.
01:56 - Dropdown Menu Interaction II:
Changing the selection to “Option 2” from the same dropdown menu. Shows Claude’s ability to manipulate the same form element multiple times and make different selections.
02:09 - Login Form Completion: Navigation to a login page (the-internet.herokuapp.com/login) and filling in the username field with “tomsmith” and password field with “SuperSecretPassword!”. Demonstrates form filling automation.
02:28 - Login Submission: Submitting the login credentials and completing the authentication process. Shows Claude’s ability to trigger form submissions and navigate through multi-step processes.
02:36 - API Request Execution: Performing a GET request to JSONPlaceholder API endpoint. Demonstrates Claude’s capability to make direct API calls and process the returned data through the MCP integration.

Key Features:

Advanced Browser Automation: Navigate to any URL, capture screenshots, interact with DOM elements, and execute JavaScript directly from Claude
Powerful API Client: Execute HTTP requests (GET, POST, PUT, PATCH, DELETE) with configurable headers and body content
Resource Management: Access browser console logs and screenshots through the MCP resource interface
Multi-browser Support: Compatible with Chrome, Firefox, Microsoft Edge, and WebKit (Safari engine)
Persistent Session: Maintains browser state across multiple commands for complex workflows
Error Handling: Provides detailed feedback and recovery options for automation challenges

Technical Overview:

Node.js Backend: Efficient server implementation with TypeScript
Model Context Protocol (MCP): Seamless integration with Claude Desktop
Playwright Framework: Industry-standard browser automation library
Multi-browser Support: Configure your preferred browser engine
Resource Exposure: Browser logs and screenshots available as queryable resources
Stateful Session Management: Persistent browser context between commands

Installation Guide:

# Clone repository
git clone https://github.com/imprvhub/mcp-browser-agent
cd mcp-browser-agent

# Install dependencies
npm install

# Build the project
npm run build

MCP Server Configuration:

Configure Claude Desktop to integrate with the Browser Agent MCP:

{
  "mcpServers": {
    "browserAgent": {
      "command": "node",
      "args": [
        "ABSOLUTE_PATH_TO_DIRECTORY/mcp-browser-agent/dist/index.js",
        "--browser",
        "chrome"
      ]
    }
  }
}

Replace ABSOLUTE_PATH_TO_DIRECTORY with the complete path to your installation directory.

Browser Selection:

The MCP Browser Agent supports multiple browser types:

Chrome (default): Uses installed Chrome browser
Firefox: Uses Firefox Nightly browser
Edge: Uses Microsoft Edge
WebKit: Uses WebKit engine (Safari-like experience)

You can specify your preferred browser through:

Command line argument: --browser chrome
Environment variable: MCP_BROWSER_TYPE=firefox
Configuration file: .mcp_browser_agent_config.json

Resource Access:

The MCP Browser Agent exposes the following resources:

browser://logs: Access browser console logs
screenshot://[name]: Access screenshots by name

System Requirements:

Node.js 16 or higher
Claude Desktop application
Playwright dependencies (auto-installed)
Supported browser (Chrome, Firefox, Edge, or WebKit)

Use Cases:

Web Research: Direct Claude to navigate websites and extract specific information
Task Automation: Automate multi-step web processes with AI-guided workflow
Testing Assistance: Help with web application testing and verification
API Exploration: Navigate API documentation and execute requests for exploration
Content Extraction: Retrieve content from websites with complex navigation
Form Automation: Fill and submit complex forms with validation logic
Technical Demonstrations: Create interactive demos of web applications
Data Collection: Gather information from multiple sources for analysis

Technical Implementation:

The MCP Browser Agent consists of four main components:

Server: Initializes the MCP server with Model Context Protocol standard
Tools Registry: Defines browser and API tool schemas with parameters
Request Handlers: Manages MCP protocol requests for tools and resources
Executor: Implements browser automation functions using Playwright

Unlike basic integrations, MCP Browser Agent functions as a true AI agent by maintaining persistent state, capturing detailed logs, managing complex interaction sequences, and supporting chained operations for complex workflows.

This integration demonstrates the powerful potential of combining large language models with browser automation capabilities, enabling Claude to interact with the web just as a human would, but with programmatic precision and speed.