gemini-computer-use-mcp

block-88/gemini-computer-use-mcp
0 starsNOASSERTIONCommunity

Install to Claude Code

This server doesn't publish a one-line install command. Follow the setup in the source repository.

Summary

An MCP server that enables browser control agents to plan and execute UI tasks using Gemini Computer Use.

README.md

Gemini Computer Use MCP

![License: MIT](https://opensource.org/licenses/MIT)

An MCP (Model Context Protocol) server for building browser-control agents using Gemini Computer Use. This project enables agents to plan and perform UI actions in a browser.

✨ Features

  • Computer Use (Browser Control): Provides an MCP tool (run_browser_task) to instruct a browser to perform a high-level task using the Gemini Computer Use model.
  • Generative AI Integration: Utilizes @google/genai for planning and executing computer-use steps.
  • stdio Transport: Communicates using the standard MCP stdio transport mechanism.

Learn more about Gemini Computer Use in the official docs: Gemini Computer Use

📚 Table of Contents

🚀 Usage

This project runs as an MCP server. It's typically invoked by an MCP client or controller.

Connecting an MCP Client

Point your MCP client to this server's executable. If your client supports a config file, use the following configs:

stdio Mode

// .mcp.json
{
  "mcpServers": {
    "gemini-computer-use": {
      "type": "stdio",
      "timeout": 300,
      "command": "npx",
      "args": ["--yes", "gemini-computer-use-mcp@latest"],
      "env": {
        "VERTEX_PROJECT_KEY": "vertex-project-key"
      }
    }
  }
}
# ~/.codex/config.toml
tool_timeout_sec = 300

[mcp_servers.gemini-computer-use]
command = "npx"
args = ["--yes", "gemini-computer-use-mcp@latest"]

[mcp_servers.gemini-computer-use.env]
VERTEX_PROJECT_KEY = "vertex-project-key"

SSE Mode

Start server with:

VERTEX_PROJECT_KEY=vertex-project-key npx --yes gemini-computer-use-mcp@latest --server

Then add:

// .mcp.json
{
  "mcpServers": {
    "gemini-computer-use": {
      "type": "sse",
      "timeout": 300,
      "url": "http://localhost:8888/sse"
    }
  }
}

Streamable HTTP Mode

Start server with:

VERTEX_PROJECT_KEY=vertex-project-key npx --yes gemini-computer-use-mcp@latest --server

Then add:

// .mcp.json
{
  "mcpServers": {
    "gemini-computer-use": {
      "type": "http",
      "timeout": 300,
      "url": "http://localhost:8888/mcp"
    }
  }
}
# ~/.codex/config.toml
tool_timeout_sec = 300

[mcp_servers.gemini-computer-use]
url = "http://localhost:8888/mcp"

Environment Variables

| Variable | Description | Required | Default | | --------------------- | -------------------------------------------------------------------------- | --------------------------------------- | ---------------------------------------- | | VERTEX_PROJECT_KEY | Vertex AI project key (preferred over GEMINI_API_KEY) | Yes, unless GEMINI_API_KEY is set | | | GEMINI_API_KEY | Your Gemini API key | Yes, unless VERTEX_PROJECT_KEY is set | | | MODEL | The model ID to use | No | gemini-2.5-computer-use-preview-10-2025| | PROJECT_PATH | Filesystem path used by some tools (defaults to current working directory) | No | (current working directory) | | PORT | Server port to use (only for streamable HTTP) | No | 8888 |

Note: Either GEMINI_API_KEY or VERTEX_PROJECT_KEY must be provided (see src/helpers/config.ts).

Tools

Once connected, the client can invoke the tools provided by this server.

run_browser_task

| Argument | Description | Required | Default | | ---------- | ------------------------------------------------ | -------- | -------------- | | task | The high-level task to perform | Yes | |

This tool leverages Gemini Computer Use to plan and perform UI actions to accomplish the provided task. It implements:

  • Automatic browser management: Checks for existing browser at localhost:9222 or starts a new instance
  • Agent loop: Continuously captures screenshots, sends them to Gemini, receives UI actions, and executes them
  • All supported UI actions: mouse movement, clicks, keyboard input, scrolling, text extraction, and more
  • Safety guidelines: Follows Gemini's safety best practices from the official documentation

See the official guidance for capabilities and safety considerations: Gemini Computer Use.

⚙️ Development

Prerequisites

  • Git

Steps

  1. Install dependencies:
   npm install
  1. Configuration:
  • Set GEMINI_API_KEY or VERTEX_PROJECT_KEY. Optionally set MODEL and PROJECT_PATH.
  1. Run:
  • In IDEs: Reload window and check if the MCP is connected
  • Manually: Run ./run in your terminal

💻 Technology Stack

📜 License

This project is licensed under the MIT License - see the LICENSE file for details. Copyright (c) 2025 Khoa Nguyen

📧 Contact

Related MCP servers

Browse all →