iceberg-mcp-server-hive

frothkoetter/iceberg-mcp-server-hive
0 starsApache-2.0Community

Install to Claude Code

This server doesn't publish a one-line install command. Follow the setup in the source repository.

Summary

Provides read-only SQL access to Apache Iceberg tables via HiveServer2, enabling querying, schema discovery, and database listing on Cloudera Data Platform.

README.md

Cloudera Iceberg MCP Server (via Hive)

Fork of cloudera/iceberg-mcp-server that uses Apache Hive (HiveServer2) instead of Impala for read-only access to Iceberg tables on CDP.

MCP Tools

| Tool | Description | |------|-------------| | execute_query(query) | Run read-only SQL (SELECT, SHOW, DESCRIBE, WITH, EXPLAIN) | | get_schema(database?) | List tables in the configured or given database | | list_databases() | List all visible Hive databases | | list_iceberg_snapshots(database, table) | Snapshot history (db.table.HISTORY, with TBLPROPERTIES fallback) | | list_iceberg_refs(database, table) | Branches and tags (db.table.REFS) | | create_iceberg_branch(...) | Create branch from current state, snapshot ID, or timestamp | | drop_iceberg_branch(...) | Drop a branch | | fast_forward_iceberg_branch(...) | Fast-forward branch hierarchy | | query_iceberg_branch(...) | Read from db.table.branch_<name> | | execute_iceberg_branch_dml(...) | INSERT / UPDATE / DELETE on a branch |

Iceberg branching (and tagging) is supported in Hive on CDP, not Impala. See branching and tagging.

Audit / write branch workflow

  1. list_iceberg_snapshots — pick a snapshot ID or timestamp
  2. list_iceberg_refs — inspect existing branches/tags
  3. create_iceberg_branch — fork an audit branch (FOR SYSTEM_VERSION or current head)
  4. query_iceberg_branch — read branch state
  5. execute_iceberg_branch_dml — write changes on the branch only
  6. fast_forward_iceberg_branch — advance a branch when ready
  7. drop_iceberg_branch — cleanup

Branch refs use lowercase branch_ prefix: mydb.mytable.branch_audit.

Configuration

Connection uses impyla against HiveServer2 (HTTP transport for CDP/Knox).

Example JDBC URL from CDP Data Warehouse:

jdbc:hive2://hs2-cdw-aw-se-hive.dw-se-sandbox-aws.a465-9q4k.cloudera.site/default;transportMode=http;httpPath=cliservice;ssl=true

Maps to MCP env vars:

| JDBC / CDP | Env var | |------------|---------| | Host in URL | HIVE_HOST | | Path after host (/default) | HIVE_DATABASE | | httpPath=cliservice | HIVE_HTTP_PATH | | transportMode=http | HIVE_USE_HTTP_TRANSPORT=true | | ssl=true | HIVE_USE_SSL=true | | Port (443 implied) | HIVE_PORT=443 | | LDAP user/password | HIVE_USER, HIVE_PASSWORD |

| Variable | Default | Description | |----------|---------|-------------| | HIVE_HOST | — | HiveServer2 or Knox gateway host | | HIVE_PORT | 443 | HS2 port (443 for Knox HTTP) | | HIVE_USER | — | LDAP / service user | | HIVE_PASSWORD | — | Password | | HIVE_DATABASE | default | Default database for SHOW TABLES | | HIVE_AUTH_MECHANISM | LDAP | impyla auth mechanism | | HIVE_USE_HTTP_TRANSPORT | true | HTTP transport (typical on CDP) | | HIVE_HTTP_PATH | cliservice | Knox / HS2 HTTP path | | HIVE_USE_SSL | true | TLS | | MCP_TRANSPORT | stdio | stdio, http, or sse |

Claude Desktop / Agent Studio

Cloudera Agent Studio (recommended)

Agent Studio only supports stdio MCP servers launched with uvx (Python) or npx (Node.js). Use a git URL so the runtime can install the package; do not use uv run unless the repo is checked out on the same machine.

{
  "mcpServers": {
    "iceberg-mcp-server-hive": {
      "command": "uvx",
      "args": [
        "--from",
        "git+https://github.com/frothkoetter/iceberg-mcp-server-hive@main",
        "run-server"
      ],
      "env": {
        "HIVE_HOST": "hs2-cdw-aw-se-hive.dw-se-sandbox-aws.a465-9q4k.cloudera.site",
        "HIVE_PORT": "443",
        "HIVE_USER": "YOUR_USER",
        "HIVE_PASSWORD": "YOUR_PASSWORD",
        "HIVE_DATABASE": "default",
        "HIVE_USE_HTTP_TRANSPORT": "true",
        "HIVE_HTTP_PATH": "cliservice",
        "HIVE_USE_SSL": "true",
        "HIVE_AUTH_MECHANISM": "LDAP"
      }
    }
  }
}

Registration tips

  1. Use placeholder credentials during catalog registration; provide real HIVE_USER / HIVE_PASSWORD when attaching the MCP server to a workflow agent.
  2. If you see "We could not figure out the tools offered by the MCP server", the server may still work in workflows — Agent Studio documents occasional tool-discovery failures. Add the MCP server to your agent manually and select tools there.
  3. Avoid writing to stdout from wrapper scripts — stdio transport uses stdout for JSON-RPC. This server logs only through the MCP SDK.
  4. Ensure the Agent Studio environment can reach HIVE_HOST on port 443 (Knox / Hive VW).

See also Cloudera MCP registration docs.

Claude Desktop (local checkout)

{
  "mcpServers": {
    "iceberg-mcp-server-hive": {
      "command": "uv",
      "args": ["run", "run-server"],
      "env": {
        "HIVE_HOST": "hs2-your-cluster.example.cloudera.site",
        "HIVE_PORT": "443",
        "HIVE_USER": "username",
        "HIVE_PASSWORD": "password",
        "HIVE_DATABASE": "default"
      }
    }
  }
}

Local development

git clone https://github.com/<your-org>/iceberg-mcp-server-hive.git
cd iceberg-mcp-server-hive
uv sync --dev
export HIVE_HOST=... HIVE_USER=... HIVE_PASSWORD=...
uv run run-server

Differences from upstream (Impala)

  • Environment variables use HIVE_ instead of IMPALA_
  • get_schema returns {database, tables} and accepts an optional database name
  • Added list_databases tool
  • execute_query returns {columns, rows} for SELECT results

Examples

See ./examples for LangChain and OpenAI SDK notebooks (update env vars from IMPALA_ to HIVE_).

Copyright (c) 2025 - Cloudera, Inc. All rights reserved.

Related MCP servers

Browse all →