Remote OpenClaw Blog

Best MCP Servers for Data Engineers in 2026: Top Picks

9 min read · 20 October 2018

Postgres MCP Pro by Crystal DBA is the best MCP server for data engineers in 2026: it gives an AI agent configurable read-only or read-write access to Postgres plus index tuning and explain-plan analysis, which covers the database work data engineers do every day. This list ranks the 10 MCP servers that earn a slot in a data engineering setup, spanning transactional databases, cloud warehouses, dbt, and pipeline observability, each with a verified install command. For a role-agnostic ranking, see the broader best MCP servers in 2026 list and our best MCP servers for Claude Code guide; this post curates what belongs in a data engineer's stack.

How We Ranked These for Data Engineers

This ranking optimizes for one job: making an AI agent useful across the data lifecycle, from source databases to warehouse to transformation to monitoring. We scored each server on fit for data workflows (schema introspection, query execution, migration authoring, lineage), GitHub stars and maintenance checked against each repository in early July 2026, and install friction. Servers that shine for general coding but add little to a data pipeline live in the complete ranked MCP list instead.

All install commands follow the syntax in the official Claude Code MCP documentation, and the same servers work in any MCP client. Our directory indexes 13,870 MCP servers, and your agent can search that directory for you.

The 10 Best MCP Servers for Data Engineers

These ten cover transactional databases, document stores, cache, cloud warehouses, transformation, and pipeline observability: the full path from raw source to trusted table.

1. Postgres MCP Pro: safe database access (best overall)

Postgres MCP Pro by Crystal DBA (~3,000 stars, source on GitHub) gives an agent configurable read-only or read-write Postgres access plus index tuning, health checks, and explain-plan analysis. It is #1 because Postgres is the most common source-of-truth database and the restricted access mode is the killer feature: the agent inspects real schemas and data distributions while drafting migrations, with no risk to production.

claude mcp add --env DATABASE_URI=postgresql://user:pass@localhost:5432/db postgres -- uvx postgres-mcp --access-mode=restricted

Setup note: keep it at local scope. A connection string is a credential and should never land in a committed .mcp.json. Add OPENAI_API_KEY only if you want the experimental index-tuning advisor.

2. dbt-mcp: the transformation and semantic layer

The official dbt-mcp from dbt Labs is the only pick that reaches your models, lineage, and metrics rather than raw tables. It lets an agent query the dbt Semantic Layer, read project metadata, and run dbt commands across dbt Core, Cloud, and Fusion. For anyone whose source of truth is dbt, this is the highest-leverage server on the list.

claude mcp add dbt -- uvx --env-file /absolute/path/to/.env dbt-mcp

Setup note: the .env file supplies DBT_HOST, DBT_TOKEN (a service token), and the numeric DBT_PROD_ENV_ID / DBT_ACCOUNT_ID. For dbt Core, set DBT_PROJECT_DIR instead.

3. ClickHouse MCP: real-time analytics queries

The official ClickHouse MCP (~810 stars) runs SQL and introspects schemas against ClickHouse or embedded chDB. It is the go-to for real-time and columnar analytics workloads where an agent needs to explore event data or validate an aggregation before you ship a dashboard.

claude mcp add clickhouse -- uv run --with mcp-clickhouse --python 3.10 mcp-clickhouse

Setup note: supply CLICKHOUSE_HOST, CLICKHOUSE_USER, and CLICKHOUSE_PASSWORD as environment variables; add --env flags for each or use a wrapper. Keep it at local scope.

4. MongoDB MCP: documents and Atlas administration

The official MongoDB MCP (~1,100 stars) queries collections and administers Atlas clusters through natural language. Data engineers pulling from document sources get schema inference and query building without hand-writing aggregation pipelines.

claude mcp add --env MDB_MCP_CONNECTION_STRING=mongodb://localhost:27017 mongodb -- npx -y mongodb-mcp-server@latest --readOnly

Setup note: the --readOnly flag disables writes and is the right default for exploration. Use Atlas API credentials (MDB_MCP_API_CLIENT_ID and MDB_MCP_API_CLIENT_SECRET) instead of a connection string for cluster administration.

5. MySQL MCP: read-only-by-default SQL access

mcp-server-mysql by benborla (~1,900 stars) is the most-starred MySQL server and ships with writes disabled by default. The agent inspects schemas and runs SELECT queries safely; you opt into mutations explicitly per operation.

claude mcp add --env MYSQL_HOST=127.0.0.1 --env MYSQL_USER=root --env MYSQL_DB=analytics mysql -- npx -y @benborla29/mcp-server-mysql

Setup note: writes stay off until you set ALLOW_INSERT_OPERATION, ALLOW_UPDATE_OPERATION, or ALLOW_DELETE_OPERATION. Leave them unset for analytics work.

6. Neon MCP: serverless Postgres and branching

The official Neon MCP (~615 stars) manages serverless Postgres: create and branch databases, run SQL, and drive the Neon Management API. Database branching from an agent prompt is genuinely useful for spinning up isolated test data without touching production.

claude mcp add --transport http neon https://mcp.neon.tech/mcp

Setup note: the remote endpoint uses OAuth, so there is no key to manage. Install at user scope if you work across multiple Neon projects.

7. Redis MCP: cache and streaming data structures

The official Redis MCP (~540 stars) is a natural-language interface to store, query, and search Redis data structures including hashes, streams, JSON, and vectors. It matters for pipelines that use Redis as a cache, feature store, or stream buffer.

claude mcp add redis -- uvx --from redis-mcp-server@latest redis-mcp-server --url redis://localhost:6379/0

Setup note: pass the connection via --url or the REDIS_HOST / REDIS_PWD environment variables. Keep it at local scope.

8. BigQuery MCP: warehouse schema and SQL

mcp-server-bigquery (~127 stars, the smallest on this list) inspects BigQuery dataset and table schemas and executes SQL against Google BigQuery. It is the practical entry point for GCP-based warehouses when you want the agent to understand table shapes before writing a query.

claude mcp add bigquery -- uvx mcp-server-bigquery --project my-gcp-project --location US

Setup note: authentication uses Application Default Credentials from gcloud auth, or set BIGQUERY_KEY_FILE to a service-account JSON path. The --project and --location flags are required.

9. Grafana MCP: pipeline observability in your session

The official Grafana MCP (~3,200 stars) lets an agent query dashboards, datasources, alerts, and logs. For data engineers it closes the loop on freshness and quality: "which pipeline alert fired overnight" becomes a one-line prompt instead of a dashboard hunt.

claude mcp add --env GRAFANA_URL=http://localhost:3000 --env GRAFANA_SERVICE_ACCOUNT_TOKEN=glsa_xxx grafana -- docker run --rm -i -e GRAFANA_URL -e GRAFANA_SERVICE_ACCOUNT_TOKEN grafana/mcp-grafana -t stdio

Setup note: create a Grafana service account token scoped to read access. See our monitoring and observability MCP guide for the wider observability stack.

10. DuckDB MCP: local analytics without a warehouse

mcp-server-duckdb (~177 stars) gives an agent SQL access to a local DuckDB file, which is ideal for ad-hoc analysis of Parquet and CSV without standing up a warehouse. It ranks last only because it is single-file and local, not because it is weak; for prototyping transformations it is fast and free.

claude mcp add duckdb -- uvx mcp-server-duckdb --db-path ~/analytics.db --readonly

Setup note: the --readonly flag prevents writes; drop it only when you want the agent to materialize tables.

Comparison Table

The table compares all ten picks on the three things that matter at install time: what they connect to, transport, and the right scope.

Rank	Server	Best for	Transport	Auth needed	Recommended scope
1	Postgres MCP Pro	Postgres, migrations	stdio (uvx)	Connection string	local
2	dbt-mcp	Transformation, semantic layer	stdio (uvx)	dbt service token	project
3	ClickHouse MCP	Real-time analytics	stdio (uv)	Host + credentials	local
4	MongoDB MCP	Documents, Atlas	stdio (npx)	Connection string	local
5	MySQL MCP	MySQL, read-only SQL	stdio (npx)	Host + credentials	local
6	Neon MCP	Serverless Postgres, branching	HTTP (remote)	OAuth	user
7	Redis MCP	Cache, streams, vectors	stdio (uvx)	Connection URL	local
8	BigQuery MCP	GCP warehouse	stdio (uvx)	ADC or key file	local
9	Grafana MCP	Pipeline observability	stdio (docker)	Service token	user
10	DuckDB MCP	Local analytics	stdio (uvx)	None	local

Read-Only Mode and Credential Safety

The single most important setting for a data engineer connecting an AI agent to a database is read-only mode. Postgres MCP Pro's --access-mode=restricted, MySQL MCP's default off-switches for writes, MongoDB MCP's --readOnly flag, and DuckDB MCP's --readonly all exist so an agent can explore production data shapes without the ability to mutate them. Enable them by default and grant write access only on a scratch or branch database.

Connection strings are credentials. Keep any server that carries one at local scope so it stays in ~/.claude.json and never in a committed .mcp.json. When a team needs to share a database server, use ${VAR} expansion in .mcp.json so each teammate supplies their own credentials. Our database management MCP guide and Postgres MCP setup walkthrough cover this in more depth.

Limitations and Tradeoffs

These servers are not free wins. Every connected server adds tool definitions to your context window, so running all ten at once wastes tokens before you type a prompt; most data engineers should enable two or three per task, not the whole set. Two widely searched servers are worth a warning: the Snowflake Labs MCP is now deprecated in favor of Snowflake's official server, and the Elasticsearch MCP is in security-fix-only mode as Elastic moves users to its Agent Builder endpoint, so treat both as transitional. Stdio servers execute code on your machine, so vet what you install and prefer official or heavily starred servers. And when the agent can already run a query through a CLI you have configured, you may not need a dedicated server at all.

Related Guides

Go deeper

The operator playbooks

Production-ready PDF guides for OpenClaw and Hermes Agent — $19.99 each.

The OpenClaw Operator Guide →

The Hermes Agent Playbook →

Skills for this topic

Browse all skills →

sota-tracker-mcpromancircus2K installs bright-data-mcpbrightdata/skills895 installs relational-database-mcp-cloudbasetencentcloudbase/skills832 installs transition-mcpnftechie621 installs baselight-mcppjsousa79590 installs jquants-mcpajtgjmdjp561 installs

Frequently Asked Questions

What is the best MCP server for data engineers?

Postgres MCP Pro is the best MCP server for data engineers in 2026. It gives an AI agent configurable read-only or read-write Postgres access plus index tuning and explain-plan analysis, has around 3,000 GitHub stars as of July 2026, and its restricted access mode lets an agent inspect real schemas while writing migrations without touching production.

Can an AI agent connect to Snowflake or BigQuery through MCP?

Yes for BigQuery, with a caveat for Snowflake. The community mcp-server-bigquery server inspects schemas and runs SQL against Google BigQuery today. For Snowflake, the older Snowflake Labs MCP is deprecated in favor of Snowflake's official server, so check which one your Snowflake account documentation points to before installing.

Is it safe to give an AI agent access to a production database?

It is safe if you enable read-only mode and scope credentials carefully. Postgres MCP Pro, MySQL MCP, MongoDB MCP, and DuckDB MCP all ship read-only switches so the agent can explore data without mutating it. Keep connection strings at local scope, grant write access only on branch or scratch databases, and never commit credentials to a shared .mcp.json.

Does dbt have an official MCP server?

Yes. dbt-mcp is the official server maintained by dbt Labs. It lets an agent query the dbt Semantic Layer, read model metadata and lineage, and run dbt commands across dbt Core, Cloud, and Fusion. It is the only server on this list that reaches the transformation and metrics layer rather than raw tables.

How many MCP servers should a data engineer connect at once?

Two or three per task is the practical sweet spot. Every enabled server adds its tool definitions to the context window on every request, so connecting a database, a warehouse, and dbt all at once wastes tokens. Install broadly useful servers at user scope and enable warehouse-specific ones per project.

Loading article