DataDog MCP Server
A FastMCP server providing DataDog log search and monitoring capabilities through the Model Context Protocol (MCP).
Features
- Service-Aware Log Search: Search DataDog logs with service-specific filtering for enhanced targeting
- Tasmania Space Filtering: Filter tasmania service logs by space ID, user, or tenant
- Meeting-Specific Debugging: Find all logs related to specific meetings (7-day default)
- User Activity Tracking: Search user-related logs and activities across services
- Webhook Event Analysis: Find integration webhook events for Zoom and Whereby
- Error Detection: Search for recent errors across services with intelligent filtering
- APM Trace Correlation: Find all logs for specific traces
- STDIO Transport: Self-contained server for local MCP usage
- Environment-Based Configuration: Secure API key management
Response Size Safeguards
The MCP server includes comprehensive safeguards to prevent overwhelming LLM contexts with massive log responses:
Automatic Protections
- Response Size Limit: Responses are capped at 500KB total
- Per-Log Limits: Individual log entries are limited to 10KB
- Payload Filtering: Large request/response data is automatically truncated with summaries
- Query Validation: Warns about overly broad searches before execution
Size Management Features
- Smart Truncation: Large fields show truncated content with original size information
- Early Termination: Processing stops when size limits are reached
- Clear Messaging: Responses indicate when truncation occurs and why
Response Indicators
{
"logs": [...],
"response_size_bytes": 45120,
"truncated": true,
"truncation_reason": "Response size limit exceeded",
"skipped_logs": 15,
"recommendation": "Use more specific filters or pagination to reduce response size",
"query_warnings": {
"risk_level": "high",
"warnings": ["Literal string search without specific field filters"],
"recommendations": ["Add specific field filters like service:meeting"]
}
}
Preview-Then-Execute Workflow (RECOMMENDED)
For large or uncertain queries, use the two-step preview workflow to avoid overwhelming responses:
Step 1: Preview the Query
# Preview using relative time (hours)
preview = preview_search(service="meeting", user_id=214413, hours=24, limit=100)
# Or preview using specific time range
preview = preview_search(
service="meeting",
user_id=214413,
time_from="2025-01-15T10:00:00Z",
time_to="2025-01-15T18:00:00Z",
limit=100
)
Preview Response: ``json { "estimated_count": 150, "estimated_size_mb": 0.3, "sample_logs": [...], // 3 sample entries for structure "cache_id": "abc123-def456-...", "expires_in_seconds": 30, "execution_recommendation": "OK", "query_warnings": { "risk_level": "low", "warnings": [], "recommendations": ["Query looks well-targeted with service and user filters"] } } ``
Step 2: Execute or Refine
# If preview looks reasonable, execute with cache_id
if preview["execution_recommendation"] == "OK":
results = search_logs(cache_id=preview["cache_id"])
else:
# Refine with more specific structured filters
refined_results = search_logs(
service="meeting",
meeting_id=136666,
status="error",
hours=6
)
Best Practices
✅ Recommended Query Patterns
# BEST: Use structured filters with specific service (defaults to prod)
search_logs(service="meeting", meeting_id=136666, status="error", hours=2)
# Query staging environment
search_logs(service="meeting", env="staging", meeting_id=136666, status="error", hours=2)
# User-specific filtering across services
search_logs(service="tasmania", user_id=214413, hours=6)
# Space-specific filtering (Tasmania coaching spaces) in dev environment
search_logs(service="tasmania", env="dev", space_id=168565, hours=12)
# Actor email filtering (finds user activity across log formats)
search_logs(service="meeting", actor_email="user@example.com", hours=24)
# PREVIEW FIRST: For potentially large result sets
preview = preview_search(service="meeting", user_id=214413, hours=24)
if preview["execution_recommendation"] == "OK":
results = search_logs(cache_id=preview["cache_id"])
# Preview staging environment
preview = preview_search(service="meeting", env="staging", user_id=214413, hours=24)
❌ Avoid These Patterns (Will Trigger Warnings)
# WRONG: Raw DataDog query strings (defeats smart filtering)
search_logs(query='env:prod "meeting_id:136666"', hours=24)
# TOO BROAD: No service or specific filters
search_logs(query="env:prod", hours=12)
# HIGH VOLUME: Long time range without targeted filters
search_logs(hours=48, limit=200)
Environment Filtering
All tools support environment filtering via the env parameter:
- Default:
prod(production environment) - Common values:
staging,dev,test - Custom: Any environment name matching your DataDog setup
# Search production (default)
search_logs(service="meeting", meeting_id=136666)
# Search staging explicitly
search_logs(service="meeting", env="staging", meeting_id=136666)
# Search dev environment
search_logs(service="meeting", env="dev", meeting_id=136666)
# Test connection to specific environment
test_connection(env="staging")
Service-Specific Filtering
The DataDog MCP server automatically adapts filtering based on the service being queried. Each service has its own set of available filters:
Supported Services
| Service | Available Filters | Description | |---------|------------------|-------------| | tasmania | user_id, tenant_id, space_id | Coaching platform logs with space-based filtering | | meeting | user_id, tenant_id, meeting_id, path_id | Meeting service logs | | assessment | user_id, tenant_id, assessment_id | Assessment service logs | | integration | user_id, tenant_id, meeting_id, provider | Integration service logs |
Filter Examples
# Tasmania space-specific filtering (maps to path filtering)
search_logs(service="tasmania", space_id=168565, user_id=214413)
# Meeting service error tracking
search_logs(service="meeting", meeting_id=136666, status="error")
# Actor email filtering (finds user activity across different log formats)
search_logs(service="meeting", actor_email="user@example.com", hours=24)
# Integration provider filtering
search_logs(service="integration", provider="zoom", meeting_id=136666)
Installation
Requirements
- Python 3.11+
- DataDog API key and Application key
- uv (recommended) or pip
Setup
Option 1: Install as Global CLI Tool (Recommended)
- Install as a global CLI tool using uv:
uv tool install git+https://github.com/everwise/torch-datadog-mcp.git
Or using pipx: ``bash pipx install git+https://github.com/everwise/torch-datadog-mcp.git ``
Or add to project dependencies: ``bash uv add git+https://github.com/everwise/torch-datadog-mcp.git ``
- Set environment variables:
export DD_API_KEY=your_api_key_here
export DD_APP_KEY=your_app_key_here
export DD_SITE=datadoghq.com # Optional, default shown
Option 2: Local Development Setup
- Clone the repository:
git clone https://github.com/everwise/torch-datadog-mcp.git
cd torch-datadog-mcp
- Install dependencies:
# Using uv (recommended)
uv sync
# Or using pip
pip install -e .
- Configure environment variables:
# Create .env file with your DataDog credentials
touch .env
- Set your DataDog API keys in
.env:
DD_API_KEY=your_api_key_here
DD_APP_KEY=your_app_key_here
DD_SITE=datadoghq.com # Optional, default shown
Getting DataDog API Keys
- Go to DataDog Organization Settings > API Keys
- Create or copy your API Key
- Go to Application Keys
- Create or copy your Application Key
Usage
Running the Server
# If installed from GitHub
datadog-mcp
# For local development with uv
uv run --project /path/to/torch-datadog-mcp datadog-mcp
# Alternative local development commands
uv run python -m datadog_mcp.server
python src/datadog_mcp/server.py
The server runs in STDIO mode by default, making it suitable for MCP clients.
Available Tools
The server provides 8 focused tools for DataDog log analysis. All tools support environment filtering via the env parameter (defaults to prod):
Core Search Tools
preview_search: Preview query size and count before execution (30s cache)- Supports:
env,service,hours/time_from/time_to, filters search_logs: Enhanced main search with service-aware filtering and size safeguards- Supports:
env,service,hours/time_from/time_to, filters, pagination get_trace_logs: Get all logs for a specific APM trace ID- Supports:
trace_id,env,hours/time_from/time_to, pagination
Business Analysis Tools
search_business_events: Find business events across services- Supports:
event_type,env,service,hours/time_from/time_to trace_request_flow: Track requests across multiple services using correlation IDs- Supports:
request_id,env,hours/time_from/time_to
Utility Tools
test_connection: Test DataDog API connectivity- Supports:
env(tests connection for specific environment) get_server_info: Get server configuration informationdebug_configuration: Get detailed debugging information
Service-Specific Structured Filters
Use these structured filters with the search_logs tool (automatically maps to correct DataDog fields):
Tasmania Service:
user_id→ Maps to current user context fieldstenant_id→ Maps to current tenant contextspace_id→ Maps to space path filtering (/api/v1/spaces/{id}*)actor_email→ Maps to statement actor email fields
Meeting Service:
meeting_id→ Maps to meeting ID fielduser_id→ Maps to user ID fieldtenant_id→ Maps to tenant ID fieldpath_id→ Maps to notifiable ID (learning paths)actor_email→ Maps to events actor email
Assessment Service:
assessment_id→ Maps to assessment ID fielduser_id,tenant_id→ Standard user/tenant filtering
Integration Service:
provider→ Filter by integration provider (zoom, whereby)meeting_id→ Maps to meeting ID fielduser_id,tenant_id→ Standard user/tenant filtering
Claude Desktop Integration
Add this configuration to your Claude Desktop config file:
macOS
Edit ~/Library/Application Support/Claude/claude_desktop_config.json:
For Global Tool Installation (Recommended)
{
"mcpServers": {
"datadog": {
"command": "datadog-mcp",
"env": {
"DD_API_KEY": "your_api_key_here",
"DD_APP_KEY": "your_app_key_here",
"DD_SITE": "datadoghq.com"
}
}
}
}
For Local Development Setup
{
"mcpServers": {
"datadog": {
"command": "uv",
"args": ["run", "--project", "/path/to/torch-datadog-mcp", "datadog-mcp"],
"env": {
"DD_API_KEY": "your_api_key_here",
"DD_APP_KEY": "your_app_key_here",
"DD_SITE": "datadoghq.com"
}
}
}
}
Windows
Edit %APPDATA%\\Claude\\claude_desktop_config.json with similar configuration.
Alternative: Using Environment Variables
If you set environment variables globally, you can omit the env section:
{
"mcpServers": {
"datadog": {
"command": "datadog-mcp"
}
}
}
Example Usage Patterns
Debug Meeting Issues
# Find all logs for a specific meeting (prod by default)
search_logs(service="meeting", meeting_id=136666, hours=24)
# Focus on errors for that meeting
search_logs(service="meeting", meeting_id=136666, status="error", hours=168)
# Debug staging environment issues
search_logs(service="meeting", env="staging", meeting_id=136666, status="error", hours=24)
Find Integration Events
# Zoom integration events for a meeting
search_logs(service="integration", provider="zoom", meeting_id=136667)
# All integration activity for a user in dev environment
search_logs(service="integration", env="dev", user_id=214413, hours=48)
Monitor Service Health
# Recent errors in meeting service (prod)
search_logs(service="meeting", status="error", hours=2)
# Tasmania space-specific errors in staging
search_logs(service="tasmania", env="staging", space_id=168565, status="error", hours=6)
User Activity Tracking
# All activity for a user in Tasmania (prod)
search_logs(service="tasmania", actor_email="user@example.com", hours=24)
# User's meeting activity in specific environment
search_logs(service="meeting", env="staging", user_id=214413, hours=12)
APM Trace Investigation
# Find all logs for a trace using relative time (prod by default)
get_trace_logs(trace_id="1234567890abcdef", hours=1)
# Find all logs for a trace in staging environment
get_trace_logs(
trace_id="1234567890abcdef",
env="staging",
time_from="2025-01-15T14:00:00Z",
time_to="2025-01-15T15:00:00Z"
)
# Track a request across services with relative time
trace_request_flow(request_id="req_abc123", hours=2)
# Track a request in dev environment with specific time range
trace_request_flow(
request_id="req_abc123",
env="dev",
time_from="now-30m",
time_to="now"
)
Query Architecture
The server uses structured filters that automatically map to the correct DataDog fields:
Structured Filter Benefits
- Smart Field Mapping:
user_id=214413maps to the right field(s) per service - OR Conditions: Automatically searches multiple possible field locations
- Health Check Exclusion: Removes noise from health check endpoints
- Service-Aware: Each service has optimized field mappings
Common Structured Patterns
# Service + specific entity
search_logs(service="meeting", meeting_id=136666)
# User activity across services
search_logs(service="tasmania", user_id=214413)
# Status filtering with context
search_logs(service="meeting", meeting_id=136666, status="error")
# Actor-based filtering (email)
search_logs(service="meeting", actor_email="user@example.com")
Time Range Specification
All time-based tools support two mutually exclusive approaches for specifying time ranges:
Option 1: Relative Time (Using hours)
# Search last 24 hours (default for search_logs)
search_logs(service="meeting", meeting_id=136666, hours=24)
# Search last 2 hours
search_logs(service="meeting", status="error", hours=2)
Option 2: Specific Time Range (Using time_from and time_to)
# Specific ISO timestamp range
search_logs(
service="meeting",
meeting_id=136666,
time_from="2025-01-15T10:00:00Z",
time_to="2025-01-15T12:00:00Z"
)
# Using DataDog's relative syntax
search_logs(
service="meeting",
time_from="now-6h",
time_to="now-2h"
)
# Mix of formats
search_logs(
service="tasmania",
time_from="2025-01-15T09:00:00Z",
time_to="now"
)
Default Time Ranges (when neither is specified)
- General log search: 1 hour
- Trace logs: 1 hour
- Business events: 1 hour
- Request flow tracing: 1 hour
Note: The hours and time_from/time_to parameters are mutually exclusive. Using both will result in an error.
Development
Running Tests
uv run pytest
Code Formatting
uv run ruff format
uv run ruff check
Project Structure
torch-datadog-mcp/
├── src/
│ └── datadog_mcp/
│ ├── __init__.py
│ ├── server.py # FastMCP server with tools
│ ├── client.py # DataDog API client
│ └── filter_config.py # Service-specific filter configuration
├── pyproject.toml # Project configuration
├── .env # Environment variables (create this)
├── README.md # This file
└── datadog-mcp.fastmcp.json # FastMCP configuration
Troubleshooting
Authentication Issues
- Verify
DD_API_KEYandDD_APP_KEYare set correctly - Check your DataDog site setting (
DD_SITE) - Ensure keys have appropriate permissions for log search
Connection Problems
- Use the
test_connectiontool to verify API connectivity - Check network connectivity to DataDog endpoints
- Verify your DataDog organization has log access enabled
No Results Found
- Check time ranges (use longer periods for older data)
- Verify query syntax matches DataDog's log search format
- Ensure the environment/service filters match your data
License
MIT License - see LICENSE file for details.






