mcp-server-graylog

![npm version](https://www.npmjs.com/package/mcp-server-graylog) ![Node.js >= 18](https://nodejs.org) ![License: MIT](https://opensource.org/licenses/MIT) ![Tests](https://github.com/Pranavj17/mcp-server-graylog)

Model Context Protocol (MCP) server for Graylog log searching. Search logs by absolute/relative timestamps, filter by streams, and debug production issues directly from Claude Desktop.

> Built for production debugging - Search Graylog logs using exact timestamps, filter by application streams, and get actionable insights for troubleshooting production issues.

Features

✅ Absolute timestamp search - Debug specific errors with exact time ranges
✅ Relative timestamp search - Search recent logs (last N seconds)
✅ Distributed tracing - Follow a trace_id across all services
✅ Surrounding-log context - See what happened ±N seconds around an error
✅ Composite incident analysis - One tool call fans out to trace + context + baseline
✅ Field aggregation - Group counts by service/level/pod/lead_id with bandwidth-efficient projection
✅ Stream discovery - List all available streams/applications
✅ System health check - Verify Graylog connectivity
✅ Comprehensive validation - ISO 8601 timestamps, query syntax, stream IDs
✅ Clear error messages - Actionable errors for auth, network, and API issues
✅ Timeout handling - 30-second timeouts prevent hanging
✅ Production-ready - 54 tests, 9.2/10 code quality score

Installation
Configuration
Available Tools
Skills & agents
Query Examples
Troubleshooting
Development
Contributing
License

Installation

Option 1: Use with npx (Recommended)

# No installation needed - use directly with npx
npx mcp-server-graylog

Option 2: Global Installation

npm install -g mcp-server-graylog

Option 3: Local Installation

# Clone the repository
git clone https://github.com/Pranavj17/mcp-server-graylog.git
cd mcp-server-graylog

# Install dependencies
npm install

Configuration

Claude Desktop Setup

Add to your Claude Desktop config file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

Windows: %APPDATA%\Claude\claude_desktop_config.json

Using npx (Recommended)

{
  "mcpServers": {
    "graylog": {
      "command": "npx",
      "args": ["-y", "mcp-server-graylog"],
      "env": {
        "BASE_URL": "https://graylog.example.com",
        "API_TOKEN": "your_api_token_here"
      }
    }
  }
}

Using Local Installation

{
  "mcpServers": {
    "graylog": {
      "command": "node",
      "args": ["/path/to/mcp-server-graylog/src/index.js"],
      "env": {
        "BASE_URL": "https://graylog.example.com",
        "API_TOKEN": "your_api_token_here"
      }
    }
  }
}

Environment Variables

| Variable | Required | Description | |----------|----------|-------------| | BASE_URL | Yes | Graylog server URL (e.g., https://graylog.example.com) | | API_TOKEN | Yes | Graylog API token (username for Basic Auth, password is "token") |

Getting Your Graylog API Token

1. Log in to Graylog web interface 2. Go to System → Users 3. Select your user 4. Click Edit tokens 5. Create a new token with read permissions 6. Copy the token value

Available Tools

1. search_logs_absolute

Search logs using absolute timestamps (from/to). Perfect for debugging errors with specific timestamps from monitoring tools or error tracking systems.

Parameters:

query (required): Search query using Elasticsearch syntax
from (required): Start timestamp in ISO 8601 format
to (required): End timestamp in ISO 8601 format
streamId (optional): Stream ID to filter results
limit (optional): Maximum results (default: 50, max: 1000)

Example:

{
  "query": "\"/api/v1/registrations\" AND \"PUT\"",
  "from": "2025-10-23T10:00:00.000Z",
  "to": "2025-10-23T11:00:00.000Z",
  "streamId": "646221a5bd29672a6f0246d8",
  "limit": 100
}

2. search_logs_relative

Search logs using relative time range (e.g., last 15 minutes). Useful for recent log analysis.

Parameters:

query (required): Search query using Elasticsearch syntax
rangeSeconds (optional): Time range in seconds (default: 900 = 15 minutes, max: 86400 = 24 hours)
streamId (optional): Stream ID to filter results
limit (optional): Maximum results (default: 50, max: 1000)

Example:

{
  "query": "level:ERROR",
  "rangeSeconds": 3600,
  "limit": 100
}

3. trace_request

Trace a request across ALL services using a trace_id. Fetches logs from every stream, groups by service/pod, and sorts each service's messages chronologically. Essential for distributed debugging in microservice architectures.

Parameters:

traceId (required): The trace ID to follow (e.g., abbb27610a7fd76be8fb5af17edbe00d)
from (required): Start timestamp in ISO 8601 format (search window)
to (required): End timestamp in ISO 8601 format (search window)
limit (optional): Maximum results (default: 200, max: 1000)

Example:

{
  "traceId": "abbb27610a7fd76be8fb5af17edbe00d",
  "from": "2026-05-13T15:38:00.000Z",
  "to":   "2026-05-13T15:48:00.000Z"
}

4. get_surrounding_logs

Return logs within ±N seconds of a timestamp, optionally filtered by source/pod/stream. Reveals what happened immediately before and after an error.

Parameters:

timestamp (required): Center timestamp in ISO 8601 format
source (optional): Source hostname or pod to filter by
streamId (optional): Stream ID filter
windowSeconds (optional): Window on each side (default: 5, max: 300)
limit (optional): Maximum results (default: 100)

Example:

{
  "timestamp": "2026-05-13T15:43:27.844Z",
  "source": "argus-production-f747f5d4d-x9hpp",
  "windowSeconds": 10
}

5. analyze_incident

Composite tool. One call fans out to three searches and returns an aggregated incident report — saves 2-3 LLM orchestration rounds when investigating a specific trace.

Internally executes: 1. The full trace hop chain (trace_id:X) 2. Pod-scoped surrounding logs around the first ERROR/CRITICAL/FATAL hop (filters by pod: to avoid multi-tenant noise on shared hosts) 3. A trailing-hour error baseline for the anchor service

Parameters:

traceId (required): The trace ID to investigate
from (required): Start timestamp in ISO 8601 format
to (required): End timestamp in ISO 8601 format
window (optional): Surrounding-logs window in seconds (default: 10, max: 300)
baselineSeconds (optional): Trailing window for the baseline lookup (default: 3600, max: 86400)

Example:

{
  "traceId": "abbb27610a7fd76be8fb5af17edbe00d",
  "from": "2026-05-13T15:38:00.000Z",
  "to":   "2026-05-13T15:48:00.000Z",
  "window": 10,
  "baselineSeconds": 3600
}

Returns (abridged):

{
  "trace_id": "abbb27610a7fd76be8fb5af17edbe00d",
  "found": true,
  "steps_executed": 4,
  "summary": {
    "hops": 4,
    "services_involved": ["argus"],
    "errors_in_trace": 1,
    "anchor_service": "argus",
    "anchor_pod": "argus-production-f747f5d4d-x9hpp",
    "first_error": { "timestamp": "...", "service": "argus", "message": "nil fund_id ...", "lead_id": "..." },
    "request": { "http_path": "/api/v2/user/graph", "http_method": "POST", "http_status": 200, "duration_ms": 67 },
    "baseline_errors_in_service": 16,
    "baseline_window_seconds": 3600
  },
  "trace_hops": [...],
  "surrounding_logs": [...]
}

6. aggregate_logs

Count log entries grouped by a field — Graylog's most-used operation, made one-call. Issues a single search with fields=<group_field> projected (so only the column you want is downloaded) and aggregates client-side. Replaces Graylog 5.x's removed legacy terms-aggregation endpoint.

Parameters:

query (required): Filter (Elasticsearch syntax). Use * for all entries.
field (required): Field to group by. Common: service, logger_level, pod, lead_id, http_status, container_name.
from+to OR rangeSeconds (required, mutually exclusive): time window
size (optional): Top N to return (default 25, max 100). Rest summed into other.
fetchLimit (optional): Max messages to aggregate (default 5000, max 10000). When matched exceeds this, truncated: true is flagged.
streamId (optional)

Example:

{
  "query": "logger_level:error",
  "field": "service",
  "rangeSeconds": 1800,
  "size": 10
}

Returns:

{
  "field": "service",
  "query": "logger_level:error",
  "time_range": "Last 1800 seconds",
  "total_matched": 30,
  "messages_aggregated": 30,
  "truncated": false,
  "unique_groups": 5,
  "top": { "milkyway": 8, "argus": 4, "telex": 4, "advisory": 3, "auth": 1 },
  "other": 0,
  "missing": 10,
  "api_calls": 1
}

The missing count is messages that matched the query but had no value for the group-by field — useful signal for log-hygiene issues.

7. list_streams

List all available Graylog streams (applications). Use this to discover stream IDs for filtering.

Parameters: None

Returns:

{
  "total": 3,
  "streams": [
    {
      "id": "646221a5bd29672a6f0246d8",
      "title": "application-api",
      "description": "API application logs",
      "disabled": false
    }
  ]
}

8. get_system_info

Get Graylog system information and health status. Verify connectivity and check server version.

Parameters: None

Returns:

{
  "version": "5.1.0",
  "codename": "graylog",
  "cluster_id": "abc123",
  "is_processing": true,
  "timezone": "UTC"
}

Skills & agents (v2.3.0+)

When installed as a Claude Code plugin, this package ships playbooks that teach Claude when and how to use the MCP tools above.

Skills

| Skill | When it triggers | What it does | |---|---|---| | graylog | "search logs", "check graylog", general log questions | Entry-point. Maps common questions to the right tool, explains streams / trace_id / query syntax, points at the specialty skills. | | trace-debugging | "trace_id", "follow this request", "distributed trace" | Single-request investigation across services. Pulls the trace, finds error spans, gathers surrounding context, synthesizes a timeline. | | incident-triage | "errors spiking", "outage", "alert fired" | Localizes an active incident to a service + pattern. Aggregates errors by service, baselines against previous window, drills into the top offender, checks for deploy correlation. | | troubleshooting | Graylog tool failures (401, connection refused, empty results) | Diagnoses connectivity, auth, query syntax. Always starts with get_system_info. |

Agent

| Agent | When to dispatch | What it returns | |---|---|---| | graylog-trace-analyzer | Trace investigations expected to surface >200 log lines or span >5 services | A structured timeline (≤50 entries) plus origin, propagation, root-cause line, and a 2–4 sentence verdict. Keeps raw logs out of the parent context. |

Skills auto-load when the plugin is installed. The agent is dispatchable via Claude Code's subagent mechanism with subagent_type: "graylog-trace-analyzer".

Query Examples

Search for Errors

level:ERROR

Search for Specific Endpoint

"/api/v1/registrations" AND "PUT"

Search for HTTP Status Codes

status:500
status:>=400

Search for User Actions

user_id:12345 AND action:login

Search for Slow Requests

duration_ms:>1000

Search for Exceptions

exception:NullPointerException

Combine Multiple Conditions

level:ERROR AND source:nexus AND message:*timeout*

Search with Wildcards

message:*connection refused*

Search by Field Existence

_exists_:error_code

Common Use Cases

1. Debug Production Error

When you get an error with a timestamp from your monitoring system:

1. Copy error timestamp from your monitoring tool
2. Use search_logs_absolute with ±5 minute window
3. Filter by application stream
4. Find root cause in logs

2. Monitor Recent Deployments

After deploying:

1. Use search_logs_relative with last 15 minutes
2. Search for level:ERROR
3. Verify no new errors introduced

3. Investigate API Failures

When an API endpoint fails:

1. Search for endpoint path: "/api/v1/endpoint"
2. Filter by status codes: status:>=400
3. Check error patterns

Error Messages

The server provides clear, actionable error messages:

| Error | Meaning | Solution | |-------|---------|----------| | Authentication failed | Invalid API token | Check API_TOKEN in configuration | | Invalid query | Elasticsearch syntax error | Check query syntax and parameters | | Endpoint not found | Wrong Graylog URL | Check BASE_URL in configuration | | Cannot reach Graylog | Network connectivity issue | Verify Graylog is accessible | | Invalid timestamp | Wrong timestamp format | Use ISO 8601 format (e.g., 2025-10-23T10:00:00.000Z) |

Troubleshooting

Server Won't Start

Check environment variables:

# Verify BASE_URL and API_TOKEN are set in Claude Desktop config
# Check Claude Desktop logs:
# macOS: ~/Library/Logs/Claude/mcp*.log
# Windows: %APPDATA%\Claude\logs\mcp*.log

Verify Graylog accessibility:

curl -u "YOUR_API_TOKEN:token" https://graylog.example.com/api/system

Authentication Errors

Verify API token has read permissions in Graylog
Token format: Use token value as username, "token" as password
Check token hasn't expired

No Results Returned

Verify stream ID is correct using list_streams tool
Check timestamp range includes data
Try simplifying query to * to see if any data exists
Verify stream is not disabled

Integration Tests Failing

# Set environment variables for integration tests
export INTEGRATION_TESTS=true
export BASE_URL=https://graylog.example.com
export API_TOKEN=your_token_here

# Run integration tests
npm run test:integration

Development

Prerequisites

Node.js >= 18.0.0
npm >= 8.0.0
Access to a Graylog instance (for integration tests)

Development Workflow

# Install dependencies
npm install

# Run in development mode (auto-reload)
npm run dev

# Run tests
npm test

# Run tests in watch mode
npm run test:watch

# Run only unit tests
npm run test:unit

# Run integration tests (requires Graylog instance)
INTEGRATION_TESTS=true BASE_URL=https://graylog.example.com API_TOKEN=xxx npm run test:integration

# Check syntax
npm run lint

Project Structure

mcp-server-graylog/
├── src/
│   └── index.js           # Main server implementation (429 lines)
├── test/
│   ├── helpers.test.js    # Helper function tests (14 tests)
│   ├── validation.test.js # Input validation tests (24 tests)
│   ├── mcp-protocol.test.js # MCP protocol tests (16 tests)
│   └── integration.test.js  # Integration tests (7 tests)
├── example-config.json    # Claude Desktop config example
├── CONTRIBUTING.md        # Contributing guidelines
├── CHANGELOG.md          # Version history
└── package.json         # npm configuration

Running Tests

# Run all tests (54 tests)
npm test

# Expected output:
# tests 54
# pass 54
# fail 0

Architecture

Simple, focused architecture in a single file (429 lines):

Configuration & Validation - Environment variable checking
Helper Functions - ISO 8601 validation, error formatting
MCP Server Setup - Standard MCP protocol implementation
Tool Definitions - 4 tools with clear schemas
Tool Implementations - Clean, validated functions
Server Startup - Validation then connection

Design Principles:

✓ Simple and maintainable
✓ One file, easy to understand
✓ Clear separation of concerns
✓ Comprehensive error handling
✓ Input validation at boundaries
✓ Consistent response format

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Quick Start: 1. Fork the repository 2. Create a feature branch 3. Add tests for your changes 4. Ensure all tests pass (npm test) 5. Submit a pull request

Changelog

See CHANGELOG.md for version history and release notes.

Security

Environment variables for sensitive data (never hardcoded)
Basic authentication properly implemented
Input validation prevents injection attacks
Timeout prevents hanging requests
Error messages don't leak sensitive information

To report security vulnerabilities, please create a private security advisory on GitHub.

License

MIT License - see LICENSE file for details.

Acknowledgments

Built with @modelcontextprotocol/sdk
Inspired by the MCP community
Thanks to all contributors!

---

Made with ❤️ for the Claude Desktop community

For questions or support, please open an issue on GitHub

graylog-log-search

Summary

Install to Claude Code

mcp-server-graylog

Features

Table of Contents

Installation

Option 1: Use with npx (Recommended)

Option 2: Global Installation

Option 3: Local Installation

Configuration

Claude Desktop Setup

Using npx (Recommended)

Using Local Installation

Environment Variables

Getting Your Graylog API Token

Available Tools

1. search_logs_absolute

2. search_logs_relative

3. trace_request

4. get_surrounding_logs

5. analyze_incident

6. aggregate_logs

7. list_streams

8. get_system_info

Skills & agents (v2.3.0+)

Skills

Agent

Query Examples

Search for Errors

Search for Specific Endpoint

Search for HTTP Status Codes

Search for User Actions

Search for Slow Requests

Search for Exceptions

Combine Multiple Conditions

Search with Wildcards

Search by Field Existence

Common Use Cases

1. Debug Production Error

2. Monitor Recent Deployments

3. Investigate API Failures

Error Messages

Troubleshooting

Server Won't Start

Authentication Errors

No Results Returned

Integration Tests Failing

Development

Prerequisites

Development Workflow

Project Structure

Running Tests

Architecture

Contributing

Changelog

Security

License

Links

Acknowledgments

Related plugins

Plugins by category