Image Tools - Background Removal, Upscaling, Face Restoration

fasuizu-br/speech-ai-examples
Community

Install to Claude Code

This server doesn't publish a one-line install command. Follow the setup in the source repository.

Summary

Image tools: background removal (BiRefNet), upscaling, face restoration. GPU-accelerated.

README.md

Brainiall AI APIs

![API Status](https://apim-ai-apis.azure-api.net/v1/health) ![License: MIT](LICENSE) ![MCP Servers](https://apim-ai-apis.azure-api.net/mcp/pronunciation/mcp) ![Azure Marketplace](https://azuremarketplace.microsoft.com) ![Models](https://apim-ai-apis.azure-api.net/v1/models)

Production AI APIs for speech, text, image, and LLM inference. Available as REST endpoints and MCP servers for AI agents.

Base URL: https://apim-ai-apis.azure-api.net Full API reference for LLMs: llms-full.txt | llms.txt

Products

| Product | Endpoints | Latency | Notes | |---------|-----------|---------|-------| | Pronunciation Assessment | /v1/pronunciation/assess/base64 | <500ms | 17MB ONNX, per-phoneme scoring (39 ARPAbet) | | Text-to-Speech | /v1/tts/synthesize | <1s | 12 voices (American + British), 24kHz WAV | | Speech-to-Text | /v1/stt/transcribe/base64 | <500ms | Compact 17MB model, English, word timestamps | | Whisper Pro | /v1/whisper/transcribe/base64 | <3s | 99 languages, speaker diarization | | NLP Suite | /v1/nlp/{toxicity,sentiment,entities,pii,language} | <50ms | CPU-only, ONNX, 5 endpoints | | Image Processing | /v1/image/{remove-background,upscale,restore-face}/base64 | <3s | GPU (A10), BiRefNet + ESRGAN + GFPGAN | | LLM Gateway | /v1/chat/completions | varies | 113+ models, OpenAI-compatible, streaming |

Authentication

Include ONE of these headers in every request:

Ocp-Apim-Subscription-Key: YOUR_KEY
Authorization: Bearer YOUR_KEY
api-key: YOUR_KEY

Get API keys at the portal (GitHub sign-in, purchase credits, create key).

Quick Start

Python — LLM Gateway (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://apim-ai-apis.azure-api.net/v1",
    api_key="YOUR_KEY"
)

response = client.chat.completions.create(
    model="claude-sonnet",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Python — Pronunciation Assessment

import requests, base64

audio_b64 = base64.b64encode(open("audio.wav", "rb").read()).decode()
r = requests.post(
    "https://apim-ai-apis.azure-api.net/v1/pronunciation/assess/base64",
    headers={"Ocp-Apim-Subscription-Key": "YOUR_KEY"},
    json={"audio": audio_b64, "text": "Hello world", "format": "wav"}
)
print(r.json()["overallScore"])  # 0-100

Python — NLP Pipeline

import requests

headers = {"Ocp-Apim-Subscription-Key": "YOUR_KEY"}
base = "https://apim-ai-apis.azure-api.net/v1/nlp"

# Sentiment
r = requests.post(f"{base}/sentiment", headers=headers, json={"text": "I love this!"})
print(r.json())  # {"label": "positive", "score": 0.9987}

# PII detection with redaction
r = requests.post(f"{base}/pii", headers=headers, json={"text": "Email john@acme.com", "redact": True})
print(r.json()["redacted_text"])  # "Email [EMAIL]"

Node.js — LLM Gateway

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://apim-ai-apis.azure-api.net/v1",
  apiKey: "YOUR_KEY"
});

const res = await client.chat.completions.create({
  model: "claude-sonnet",
  messages: [{ role: "user", content: "Hello!" }]
});
console.log(res.choices[0].message.content);

curl — Image Background Removal

curl -X POST https://apim-ai-apis.azure-api.net/v1/image/remove-background/base64 \
  -H "Ocp-Apim-Subscription-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"image\": \"$(base64 -i photo.jpg)\"}"

LLM Gateway — Popular Models

| Model | Alias | Price ($/MTok in/out) | |-------|-------|----------------------| | Claude Opus 4.6 | claude-opus | $5 / $25 | | Claude Sonnet 4.6 | claude-sonnet | $3 / $15 | | Claude Haiku 4.5 | claude-haiku | $1 / $5 | | DeepSeek R1 | deepseek-r1 | $1.35 / $5.40 | | DeepSeek V3 | deepseek-v3 | $0.27 / $1.10 | | Llama 3.3 70B | llama-3.3-70b | $0.72 / $0.72 | | Amazon Nova Pro | nova-pro | $0.80 / $3.20 | | Amazon Nova Micro | nova-micro | $0.035 / $0.14 | | Mistral Large 3 | mistral-large-3 | $2 / $6 | | Qwen3 32B | qwen3-32b | $0.35 / $0.35 |

Full list: GET /v1/models (113+ models from 17 providers).

Supports: streaming SSE, tool calling, structured output (json_object/json_schema), extended thinking.

Works with: OpenAI SDK, LiteLLM, LangChain, Cline, Cursor, Aider, Continue, SillyTavern, Open WebUI.

MCP Servers (for AI Agents)

3 MCP servers with 20 tools total. Streamable HTTP transport.

| Server | URL | Tools | |--------|-----|-------| | Speech AI | https://apim-ai-apis.azure-api.net/mcp/pronunciation/mcp | 10 tools + 8 resources + 3 prompts | | NLP Tools | https://apim-ai-apis.azure-api.net/mcp/nlp/mcp | 6 tools + 3 resources + 3 prompts | | Image Tools | https://apim-ai-apis.azure-api.net/mcp/image/mcp | 4 tools + 3 resources + 2 prompts |

MCP Configuration (Claude Desktop / Cursor / Cline)

{
  "mcpServers": {
    "brainiall-speech": {
      "url": "https://apim-ai-apis.azure-api.net/mcp/pronunciation/mcp",
      "headers": { "Ocp-Apim-Subscription-Key": "YOUR_KEY" }
    },
    "brainiall-nlp": {
      "url": "https://apim-ai-apis.azure-api.net/mcp/nlp/mcp",
      "headers": { "Ocp-Apim-Subscription-Key": "YOUR_KEY" }
    },
    "brainiall-image": {
      "url": "https://apim-ai-apis.azure-api.net/mcp/image/mcp",
      "headers": { "Ocp-Apim-Subscription-Key": "YOUR_KEY" }
    }
  }
}

Also available on: Smithery (score 95/100) | MCPize | Apify ($0.02/call) | MCP Registry

Examples

| File | Description | |------|-------------| | python/basic_usage.py | Speech APIs — assess, transcribe, synthesize | | python/pronunciation_tutor.py | Interactive pronunciation tutor | | javascript/basic_usage.js | Node.js examples for speech APIs | | curl/examples.sh | curl commands for every endpoint | | mcp/claude-desktop-config.json | MCP config for Claude Desktop | | mcp/cursor-config.json | MCP config for Cursor IDE | | llms-full.txt | Complete API reference for LLM consumption |

Pricing

| Product | Price | Unit | |---------|-------|------| | Pronunciation | $0.02 | per call | | TTS | $0.01-0.03 | per 1K chars | | STT (compact) | $0.01 | per request | | Whisper Pro | $0.02 | per minute | | NLP (any) | $0.001-0.002 | per call | | Image (any) | $0.003-0.005 | per image | | LLM Gateway | competitive pricing | per MTok |

Credit packages: $5, $10, $25, $50, $100. Portal | Azure Marketplace (search "Brainiall").

License

MIT — Brainiall

Related MCP servers

Browse all →