OpenClaw · Skill
Nvidia Kimi Vision
Fast image analysis using Kimi K2.5 multimodal model from NVIDIA NIM.
Install
Start with the primary install command. Alternate entrypoints are included below for ClawHub and OpenClaw CLI users.
Primary command
clawhub install miladnoo/nvidia-kimi-visionClawHub installer
npx clawhub@latest install miladnoo/nvidia-kimi-visionOpenClaw CLI
openclaw skills install miladnoo/nvidia-kimi-visionDirect OpenClaw install
openclaw install miladnoo/nvidia-kimi-visionWhat this skill does
Fast image analysis using Kimi K2.5 multimodal model from NVIDIA NIM.
Why it matters
Gives any agent or model image understanding without requiring a native vision model, using NVIDIA NIM's free tier for fast inference.
Typical use cases
- Extract text from a screenshot
- Describe what's in a photo
- Analyze a meme or diagram
- Identify objects in an image
- Add vision to a non-vision LLM workflow
Source instructions
NVIDIA Kimi Vision
Fast image analysis using Kimi K2.5 multimodal model from NVIDIA NIM.
Why This Skill?
- Fast - NVIDIA NIM inference
- Quality - Kimi K2.5 is a solid vision model
- Simple - Just pass an image and prompt
- Free tier - Available through NVIDIA build.nvidia.com
API Setup (IMPORTANT)
When using this skill, if no API key is found, it will automatically guide the user through setup:
Step 1: Get a Free API Key
- Go to https://build.nvidia.com
- Sign up / Log in with GitHub or Google
- Search for "Kimi K2.5"
- Click on the model and get your free API key
Step 2: Save the Key
# Option A: Save to file (recommended)
mkdir -p ~/.config
echo 'your-api-key-here' > ~/.config/nvidia-kimi-api-key
# Option B: Pass directly when running
python3 scripts/analyze_image.py photo.jpg "What's this?" sk-your-key-here
First Time Setup (for agents)
When a user tries to use this skill without an API key, the script will output clear setup instructions. Guide them through:
- Visiting https://build.nvidia.com
- Getting their free API key
- Saving it to ~/.config/nvidia-kimi-api-key
Usage
python3 scripts/analyze_image.py <image_path> "<prompt>" [api_key]
Examples
# What's in this image?
python3 scripts/analyze_image.py "/path/to/image.jpg" "Describe what's in this image"
# Extract text from screenshot
python3 scripts/analyze_image.py "/path/screenshot.png" "Extract all text"
# Analyze a meme
python3 scripts/analyze_image.py "/path/meme.jpg" "Explain this meme"
# With API key inline
python3 scripts/analyze_image.py photo.jpg "What's this?" sk-xxxxx
Image Formats
Supports: png, jpg, jpeg, webp
Rate Limits
The free tier through NVIDIA NIM has some limits, but they're not clearly documented on the site. Check https://build.nvidia.com for the latest info on your specific key's limits.