Agnes AI Generation
Use this skill to call Agnes text, image, and video generation APIs through https://apihub.agnes-ai.com.
Quick Start
- Read
references/api.mdwhen endpoint details, parameters, or response fields are needed. - Use
scripts/agnes_api.pyfor real API calls instead of rewriting curl by hand. - Require an API key in
AGNES_API_KEY,AGNES_API_TOKEN, orAPIHUB_AGNES_API_KEY. Never print the key. - For light live verification, run
smoke-test; it avoids video creation by default. Add--include-image-editfor image-to-image, and add--video-case <case>explicitly for video modes. Treat the skill as fully tested only when basic text, text streaming, text tool calling, text-to-image, image-to-image, text-to-video, image-to-video, multi-image video, keyframe video, and video retrieval return successful responses.
Commands
Text generation:
python scripts/agnes_api.py text --prompt "Write a concise product tagline for an AI assistant."
Streaming text:
python scripts/agnes_api.py text --prompt "Write a short product intro." --stream
Streaming output is normalized and includes aggregated content, events, done, and a short raw_prefix.
Image generation:
python scripts/agnes_api.py image --prompt "A luminous floating city above a misty canyon at sunrise, cinematic realism" --size 1024x768
Image-to-image:
python scripts/agnes_api.py image --prompt "Turn the scene into a rainy cyberpunk night while preserving composition" --image https://example.com/input.png --size 1024x768
Text-to-video with polling:
python scripts/agnes_api.py video --prompt "A cinematic shot of a cat walking on the beach at sunset" --poll
Image-to-video:
python scripts/agnes_api.py video --prompt "Animate subtle camera movement and natural lighting" --image https://example.com/image.png --poll
Keyframe / multi-image video:
python scripts/agnes_api.py video --prompt "Create a smooth cinematic transition between the two keyframes" --image https://example.com/a.png --image https://example.com/b.png --mode keyframes --poll
Retrieve a video task:
python scripts/agnes_api.py video-get task_123456
Light live smoke test:
python scripts/agnes_api.py smoke-test
Image edit smoke test:
python scripts/agnes_api.py smoke-test --include-image-edit
Single video smoke test:
python scripts/agnes_api.py smoke-test --video-case text-to-video
Workflow
- Prefer
agnes-2.0-flashfor text chat/completions. - Prefer
agnes-image-2.1-flashfor text-to-image, image-to-image, and high-information-density image generation. High-density generation is prompt-driven; include subject hierarchy, environment, secondary details, lighting, composition, and quality requirements. - Prefer
agnes-video-v2.0for text-to-video, image-to-video, multi-image video, keyframe animation, prompt-based motion and scene control, cinematic output, asynchronous task creation, polling-based result retrieval, and seed-based reproducibility. - For image and video generation, convert any non-English user prompt to a fluent English generation prompt before calling the image/video API. English prompts are more stable for Agnes video generation. Preserve concrete visual details, style, lighting, composition, motion, camera instructions, and constraints during translation.
- For videos, remember the API is asynchronous: create a task first, then poll or retrieve by task id.
- The script validates image sizes, video frame counts, frame rates, and dimensions before sending requests.
num_framesmust be8n + 1and<= 441;81or121are good short values. - The video command defaults to
num_frames=121andframe_rate=24for more stable generation. Video smoke tests default tonum_frames=81andframe_rate=24. - Warn the user before costly or long-running live video generation unless they explicitly asked to test or generate video.
- Test video capabilities one at a time with
smoke-test --video-case <case>to avoid creating many tasks at once. Supported cases aretext-to-video,image-to-video,multi-image, andkeyframes.
Current Validation Notes
- Confirmed locally: skill metadata validation and Python syntax.
- Confirmed by live API: basic text, streaming text, tool-calling request shape, text-to-image, image-to-image, high-information-density text-to-image, Chinese prompt translation for image/video, completed text-to-video URL retrieval, and completed image-to-video URL retrieval.
- Caveat: Agnes may accept tool-calling request parameters without consistently returning
tool_calls; usesmoke-test --strict-toolswhen strict tool-call validation is required. - Supported by the script and smoke-test selector, but not re-run end-to-end in the latest pass: multi-image video and keyframe animation.
- Not yet confirmed end-to-end: completed URL retrieval for every multi-image video and keyframe animation task. A previous text-to-video task returned a provider-side
division by zeroerror, so keep video retries visible and report provider errors clearly.
Output Handling
- Return or save generated URLs from the JSON response.
- For image responses, expect URL-style results when
extra_body.response_formatisurl. - For video responses, extract URLs from
video_url,url, orremixed_from_video_idwhenstatusiscompleted. - If a request fails, report HTTP status and provider error body without exposing the API key.

