Hermes Agent · Optional

clip

OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for image search, content moderation, or vision-language tasks without fine-tuning. Best for general-purpose image understanding.

MlopsOptionalv1.0.0MIT

What this skill is

This directory page tracks a Hermes-compatible skill reference and links back to the original source for install instructions, files, and updates.

Tags and platforms

MultimodalCLIPVision-LanguageZero-ShotImage ClassificationOpenAIImage SearchCross-Modal RetrievalContent Moderation

Featured

Your product here

Show your offer to OpenClaw operators and AI builders across every page and blog.

Advertise