OpenClaw · Skill
Orchata
This document describes how to effectively use Orchata, a RAG (Retrieval-Augmented Generation) platform with tree-based document indexing. Load this into your context to interact with Orchata knowledge bases.
Install
Start with the primary install command. Alternate entrypoints are included below for ClawHub and OpenClaw CLI users.
Primary command
clawhub install ehudsn/orchataClawHub installer
npx clawhub@latest install ehudsn/orchataOpenClaw CLI
openclaw skills install ehudsn/orchataDirect OpenClaw install
openclaw install ehudsn/orchataWhat this skill does
This document describes how to effectively use Orchata, a RAG (Retrieval-Augmented Generation) platform with tree-based document indexing. Load this into your context to interact with Orchata knowledge bases.
Why it matters
Tree-based document indexing lets you navigate large files by section and summary rather than retrieving unstructured chunks.
Typical use cases
- Searching internal documentation with natural language queries
- Uploading and indexing new PDFs for team knowledge bases
- Navigating large documents by section without reading everything
- Discovering which knowledge spaces contain relevant content
- Checking processing status of recently uploaded documents
Source instructions
Orchata Skills
This document describes how to effectively use Orchata, a RAG (Retrieval-Augmented Generation) platform with tree-based document indexing. Load this into your context to interact with Orchata knowledge bases.
What is Orchata?
Orchata is a knowledge management platform that:
- Organizes documents into Spaces - Logical containers for related content
- Uses tree-based indexing - Documents are parsed into hierarchical structures with sections, summaries, and page ranges
- Provides semantic search - Find relevant content using natural language queries
- Exposes MCP tools - AI assistants can directly manage and query knowledge bases
Core Concepts
Spaces
A Space is a container for related documents. Think of it as a folder with semantic search capabilities.
- Each space has a
name,description, and optionalicon - Descriptions are used by
smart_queryto recommend relevant spaces - Spaces can be archived (soft-deleted)
Documents
A Document is content within a space. Supported formats include:
- PDF (text-based and scanned with OCR)
- Word documents (.docx)
- Excel spreadsheets (.xlsx)
- PowerPoint presentations (.pptx)
- Markdown files (.md)
- Plain text files (.txt)
- Images (PNG, JPG, etc.)
Document Status:
| Status | Description |
|---|---|
PENDING | Uploaded, waiting for processing |
PROCESSING | Being parsed and indexed |
COMPLETED | Ready for queries |
FAILED | Processing error occurred |
Important: Only query documents with status: "COMPLETED". Other statuses won't return results.
Document Trees
Documents are indexed into hierarchical tree structures:
- Each tree has nodes representing sections/chapters
- Nodes contain:
title,summary,startPage,endPage,textContent - Trees enable precise navigation of large documents
Queries
Two types of queries are available:
query_spaces- Search document content using tree-based reasoningsmart_query- Discover which spaces are relevant for a query
MCP Tools Reference
Space Management
list_spaces
List all knowledge spaces in the organization.
list_spaces
list_spaces with status="active"
list_spaces with page=1 pageSize=20
Parameters:
page(number, optional): Page number (default: 1)pageSize(number, optional): Items per page (default: 10)status(string, optional): Filter byactive,archived, orall
manage_space
Create, get, update, or delete a space.
manage_space with action="create" name="Product Docs" description="Technical documentation"
manage_space with action="create" name="Legal" description="Case files" icon="briefcase"
manage_space with action="get" id="space_abc123"
manage_space with action="update" id="space_abc123" description="Updated description"
manage_space with action="delete" id="space_abc123"
Parameters:
action(string, required):create,get,update, ordeleteid(string): Space ID (required for get/update/delete)name(string): Space name (required for create)description(string, optional): Space descriptionicon(string, optional): Icon name. Defaults to "folder"slug(string, optional): URL-friendly identifierisArchived(boolean, optional): Archive status (for update)
Valid Icons:
folder, book, file-text, database, package, archive, briefcase, inbox, layers, box
If an invalid icon is provided, the tool returns an error with the list of valid options.
Document Management
list_documents
List documents in a space.
list_documents with spaceId="space_abc123"
list_documents with spaceId="space_abc123" status="completed"
list_documents with spaceId="space_abc123" status="all"
Parameters:
spaceId(string, required): Space IDpage(number, optional): Page numberpageSize(number, optional): Items per page (max: 100)status(string, optional): Filter by status. Values:pending,processing,completed,failed, orall. Omitting returns all documents.
Note: Status values are case-insensitive (completed and COMPLETED both work).
save_document
Upload or upsert documents (single or batch).
Single document:
save_document with spaceId="space_abc123" filename="guide.md" content="# Guide\n\nContent here..."
Batch upload:
save_document with spaceId="space_abc123" documents=[{"filename": "doc1.md", "content": "..."}, {"filename": "doc2.md", "content": "..."}]
Parameters:
spaceId(string, required): Space IDfilename(string): Filename (required for single)content(string): Content (required for single)documents(array, optional): Array of{filename, content, metadata}for batchmetadata(object, optional): Custom key-value pairs
get_document
Get document content by ID or filename. Returns processed markdown text.
get_document with spaceId="space_abc123" id="doc_xyz789"
get_document with spaceId="space_abc123" filename="guide.md"
get_document with spaceId="*" filename="guide.md"
Parameters:
spaceId(string, required): Space ID, or*to search all spaces (requires filename)id(string, optional): Document IDfilename(string, optional): Filename
Notes:
- Either
idorfilenameis required - Use
spaceId="*"to search all spaces when you know the filename but not the space - For completed documents, returns the extracted markdown text (not raw PDF binary)
- When using
*, the response includes thespaceIdwhere the document was found
update_document
Update document content or metadata.
update_document with spaceId="space_abc123" id="doc_xyz789" content="New content..."
update_document with spaceId="space_abc123" id="doc_xyz789" append=true content="Additional content"
Parameters:
spaceId(string, required): Space IDid(string, required): Document IDcontent(string, optional): New contentmetadata(object, optional): New metadataappend(boolean, optional): Append instead of replaceseparator(string, optional): Separator for append mode
delete_document
Permanently delete a document.
delete_document with spaceId="space_abc123" id="doc_xyz789"
Parameters:
spaceId(string, required): Space IDid(string, required): Document ID
Query Tools
query_spaces
Search documents across one or more spaces using tree-based reasoning.
query_spaces with query="How do I authenticate API requests?"
query_spaces with query="installation guide" spaceIds="space_abc123"
query_spaces with query="error handling" spaceIds=["space_abc", "space_def"] topK=10
Parameters:
query(string, required): Natural language search queryspaceIds(string or array, optional): Space ID(s) to search. Omit or use*for all spacestopK(number, optional): Maximum results (default: 10)compact(boolean, optional): Use compact format (default: false). See When to Use Compact below.
When to Use Compact:
| Mode | When to use | What you get |
|---|---|---|
compact=false (default) | Most queries. Any time you need actual data, facts, numbers, dates, or details from documents. | Full results with document metadata, tree context, page ranges, and complete content. |
compact=true | Broad discovery queries where you only need to know which documents are relevant, not their content. | Minimal results: just content snippet, source filename, and score. |
Rule of thumb: Default to compact=false. Only use compact=true when you're browsing/surveying and don't need the actual content yet.
Response (compact=true format):
{
"results": [
{
"content": "Relevant text content...",
"source": "filename.pdf",
"score": 0.95
}
],
"total": 5
}
smart_query
Discover which spaces are relevant for a query using LLM reasoning.
smart_query with query="How do I install the SDK?"
smart_query with query="billing questions" maxSpaces=3
Parameters:
query(string, required): Query to find relevant spaces formaxSpaces(number, optional): Maximum spaces to return (default: 5)
Response:
{
"query": "How do I install the SDK?",
"relevantSpaces": [
{"spaceId": "space_abc123", "relevance": "Contains SDK installation guides"},
{"spaceId": "space_def456", "relevance": "Has developer tutorials"}
],
"totalFound": 2
}
Use case: When you don't know which space to search, use smart_query first to discover relevant spaces, then use query_spaces with those space IDs.
Tree Visibility Tools
These tools let you explore the hierarchical structure of indexed documents.
get_document_tree
Get the tree structure of a document showing sections, summaries, and page ranges.
get_document_tree with spaceId="space_abc123" documentId="doc_xyz789"
Parameters:
spaceId(string, required): Space IDdocumentId(string, required): Document ID
Response:
{
"documentId": "doc_xyz789",
"totalPages": 45,
"totalNodes": 12,
"nodes": [
{
"nodeId": "0001",
"title": "Introduction",
"summary": "Overview of the system architecture...",
"pages": "1-5",
"depth": 0
},
{
"nodeId": "0002",
"title": "Installation",
"summary": "Step-by-step installation guide...",
"pages": "6-12",
"depth": 0
}
]
}
Use case: Use this to understand a document's structure before drilling into specific sections.
get_tree_node
Get the full text content of a specific tree node/section.
get_tree_node with documentId="doc_xyz789" nodeId="0002"
Parameters:
documentId(string, required): Document IDnodeId(string, required): Node ID from the tree structure
Response:
{
"documentId": "doc_xyz789",
"filename": "manual.pdf",
"nodeId": "0002",
"title": "Installation",
"summary": "Step-by-step installation guide...",
"pages": "6-12",
"depth": 0,
"content": "## Installation\n\nTo install the software, follow these steps:\n\n1. Download the installer...\n\n..."
}
Use case: After viewing the tree structure, use this to read the full content of a specific section.
Workflow Patterns
Pattern 1: Search for Information (Default Approach)
For most questions, a single query_spaces call is all you need. Start here before trying multi-step workflows.
query_spaces with query="your question"
This searches all spaces with full details (compact=false by default). One call, done.
If you want to narrow to specific spaces:
query_spaces with query="your question" spaceIds="known_space_id"
If you truly don't know which spaces exist:
smart_query with query="your question"
# Then use the returned spaceIds:
query_spaces with query="your question" spaceIds=["returned_space_id"]
Avoid over-searching. The multi-step workflow (
smart_query->query_spaces->get_document_tree->get_tree_node) is rarely necessary. For most questions, a singlequery_spacescall returns the answer directly. Only escalate to tree browsing if results are insufficient.
Pattern 2: Look Up Specific Data
When looking for specific facts, numbers, dates, names, or details:
Just query directly -- one call:
query_spaces with query="total amount on invoice #1234"
The default compact=false returns full content with document metadata, so you get the actual data you need in one step. Do not use compact=true for data lookups -- it strips the detail you need.
Pattern 3: Browse a Large Document
When you need to navigate a large document's structure:
-
Get the document structure:
get_document_tree with spaceId="space_id" documentId="doc_id" -
Identify relevant sections from the node titles and summaries
-
Read specific sections:
get_tree_node with documentId="doc_id" nodeId="relevant_node_id"
Pattern 4: Add New Content
When adding documents to a knowledge base:
-
Find or create the appropriate space:
list_spaces # or manage_space with action="create" name="New Space" description="..." -
Upload the content:
save_document with spaceId="space_id" filename="document.md" content="..." -
Wait for processing (status will change from PENDING -> PROCESSING -> COMPLETED)
-
Verify it's ready:
list_documents with spaceId="space_id" status="COMPLETED"
manage_space - Valid Icons
When creating or updating a space, use one of these icon values:
folder(default)bookfile-textdatabasepackagearchivebriefcaseinboxlayersbox
Invalid icons will return a helpful error message with the list of valid options.
list_documents - Status Parameter
The status parameter accepts the following values (case-insensitive):
"all"- Returns documents in any status (COMPLETED, FAILED, PENDING, PROCESSING)"completed"- Returns only successfully processed documents"failed"- Returns only documents that failed processing (includeserrorMessagefield)"pending"- Returns documents waiting to be processed"processing"- Returns documents currently being processed
Documents with status="FAILED" will include an errorMessage field explaining what went wrong during processing.
save_document - Processing Workflow
Documents are processed asynchronously:
save_documentreturns immediately withstatus="PROCESSING"- Background job generates embeddings and indexes the document (typically 1-3 seconds)
- Status changes to
"COMPLETED"when ready - Document becomes searchable via
query_spaces
To check completion status:
- Use
get_documentto check a specific document's status - Use
list_documentswithstatus="processing"to see all processing documents - Use
list_documentswithstatus="failed"to see any failures
Example:
// Save document
const result = await save_document({...});
// result.document.status === "PROCESSING"
// Check status after a moment
const doc = await get_document({id: result.document.id});
// doc.status === "COMPLETED" (when ready)
get_tree_node - Content Availability
get_tree_node may return "(No text content cached for this node)" for certain nodes. This occurs for:
- Structural/organizational nodes without associated text content
- Nodes that serve as section headers in the tree hierarchy
This is expected behavior.
To read actual document content:
- Use
get_documentto retrieve the full processed markdown - Use
query_spacesto search and retrieve relevant content chunks
The tree structure (via get_document_tree) is always available and shows document organization, summaries, and page ranges.
Best Practices
DO
- Start with a single
query_spacescall - it usually has the answer in one step - Use
compact=false(the default) for most queries - you get full content and context - Check document status before querying - only
COMPLETEDdocuments are searchable - Use descriptive queries - natural language works best
- Use tree tools for large documents - navigate structure instead of reading everything
- Write good space descriptions - they're used by
smart_queryfor discovery
DON'T
- Don't over-search - avoid multi-step workflows (
smart_query->query_spaces->get_document_tree->get_tree_node) when a singlequery_spacescall suffices - Don't use
compact=truefor data lookups - it strips the content you need; only use it for broad discovery - Don't query PENDING/PROCESSING documents - they won't return results
- Don't use very short queries - more context = better results
- Don't forget to check processing status after uploading new documents
Error Handling
Common errors and solutions:
| Error | Cause | Solution |
|---|---|---|
| "Document not found" | Wrong ID or no access | Verify the document ID with list_documents |
| "Space not found" | Wrong ID or archived | Use list_spaces to find valid space IDs |
| Empty search results | Document not COMPLETED or no matches | Check document status; try broader query |
| "Tree not found" | Document uses vector indexing or not processed | Check if document status is COMPLETED |
| "Invalid icon" | Icon name not in allowed list | Use one of: folder, book, file-text, database, package, archive, briefcase, inbox, layers, box |
| "No text content cached" | Tree node content not cached | This is normal for structural nodes; use get_document for full content |
Troubleshooting Tips
If save_document fails:
- Verify the space exists with
manage_space with action="get" id="..." - Ensure content is valid text/markdown
- Check that the space is not archived
If list_documents returns 0 results:
- Try
status="all"or omit the status parameter entirely - Verify the spaceId is correct with
list_spaces - Check if documents are still processing (status="processing")
If get_tree_node returns no content:
- Some nodes are structural and don't have cached text content
- Use
get_documentto get the full processed document text instead - Or use
query_spacesto search for specific content
Quick Reference
| Task | Tool | Example |
|---|---|---|
| List all spaces | list_spaces | list_spaces with status="active" |
| Create a space | manage_space | manage_space with action="create" name="Docs" |
| List documents | list_documents | list_documents with spaceId="..." |
| Upload content | save_document | save_document with spaceId="..." content="..." |
| Get document text | get_document | get_document with spaceId="..." id="..." |
| Search content | query_spaces | query_spaces with query="..." |
| Find relevant spaces | smart_query | smart_query with query="..." |
| View doc structure | get_document_tree | get_document_tree with spaceId="..." documentId="..." |
| Read a section | get_tree_node | get_tree_node with documentId="..." nodeId="..." |