Prerequisites
uv: Read theuvskill and follow its Setup instructions to ensure
uv is installed and on PATH.
- User Notification: If LICENSE_NOTIFICATION.txt does not already exist in
this skill directory then (1) prominently notify the user to check the terms at https://search.foldseek.com/search and https://github.com/steineggerlab/foldseek, then (2) create the file recording the notification text and timestamp.
Goal
Submit a user-provided 3D protein structure file (.cif, .mmcif, or .pdb) to the Foldseek web server API to find structurally similar proteins. Report the top structural hits, interpret key alignment metrics, summarize the inferred protein functions, save the Markdown-formatted table to a .md file, and save the full detailed results to a local JSON file.
Core Rules
- File Requirement: This tool absolutely cannot search by sequence, name,
or accession ID. It strictly requires a .pdb, .cif, or .mmcif file path.
- Strict Validation: Never bypass the input validation or the database
allowlist check.
- Do Not Parse the JSON: Rely entirely on the generated
.mdfile for
your immediate summary. The JSON is saved purely for subsequent, specialized tool use.
- No Raw Parsing: Do not attempt to parse or read the raw 3D coordinates
yourself; always pass the file to the script.
- Notification: If this skill is used, ensure this is mentioned in the
output.
Instructions
- Strict Input Validation: Verify that the user has explicitly provided a
valid path to a .cif, .mmcif, or .pdb file in their workspace.
- If the user provided a protein name, an amino acid sequence, or an
accession ID (e.g., a UniProt ID) but NO downloaded structure file, halt immediately. Do not run the script.
- Inform the user that Foldseek requires a physical 3D coordinate file,
and suggest downloading the structure first (e.g., using the AlphaFold fetch tool).
- Database Validation: Check if the user requested specific databases to
search.
- Allowed List:
afdb50,afdb-swissprot,pdb100,BFVD,
mgnify_esm30, cath50, gmgcl_id, bfmd, afdb-proteome.
- If the user requests a database NOT on this list, halt immediately.
Do not run the script. Inform the user that the database is unsupported and provide them with the allowed list.
- Generate File Names: Generate descriptive output file names for both the
JSON data and the Markdown table based on the input file (e.g., proteinA_foldseek_results.json and proteinA_foldseek_results.md).
- Execute the python script based on the user's request, redirecting the
standard output into your generated .md file:
- Default (No databases specified): `uv run scripts/search.py
<path-to-file> -o <generated-filename.json> > <generated-filename.md>`
- Custom (Valid databases specified): `uv run scripts/search.py
<path-to-file> -o <generated-filename.json> --databases <db1,db2,db3> > <generated-filename.md>`
- The script will query the databases, save the full JSON payload, and write a
Markdown-formatted table to your specified .md file.
- Read the Results: Open and read the newly generated
.mdfile carefully
to view the Markdown table.
- Interpret the Metrics: Summarize the top 3 to 5 structural matches that
have meaningfull annotations for the user. When reporting, assess the match quality using these specific fields:
- Prob (Probability): Values approaching 1.0 (100%) indicate extreme
confidence that the fold is a true structural homologue.
- Q-Cov (Query Coverage): High percentages mean the match covers the
majority of the query protein's overall shape, rather than just a small local motif.
- E-value & Seq Identity: Use these to provide additional evolutionary
context.
- Perform Functional Analysis: Analyze the text descriptions embedded
within the Target ID column for the reported matches.
- Explicitly report the specific protein names/functions of the top
structural homologues.
- Provide a synthesized overview summarizing the entire variety of
different functions, domains, or protein families found across the whole list of homologues (e.g., "Most hits are portal proteins, but there is also a distinct cluster of viral capsid matches...").
- Explicitly inform the user of both newly created files (
.jsonand.md)
and their locations so they can be seamlessly used in subsequent analysis steps.
* If the API returns an error or the file is missing, inform the user clearly
and ask them to verify the file path.

