OpenAlex Skill
Prerequisites
uv: Read theuvskill and follow its Setup instructions to ensure
uv is installed and on PATH.
- User Notification: If LICENSE_NOTIFICATION.txt does not already exist in
this skill directory then (1) prominently notify the user to check the terms at https://developers.openalex.org/ and to always check the license of the papers retrieved by the skill for any restrictions, then (2) create the file recording the notification text and timestamp.
.envfile: Make sure the.envfile exists in your home directory.
Create one if it does not exist.
OPENALEX_API_KEY(optional but recommended): Enables the OpenAlex
Premium API with higher rate limits. The skill works without it (using the free "polite pool"). If the variable is missing from .env, do NOT ask the user to paste it into the chat (this would leak the key into the agent's context). Instead, give the user this command — substituting ENV_FILE with the resolved literal path to the .env file:
printf "Enter OpenAlex API key (typing hidden): " && read -s key && echo && echo "OPENALEX_API_KEY=$key" >> "ENV_FILE" && echo "Saved."
The scripts load credentials automatically via dotenv. NEVER read, print, or inspect the .env file or its variables (e.g. no cat, grep, echo, printenv, or os.environ.get on keys). Credentials must stay out of the agent's context. See the Rate Limits section for more details.
Core Rules
- List Sources. If this skill is used, ensure this is mentioned in the
output AND list the URLs of all papers that were used in producing the output.
- Resolve before filter. NEVER filter by name. Always
resolvea name to
an ID first, then use that ID in --filter.
- Use the CLI only. Never call the API via
curl/urllib. The CLI
handles retries and rate limiting.
- No fabrication. Never invent OpenAlex IDs or DOIs. Use
resolve/get
to look them up. Report empty results accurately.
- API key. If a command returns 401/429 or you need high-volume queries,
follow the prerequisite instructions above to help the user add OPENALEX_API_KEY to the .env file. Keys are at OpenAlex.org → account settings.
- Keep output small. Always use
--selectand--per-page 5–10for
overview queries. Pipe filter output to a file (> results.json), then slim with jq before reading into context.
Rate Limits
- With key: ~10 req/s, $1/day free budget.
- Without key: Very limited, $0.01/day budget.
Operation | Cost ---------------------- | ------- Singleton get | Free filter | $0.0001 --search / resolve | $0.001 download-pdf | $0.01
CLI Reference
uv run scripts/openalex_cli.py [--api-key KEY] <command> [flags]
Entity types (shared across commands): works, authors, sources, institutions, topics, domains, fields, subfields, sdgs, countries, continents, languages, keywords, publishers, funders, work-types, source-types, institution-types, licenses
Commands
resolve <entity> <query> — Name → ID candidates. Returns id, display_name, hint. Use --per-page N for more candidates.
get <entity> <id> — Full metadata for one entity. Accepts short ID (W2741809807), full URL, or DOI URL. Use --select to limit fields.
filter <entity> — Search/filter entities. Key flags are:
--search <query>: Full-text search (10× cost of--filter)--filter <expr>: Filter expressions. Use,for AND and|for OR.--sort <field:dir>: Sort results (e.g.,cited_by_count:desc)--select <fields>: Limit the fields returned in the output.--group-by <field>: Aggregate results by a specific field.--per-page <N>: Number of results per page (default 25, max 100).--page <N>: Specify the page number to retrieve.--sample <N>: Get a random sample of up to 10,000 results.--seed <N>: Seed for reproducible sampling.
download-pdf <work-id> <output-path> — Download PDF (requires API key). Falls back to alternative pdf_url locations if primary fails. Whenever you download a PDF, verify it is not empty or corrupted.
rate-limit — Check current rate limit status (requires API key).
Search Tips
- If
resolvereturns no matches, try alternate spellings or abbreviations. - If
--searchreturns 0 results, try broader terms (max 3 retries). - If
resolvereturns multiple candidates, present them to the user with
display_name and hint for manual selection.
Entity References
Consult references/ for valid filter, sort, and group-by fields per entity:
— Taxonomy
Common Workflows
# Author's works (resolve → filter)
uv run scripts/openalex_cli.py resolve authors "Geoffrey Hinton"
uv run scripts/openalex_cli.py filter works \
--filter "authorships.author.id:A5108093963" \
--sort "cited_by_count:desc" --per-page 10 > papers.json
cat papers.json | jq '[.results[] | {id, title: .display_name, year: .publication_year, citations: .cited_by_count}]'
# DOI lookup
uv run scripts/openalex_cli.py get works "https://doi.org/10.1038/s41586-021-03819-2"
# Bulk DOI lookup (up to 100)
uv run scripts/openalex_cli.py filter works \
--filter "doi:10.1234/a|10.1234/b|10.1234/c" --per-page 100 > results.json
# Institutional impact by year
uv run scripts/openalex_cli.py resolve institutions "MIT"
uv run scripts/openalex_cli.py filter works \
--filter "authorships.institutions.id:I63966007" \
--group-by "publication_year" > mit_by_year.json
# Random sample
uv run scripts/openalex_cli.py filter works \
--filter "publication_year:2023,is_oa:true" \
--sample 100 --seed 42 > results.json
Error Handling
Code | Meaning | Action ---- | ------------------- | ------------------------------------------------ 401 | Unauthorized | Help user add API key to .env (see prereqs) 403 | Plan upgrade needed | Inform user; see https://openalex.org/pricing 404 | Not found | Verify ID; try resolve first 429 | Rate limited | Wait and retry; suggest adding API key to .env
Known premium-only filters: from_updated_date, to_updated_date.
Never fabricate results on empty responses — report accurately and suggest alternate search terms.

