OpenClaw · Skill
Arc Creator
Create FAIR Digital Objects following the nfdi4plants ARC specification v3.0.0.
Install
Start with the primary install command. Alternate entrypoints are included below for ClawHub and OpenClaw CLI users.
Primary command
clawhub install ingogiebel/arc-creatorClawHub installer
npx clawhub@latest install ingogiebel/arc-creatorOpenClaw CLI
openclaw skills install ingogiebel/arc-creatorDirect OpenClaw install
openclaw install ingogiebel/arc-creatorWhat this skill does
Create FAIR Digital Objects following the nfdi4plants ARC specification v3.0.0.
Why it matters
The interactive phase-by-phase approach prevents metadata omissions that would make an ARC non-compliant with the ISA specification, which is easy to miss when filling fields manually.
Typical use cases
- Setting up a new plant biology ARC from scratch
- Adding proteomics or RNA-seq assay metadata to an existing ARC
- Organizing raw mass spectrometry files into the correct ARC dataset folder
- Registering investigation contacts and publications in ISA format
- Pushing a completed ARC to a DataHUB instance for FAIR sharing
Source instructions
ARC Creator
Create FAIR Digital Objects following the nfdi4plants ARC specification v3.0.0.
Prerequisites
gitandgit-lfsinstalled- ARC Commander CLI at
~/bin/arc(optional but recommended) - For DataHUB sync: Personal Access Token for git.nfdi4plants.org or datahub.hhu.de
Interactive ARC Creation Workflow
Guide the user through these phases in order. Ask questions conversationally — don't dump all questions at once. Batch 2-4 related questions per message.
Phase 1: Investigation Setup
Ask the user:
- Investigation identifier (short, lowercase-hyphenated, e.g.
cold-stress-arabidopsis) - Title (concise name for the investigation)
- Description (textual description of the research goals)
- Where to store the ARC locally (suggest
/home/uranus/arc-projects/<identifier>/)
Then run scripts/create_arc.sh <path> <identifier> and set investigation metadata via:
arc investigation update -i "<id>" --title "<title>" --description "<desc>"
Phase 2: Studies
For each study, ask:
- Study identifier (e.g.
plant-growth) - Title and description
- Organism (for Characteristic [Organism])
- Growth conditions (temperature, light, medium, etc.)
- Source materials (what goes in — seeds, cell lines, etc.)
- Sample materials (what comes out — leaves, roots, extracts, etc.)
- Protocols — does the user have protocol documents to include?
- Factors — what experimental variables are being tested? (e.g., temperature, genotype, treatment)
Create with:
arc study init --studyidentifier "<id>"
arc study update --studyidentifier "<id>" --title "<title>" --description "<desc>"
Copy protocol files to studies/<id>/protocols/.
Copy resource files to studies/<id>/resources/.
Phase 3: Assays
For each assay, ask:
- Assay identifier (e.g.
proteomics-ms,rnaseq,sugar-measurement) - Measurement type (e.g., protein expression profiling, transcription profiling, metabolite profiling)
- Technology type (e.g., mass spectrometry, nucleotide sequencing, plate reader)
- Technology platform (e.g., Illumina NovaSeq, Bruker timsTOF)
- Data files — where are the raw data files? (will go into
assays/<id>/dataset/) - Processed data — any processed output files?
- Protocols — assay-specific protocols?
- Performers — who performed this assay? (name, affiliation, role)
Create with:
arc assay init -a "<id>" --measurementtype "<type>" --technologytype "<tech>"
Copy data to assays/<id>/dataset/, protocols to assays/<id>/protocols/.
Phase 4: Workflows (optional)
Ask if there are computational analysis steps. For each:
- Workflow identifier (e.g.
deseq2-analysis,heatmap-generation) - Description of what it does
- Code files (scripts, notebooks)
- Dependencies (Python packages, R libraries, Docker image)
Place code in workflows/<id>/.
Note: workflow.cwl is REQUIRED by spec but often created later. Inform user.
Phase 5: Runs (optional)
Ask if there are computation outputs. For each:
- Run identifier
- Which workflow produced it
- Output files (figures, tables, processed data)
Place outputs in runs/<id>/.
Phase 6: Contacts & Publications
Ask:
- Investigation contacts (name, email, affiliation, role — at minimum the PI)
- Publications (if any — DOI, PubMed ID, title, authors)
Add via:
arc investigation person register --lastname "<last>" --firstname "<first>" --email "<email>" --affiliation "<aff>"
Phase 7: Git Commit & DataHUB Sync
- Configure git user:
git config user.name "<name>"
git config user.email "<email>"
- Commit:
git add -A
git commit -m "Initial ARC: <investigation title>"
- Ask if the user wants to push to a DataHUB. If yes:
- Ask which host (git.nfdi4plants.org, datahub.hhu.de, etc.)
- Create remote repo (via browser or API)
- Set remote and push
ISA Metadata Reference
For detailed ISA-XLSX fields, annotation table columns, and ontology references, read references/arc-spec.md.
Key Reminders
- Assay data is immutable — never modify files in
assays/<id>/dataset/after initial placement - Studies describe materials, assays describe measurements
- Workflows are code, runs are outputs
- Git LFS for files > 100 MB:
git lfs track "*.fastq.gz" "*.bam" "*.raw" - Don't store ARCs on OneDrive/Dropbox — Git + cloud sync causes conflicts
- ARC Commander CLI reference:
arc <subcommand> --help