Skip to content

VectorSpaceLab/Auto-ML-Skills

Repository files navigation

Auto-ML-Skills

Discussions | Skill Library | License

English | 简体中文

🧭 Table of Contents

💡 Introduction

Modern coding agents can already write useful machine-learning code, but they often struggle with real ML repositories:

  • Package selection: agents may not know which library best fits a task when multiple ML, LLM, RAG, bio/chem, vision, or MLOps projects overlap.
  • Repo-specific usage: agents often misuse local APIs, config files, launch commands, or data formats, then spend extra turns debugging avoidable errors.
  • Current-code awareness: agents need repository-grounded guidance when the right answer depends on today's source tree, examples, tests, and package metadata.
  • Costly trial and error: agents can waste time, tokens, downloads, or GPU runs when they explore an unfamiliar repository without a reliable operating map.

Auto-ML-Skills is a skill library for automated machine learning. It equips agents with repository-specific and paper-derived operating knowledge so they can work with ML software more accurately and with fewer wasted tokens.

This repository provides:

  • Runtime skill library: repo-skills/ contains skills for common ML, LLM, agent, RAG, bio/chem, vision, MLOps, and scientific Python projects.
  • DisCo CLI: src/ contains a pi-based CLI for creating, verifying, refreshing, extending, importing, and maintaining repo skills. Repo-skill creation includes assertion-backed usability cases, content-level self-refine, native example/test checks when safe, static verification, coverage reports, and import-readiness gates. The CLI also turns AI research papers into modular Agent Skills through the integrated Paper2Skills Distiller workflow.
  • Meta skills: meta-skills/ contains lightweight repo-skill and paper-to-skill workflows that can be copied into other agents when you do not need the full DisCo CLI.

With Auto-ML-Skills, you can:

  1. Use ready-made skills: install high-quality ML repo skills generated by DisCo into your own agent to improve its efficiency on ML tasks.
  2. Build new skills: use DisCo to create repo skills for your own repositories, verify them through the built-in verification workflow, and optionally contribute those skills back to this library.
  3. Distill research papers: run DisCo with --source paper to convert a PDF, arXiv id or URL, paper title, or paper/repo pair into reusable module-level skills.
  4. Bring workflows into agents: import the provided meta skills into Codex or Claude Code so they can run DisCo-style repo-skill and paper-to-skill workflows.

📣 News

  • 2026-06-28: Initial release of Auto-ML-Skills, including the public runtime skill library, the DisCo CLI for repo-skill and paper-to-skill workflows, and the companion meta skills for bringing DisCo workflows into agents such as Codex and Claude Code.

⚙️ Installation

Install DisCo

Install the DisCo CLI from npm:

npm install -g @auto-ml-skills/disco
disco

DisCo requires Node.js >=22.19.0. pi natively supports 35 model providers, and DisCo inherits that provider layer. Configure at least one provider in the startup flow with /login, or use environment variables such as OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY, OPENROUTER_API_KEY, or MISTRAL_API_KEY.

You can also build from source:

git clone https://github.com/VectorSpaceLab/Auto-ML-Skills.git
cd Auto-ML-Skills
bash scripts/build-from-source-link.sh

The script installs workspace dependencies, builds the TypeScript packages, and links the disco command globally for local use.

Install The Skill Library

Clone this repository and copy the runtime repo skills into DisCo's managed skills directory:

git clone https://github.com/VectorSpaceLab/Auto-ML-Skills.git
cd Auto-ML-Skills
mkdir -p ~/.disco/agent/skills
cp -R repo-skills/* ~/.disco/agent/skills/

Restart DisCo after copying so the managed skill index is reloaded.

Install Workflow Meta Skills

The top-level meta-skills/ directory contains workflow skills for agents that should run DisCo-style repo-skill or paper-to-skill workflows without relying on the full DisCo CLI.

If you do not already have a local checkout, clone this repository first:

git clone https://github.com/VectorSpaceLab/Auto-ML-Skills.git
cd Auto-ML-Skills

Install all workflow meta skills into Codex:

mkdir -p ~/.codex/skills
cp -R meta-skills/* ~/.codex/skills/

Install all workflow meta skills into Claude Code:

mkdir -p ~/.claude/skills
cp -R meta-skills/* ~/.claude/skills/

To install only the paper workflow into Codex, copy these seven directories:

mkdir -p ~/.codex/skills
cp -R \
  meta-skills/create-paper-skills \
  meta-skills/paper-skills-distiller \
  meta-skills/plan-paper-skill-modules \
  meta-skills/create-paper-module-skill \
  meta-skills/prepare-paper-recovery-env \
  meta-skills/recover-paper-result \
  meta-skills/analyze-paper-recovery \
  ~/.codex/skills/

See meta-skills/README.md for the workflow list, Claude Code paper-only install command, copy-and-run agent installation prompts that clone the repository automatically, and default workflow artifact layout.

🚀 Quick Start

Use Repo Skills In Codex Or Claude Code

After the skill library is installed in DisCo's managed skills directory, use DisCo's import workflow to export selected repo skills into your target agent. For example, import the router plus the vllm and sglang skills into Claude Code:

disco -p "/skill:import-repo-skills-to-agent import vllm and sglang to ~/.claude"

To import the same skills into Codex:

disco -p "/skill:import-repo-skills-to-agent import vllm and sglang to ~/.codex"

Restart the agent, then ask for a concrete deployment task:

Use the repo skills to compare vLLM and SGLang for deploying Qwen3-32B on this
machine, then prepare a minimal OpenAI-compatible serving plan with launch
commands, environment checks, and a smoke-test request.

Create A Skill For A Repository

Use DisCo to create and verify a repo-specific skill from source evidence:

disco -p "Create a repo skill for /path/to/repo."

The workflow analyzes repository structure, prepares or checks a Python inspection environment when needed, writes runtime guidance, records provenance, and then hands the draft to verify-repo-skill. Verification creates assertion-backed usability cases, runs content-level self-refine, checks safe native examples or tests when available, runs static quality gates, and writes coverage and review artifacts before the skill is treated as ready.

To let the agent choose the extraction scope and import the verified skill into DisCo's managed library without another confirmation round, delegate both decisions in the request:

disco -p "Create a repo skill for /path/to/repo with auto decide and auto import."

Create Skills From A Paper

Use the paper-to-skill workflow integrated in the DisCo CLI when the source is a research paper rather than a software repository. For repeatable runs, copy and fill the bundled run-config template, then pass it to DisCo:

cp meta-skills/create-paper-skills/assets/distiller-run-config-template.toml \
  /path/to/distiller_run_config.toml
disco --source paper -p "Use Distiller to process the runs in this config. config_path: /path/to/distiller_run_config.toml"

The paper source can be a local PDF or text file, direct PDF URL, arXiv URL or id, or paper title. An implementation repository is optional and can be a local path, Git URL, none, or unknown. Distiller modularizes the paper, creates and validates module-level skills, prepares bounded runtime evidence, runs the strongest feasible recovery experiment without reading the original implementation repo, analyzes gaps, refines within iteration_budget when needed, and writes attempt artifacts plus final reports under <attempt_dir>/reports/final/. The default recovery_mode is hard, so reduced, proxy, toy, or fallback runs are recorded as diagnostics rather than accepted as successful recovery unless you explicitly choose soft mode.

Extend An Existing Skill

Ask DisCo to extend an existing skill when it is correct but needs deeper coverage for a new workflow area:

disco -p "Add streaming inference coverage to the existing skill at /path/to/repo/skills/example-skill using /path/to/repo as evidence."

Refresh A Skill After Upstream Changes

Ask DisCo to refresh a skill when the upstream repository changes APIs, configs, examples, dependencies, or runtime behavior:

disco -p "Refresh the skill at /path/to/repo/skills/example-skill against the current /path/to/repo code."

Refresh should preserve correct existing guidance while updating stale instructions against the current source baseline.

🤝 Contributing

We welcome contributions in three main areas:

  1. Contribute generated repo skills. Add a publishable runtime skill under repo-skills/<skill-id>/, include provenance and routing metadata, and update repo-skills-router so agents can discover it.
  2. Extend or refresh existing repo skills. Improve stale, incomplete, or unclear skills with source-grounded changes. Update provenance or routing metadata when the upstream baseline or coverage changes.
  3. Improve the DisCo CLI source. Changes to the TypeScript CLI under src/ are welcome, including package/repo and paper-to-skill workflows. Run focused checks and document behavior changes. Repo-skill workflow changes should preserve the create/verify split, review/test artifact layout, import-readiness gates, and locked router-update transaction. Updates to the integrated Paper2Skills workflow should preserve its source-resolution, modularization, generated-skill validation, recovery, analysis, and final-report contracts.

For repo-skill PRs, list the model, provider, reasoning or thinking level, source repository commit, and verification steps used to produce or revise the skill. For DisCo CLI changes that touch paper-to-skill behavior, include the paper source, run config, recovery mode, validation artifacts, and final report path when applicable. See CONTRIBUTING.md for the full checklist.

📚 Documentation

Page Description
Imported Repo Skills Catalog Public catalog of included runtime repo skills, grouped by workflow area with upstream baselines.
Architecture Repository layers, DisCo source layout, skill authoring pipeline, runtime skill shape, and managed library model.
Workflow Meta Skills Copyable package/repo and paper-to-skill workflow skills for external agents.
DisCo CLI README DisCo CLI usage for repo-skill creation, import, verification, and paper-to-skill workflows.
Contributing Contribution rules for generated repo skills, router/catalog updates, documentation, meta skills, and CLI source.

🙏 Acknowledgement

DisCo's CLI and agent runtime are built on the foundation of earendil-works/pi, an open-source AI agent toolkit with a unified LLM API, agent loop, terminal UI, and coding-agent CLI.

Auto-ML-Skills is also made possible by the GitHub open-source community. The repo skills in this library exist because many researchers and engineers have released high-quality ML, agent, data, bio/chem, vision, and infrastructure projects for the community to build on.

📄 License

Auto-ML-Skills is released under the Apache License 2.0. Unless a file explicitly states otherwise, the license applies to both the DisCo CLI source code in src/ and the open-sourced runtime repo skills under repo-skills/.

See LICENSE for the full license text.

📝 Citation

TBA

About

A Skill Library for Automated Machine Learning

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors