Quickstart¶
From a blank machine to a trained-and-evaluated VLA checkpoint — then into any of the five capability-specific paths.
Read top to bottom
Work through Installation → Baseline VLA first. Everything below that is optional and can be tackled in any order.
The path¶
| Step | Page | What it does | Default hardware |
|---|---|---|---|
| 0 | Installation | Conda envs, Flash Attention, pretrained weights, LIBERO data, .env config. | — |
| 1 | Baseline VLA | PaliGemmaOFT / Pi05 / LlamaOFT finetune + LIBERO eval — the first trained checkpoint from the default recipe. | 4× A800 |
| 2 | NeuroVLA | Pretrain a brain-inspired VLA, then R-STDP fine-tune the SNN action head. | 4× A800 |
| 3 | RL-Token | Off-policy TD3 RL on a pretrained QwenOFT (encoder pretrain + rollout + TD updates). | 6× A800 |
| 4 | World Model | V-JEPA / Cosmos / Wan backbones × GR00T / OFT / PI decoders (12 combinations) + CosmosPolicy. | 4× A800 |
| 5 | Continual Learning | Sequential finetuning across 10 LIBERO tasks with Experience Replay (LoRA + full-param). | 2× A800 |
Every capability page follows the same layout — Overview → Prerequisites → Quick Start → Tips → Pointers — so you can skim one and recognize the shape of the next.
Minimum requirements¶
- Python 3.10+
- CUDA 11.8 / 12.x with matching PyTorch 2.1+
- At least one A100/H100-class GPU (4× A800 recommended for the default Baseline VLA recipe)
- ~100 GB of local disk for LIBERO datasets + pretrained VLMs
Detailed environment setup lives in Installation.
Shared conventions¶
Every quickstart assumes the same project layout:
AlphaBrain/
├── .env # local paths (see Installation)
├── configs/
│ ├── finetune_config.yaml # single-entry training/eval modes
│ ├── continual_learning/*.yaml # CL-specific configs
│ └── rl_recipes/*.yaml # RL-specific configs
├── scripts/
│ ├── run_finetune.sh # training launcher
│ ├── run_eval.sh # eval launcher
│ ├── run_base_vla/ # Baseline VLA wrappers
│ ├── run_brain_inspired_scripts/ # NeuroVLA wrappers
│ ├── run_rl_scripts/ # RL-Token wrappers
│ ├── run_continual_learning_scripts/
│ └── run_world_model/ # World Model wrappers
└── results/
├── training/<run_id>/ # finetune outputs
└── evaluation/<run_id>/ # eval outputs
All launchers:
- Source
.envat the project root - Read a mode block from YAML (mode name → resolved config)
- Create the output dir and snapshot the config for reproducibility
- Call
accelerate launch(training) or start a server + client (eval)
Shared prerequisites¶
Run these once before any capability page:
- Complete Installation — main conda env + flash-attn + eval env + LIBERO data.
- Fill out
.envwithPRETRAINED_MODELS_DIR,LEROBOT_LIBERO_DATA_DIR,LIBERO_DATA_ROOT,LIBERO_HOME,LIBERO_PYTHON. - Download the backbone(s) you need (see each page's Prerequisites section).
Which page should I read?¶
- "I just installed — does training work?" → Baseline VLA
- "How do I finetune a different backbone?" → Baseline VLA § Switch Backbone
- "Is there a bio-plausible fine-tuning path?" → NeuroVLA
- "Can I improve a pretrained VLA with RL?" → RL-Token
- "I want the SOTA visual backbone." → World Model (Cosmos 2.0 + GR00T currently leads)
- "How do I handle task drift across 10 tasks?" → Continual Learning
Getting help¶
- Full installation reference → Installation
- First trained-and-evaluated checkpoint → Baseline VLA
- Source-level reference → API Reference
- File an issue → github.com/AlphaBrainGroup/AlphaBrain/issues