Training › General Training¶
Source path: AlphaBrain/training/
Generic training entrypoints and shared utilities for VLA models. Continual Learning and Reinforcement Learning have their own pages:
Training entrypoints¶
train_alphabrain.py — main training entrypoint¶
train_alphabrain ¶
AlphaBrain’s trainer is built directly on native PyTorch + Accelerate + DeepSpeed, keeping the loop explicit and easy to hack. Conventions: 1. Store runtime state in dicts where possible (simplifies data info, procesing info, config, etc). 2. Use multiple dataloaders to adapt heterogeneous data types / task mixtures. 3. Put each training strategy in its own trainer_*.py file (avoid large if‑else chains).
VLATrainer ¶
Bases: TrainerUtils
Source code in AlphaBrain/training/train_alphabrain.py
train ¶
execute training loop
Source code in AlphaBrain/training/train_alphabrain.py
eval_action_model ¶
Evaluate the model on the given dataset using the specified metric function.
:param eval_dataset: List of evaluation samples, each containing 'image', 'instruction', and 'action'. :param metric_fn: Function to compute the distance between predicted and ground truth actions. :return: Average metric score across the evaluation dataset.
Source code in AlphaBrain/training/train_alphabrain.py
setup_file_logging ¶
Add a FileHandler to root logger so all log messages are saved to a local file. Only the main process (rank 0) writes to avoid multi-process file conflicts.
Source code in AlphaBrain/training/train_alphabrain.py
setup_directories ¶
create output directory and save config
Source code in AlphaBrain/training/train_alphabrain.py
build_model ¶
build model framework
Source code in AlphaBrain/training/train_alphabrain.py
prepare_data ¶
prepare training data
Source code in AlphaBrain/training/train_alphabrain.py
setup_optimizer_and_scheduler ¶
setup_optimizer_and_scheduler(model, cfg) -> Tuple[torch.optim.Optimizer, torch.optim.lr_scheduler._LRScheduler]
set optimizer and scheduler
Source code in AlphaBrain/training/train_alphabrain.py
train_alphabrain_cotrain.py — co-training¶
train_alphabrain_cotrain ¶
AlphaBrain’s trainer is built directly on native PyTorch + Accelerate + DeepSpeed, keeping the loop explicit and easy to hack. Conventions: 1. Store runtime state in dicts where possible (simplifies data info, procesing info, config, etc). 2. Use multiple dataloaders to adapt heterogeneous data types / task mixtures. 3. Put each training strategy in its own trainer_*.py file (avoid large if‑else chains).
VLATrainer ¶
VLATrainer(cfg, model, vla_train_dataloader, vlm_train_dataloader, optimizer, lr_scheduler, accelerator)
Bases: TrainerUtils
Source code in AlphaBrain/training/train_alphabrain_cotrain.py
train ¶
Execute training loop.
Source code in AlphaBrain/training/train_alphabrain_cotrain.py
eval_action_model ¶
Evaluate action prediction with current model.
Source code in AlphaBrain/training/train_alphabrain_cotrain.py
setup_file_logging ¶
Add a FileHandler to root logger so all log messages are saved to a local file. Only the main process (rank 0) writes to avoid multi-process file conflicts.
Source code in AlphaBrain/training/train_alphabrain_cotrain.py
setup_directories ¶
Create output directory and checkpoint directory.
Source code in AlphaBrain/training/train_alphabrain_cotrain.py
prepare_data ¶
Prepare co-training data.
Source code in AlphaBrain/training/train_alphabrain_cotrain.py
setup_optimizer_and_scheduler ¶
setup_optimizer_and_scheduler(model, cfg) -> Tuple[torch.optim.Optimizer, torch.optim.lr_scheduler._LRScheduler]
Set optimizer and learning rate scheduler.
Source code in AlphaBrain/training/train_alphabrain_cotrain.py
train_alphabrain_vlm.py — VLM-only training¶
train_alphabrain_vlm ¶
AlphaBrain’s trainer is built directly on native PyTorch + Accelerate + DeepSpeed, keeping the loop explicit and easy to hack. Conventions: 1. Store runtime state in dicts where possible (simplifies data info, procesing info, config, etc). 2. Use multiple dataloaders to adapt heterogeneous data types / task mixtures. 3. Put each training strategy in its own trainer_*.py file (avoid large if‑else chains).
VLATrainer ¶
Bases: TrainerUtils
Source code in AlphaBrain/training/train_alphabrain_vlm.py
train ¶
Execute training loop.
Source code in AlphaBrain/training/train_alphabrain_vlm.py
setup_directories ¶
Create output directory and checkpoint directory.
Source code in AlphaBrain/training/train_alphabrain_vlm.py
prepare_data ¶
Prepare VLM training data.
Source code in AlphaBrain/training/train_alphabrain_vlm.py
setup_optimizer_and_scheduler ¶
setup_optimizer_and_scheduler(model, cfg) -> Tuple[torch.optim.Optimizer, torch.optim.lr_scheduler._LRScheduler]
Set optimizer and learning rate scheduler.
Source code in AlphaBrain/training/train_alphabrain_vlm.py
train_stdp.py — STDP spiking-model training¶
train_stdp ¶
STDP Fine-tuning Training Script for NeuroVLA.
This script loads a pretrained NeuroVLA checkpoint and fine-tunes the SNN action head using Reward-Modulated STDP (R-STDP), optionally blended with standard backpropagation gradients.
Modes
- hybrid: Δw = α·Δw_backprop + β·Δw_rstdp (default)
- pure_stdp: Δw = Δw_rstdp only (no backprop for SNN weights)
Usage
accelerate launch AlphaBrain/training/train_stdp.py --config_yaml configs/finetune_config.yaml --mode neuro_vla_stdp
STDPTrainer ¶
Bases: TrainerUtils
Trainer for R-STDP fine-tuning of NeuroVLA.
Extends the standard training loop with: 1. SpikeMonitor to record spike timing from LIF layers 2. STDPLearner to compute STDP weight updates 3. RSTDPOptimizer to blend backprop and STDP updates
Source code in AlphaBrain/training/train_stdp.py
Trainer utilities¶
Shared training utilities: structured logging (overwatch), PEFT, finetune configuration, checkpoint tracking, and more.
Overwatch (unified logging)¶
overwatch ¶
overwatch.py
Original file from OpenVLA project (Prismatic), licensed under MIT License.¶
See https://github.com/openvla/openvla for full license text and contributors.¶
Modified by @JinhuiYE, [2025]¶
Utility class for creating a centralized/standardized logger (built on Rich) and accelerate handler.
DistributedOverwatch ¶
Initializer for an Overwatch object that wraps logging & accelerate.PartialState.
Source code in AlphaBrain/training/trainer_utils/overwatch.py
PureOverwatch ¶
Initializer for an Overwatch object that just wraps logging.
Source code in AlphaBrain/training/trainer_utils/overwatch.py
Finetune configuration¶
finetune_config ¶
Utilities for loading finetune_config.yaml as the primary training config.
Merge order (lowest → highest priority): configs/models/
expand_env_vars ¶
Expand bash-style ${VAR} / ${VAR:-default} in a string. No-op for non-strings.
Source code in AlphaBrain/training/trainer_utils/finetune_config.py
build_config_from_finetune ¶
Build an OmegaConf training config from finetune_config.yaml + mode name.
Source code in AlphaBrain/training/trainer_utils/finetune_config.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 | |
Configuration tracker¶
config_tracker ¶
AccessTrackedConfig ¶
AccessTrackedConfig(cfg: Union[DictConfig, ListConfig], parent: AccessTrackedConfig = None, key_path: str = '')
Wrapper for OmegaConf to track accessed parameters. Only saves configuration items that were actually accessed during execution.
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
keys ¶
Return config keys (required for dict unpacking) Tracks all keys as accessed. Only works for DictConfig.
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
values ¶
Return config values (tracks all keys as accessed)
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
items ¶
Return config items (tracks all keys as accessed)
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
get ¶
Get value with default fallback
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
update ¶
Update config with values from another dict/config
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
pop ¶
Remove and return a value
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
append ¶
Append value to list (only for ListConfig)
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
extend ¶
Extend list with values (only for ListConfig)
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
setdefault ¶
Set default value if key doesn't exist
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
copy ¶
Return a shallow copy (does not copy access tracking state)
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
deepcopy ¶
Return a deep copy (does not copy access tracking state)
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
merge_with ¶
Merge with other configs and return new tracked config
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
to_dict ¶
to_yaml ¶
unwrap ¶
get_root ¶
export_accessed_config ¶
Export accessed configuration as dictionary (only leaf values)
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
save_accessed_config ¶
Save accessed configuration to file
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
get_access_summary ¶
Get summary of accessed configuration
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
print_access_summary ¶
Print a formatted summary of accessed configuration
Source code in AlphaBrain/training/trainer_utils/config_tracker.py
wrap_config ¶
unwrap_config ¶
Unwrap AccessTrackedConfig to get underlying OmegaConf object
Trainer helper functions¶
trainer_tools ¶
metrics.py
Utility classes defining a Metrics container and multiple Trackers to enable model/stage-specific logging to various endpoints (e.g., JSONL local logs, Weights & Biases).
TrainerUtils ¶
freeze_backbones staticmethod ¶
directly freeze the specified submodules based on the relative module path list (patterns), no longer recursively find all submodule names: - patterns: read from config.trainer.freeze_modules, separated by commas to get the "relative path" list for example "qwen_vl_interface, action_model.net", it means to freeze model.qwen_vl_interface and model.action_model.net.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model | nn.Module model object | required | |
freeze_modules | relative module path list (patterns) | '' |
Returns:
| Name | Type | Description |
|---|---|---|
model | nn.Module model object |
return: - model:
Source code in AlphaBrain/training/trainer_utils/trainer_tools.py
print_trainable_parameters staticmethod ¶
print the total number of parameters and trainable parameters of the model :param model: PyTorch model instance
Source code in AlphaBrain/training/trainer_utils/trainer_tools.py
load_pretrained_backbones staticmethod ¶
load checkpoint: - if reload_modules is set, load by path part - otherwise → load the entire model parameters (overwrite model)
return
replace, loaded_modules: list of module paths that successfully loaded parameters; if global load, then ["
Source code in AlphaBrain/training/trainer_utils/trainer_tools.py
228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 | |
print_freeze_status staticmethod ¶
print the freezing status of each parameter in the model :param model: PyTorch model instance
Source code in AlphaBrain/training/trainer_utils/trainer_tools.py
setup_distributed_training staticmethod ¶
use Accelerator to prepare distributed training components :param accelerator: Accelerate instance :param components: any number of components (such as model, optimizer, dataloader, etc.) :return: prepared distributed components (in the same order as input)
Source code in AlphaBrain/training/trainer_utils/trainer_tools.py
compute_grad_angle_with_stats staticmethod ¶
compute the cosine angle between two groups of gradient vectors (degrees), and calculate the average angle and variance. grads_a, grads_v: gradient Tensor list corresponding to the same parameter list interface_params return: mean_angle_deg: average angle (degrees) angle_variance: angle variance
Source code in AlphaBrain/training/trainer_utils/trainer_tools.py
pcgrad_project staticmethod ¶
apply PCGrad projection to the second group of gradients grads_v, suppress negative transfer between grads_a and grads_v if the dot product of two groups of gradients < 0, then: grads_v <- grads_v - (dot / ||grads_a||^2) * grads_a return the new grads_v list
Source code in AlphaBrain/training/trainer_utils/trainer_tools.py
eval_qwenpi staticmethod ¶
evaluate QwenQFormerDiT model, compute IoU and action distance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
qwenpi | QwenQFormerDiT model instance. | required | |
dataloader | data loader. | required | |
num_batches | number of batches to evaluate. | 20 |
Returns:
| Name | Type | Description |
|---|---|---|
dict | contains IoU and action distance evaluation results. |
Source code in AlphaBrain/training/trainer_utils/trainer_tools.py
extract_json_from_string staticmethod ¶
extract valid JSON part from string and convert to dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_string | str | string containing extra characters. | required |
Returns:
| Name | Type | Description |
|---|---|---|
dict | dictionary extracted and parsed. |
Source code in AlphaBrain/training/trainer_utils/trainer_tools.py
normalize_dotlist_args ¶
Convert ['--x.y', 'val'] and ['--flag'] → ['x.y=val', 'flag=true']
Source code in AlphaBrain/training/trainer_utils/trainer_tools.py
build_param_lr_groups ¶
build multiple param groups based on cfg.trainer.learning_rate. support specifying different learning rates for different modules, the rest use base.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vla | nn.Module model object | required | |
cfg | config object, requires cfg.trainer.learning_rate dictionary | required |
Returns:
| Type | Description |
|---|---|
| List[Dict]: param_groups that can be used to build optimizer with torch.optim |
Source code in AlphaBrain/training/trainer_utils/trainer_tools.py
only_main_process ¶
decorator: only run in main process (rank=0)
Source code in AlphaBrain/training/trainer_utils/trainer_tools.py
resize_images ¶
recursively resize all images in the nested list.
:param images: nested list of images or single image. :param target_size: target size (width, height) after resizing. :return: resized images list, keeping the original nested structure.
Source code in AlphaBrain/training/trainer_utils/trainer_tools.py
PEFT integration¶
peft ¶
LoRA / PEFT helpers shared across all trainers.
Public API¶
is_lora_enabled(cfg) -> bool
apply_lora(model, cfg) -> model (in-place)
save_lora_checkpoint(accelerator, model, base_path, cfg)
load_and_merge(base_model_factory, lora_adapter_dir,
action_model_pt, output_path, vlm_module=None)
Schema¶
The lora: block in training yaml is parsed by LoRASpec.from_omega. See config.py for the recognized fields. Backward-compatible with all existing yaml configs under configs/continual_learning/.
Checkpoint layout (unchanged from previous inline implementation):
LoRASpec dataclass ¶
LoRASpec(rank: int = 32, alpha: int = 16, dropout: float = 0.05, target_modules: Any = 'all-linear', init_lora_weights: str = 'gaussian', vlm_module: str | None = None, freeze_extra_modules: list[str] = list())
Backbone-agnostic LoRA application spec.
Resolved from yaml lora: block via :meth:from_omega.
from_omega classmethod ¶
Parse from yaml/OmegaConf lora: block.
Tolerant of: - Missing lora key (returns defaults; caller should check is_lora_enabled) - freeze_extra_modules as comma-separated string OR list - target_modules as string ("all-linear") OR list
Source code in AlphaBrain/training/trainer_utils/peft/config.py
peft_config ¶
Build a peft.LoraConfig from this spec.
Source code in AlphaBrain/training/trainer_utils/peft/config.py
is_lora_enabled ¶
Return True iff cfg.lora.enabled is set.
Source code in AlphaBrain/training/trainer_utils/peft/config.py
apply_lora ¶
Apply LoRA in-place per spec.
Steps
- Resolve VLM interface (from
lora.vlm_moduleor auto-detect via_VLM_REGISTRY). - Freeze ALL params of the VLM interface wrapper.
- Replace
vlm_interface.model = get_peft_model(...)so PEFT injects LoRA layers (their params are trainable, base remains frozen). - Freeze each module listed in
lora.freeze_extra_modules. - Modules not touched by the above stay with their original
requires_grad(typically full-FT — e.g. action_model, dino).
Returns the same model instance (mutated in place).
Source code in AlphaBrain/training/trainer_utils/peft/injector.py
save_lora_checkpoint ¶
Save LoRA adapter + non-VLM weights for a checkpoint.
Creates
Source code in AlphaBrain/training/trainer_utils/peft/checkpoint.py
load_and_merge ¶
load_and_merge(*, base_model_factory: Callable[[], 'torch.nn.Module'], lora_adapter_dir: str, action_model_pt: str, output_path: str, vlm_module: str | None = None) -> None
Build base model, attach LoRA adapter, merge, load extras, save full ckpt.
The output is a single .pt file usable by BaseFramework.from_pretrained, suitable for the standard server_policy + eval_libero pipeline.
Source code in AlphaBrain/training/trainer_utils/peft/checkpoint.py
checkpoint ¶
LoRA checkpoint save / load+merge.
File-name conventions are kept identical to the previous inline code, so existing checkpoints (5d / 5h / 5l etc.) remain merge-and-eval compatible:
<base_path>_lora_adapter/ ← PEFT adapter directory
adapter_config.json
adapter_model.safetensors
<base_path>_action_model.pt ← non-VLM weights (action_model + extras
like layer_qformer / edit_model / dino)
save_lora_checkpoint ¶
Save LoRA adapter + non-VLM weights for a checkpoint.
Creates
Source code in AlphaBrain/training/trainer_utils/peft/checkpoint.py
load_and_merge ¶
load_and_merge(*, base_model_factory: Callable[[], 'torch.nn.Module'], lora_adapter_dir: str, action_model_pt: str, output_path: str, vlm_module: str | None = None) -> None
Build base model, attach LoRA adapter, merge, load extras, save full ckpt.
The output is a single .pt file usable by BaseFramework.from_pretrained, suitable for the standard server_policy + eval_libero pipeline.
Source code in AlphaBrain/training/trainer_utils/peft/checkpoint.py
config ¶
LoRA spec parsed from yaml lora: section.
Recognized fields (current schema, kept stable for backward-compat): enabled bool rank int (default 32) alpha int (default 16) dropout float (default 0.05) target_modules str | list[str] (default "all-linear") init_lora_weights str (default "gaussian") vlm_module str | None (default None → auto-detect) freeze_extra_modules str | list[str] (default [])
LoRASpec dataclass ¶
LoRASpec(rank: int = 32, alpha: int = 16, dropout: float = 0.05, target_modules: Any = 'all-linear', init_lora_weights: str = 'gaussian', vlm_module: str | None = None, freeze_extra_modules: list[str] = list())
Backbone-agnostic LoRA application spec.
Resolved from yaml lora: block via :meth:from_omega.
classmethod ¶Parse from yaml/OmegaConf lora: block.
Tolerant of: - Missing lora key (returns defaults; caller should check is_lora_enabled) - freeze_extra_modules as comma-separated string OR list - target_modules as string ("all-linear") OR list
Source code in AlphaBrain/training/trainer_utils/peft/config.py
Build a peft.LoraConfig from this spec.
Source code in AlphaBrain/training/trainer_utils/peft/config.py
is_lora_enabled ¶
Return True iff cfg.lora.enabled is set.
Source code in AlphaBrain/training/trainer_utils/peft/config.py
injector ¶
LoRA injection: freeze backbone + wrap with PEFT + freeze extras.
Extracted verbatim from the (more complete) implementation that previously lived in AlphaBrain/training/continual_learning/train.py. The simpler implementation in AlphaBrain/training/train_alphabrain.py is replaced by this version (a strict superset — the auto-detect / freeze_extras paths are no-op when the relevant yaml fields are absent, so QwenGR00T behavior is identical).
apply_lora ¶
Apply LoRA in-place per spec.
Steps
- Resolve VLM interface (from
lora.vlm_moduleor auto-detect via_VLM_REGISTRY). - Freeze ALL params of the VLM interface wrapper.
- Replace
vlm_interface.model = get_peft_model(...)so PEFT injects LoRA layers (their params are trainable, base remains frozen). - Freeze each module listed in
lora.freeze_extra_modules. - Modules not touched by the above stay with their original
requires_grad(typically full-FT — e.g. action_model, dino).
Returns the same model instance (mutated in place).
Source code in AlphaBrain/training/trainer_utils/peft/injector.py
merge_lora_checkpoint ¶
merge_lora_checkpoint.py — Merge LoRA adapter + non-VLM weights into a full checkpoint usable by the standard eval pipeline (server_policy.py + BaseFramework.from_pretrained).
Thin CLI wrapper around the sibling load_and_merge() helper. Located inside the peft module so it can be invoked via python -m without path hacks.
Usage (from repo root, starVLA env active): python -m AlphaBrain.training.trainer_utils.peft.merge_lora_checkpoint \ --base_config configs/continual_learning/qwengr00t_continual_libero.yaml \ --lora_adapter_dir results/Checkpoints/.../task_4_id4_steps_50000_lora_adapter \ --action_model_pt results/Checkpoints/.../task_4_id4_steps_50000_action_model.pt \ --output_path results/Checkpoints/.../task_4_id4_steps_50000_pytorch_model.pt