-
Debug smolVLA
Status: can reproduce offline SmolVLA and ACT and confirm that this is a model (smolvla) problem. ACT (even with 50 steps chunk size) seems to not have such problem. Early evidence that SmolVLA fits train well.
- [x] Infra: polish smolvla reproduce code
- [x] Infra: enable ACT reproduce after refactoring
- [x] Exp: confirm that ACT don’t to this (with proper reproducibility)
- [x] Exp: check at what moment (step) smolvla starts to misbehave (retro motion).
- [x] Exp: confirm that smolvla fits train (as MSE, not just denoising)
- [x] Exp: check smolvla on ACT eval dataset
- [x] Exp: check early smolvla checkpoints
- [x] Analise: visualise “gradient field” for frame 100
- [x] Exp: Cut jump back moments from train dataset, train and eval (1. offline 2. online)
- [ ] lerobot dataset issue
-
Collect pick-and-place dataset
-
Make reproducible eval
- Inference from the default position
- Control starting conditions with a set of “tasks” and image overlay
-
Test ACT against SmolVLA on the pick-and-place task
-
Implement a simple behavioural model from TRI (CLIP + Action head) and test it on the pick-and-place task
-
Tool to edit a dataset (cut episode, delete episode)
-
Collect diverse dataset: positions/objects/target with lang
-
Train and eval models on these tasks
-
Implement voice control