Model Checkpoint

Practical

A model checkpoint is a saved snapshot of a neural network’s weights at a specific point during training. For robot policies, it is the artifact you actually deploy — it encodes everything the model learned from the training dataset up to that moment. When a robot runs a trained policy, it is running a checkpoint.

What’s in a Checkpoint

A checkpoint is a directory, not a single file. For a LeRobot ACT policy trained to 100K steps, it looks like this:

checkpoints/
  100000/
    pretrained_model/
      config.json          ← model architecture + hyperparameters
      model.safetensors    ← weight values (~200MB for ACT)
      stats.json           ← dataset normalization statistics (mean/std per channel)

model.safetensors stores the weight tensors in a safe, fast-loading binary format (an alternative to PyTorch’s .pt files that avoids arbitrary code execution on load). config.json records the architecture decisions — input dimensions, number of transformer layers, chunk size — so the model can be reconstructed exactly. Neither file is useful without the other.

Training Curve and Checkpoint Selection

Loss decreases over the course of training, but not linearly. A typical ACT run on a 56-episode manipulation dataset:

Step   10K  →  loss 0.180   (model is barely tracking structure)
Step   50K  →  loss 0.072   (grasping emerges)
Step   80K  →  loss 0.058   (reliable on common positions)
Step  100K  →  loss 0.049   (diminishing returns begin)
Step  120K  →  loss 0.051   (slight overfit — worse than 100K)

The checkpoint at lowest validation loss is the best candidate, but evaluation on the real robot is the ground truth. A checkpoint at step 80K may generalize better than 100K if the dataset is small — the lower loss can reflect memorization of the training positions rather than genuine task understanding.

Checkpoint Lifecycle

Once training completes, a checkpoint follows a predictable path before reaching the robot:

Training completes
       ↓
Checkpoint saved to disk
       ↓
Sim validation (optional — reject if below success threshold)
       ↓
Deploy to robot via OTA update
       ↓
Physical evaluation (N trials across M positions)
       ↓
If regression: rollback to previous checkpoint

The rollback step is important in production. Because checkpoints are versioned snapshots, reverting to a known-good deployment is straightforward — the robot just loads the previous checkpoint file. This is qualitatively different from a software service rollback: the failure mode is behavioral (the robot picks up objects less reliably) rather than a crash, and it may not be obvious until enough trials have accumulated.

Robot Training Dataset The demonstration data a checkpoint is trained on

Sim-to-Real Validating checkpoints in simulation before physical deployment

OTA Model Update Delivering new checkpoints to deployed robots over the air

Imitation Learning The training paradigm that produces robot policy checkpoints

Sources

LeRobot checkpoint format — HuggingFace’s open-source library for robot imitation learning; defines the pretrained_model/ directory structure used above
HuggingFace safetensors — the safe, fast tensor serialization format used for model.safetensors; avoids the arbitrary code execution risk of Python pickle-based .pt files