Roadmap¶
RoboSandbox is a playground for building and evaluating manipulation agents. The project is organised along four axes: object diversity, task diversity, interaction, and loop closure (record → train → deploy).
This page lists what currently ships and what's deferred or open. For
the concrete test and benchmark status, run robo-sandbox-bench
locally or check the
CI badge.
Shipped¶
Core
- MuJoCo backend + built-in 6-DOF arm.
- 9 skills:
pick,place_on,push,home,pour,tap,open_drawer,close_drawer,stack. - Stub planner + OpenAI-compatible VLM planner.
Robot + object diversity
- URDF import —
Scene(robot_urdf=...)+ sidecar YAML; bundled Franka Panda. - Mesh import —
SceneObject(kind="mesh")with CoACD decomposition + hull cache. - 10 bundled YCB items reachable via
@ycb:<id>shorthand. - Procedural scenes —
scene.presets.tabletop_clutter(n, seed). - Drawer primitive —
SceneObject(kind="drawer"), first articulated primitive.
Benchmark + evaluation
- Declarative success criteria (
lifted,moved_above,displaced,all,any). randomize:YAML block +--seeds Naggregation withmean ± stderr.- 8 default tasks + 1 experimental, covering pick / stack / push / pour / drawer.
- Authoring-time reachability pre-flight check.
Interaction
- Browser live viewer — FastAPI + WebSocket + SPA; task dropdown, Run/Reset, Record, Inspector scrubber.
- Keyboard teleop — WASD/QE drives EE; Space toggles gripper.
Loop closure
LocalRecorder— per-episode MP4 + events.jsonl + result.json.- LeRobot v3 export —
robo-sandbox export-lerobotCLI. Policyprotocol —ReplayTrajectoryPolicy,LeRobotPolicyAdapter,run_policy.RealRobotBackend— satisfiesSimBackendProtocol; SO-101 skeleton underexamples/so101_handoff/.
Open / deferred¶
Pillar 2 — task diversity¶
insert_pegskill — needs a peg-hole articulated primitive (prismatic-jointed hole with compliance). ~1 day slice.- More composites — "put the apple in the drawer", "stack three cubes by colour", "tidy the table".
- Richer randomization fields — rgba, size, mass (currently only xy + yaw).
Pillar 3 — interaction¶
- Client-side orbit camera — move viewer render from server-side MJPEG to client-side Three.js. ~2–3 days; needs its own spec.
- Gamepad + continuous teleop — extend keyboard teleop with gamepad axes → velocity integration.
- Trajectory inspector — post-run scrubber; requires in-RAM episode buffer.
Pillar 4 — loop closure¶
- First integration with a real policy checkpoint — wire a
public LeRobot/ACT/Diffusion ckpt into
load_policy, validate againstpick_cube_franka. - SO-101 reference backend — first concrete
RealRobotBackendsubclass on LeRobot's SO-101 driver.
Polish¶
- Full-mesh Franka visuals —
--meshes fulldownloads menagerie's 33 MB of OBJ visuals on demand, caches locally. - PyPI release — CI wheel+sdist on tag.
v0.3 directions (not committed)¶
- Full-scene randomization (lighting, camera, domain randomization).
- Grasp evaluation plugin (AnyGrasp / GraspNet / anti-podal / learned).
- More robot arms (UR5, xArm, SO-100 variants, two-arm setups).
- Procedural kitchens / desks / shelves beyond
tabletop_clutter. - Soft-body objects (fabric, fluids).
- Multi-robot scenes + coordination skills.