HDF5 adapter
Reads imitation-learning HDF5 files and turns each trajectory into a
RoboTrace episode. It depends on h5py only - not robomimic,
lerobot, or torch - and covers the two layouts that dominate behavior
cloning today:
- robomimic - one file, many trajectories under
data/demo_0,data/demo_1, … Each demo holdsactions, anobsgroup (proprioception + camera image stacks),rewards,dones,states. One demo → one episode. - ALOHA / ACT - one file per episode:
/actionplus an/observationsgroup (qpos,qvel,effort, andimages/<camera>stacks). The whole file → one episode.
from robotrace.adapters import hdf5
# ALOHA single-episode file → one RoboTrace episode.
hdf5.upload_episode(
"episode_0.hdf5",
policy_version="act-v1",
env_version="aloha-cell-1",
fps=50,
)
# robomimic multi-demo file → one episode per demo.
hdf5.upload_dataset("low_dim.hdf5", policy_version="bc-v3", fps=20)Install
# Proprioception + actions only (no video).
pip install 'robotrace-dev[hdf5]==0.3.0'
# With image streams → MP4 encoding.
pip install 'robotrace-dev[hdf5,video]==0.3.0'[hdf5] pulls in h5py and numpy (~few MB - a thin libhdf5 wrapper).
[video] adds opencv-python to encode (T, H, W, C) image datasets
into video.mp4. A sensor-only file never pays the opencv cost.
The four verbs
| Verb | What it does |
|---|---|
hdf5.scan_file(path, fps=...) | Read-only introspection. Returns a FileSummary with the detected layout, trajectory count, fps, robot, and camera datasets. No frames decoded - safe on a multi-GB file. |
hdf5.encode_episode(path, out, episode_index=...) | Encodes one trajectory into video.mp4 / sensors.npz / actions.npz under out. Returns an EncodedEpisode. No network. |
hdf5.upload_episode(path, ...) | One-shot: encode one trajectory to a tempdir → start_episode + upload_* + finalize. Returns the finalized Episode. |
hdf5.upload_dataset(path, ...) | Bulk: walk every trajectory in a multi-demo file and upload each. Returns the finalized Episode list, with an on_progress callback hook. |
Start with scan_file to confirm the layout and fps before uploading:
summary = hdf5.scan_file("low_dim.hdf5")
print(summary.report())
# low_dim.hdf5
# layout: robomimic
# trajectories: 200
# fps: 20
# robot_type: Panda
# env: Lift
# cameras: obs/agentview_image, obs/robot0_eye_in_hand_imageSlot mapping
Dataset names within a trajectory are routed by classify_dataset, a
pure function you can call directly to pin behavior:
| Dataset name | Slot |
|---|---|
action, actions, action_dict/* | actions.npz |
observations/images/<cam>, *_image, *_rgb, *_depth | video.mp4 |
rewards, dones, success, discount | episode metadata |
qpos, qvel, robot0_eef_pos, states, anything else | sensors.npz |
timestamp, frame_index, index | dropped (bookkeeping) |
A name that looks like an image but isn't a (T, H, W, C) uint8 stack
falls back to sensors with a note in skipped_datasets.
NPZ layout
Proprioception lands in sensors.npz, actions in actions.npz, using
the same namespaced layout as the ROS 2, LeRobot, and Gymnasium
adapters:
observations/qpos/value float32[T, K] # flattened per-step values
observations/qpos/_t_ns int64[T] # synthetic step clock
action/value float32[T, action_dim]Each dataset is read as (T, …) and reshaped to (T, K).
Timestamps & fps
HDF5 imitation files rarely store a per-step clock - the spacing is
uniform by construction - so timestamps are synthesised from fps.
Pass the real capture rate (ALOHA is typically 50, robomimic 20)
via fps=. robomimic's control_freq is read from data.attrs["env_args"]
automatically. When nothing declares a rate, the adapter assumes 30 and
sets fps_assumed: true in the episode metadata.
Images & color order
Image datasets are encoded to one video.mp4 (single camera) or a
horizontal tile (multiple cameras, frames aligned by index). Stored
arrays are assumed RGB and converted to BGR for the encoder; pass
image_color="bgr" if your file already stores BGR.
hdf5.upload_episode(
"episode_0.hdf5",
fps=50,
canonical_camera="observations/images/top", # one camera, skip the tile
)Episode metadata
The encoder merges HDF5 facts into episode metadata:
{
"adapter": "hdf5",
"hdf5_layout": "robomimic",
"hdf5_source": "low_dim.hdf5",
"hdf5_episode_index": 0,
"hdf5_episode_key": "demo_0",
"hdf5_trajectory_length": 137,
"hdf5_robot_type": "Panda",
"hdf5_env": "Lift",
"hdf5_episode_outcome": { "dones": 1, "reward_sum": 4.0 }
}Reproducibility fields (policy_version, env_version, git_sha,
seed) come from the caller, same as every other adapter.
Defaults
| Parameter | Default |
|---|---|
source | "replay" |
episode_index | 0 |
fps | read from file, else 30 (assumed) |
image_color | "rgb" |
Roadmap
Not yet shipped:
- RLDS / Open X-Embodiment (TFDS) import
- Streaming row-group reads for files that don't fit in memory
See also: log_episode for raw NumPy logging,
LeRobot adapter, Gymnasium adapter.