robotrace.log_episode

The single one-shot entry point for ingesting an episode. Equivalent to start_episode(...) → upload all artifacts → finalize. Use this for the 95% case of "I have files on disk, log them and move on."

The contract is sacred

Per AGENTS.md in the RoboTrace monorepo, this signature is the sacred SDK contract. Once we cut 1.0.0, breaking it requires:

  • A major version bump (1.x2.0)
  • At least one minor of DeprecationWarning before the break ships, so existing training scripts get an early warning instead of a TypeError

Until 1.0.0 (we're at 0.1.0a0 today) we may still iterate on the shape — every change lands in CHANGELOG.md.

Signature

def log_episode(
    *,
    # Identification
    name: str | None = None,
    source: Literal["real", "sim", "replay"] = "real",
    robot: str | None = None,
 
    # Reproducibility — load-bearing per AGENTS.md
    policy_version: str | None = None,
    env_version: str | None = None,
    git_sha: str | None = None,
    seed: int | None = None,
 
    # Artifact paths (uploaded inline via signed PUT URLs)
    video: str | Path | None = None,
    sensors: str | Path | None = None,
    actions: str | Path | None = None,
 
    # Run details
    duration_s: float | None = None,
    fps: float | None = None,
    metadata: Mapping[str, Any] | None = None,
 
    # Final state — defaults to "ready". Pass "failed" when the run
    # errored before producing usable data.
    status: Literal["ready", "failed"] = "ready",
) -> Episode

All arguments are keyword-only — positional calls raise TypeError. This is intentional: it lets us add new params without breaking older call sites.

Identification

name: str | None

Human-readable label for the run, shown in the episodes list. Falls back to episode_<short_id> when omitted. Use the same naming scheme across runs of the same task — it makes the list filterable.

source: "real" | "sim" | "replay"

Where the episode came from:

  • real — physical robot. The default.
  • sim — simulator (MuJoCo, Genesis, Isaac, Drake, etc.).
  • replay — generated by re-rolling a policy against a previously recorded observation stream. The eval engine sets this for you; you generally don't pass it manually.

robot: str | None

Stable identifier for the physical robot or sim configuration that produced the episode. Recommend a short slug (halcyon-bimanual-01, franka-right, ur5-cell-3) so the portal can group runs across days.

Reproducibility (load-bearing)

These four fields exist so future-you can re-roll a new policy against this episode and know what changed. Don't drop them to "simplify" — the eval engine literally can't run without them.

policy_version: str | None

A stable identifier for the policy / model checkpoint that produced this episode. Conventions we recommend:

StyleExample
SL/ILckpt_2026-05-01_step_180k
RLppo_2026-05-01_seed42
Frozen baselinebaseline_v1
VLApap-v3.2.1 (semver against the policy)

Whatever you pick — make it resolvable. The re-roll feature can only re-run a policy version it can locate, so don't put random hashes here unless your registry can map them back to weights.

env_version: str | None

The environment / world version. For sim, the build hash or config tag (mujoco_warehouse_v3, genesis-rev412). For real-world, the workcell setup version (cell_a_2026-04-12). Required so re-rolls know whether comparing across policy_versions is fair.

git_sha: str | None

The git SHA of your training/inference code at the time the episode was produced. We don't validate that the SHA exists in any specific repo — that's between you and your CI. Seven characters minimum is the convention.

seed: int | None

The seed used by the policy / env. If your stack uses multiple seeds, pass the highest-level one and stash the rest in metadata.

Artifacts

Local file paths. Each is uploaded to Cloudflare R2 via a short-lived signed PUT URL — bytes never touch the RoboTrace origin server. The SDK streams from disk so memory stays flat regardless of file size.

video: str | Path | None

A video file (.mp4, .webm, .mov). The signed URL is minted with Content-Type: video/mp4, so the file's actual content type needs to match. Files up to 8 GB are supported in Phase 1; split longer episodes.

sensors: str | Path | None

A serialized sensor blob — typically a .npy, .npz, .h5, or .bin file containing per-step sensor arrays ((T, ...) shaped, time axis first). Format is opaque to the server: we store the bytes and let your replay tooling deserialize them.

actions: str | Path | None

A serialized actions blob — typically a .parquet, .feather, or .npy file containing the (T, action_dim) action vector. Required if you want to re-roll a different policy on this episode later.

The SDK sanity-checks file extensions against slot names. Passing actions="run.mp4" raises ConfigurationError — likely the kwargs got swapped.

Run details

duration_s: float | None

Wall-clock duration of the run in seconds. Shown on the detail page and used by the dashboard heatmap to weight cells.

fps: float | None

Sampling rate for the recorded sensors / actions. Used by the replay viewer (when it ships) to align video and sensor tracks.

metadata: Mapping[str, Any] | None

Free-form JSON metadata stored as metadata jsonb on the episode row. Use it for anything that doesn't fit the standard fields: operator, lighting, shift, hardware revision, task outcome, etc.

Per AGENTS.md, don't put bytes or raw sensor values here — that's what the artifact slots are for. The DB column is indexed for JSONB search but not designed for multi-MB blobs.

status: "ready" | "failed"

Final state to flip the episode into. Defaults to "ready". Pass "failed" when you know the run errored before producing usable data — the episode still appears in the list but is filtered out of "recent successful runs" cards.

Return value

@dataclass
class Episode:
    id: str                               # uuid, as str
    status: str                           # "ready" or "failed"
    storage: Literal["r2", "unconfigured"]
    upload_urls: dict[ArtifactKind, UploadUrl]

You rarely need the return value from log_episode — by the time it returns, everything's already uploaded and finalized. Useful when you want to capture the episode id for your own logs:

ep = rt.log_episode(...)
my_logger.info("logged episode", episode_id=ep.id)

Errors

log_episode raises typed exceptions on every failure path. See Errors for the full hierarchy and recovery patterns. The most common ones in this call:

ExceptionWhen
ConfigurationErrorapi_key / base_url missing, or a file path doesn't exist
AuthErrorAPI key bad / revoked
ValidationErrorPayload didn't pass server-side validation
ConflictError(rare) Episode is somehow already archived
TransportErrorNetwork / DNS / timeout
ServerError5xx — flag for retries

If an upload fails partway through, the SDK auto-flips the run to status="failed" with the failure reason in metadata.failure_reason before re-raising — so you don't end up with ghostly "recording" runs in the portal.

Don'ts

  • Don't call log_episode from inside your training inner loop. Rate-limit at episode boundaries, not per step.
  • Don't put episode bytes in metadata. The DB is for metadata, R2 is for bytes.
  • Don't log the API key in your training script — use environment variables. The SDK never logs the key value.
  • Don't pass positional arguments. The contract is keyword-only on purpose.