Platform Backbone

Unified Data Model & Traceability

The Unified Data Model & Traceability plane is what makes rFabric a platform instead of a collection of tools. It defines the canonical entities, lineage rules, promotion boundaries, and metadata authority that connect robot data, model development, release history, fleets, and field operations.

What This Surface Owns

Canonical entity graph

Every meaningful lifecycle object has a first-class place in the graph instead of living in component-local metadata islands.

Datasets, episodes, sensor streams, derived assets, and annotations
Training runs, checkpoints, promoted models, and artifacts
Deployments, telemetry, maintenance records, and intervention events

Lineage and promotion boundaries

The graph has to explain not only what exists, but how it came to exist and when mutable work became immutable promoted state.

Upstream and downstream relationship traversal
Immutable dataset versions, models, and release artifacts
Audit-friendly provenance across approvals and workflows

Canonical Entity Families

Governance and ownership context

The platform needs explicit ownership and scope before it can do anything trustworthy across multiple teams or deployments.

Organization, workspace, region, and environment ancestry
Policy and audit-linked state where required
Actor attribution for human, service, and robot actions

Data foundation entities

Robot data is more than a file upload. It is structured collection context, synchronized sensor streams, semantic labels, and quality decisions.

Dataset, episode, sensor stream, and derived asset
Annotation, annotation project, curation ruleset, and quality score
Versioned dataset outputs used downstream by model development

Model-development entities

Training lineage is only credible when every promoted result is connected to the exact data and evaluation context that produced it.

Experiment, training run, and checkpoint
Evaluation outputs and promotion evidence
Promoted model identity with parent-child lineage

Release and operations entities

The same graph needs to explain what was packaged, what was deployed, what happened in the field, and how those outcomes fed future decisions.

Artifact, deployment, deployment target, and rollout history
Telemetry event, maintenance record, and intervention event
Robot, fleet, site, and hardware context for field behavior

Relationship Chains That Matter

The point is not a long entity list. The point is that teams can move from any downstream outcome back to the exact upstream decisions that produced it.

From raw session to deployed model

Robot -> dataset -> episode -> annotation and quality decisions -> dataset version -> training run -> model -> artifact -> deployment -> live telemetry

From field failure back to source evidence

Incident -> deployment -> artifact -> model -> training run -> dataset version -> selected episodes -> collection context

From intervention to next model improvement

Intervention -> correction capture -> review -> curation ruleset -> updated dataset version -> retraining -> evaluation -> promoted model

Which field incidents trace back to dataset version v12?
Which operator sessions influenced the current model at one site?
Which benchmark cases were added because intervention frequency rose?

Why Teams Care

Traceability

Any deployment, incident, or maintenance event can be traced back to the exact data and decisions that produced it.

Cleaner handoffs

Data, ML, platform, and operations teams use one entity language instead of translating between disconnected tools.

Safer change

Robots, sensors, schemas, workflows, and environments can evolve without destroying historical continuity.

Compounding platform value

The more surface area a team adopts, the more useful the graph becomes because every new workflow enriches the same system of record.