Platform Backbone
Unified Data Model & Traceability
The Unified Data Model & Traceability plane is what makes rFabric a platform instead of a collection of tools. It defines the canonical entities, lineage rules, promotion boundaries, and metadata authority that connect robot data, model development, release history, fleets, and field operations.
What This Surface Owns
Canonical entity graph
Every meaningful lifecycle object has a first-class place in the graph instead of living in component-local metadata islands.
- Datasets, episodes, sensor streams, derived assets, and annotations
- Training runs, checkpoints, promoted models, and artifacts
- Deployments, telemetry, maintenance records, and intervention events
Lineage and promotion boundaries
The graph has to explain not only what exists, but how it came to exist and when mutable work became immutable promoted state.
- Upstream and downstream relationship traversal
- Immutable dataset versions, models, and release artifacts
- Audit-friendly provenance across approvals and workflows
Canonical Entity Families
Governance and ownership context
The platform needs explicit ownership and scope before it can do anything trustworthy across multiple teams or deployments.
- Organization, workspace, region, and environment ancestry
- Policy and audit-linked state where required
- Actor attribution for human, service, and robot actions
Data foundation entities
Robot data is more than a file upload. It is structured collection context, synchronized sensor streams, semantic labels, and quality decisions.
- Dataset, episode, sensor stream, and derived asset
- Annotation, annotation project, curation ruleset, and quality score
- Versioned dataset outputs used downstream by model development
Model-development entities
Training lineage is only credible when every promoted result is connected to the exact data and evaluation context that produced it.
- Experiment, training run, and checkpoint
- Evaluation outputs and promotion evidence
- Promoted model identity with parent-child lineage
Release and operations entities
The same graph needs to explain what was packaged, what was deployed, what happened in the field, and how those outcomes fed future decisions.
- Artifact, deployment, deployment target, and rollout history
- Telemetry event, maintenance record, and intervention event
- Robot, fleet, site, and hardware context for field behavior
Relationship Chains That Matter
The point is not a long entity list. The point is that teams can move from any downstream outcome back to the exact upstream decisions that produced it.
From raw session to deployed model
Robot -> dataset -> episode -> annotation and quality decisions -> dataset version -> training run -> model -> artifact -> deployment -> live telemetry
From field failure back to source evidence
Incident -> deployment -> artifact -> model -> training run -> dataset version -> selected episodes -> collection context
From intervention to next model improvement
Intervention -> correction capture -> review -> curation ruleset -> updated dataset version -> retraining -> evaluation -> promoted model
- Which field incidents trace back to dataset version v12?
- Which operator sessions influenced the current model at one site?
- Which benchmark cases were added because intervention frequency rose?
Why Teams Care
Traceability
Any deployment, incident, or maintenance event can be traced back to the exact data and decisions that produced it.
Cleaner handoffs
Data, ML, platform, and operations teams use one entity language instead of translating between disconnected tools.
Safer change
Robots, sensors, schemas, workflows, and environments can evolve without destroying historical continuity.
Compounding platform value
The more surface area a team adopts, the more useful the graph becomes because every new workflow enriches the same system of record.