Model Lifecycle at Miro AI

Changes to prompts, models, or tool wiring must flow through a simple but strict lifecycle. This ensures measurable quality and predictable behavior for end users.

Stages

Experiment: freeform iteration; logs are sampled; offline evals encouraged but optional.
Staging: frozen prompt version, dataset defined; offline eval required.
Pilot: gated rollout to a small cohort; online metrics monitored; rollback plan ready.
Production: broad rollout with SLOs; changes require review and re-evaluation.

Prompt Management

Prompts are versioned in the Registry with semantic labels and changelogs.
Structured templates support variables, safety inserts, and system instructions.
Approval is required before promotion to Production.

Evaluation

Offline evals use curated datasets with golden answers or heuristics. Regression tests run on each change. Online metrics (task success rate, handoff rate, latency, cost) are tracked during Pilot and Production.

Tip

Keep datasets tiny but representative. Ten great examples beat one hundred noisy ones.

Model Lifecycle at Miro AI

Stages

Prompt Management

Evaluation

Related docs