Golden traces
Curate a pack of golden traces — the canonical examples of correct behaviour. Every promotion candidate runs against the pack.
Bounded eval and promotion pipeline. VMI evaluation harness with golden traces, pack scoring, regression detection, and governed self-improvement gates.
What it does
Curate a pack of golden traces — the canonical examples of correct behaviour. Every promotion candidate runs against the pack.
Score deltas surface regressions before they ship. Per-metric gates can block a promotion until the regression is reasoned about.
Auto-research loop. Time-bounded experiments. If the result improves the metric, commit. If it regresses, reset. The codebase ratchets forward.
Each user's thumbs up / thumbs down flows into DreamLab as labelled traces. Personal models stay personal; org models stay org-scoped.
Screens
Final screenshots ship alongside the public launch. Slots above match the canonical DreamLab surface today.
How to access
Personal trace packs, basic regression gates.
Shared packs, team-scoped feedback, promotion approvals.
Dedicated eval compute, BYO models, audit export.
Roadmap
Install community-curated golden trace packs by vertical.
Replay a trace with a new model to score head-to-head deltas.
Promote candidates to a percentage of live traffic with safety gates.
One platform. Every vertical. Switch between DreamLab and the rest of NOME without losing context.