Outcomes

Nav: Experiments > Outcomes | URL: /experiments Most experiment tracking stops at artifacts and metrics. But the most valuable output of an experiment is the team’s interpretation: what worked, what didn’t, and what to do next. That context typically lives in someone’s head. When they leave, the next person starts from scratch. Outcomes captures what your team tried, what happened, and what the team decided and why. Every decision logged is institutional knowledge that persists regardless of team changes.

Experiment Timeline

The main view shows all experiments for the currently selected project, sorted by date or impact.

Experiment Cards

Each experiment appears as a card showing:

Name and creation date
Source type icon — research resource, hypothesis, incident, or recommendation
Status badge — one of the eight lifecycle states
Target metric and observed delta (color-coded: green positive, red negative, gray pending)
Tags for grouping and pattern detection

Experiment Lifecycle

An experiment moves through eight states:

draft → running → complete → analyzing → { shipped | iterating | rejected }

State	Meaning
draft	Created and being configured; not yet started
running	Implementation or test is in flight
complete	Results are in, awaiting analysis
analyzing	Results under review; a decision is being formed
shipped	Terminal — the change was adopted
iterating	Terminal-for-now — spun into a follow-up direction
rejected	Terminal — the hypothesis didn’t hold
abandoned	Stopped without a clean result (deprioritized, blocked, or superseded)

shipped, iterating, and rejected are reached from analyzing. abandoned can be reached from any non-terminal state.

Outcomes Analytics

A date-bucketed timeline sits at the top of the view, with three charts over your target metric across experiments:

Chart	What it shows
Trend	The target metric over time across completed experiments
Velocity	How quickly experiments are moving through the lifecycle
Impact	The magnitude of observed deltas, largest first

The observed delta appears as a click-to-edit pill: click it to update observed_delta inline without opening the experiment. When you move an experiment into a terminal state (shipped, iterating, rejected), a decision-capture modal opens so the reasoning is recorded at the moment the decision is made.

Filtering and Search

Control	What it does
Search bar	Instant filtering by name, hypothesis, project, or tags
Status chips	Filter by lifecycle state (with counts)
Source filters	Toggle by source type (Papers / Hypotheses)
Sort	By date (newest first) or by impact (largest delta first)

Creating an Experiment

Click + New Experiment in the top-right corner. The create form adapts based on source type:

From a Resource
Custom Hypothesis
From an Incident

Select the Search papers source mode. Search for a paper, repo, or model by title. Remyx autocompletes from its resource index.

Field	Description
Name	Short descriptive name
Resource	Search and select from the index
Hypothesis	What you expect to happen
Target metric	Dropdown of metrics configured for this project
Project	Which project this belongs to
Tags	Comma-separated labels for grouping
Target repository	GitHub repo for the implementation (optional)
Tracker link	Link to Linear, Jira, or GitHub issue (optional)

Select the Custom source mode to start from your own idea without a linked resource.

Field	Description
Name	Short descriptive name
Hypothesis	What you expect to happen
Target metric	Dropdown of metrics configured for this project
Project	Which project this belongs to
Tags	Comma-separated labels

Select the Incident source mode. Track an experiment that originated from a production issue.

Field	Description
Name	Short descriptive name
Hypothesis	What you expect to happen
Target metric	Dropdown of metrics configured for this project
Tags	Comma-separated labels

After creation, you’re redirected to the experiment detail page.

Experiment Detail

URL: /experiments/dashboard/<experiment_id> A two-column layout showing the full lifecycle of a single experiment.

Origin Section

For research-sourced experiments, the Origin section shows the launch context — built automatically on first load (~2-4 seconds):

Field	Description	Editable?
Resource title	Link to the resource viewer	—
Abstract excerpt	One-sentence summary	Click to edit
Key methods	Technique badges extracted from the resource	Add/remove inline
Target repository	Repo where the implementation lands	Change triggers context rebuild
Implementation plan	AI-generated plan referencing actual file paths	Collapsible, editable, regeneratable
Docker image	Pre-built environment reference	Read-only

For hypothesis-sourced experiments, the Origin section shows the hypothesis text.

External Validation Data

When you’ve connected the relevant tools, the experiment detail view pulls in external validation data alongside the experiment’s own record:

Weights & Biases runs — tracked training and evaluation runs surfaced from your connected W&B account
HuggingFace metadata — model and dataset details for linked HF resources

This keeps the run-level evidence next to the hypothesis and decision instead of in a separate tab. See Connectors to connect Weights & Biases and HuggingFace.

Analysis Card

Combined Hypothesis and Decision in a single card:

Hypothesis — the team’s prediction, always visible at top
Decision — logged after results are in; includes text, author, and timestamp; click to edit

Implement Section

A compact bar for Claude Code integration:

Copy-paste command to run Claude Code with the Remyx MCP connection
Link to Connectors for setup

When a PR exists, a green banner appears at the top of the page with the PR title, status, and link to GitHub.

Activity Feed

Unified chronological feed combining:

Comments with @mention support, edit/delete
System events from the knowledge graph (experiment created, status changed, decision logged, PR opened)

Section	What it shows
Status	Dropdown across the lifecycle states; selecting a terminal state opens the decision-capture modal
Metric	Target metric, observed delta, confidence level
Resources	Linked artifacts: PR, ticket, repo, dataset, tracking run, custom links
Related Experiments	Bidirectional linking with cross-project search
Project	Project context from project settings
Tags	Editable tag list

Logging a Decision

The most important step in the ExperimentOps workflow. After reviewing results:

Scroll to the Decision section in the Analysis card
Write what the team decided and why
The decision is timestamped and attributed to the author

Good decisions capture reasoning and next steps:

“Ship to 100%. The re-ranker specifically helps with multi-topic tickets where the old retriever returned tangentially related articles. Three retrieval experiments now, all positive. This is our best direction.”

Insights

See cross-experiment patterns and recommended next steps

Projects

Leadership portfolio across all projects

Connectors

Link GitHub, Linear, Jira for bidirectional sync

Project Settings

Configure metrics, repos, and integrations per project

Introduction

Discover

Experiments

Manage

Admin

Concepts

Roadmap

Experiment Timeline

Experiment Cards

Experiment Lifecycle

Outcomes Analytics

Filtering and Search

Creating an Experiment

Experiment Detail

Origin Section

External Validation Data

Analysis Card

Implement Section

Activity Feed

Sidebar

Logging a Decision

Insights

Projects

Connectors

Project Settings

​Experiment Timeline

​Experiment Cards

​Experiment Lifecycle

​Outcomes Analytics

​Filtering and Search

​Creating an Experiment

​Experiment Detail

​Origin Section

​External Validation Data

​Analysis Card

​Implement Section

​Activity Feed

​Sidebar

​Logging a Decision

​Related

Insights

Projects

Connectors

Project Settings

Experiment Timeline

Experiment Cards

Experiment Lifecycle

Outcomes Analytics

Filtering and Search

Creating an Experiment

Experiment Detail

Origin Section

External Validation Data

Analysis Card

Implement Section

Activity Feed

Sidebar

Logging a Decision

Related