> ## Documentation Index
> Fetch the complete documentation index at: https://docs.remyx.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Continuous Experimentation

> The loop that keeps running on top of ExperimentOps: discovery, drafting, evaluation, and decision, with each turn sharpening the next. CI/CD-style cadence for AI experimentation, with a human at the merge.

[ExperimentOps](/concepts/experimentops) is your system of record — what your team tried, why, and what you decided. **Continuous Experimentation** puts that record to work: it runs the discovery-to-decision loop on a cadence, so keeping up with the field becomes a background process instead of a second job — the way CI/CD turned "we deploy when someone remembers to" into "every change runs the pipeline."

The bottleneck it removes is the legwork before a decision. New techniques ship across arXiv, Hugging Face, and GitHub every week; ideas are plentiful, and the cost is finding the few that fit your codebase and turning them into something you can actually evaluate. The loop does that work — surface, validate, draft — and leaves you the judgment, so effort flows to the most fruitful directions.

<Note>
  **What "continuous" means, precisely.** The loop runs on a schedule and does the watching, reading, and first-draft work automatically. It does **not** make the call — every cycle surfaces a reviewable artifact (a ranked recommendation, a draft PR, a scored variant) and a person decides. The CI/CD analogy, taken honestly: the pipeline runs on every change, but a human still approves the deploy.
</Note>

***

## The loop

A single experiment in Remyx already has a lifecycle: it comes from somewhere, gets implemented, gets evaluated, and ends in a decision. Continuous Experimentation is that lifecycle run as a *standing loop*, where the output of each turn feeds the input of the next.

<Steps>
  <Step title="Discover" icon="magnifying-glass">
    Your [Feed](/platform/discover/feed) surfaces new papers, repos, and models ranked against your team's actual shipping history. This runs daily without you asking.
  </Step>

  <Step title="Draft" icon="code">
    Remyx's [automated discovery-PR agent](/platform/discover/outrider) (**Outrider**), a scheduled GitHub Action, picks the candidate most *implementable* against your codebase and opens a **draft PR** wiring it into a real call site, or a **discussion Issue** when a clean integration isn't possible. This is the step that used to require a human to sit down and start reading.
  </Step>

  <Step title="Evaluate" icon="flask">
    The variant is scored against the [eval template and decision policy](/platform/experiments/evaluation) your team [committed to ahead of time](/tutorials/get-started/define-how-progress-gets-measured), so the bar is fixed before the result is known.
  </Step>

  <Step title="Decide" icon="gavel">
    A person reviews the evidence and logs the call — ship, iterate, or abandon — and why. This is the human gate, and it stays human.
  </Step>

  <Step title="Compound" icon="arrows-rotate">
    The decision becomes part of the record. The next discovery pass is ranked against a history that now includes this experiment. The loop tightens.
  </Step>
</Steps>

The first two steps are where Continuous Experimentation does new work: it closes the gap between *a relevant paper exists* and *there's a reviewable proposal in front of me*. Steps three through five are the [ExperimentOps](/concepts/experimentops) loop you already run — now fed automatically.

***

## CI/CD for AI experimentation

The analogy is exact in the places that matter and worth stating plainly where it isn't.

| Software CI/CD                                                    | Remyx Continuous Experimentation                                                                              |
| ----------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------- |
| A commit triggers the pipeline                                    | A schedule (or a context change) triggers a discovery + draft run                                             |
| The pipeline builds and tests automatically                       | Outrider selects, drafts, validates, and self-reviews automatically                                           |
| A failing build never reaches review                              | A recommendation that can't be cleanly integrated becomes a discussion Issue with the attempted diff attached |
| The team sets the gates (required checks, coverage) ahead of time | The team sets the eval template and decision policy ahead of time                                             |
| **A human approves the deploy**                                   | **A human reviews and merges the PR, and logs the decision**                                                  |

Where the analogy breaks — and where overclaiming would erode trust — is the last row. CI/CD ends in an automated deploy because the correctness bar is a passing test suite. AI experimentation ends in a *judgment* about whether a change actually moved a business outcome, which is not something to hand to a cron job. Remyx automates the toil up to that judgment and stops there.

<Info>
  This is also why the [Maturity Progression](/concepts/maturity-progression) stages stay **read-only and passive** through Stage 3. Continuous Experimentation reads your repo, ranks against your history, and proposes changes you approve. It does not touch production behavior. The first capability that does (Stage 4 counterfactual perturbations) ships only behind shadow-mode audits.
</Info>

***

<h2 id="how-recommendations-get-sharper">
  How recommendations get sharper
</h2>

A loop is only worth running continuously if each turn is better than the last. Two mechanisms make Remyx's recommendations improve as your history grows, rather than re-surfacing the same generic results:

<AccordionGroup>
  <Accordion title="History-aware ranking" icon="list-check">
    The structured experiments Remyx extracts from your merge log feed the ranker as context, so a candidate aligned with the direction you've actually shipped ranks above a merely topical one. This shifts the top results meaningfully versus ranking from your interest description alone, and the reasoning cites specific past work instead of shallow keyword overlap.
  </Accordion>

  <Accordion title="A learned preference model" icon="wand-magic-sparkles">
    Remyx fits a per-team preference model over your past experiments — learning from the order and lineage in which you shipped things — and scores new candidates with it as a tiebreaker behind relevance. It populates lazily and becomes meaningful past a few dozen experiments. It sharpens ranking only; it is deliberately *not* wired to auto-generate or auto-select experiments, which would put it in the decision seat the human holds.
  </Accordion>
</AccordionGroup>

Both converge on the same `ExperimentHistory` whether you reached it through a [Project](/platform/manage/projects) or a repo-driven [Research Interest](/platform/discover/feed#research-interests-from-a-repo), so the loop sees one coherent picture of your work. Beyond history, a [Deep Research](/platform/discover/deep-research) brief feeds the ranker as a forward-looking axis: it captures where your team intends to go next, so candidates aligned with that direction rank up even before you've shipped against it. If a recommendation or draft is wrong, the cost is a PR you close — nothing reaches your default branch or your users without a person putting it there.

***

## Get started

<CardGroup cols={2}>
  <Card title="Automated discovery PRs" icon="compass" href="/platform/discover/outrider">
    Set up the scheduled draft-PR loop on a repo
  </Card>

  <Card title="Feed" icon="newspaper" href="/platform/discover/feed">
    Create the Research Interest that drives recommendations
  </Card>

  <Card title="ExperimentOps" icon="book" href="/concepts/experimentops">
    The system of record this loop runs on top of
  </Card>

  <Card title="Maturity Progression" icon="stairs" href="/concepts/maturity-progression">
    Why the loop stays passive and read-only
  </Card>
</CardGroup>
