Roadmap
This page tracks the current state of Remyx’s capabilities. We publish it so customers, collaborators, and researchers can see where we’re going well before each capability ships. For the architectural through-line, see Causal Intelligence. For how these capabilities map to the customer journey, see Maturity Progression.Status legend
| Indicator | Status | Meaning |
|---|---|---|
| Shipped | Available in production today | |
| In development | Actively being built | |
| Planned | Committed to the roadmap, not yet started | |
| Research | Exploring feasibility, not yet committed |
ExperimentOps
These capabilities are live in production today. They form the Stage 2 foundations and are documented in ExperimentOps and the platform pages.| Status | Capability | Description |
|---|---|---|
| Shipped | Experiment capture | Full lifecycle from origin through decision, with target metric, hypothesis, and decision rationale captured directly on the experiment record |
| Shipped | Cross-experiment patterns | Tag-based clustering identifies which directions consistently produce results |
| Shipped | Resource discovery | Semantic search across papers, repos, models, and datasets, matched to your team’s experiment history |
| Shipped | History-aware recommendation ranking | The structured shipping history extracted from your repo feeds the recommendation ranker, so candidates that align with your team’s actual trajectory rank higher |
| Shipped | Preference-model ranking | A per-team preference model fit over past experiments scores candidate work and breaks ties behind relevance |
| Shipped | Automated discovery PRs (Outrider) | A scheduled GitHub Action that opens a reviewable draft PR — or a discussion Issue — for the next paper worth implementing; a person reviews and merges |
| Shipped | Portfolio view | Leadership-facing view of every project with health indicators, hit rates, and metric trends |
| Shipped | Connector framework | Bidirectional sync with GitHub (GitHub App), Linear, Jira, Slack, Claude Code MCP |
| Shipped | MCP server | Programmatic access to Remyx capabilities from Claude Code and other MCP clients |
| In development | Standalone Stage 1 product surface | Milestone-driven recommendations for early-dev teams without production traffic |
Causal intelligence, evidence layer
Stage 2. The evidence layer feeding the causal model.| Status | Capability | Description |
|---|---|---|
| Planned | Observational log ingestion | Pluggable adapters for Datadog, Honeycomb, structured JSON, OpenTelemetry. Customers connect existing telemetry, and Remyx populates the evidence layer without instrumentation changes |
| Planned | Commit-correlated regime boundary detection | Extension of the existing repo integration to identify regime changes in the data-generating process from commit history |
| Planned | Quasi-experiment identification | Combine observational logs with regime boundaries to produce identified causal effects via difference-in-differences, interrupted time series, or regression discontinuity |
| Planned | Causal discovery | Bootstrap a partial causal graph from observational data, supplemented by regime-boundary structure. Discovered structure is human-validated before being treated as the working model |
| Planned | Causal graph engine | Versioned causal graph as a top-level object, supporting interventional, counterfactual, and mediation-aware graphical models. Semi-Markovian formulation to handle latent confounders |
| Planned | Causal data fusion | Combine evidence from multiple sources into one coherent posterior. Conflict resolution, graph refinement proposals, continuous incremental updates |
Causal intelligence, query and interaction
| Status | Capability | Description |
|---|---|---|
| Planned | Identification dispatcher | Given a question, classify it by required identification layer, route to evidence sources with appropriate identification, and dispatch to estimation logic |
| Planned | Natural language query layer | Customer-facing interface that takes natural language questions, parses into formal estimands, and returns natural language answers with identification status and recommendations |
Causal intelligence, Stage 3
| Status | Capability | Description |
|---|---|---|
| Planned | A/B test integration framework | Connectors for Statsig, Eppo, LaunchDarkly. A/B test results become an evidence source feeding the causal model |
Causal intelligence, Stage 4
| Status | Capability | Description |
|---|---|---|
| Planned | Shadow-decision SDK (log-only mode) | Python SDK that wraps decision points in your AI system. The first version captures natural policy output without applying overrides |
| Planned | Shadow-mode audit infrastructure | Dedicated product surface for the shadow-mode adoption phase. Audit trail viewer, override proposal review, and compliance reporting |
| Planned | CTF-RAND override policies | Extension of the SDK with counterfactual randomization. Trajectory-consistent semantics by default. Per-decision-point semantics opt-in for mediation analysis |
| Planned | ETT and NDE estimation | Counterfactual estimation procedures for effect-of-treatment-on-the-treated and natural direct effect |
Hypothesis triage
The triage layer matures with the customer. Stage 2 customers see quasi-experimental recommendations. Stage 3 customers see A/B test recommendations. Stage 4 customers see CTF-RAND recommendations.| Status | Capability | Description |
|---|---|---|
| Planned | Hypothesis ranking and evidence path recommendation | Rank hypotheses by expected information gain. Identify the cheapest evidence path to an answer for each |
| Planned | Orchestration scheduler | Coordinate active interventions (CTF-RAND randomizations, A/B tests, quasi-experiment analyses) for maximum concurrency without compromising estimate validity |
| Planned | Identification-enabling intervention proposals | Proactively propose CTF-RAND or A/B interventions that would make currently-unidentifiable hypotheses estimable |
How we sequence
Foundational evidence-layer capabilities ship first. The causal model and query layer come next. A/B integration follows. Counterfactual perturbations ship last. The dependency hot path through the architecture follows this order.- Evidence schema.
- Observational log ingestion.
- Quasi-experiment identification.
- Causal graph engine and data fusion.
- Identification dispatcher and natural language query.
- A/B integration.
- Shadow-decision SDK.
- CTF-RAND override policies.
- ETT and NDE estimation.