Skip to content

feat(experimentation): experiment results model, task and endpoints#7796

Draft
gagantrivedi wants to merge 3 commits into
feat/experiment-results-queryfrom
feat/experiment-results-model
Draft

feat(experimentation): experiment results model, task and endpoints#7796
gagantrivedi wants to merge 3 commits into
feat/experiment-results-queryfrom
feat/experiment-results-model

Conversation

@gagantrivedi

Copy link
Copy Markdown
Member

Thanks for submitting a PR! Please check the boxes below:

  • I have read the Contributing Guide.
  • I have added information to docs/ if required so people know about the feature.
  • I have filled in the "Changes" section below.
  • I have filled in the "How did you test this code" section below.

Changes

Contributes to the experiment results stats layer (PR-7 of the stack; stacked on #7781).

Persists Bayesian results per experiment and exposes them, cloning the shipped exposures pair:

  • ExperimentResults model (+ migration 0008): OneToOne to Experiment, as_of / payload / last_error_at / refresh_requested_at, with is_final freezing a completed experiment's row before the warehouse 90-day TTL, and record_refresh / record_failure / record_refresh_request.
  • compute_results_summary orchestrator (services.py): derives the MetricSpec list from the attached metrics and the expected variant split from the environment's multivariate allocations — control takes the unallocated remainder, options with no variant key are skipped (SRM then skipped rather than tested against a split that doesn't describe the experiment) — then runs the PR-6 aggregation + kernel.
  • compute_experiment_results task: runs off the refresh endpoint; on warehouse failure stamps last_error_at and preserves the last good payload, logging results.compute_failed.
  • GET …/results/ + POST …/results/refresh/: clones of the exposures endpoints (202 + enqueue; 400 before start; 400 once final; 429 + Retry-After within the interval). The shared guard ladder is extracted into a _refresh_panel helper used by both panels.

Events catalogue regenerated for the new results.compute_failed event and shifted line numbers.

How did you test this code?

make test for the experimentation app (unit tests for the model, task, orchestrator and both endpoints — 100% diff coverage). mypy (strict) and ruff clean; FT test linter and events-catalogue docgen run locally. The orchestrator's _expected_variant_shares is covered for keyed options, null keys and a missing live feature state; compute_results_summary wiring is verified end-to-end with a faked warehouse client.

@vercel

vercel Bot commented Jun 16, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

3 Skipped Deployments
Project Deployment Actions Updated (UTC)
docs Ignored Ignored Preview Jun 16, 2026 12:18pm
flagsmith-frontend-preview Ignored Ignored Preview Jun 16, 2026 12:18pm
flagsmith-frontend-staging Ignored Ignored Preview Jun 16, 2026 12:18pm

Request Review

@github-actions github-actions Bot added api Issue related to the REST API docs Documentation updates feature New feature or request and removed docs Documentation updates labels Jun 16, 2026
@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.52%. Comparing base (1cdb491) to head (e62624f).

Additional details and impacted files
@@                        Coverage Diff                        @@
##           feat/experiment-results-query    #7796      +/-   ##
=================================================================
- Coverage                          98.57%   98.52%   -0.06%     
=================================================================
  Files                               1462     1463       +1     
  Lines                              56762    57021     +259     
=================================================================
+ Hits                               55955    56179     +224     
- Misses                               807      842      +35     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Persist Bayesian results in an ExperimentResults row (the ExperimentExposures
pattern: OneToOne, as_of/payload/last_error_at/refresh_requested_at, is_final
freezing completed experiments before the warehouse TTL).

compute_results_summary orchestrates the warehouse aggregation: it derives the
metric specs from the attached metrics and the expected variant split from the
environment's multivariate allocations (control takes the unallocated
remainder), then runs the kernel. compute_experiment_results runs it off the
refresh endpoint and records the row, preserving the last good payload on
warehouse failure.

GET/POST .../results/ and .../results/refresh/ clone the exposures pair; the
shared guard ladder (pre-start, finality, throttle) moves into _refresh_panel.
@gagantrivedi gagantrivedi force-pushed the feat/experiment-results-model branch from acb3e36 to d94bd85 Compare June 16, 2026 10:57
@github-actions github-actions Bot added the docs Documentation updates label Jun 16, 2026
@github-actions github-actions Bot added feature New feature or request and removed feature New feature or request docs Documentation updates labels Jun 16, 2026
ExperimentExposures and ExperimentResults had identical fields and lifecycle.
Hoist them into a generic abstract ExperimentComputation[SummaryT]: the
subclass binds which summary record_refresh stores, so it stays type-safe per
panel. Each concrete model keeps only its experiment OneToOneField (and so its
related_name); no schema change.
@github-actions github-actions Bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels Jun 16, 2026
- Rename the view-layer PanelT/_refresh_panel to ComputationT/
  _refresh_computation so the shared abstraction has one name everywhere
  (ExperimentComputation), rather than reintroducing UI "panel" vocabulary.
- _refresh_computation takes the finished 400 messages instead of a noun,
  dropping the unwritten assumption that the noun is a plural subject.
- _expected_variant_shares selects the current feature state by highest id
  (matching Environment's Max("id") convention) instead of relying on the
  default ascending ordering, which picked the oldest version; note the
  coupling to features' multivariate representation.
- Declare experiment on ExperimentComputation under TYPE_CHECKING so is_final
  is type-checked rather than suppressed with a type: ignore.
@github-actions github-actions Bot added docs Documentation updates feature New feature or request and removed feature New feature or request docs Documentation updates labels Jun 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api Issue related to the REST API feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant