feat(experimentation): experiment results model, task and endpoints#7796
Draft
gagantrivedi wants to merge 3 commits into
Draft
feat(experimentation): experiment results model, task and endpoints#7796gagantrivedi wants to merge 3 commits into
gagantrivedi wants to merge 3 commits into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 3 Skipped Deployments
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## feat/experiment-results-query #7796 +/- ##
=================================================================
- Coverage 98.57% 98.52% -0.06%
=================================================================
Files 1462 1463 +1
Lines 56762 57021 +259
=================================================================
+ Hits 55955 56179 +224
- Misses 807 842 +35 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Persist Bayesian results in an ExperimentResults row (the ExperimentExposures pattern: OneToOne, as_of/payload/last_error_at/refresh_requested_at, is_final freezing completed experiments before the warehouse TTL). compute_results_summary orchestrates the warehouse aggregation: it derives the metric specs from the attached metrics and the expected variant split from the environment's multivariate allocations (control takes the unallocated remainder), then runs the kernel. compute_experiment_results runs it off the refresh endpoint and records the row, preserving the last good payload on warehouse failure. GET/POST .../results/ and .../results/refresh/ clone the exposures pair; the shared guard ladder (pre-start, finality, throttle) moves into _refresh_panel.
acb3e36 to
d94bd85
Compare
ExperimentExposures and ExperimentResults had identical fields and lifecycle. Hoist them into a generic abstract ExperimentComputation[SummaryT]: the subclass binds which summary record_refresh stores, so it stays type-safe per panel. Each concrete model keeps only its experiment OneToOneField (and so its related_name); no schema change.
- Rename the view-layer PanelT/_refresh_panel to ComputationT/
_refresh_computation so the shared abstraction has one name everywhere
(ExperimentComputation), rather than reintroducing UI "panel" vocabulary.
- _refresh_computation takes the finished 400 messages instead of a noun,
dropping the unwritten assumption that the noun is a plural subject.
- _expected_variant_shares selects the current feature state by highest id
(matching Environment's Max("id") convention) instead of relying on the
default ascending ordering, which picked the oldest version; note the
coupling to features' multivariate representation.
- Declare experiment on ExperimentComputation under TYPE_CHECKING so is_final
is type-checked rather than suppressed with a type: ignore.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Thanks for submitting a PR! Please check the boxes below:
docs/if required so people know about the feature.Changes
Contributes to the experiment results stats layer (PR-7 of the stack; stacked on #7781).
Persists Bayesian results per experiment and exposes them, cloning the shipped exposures pair:
ExperimentResultsmodel (+ migration0008):OneToOnetoExperiment,as_of/payload/last_error_at/refresh_requested_at, withis_finalfreezing a completed experiment's row before the warehouse 90-day TTL, andrecord_refresh/record_failure/record_refresh_request.compute_results_summaryorchestrator (services.py): derives theMetricSpeclist from the attached metrics and the expected variant split from the environment's multivariate allocations —controltakes the unallocated remainder, options with no variant key are skipped (SRM then skipped rather than tested against a split that doesn't describe the experiment) — then runs the PR-6 aggregation + kernel.compute_experiment_resultstask: runs off the refresh endpoint; on warehouse failure stampslast_error_atand preserves the last good payload, loggingresults.compute_failed.GET …/results/+POST …/results/refresh/: clones of the exposures endpoints (202 + enqueue; 400 before start; 400 once final; 429 +Retry-Afterwithin the interval). The shared guard ladder is extracted into a_refresh_panelhelper used by both panels.Events catalogue regenerated for the new
results.compute_failedevent and shifted line numbers.How did you test this code?
make testfor the experimentation app (unit tests for the model, task, orchestrator and both endpoints — 100% diff coverage).mypy(strict) andruffclean; FT test linter and events-catalogue docgen run locally. The orchestrator's_expected_variant_sharesis covered for keyed options, null keys and a missing live feature state;compute_results_summarywiring is verified end-to-end with a faked warehouse client.