chore: Metric Views typegen#433
Conversation
There was a problem hiding this comment.
Pull request overview
Adds build-time type generation for Unity Catalog Metric Views, emitting a MetricRegistry module augmentation plus a frontend-safe semantic metadata bundle when config/queries/metric-views.json is present.
Changes:
- Introduces metric-view config parsing/validation + DESCRIBE-driven schema extraction, emitting
metric.d.tsandmetrics.metadata.json. - Extends the Vite typegen plugin with metric output options and watcher support for
metric-views.jsonchanges. - Adds extensive unit + snapshot coverage for metric registry generation and plugin option plumbing.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/appkit/src/type-generator/vite-plugin.ts | Adds metric output options and triggers regeneration on metric-views.json edits. |
| packages/appkit/src/type-generator/index.ts | Wires metric-view generation into generateFromEntryPoint and exports metric artifact constants/types. |
| packages/appkit/src/type-generator/metric-registry.ts | Implements metric config resolution, DESCRIBE parsing, type/metadata emission, and sync failure reporting. |
| packages/appkit/src/type-generator/tests/vite-plugin.test.ts | Tests watcher behavior for metric-views.json and option plumbing for metric outputs. |
| packages/appkit/src/type-generator/tests/index.test.ts | Tests end-to-end emission/dormancy behavior for metric artifacts in generateFromEntryPoint. |
| packages/appkit/src/type-generator/tests/metric-registry.test.ts | Adds comprehensive unit tests for config validation, extraction, time grains, and metadata formatting. |
| packages/appkit/src/type-generator/tests/snapshots/metric-registry.test.ts.snap | Snapshot coverage for emitted metric.d.ts and metrics.metadata.json. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…c failure logs Copilot review response (#433): (1) the metric emit block now gates DESCRIBEs on warehouse state in non-blocking mode — one read-only status GET (never starts a warehouse); when not RUNNING (or the probe fails) it skips all DESCRIBEs and emits degraded artifacts (every configured key with empty measures/dimensions) that the vite plugin's warehouse-watch regen (blocking mode) refreshes once the warehouse is up. Blocking mode and injected metricFetcher bypass the gate. (2) syncMetrics is now log-free: the three internal per-failure warns are removed and the generateFromEntryPoint caller owns surfacing failures exactly once. Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
e66fc62 to
e38d0f2
Compare
Build-time type generator for UC Metric Views: reads config/queries/metric-views.json, runs DESCRIBE TABLE EXTENDED per declared view, and emits the MetricRegistry .d.ts augmentation (metric.d.ts) plus the metrics.metadata.json semantic bundle. Non-blocking-mode aware (degraded types when the warehouse is unavailable), blocking-mode warehouse preflight, bounded-concurrency DESCRIBEs, and a retry-driven describe cache with last-known-good degradation. Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
Config caps (200 metric views, 255 chars per FQN segment, 767 total, 100 decimal places), reject metricViews:null, backtick-quote validated FQN segments in DESCRIBE, null-prototype metadata bundle, exact-basename watcher match, and locale-independent artifact key order. Cache: sticky vs transient retry classification, pruning to the configured key set, and structural validation of revived entries. Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
Failure-outcome helper, unified renderer block builders, currency-symbol map, parallel-array removal, hoisted allowlist sets, and relocation of the revival validator + cache-hash helper into cache.ts. The Vite plugin now defers metric artifact defaults to the generator so plugin- and CLI-driven runs agree under a custom outFile. Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
SDK executeStatement returns ARROW_STREAM by default (rows in result.attachment, data_array empty), so the metric and query DESCRIBE fetchers silently degraded on warehouses that don't default to JSON_ARRAY. Add normalizeResultRows (apache-arrow tableFromIPC) and request ARROW_STREAM + INLINE in both fetchers; downstream parsers read the populated data_array unchanged. Verified live against a real warehouse: real measure/dimension unions, cache no longer degraded. Hardening: refuse to emit partial types when a DESCRIBE result is multi-chunk (next_chunk_* present) — fail loud rather than cache a truncated schema; extract row values via the positional StructRow iterator ([...row]) rather than Object.values, which reorders integer-named columns. Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
Split the ~1.4k-line metric-registry.ts into focused mv-registry/ modules (config, describe, metadata, render-types, sync, types). Consumers import directly from the relevant submodule; there is no aggregating barrel. The package's public type surface is unchanged — type-generator/index.ts still re-exports the same metric types from mv-registry/types. Behavior-preserving: 2962 tests green and the dogfood live run still emits real metric types. Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
The prior Arrow fix hardcoded INLINE+ARROW_STREAM for DESCRIBE, which only the Reyden engine accepts. Standard DBSQL (PRO/CLASSIC) rejects that pairing and requires JSON_ARRAY, so metric and query typegen broke on real warehouses. The two engines have opposite requirements — no single hardcoded format works on both. describeAdaptive tries JSON_ARRAY first and falls back to ARROW_STREAM only when the warehouse rejects the format (merge_json_arrays / disposition mismatch), memoizing the accepted format per run. SQL errors, degrades, and connectivity failures pass through unchanged. The Arrow decoder stays as the Reyden branch. Covers the metric (describe.ts) and query (query-registry.ts) paths. Verified live on revenue_arr_demo (PRO) and a type=REYDEN warehouse. Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
| export function isValidFqn(fqn: string): boolean { | ||
| return /^[a-zA-Z0-9_][a-zA-Z0-9_-]*\.[a-zA-Z0-9_][a-zA-Z0-9_-]*\.[a-zA-Z0-9_][a-zA-Z0-9_-]*$/.test( | ||
| fqn, | ||
| ); |
There was a problem hiding this comment.
Looks like this regex is more restrictive than the UC identifier naming limitations? https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-names
Should we make this regex 100% accurate if based on it we might fail the production build?
|
Tried to post the comments separately but encountering some issues with Claude, so here's an aggregated version: [blocking] commentsB1 —
|
When
config/queries/metric-views.jsonis present, the type generator runsDESCRIBE TABLE EXTENDED ... AS JSONper declared metric view and emits two artifacts:shared/appkit-types/metric.d.ts— theMetricRegistrymodule augmentation. Each entry carries typedmeasures/dimensionsrow fields plusmeasureKeys/dimensionKeys/timeGrainsliteral unions, the base theMeasureKey<K>/DimensionKey<K>/MetricRow<K>/TimeGrain<K>helpers derive from on the appkit-ui side.shared/appkit-types/metrics.metadata.json— the semantic-metadata bundle, entries shaped{ measures, dimensions }(display names, format specs, descriptions, time-grain hints). Frontend-safe by construction: UC FQNs and execution lanes are deliberately excluded.Note
Dormancy invariant: absent config means nothing executes — zero artifacts, zero logs, no fallback to any legacy filename. Apps that never adopt metric views see no change, which is what keeps merging this incrementally safe ahead of the runtime.
Output
The schema already exists in
main, so it's not part of this PR.This feature creates the artifact according to the defined schema.
config/queries/metric-views.json, entity-firstmetricViewsmap per the #429metric-sourceschema:{ "metricViews": { "revenue": { "source": "main.finance.revenue_metrics" }, "customer_metrics": { "source": "main.cs.customer_metrics", "executor": "user" } } }executorisapp_service_principal(default) oruser(per-user OBO); the internal sp/obo lane is derived at the parse boundary, so downstream code only ever sees lanes.Failure semantics
Vite plugin grows
mvOutFile/mvMetadataOutFileoptions, and the dev watcher regenerates onmetric-views.jsonedits through the exact same single-flight regen flow as.sqlfiles.Adaptive
DESCRIBEformat per warehouseSome warehouses have opposite requirements for
DESCRIBE … AS JSON:disposition+formatINLINE+JSON_ARRAYdata_arraymerge_json_arraysINLINE+ARROW_STREAMEXTERNAL_LINKS+ARROW_STREAMManual Testing
fallback:
databricks warehouses list -p DEFAULT -o json \ | jq -r '.[] | select(.creator_name=="<your_account_email>") | "\(.id)\t\(.warehouse_type)\t\(.name)"' # → 1075664542a32710 PRO revenue_arr_demoThe metric views referenced in the config must exist in that workspace.
Build the SDK (so the CLI runs the patched describeAdaptive)
(Point source at metric views that exist in your workspace.)
DATABRICKS_CONFIG_PROFILE=DEFAULT DATABRICKS_WAREHOUSE_ID=1075...2710 \ pnpm exec tsx packages/shared/src/cli/index.ts generate-types \ /tmp/mv-typegen-test /tmp/mv-typegen-test/shared/appkit-types/analytics.d.ts \ --no-cache --wait - --no-cache → forces a fresh DESCRIBE (actually hits the warehouse). - --wait → blocks until RUNNING (auto-starts a stopped serverless warehouse).var.
(a) no format errors:
the run output should NOT contain INVALID_PARAMETER_VALUE / merge_json_arrays / "metric sync failed"
(b) real unions (not
string) in the generated types:(c) cache shows not-degraded (cache lives under the cwd = worktree):