Skip to content

Desktop editor startup and playback latency optimizations#1975

Merged
richiemcilroy merged 32 commits into
mainfrom
desktop-optimisations-etc
Jul 3, 2026
Merged

Desktop editor startup and playback latency optimizations#1975
richiemcilroy merged 32 commits into
mainfrom
desktop-optimisations-etc

Conversation

@richiemcilroy

@richiemcilroy richiemcilroy commented Jul 3, 2026

Copy link
Copy Markdown
Member

Desktop editor open faster, start playback sooner, and feel more responsive — especially on long recordings and Bluetooth audio.

Editor playback & audio

  • Background audio loading — Mic/system audio decodes in the background via AudioLoader instead of blocking editor open. Export still waits for decodes and fails loudly if a track is missing.
  • Progressive audio pre-render — Play no longer renders the entire timeline mix synchronously on every press (which could take seconds on long recordings). A background thread renders a window around the playhead first; the rest fills in progressively.
  • Persistent audio output stream — One cpal stream per editor session, prewarmed at open. Play attaches a source to the live stream instead of opening the device each time, cutting press-to-audio from hundreds of ms to ~5–15ms (Bluetooth included).
  • Parallel startup — Segment setup (decoders + audio decode kickoff) runs concurrently with GPU/renderer init; screen and camera decoders init in parallel on all platforms.
  • Telemetry & benchmarks — Play-start events (AudioSegmentsResolved, AudioPipelineReady, ClockStarted) and --press-starts benchmark mode for measuring press-to-clock latency.

Desktop UX & window behavior

  • Faster editor open — Native window background color before webview paint; window shown immediately when transparency is off; first frame pre-rendered so the UI has something to show on connect.
  • No skeleton hang — Waveforms load via signals (not Suspense); custom domain query has placeholder data; transparent editor reveal avoids throttled requestAnimationFrame.
  • Startup polishfont-display: block for bundled fonts; idle font/emoji cache prewarm to avoid first-render jank.
  • Visual cleanup — Removed fade-in animations on window open and redundant cursor-pointer classes across editor, settings, and screenshot editor.

Fixes

  • macOS liquid glass — Stopped disabling window/WebView occlusion detection, which could wedge WindowServer after sleep/lid-close and soft-restart the login session.
  • License refetch — Fixed query key (licenseQuery instead of bruh).
  • API parsing — JSON responses with charset suffix in Content-Type now parse correctly.
  • Changelog settings — Removed debug logging; simplified loading/error handling.

Dependencies

  • Pinned tauri-plugin-http to 2.5.2 (Rust + npm + lockfile).
  • Regenerated tauri specta bindings.

Greptile Summary

This PR delivers a broad set of startup and playback latency improvements to the desktop editor, replacing per-press stream creation with a persistent AudioOutput session, background audio decoding via AudioLoader, and progressive pre-rendering so the first play press no longer blocks on the full timeline mix. Alongside Rust changes, several long-standing frontend bugs are fixed (wrong licenseQuery key, Content-Type charset parsing, editor skeleton hang) and the macOS occlusion-suppression SPI that was wedging WindowServer is removed.

  • Audio pipeline overhaulAudioLoader decodes tracks in the background at editor open; AudioOutput keeps a prewarmed cpal stream alive for the session; PrerenderedAudioBuffer fills a progressive ring buffer on a background thread so audio starts within one callback period of clock start.
  • Parallel and early startup — Segment decoder init and GPU renderer init now run concurrently; window is shown with a themed native background colour before the webview paints; the editor skeleton hang caused by createResource on waveform fetches and a slow customDomainQuery is resolved by switching to signals with placeholderData.
  • Bug fixesqueryKey: ["bruh"]["licenseQuery"] restores license refetch; isJsonContentType handles charset-suffixed content types; occlusion-detection SPI removed to prevent login-session soft-restarts on macOS 26.

Confidence Score: 5/5

Safe to merge; the audio pipeline refactor, progressive pre-render, and macOS occlusion fix are all structurally sound and well-tested.

The new AudioLoader, AudioOutput, and PrerenderedAudioBuffer implementations are correctly designed — the watch-channel-backed decode cache, lock-free watermark scheme, and generation-token stop mechanism all hold up under concurrent access. Export validation, the macOS SPI removal, and the frontend bug fixes (license key, content-type, skeleton hang) are clean and correct. The only finding is a pre-existing first-segment-only audio check that was preserved unchanged from the old code.

No files require special attention; the Rust audio pipeline changes in audio_output.rs and audio.rs are the most complex but are well-structured.

Important Files Changed

Filename Overview
crates/editor/src/audio_output.rs New module: persistent cpal output stream with generation-based source install/remove; prewarm, play, and stop_playback correctly handle Bluetooth device wakeup, stream rebuild on failure, and concurrent stop/start via generation tokens.
crates/editor/src/audio.rs Replaces AudioPlaybackBuffer with PrerenderedAudioBuffer (progressive + streaming fallback for >512 MB recordings); lock-free atomic watermark scheme correctly uses Release/Acquire on watermarks with Relaxed sample loads; unsafe allocate_samples transmutation is sound per documented AtomicU32/u32 layout guarantee.
crates/editor/src/editor_instance.rs Introduces AudioLoader (watch-channel backed, correct handling of ready/spawn/none cases) and wires AudioOutput into EditorInstance; segment setup and GPU init now run concurrently via tokio::spawn; audio output is shut down on close.
crates/editor/src/playback.rs Old AudioPlayback (per-press stream + thread) replaced by a call to audio_output.play(); audio segments awaited before entering the sync playback thread; telemetry events added for press-to-clock latency measurement; generation-based stop correctly avoids cutting a successor's source.
crates/editor/src/segments.rs get_audio_segments is now async; awaits AudioLoader.get() for each track sequentially, but decodes run in parallel (all spawn_blocking tasks are already in-flight); failed decodes degrade to silence for playback while export validates separately.
crates/export/src/lib.rs Adds early audio-decode validation loop so exports fail loudly if any background decode errored, rather than silently falling back to silence as playback does.
apps/desktop/src-tauri/src/platform/macos/mod.rs Removes disable_window_occlusion_detection and disable_webview_occlusion_detection calls that were wedging WindowServer on macOS 26; enable path kept as defense-in-depth to re-assert OS default.
apps/desktop/src-tauri/src/lib.rs Adds shared resolution helpers (with unit tests verifying match to frontend defaults), removes debug println, waveform commands now await AudioLoader, and triggers first-frame pre-render at editor open.
apps/desktop/src-tauri/src/windows.rs Editor window shown immediately with themed native background (skipped for transparency mode); orphaned prewarm instances cleaned up on window build failure.
apps/desktop/src/routes/editor/context.ts Waveform data switched from createResource (would suspend the editor skeleton) to plain signals populated via onMount promises; errors surface in console but never block the UI.
apps/desktop/src/utils/web-api.ts Fixes Content-Type comparison to strip charset suffix (e.g. application/json; charset=utf-8 now parsed as JSON correctly).
apps/desktop/src/routes/(window-chrome)/settings/license.tsx Fixes three occurrences of incorrect queryKey ["bruh"] to ["licenseQuery"] so license refetch and deactivation actually invalidate the right cache entry.
apps/desktop/app.config.ts Vite transform swaps font-display: swap to block in bundled font CSS to eliminate FOUT; correctly scoped to desktop build via file-pattern matching.
crates/rendering/src/lib.rs Screen and camera decoders now always init concurrently (try_join!) instead of sequentially on non-Windows; the #[cfg] split is removed as the optimization is safe on all platforms.

Comments Outside Diff (1)

  1. apps/desktop/src/routes/(window-chrome)/settings/changelog.tsx, line 28-56 (link)

    P2 ErrorBoundary replaced by Show — render errors no longer contained

    The old code wrapped the entry list in an ErrorBoundary, which catches both async query errors and synchronous JavaScript exceptions thrown during SolidJS rendering (e.g., a null dereference on an unexpected entry shape). The new Show when={!changelog.isError} only reflects the TanStack Query error state; a runtime render exception inside the For loop would propagate up and crash the entire Settings panel instead of being contained to the changelog section. Adding a lightweight ErrorBoundary around the For would restore the isolation without undoing the Suspense → Show simplification.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: apps/desktop/src/routes/(window-chrome)/settings/changelog.tsx
    Line: 28-56
    
    Comment:
    **`ErrorBoundary` replaced by `Show` — render errors no longer contained**
    
    The old code wrapped the entry list in an `ErrorBoundary`, which catches both async query errors and synchronous JavaScript exceptions thrown during SolidJS rendering (e.g., a `null` dereference on an unexpected entry shape). The new `Show when={!changelog.isError}` only reflects the TanStack Query error state; a runtime render exception inside the `For` loop would propagate up and crash the entire Settings panel instead of being contained to the changelog section. Adding a lightweight `ErrorBoundary` around the `For` would restore the isolation without undoing the `Suspense → Show` simplification.
    
    How can I resolve this? If you propose a fix, please make it concise.

Reviews (3): Last reviewed commit: "fix: bound long audio playback buffering" | Re-trigger Greptile

// so the clock below never runs ahead of audible audio.
let audio_spawn_start = Instant::now();
let audio_generation =
if audio_segments.is_empty() || audio_segments[0].tracks.is_empty() {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition only looks at the first segment. If segment 0 is silent but later segments have audio, playback will incorrectly skip audio.

Suggested change
if audio_segments.is_empty() || audio_segments[0].tracks.is_empty() {
if audio_segments.is_empty() || audio_segments.iter().all(|s| s.tracks.is_empty()) {

Comment on lines +25 to +28
return {
code: code.replace(/font-display:\s*swap;/g, "font-display: block;"),
map: null,
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small thing: this returns a transformed result even when there are no font-display: swap matches, which can cause unnecessary downstream work.

Suggested change
return {
code: code.replace(/font-display:\s*swap;/g, "font-display: block;"),
map: null,
};
const updated = code.replace(/font-display:\s*swap;/g, "font-display: block;");
if (updated === code) return;
return {
code: updated,
map: null,
};

Comment thread crates/export/src/lib.rs
Comment on lines +154 to +157
for segment in &segments {
segment.audio.get().await.map_err(Error::MediaLoad)?;
segment.system_audio.get().await.map_err(Error::MediaLoad)?;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since both loaders are awaited just to validate decode success, these can be awaited concurrently per segment to shave a bit off export start on long recordings.

Suggested change
for segment in &segments {
segment.audio.get().await.map_err(Error::MediaLoad)?;
segment.system_audio.get().await.map_err(Error::MediaLoad)?;
}
for segment in &segments {
tokio::try_join!(segment.audio.get(), segment.system_audio.get())
.map_err(Error::MediaLoad)?;
}

Comment thread crates/editor/src/editor_instance.rs
Comment thread apps/desktop/src-tauri/src/lib.rs Outdated
@CapSoftware CapSoftware deleted a comment from polarityinc Bot Jul 3, 2026
Comment thread apps/desktop/src/routes/(window-chrome)/settings/changelog.tsx Outdated
Comment thread apps/desktop/src/routes/(window-chrome)/settings/changelog.tsx Outdated
@richiemcilroy

Copy link
Copy Markdown
Member Author

hey @greptileai, please re-review the PR

import { resolveServerRequestPath } from "./server-url-routing";

const isJsonContentType = (contentType: string | null) =>
contentType?.toLowerCase().split(";")[0]?.trim() === "application/json";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only handles exactly application/json. If the server ever returns a structured JSON type (e.g. application/problem+json), we’ll currently parse it as text.

Suggested change
contentType?.toLowerCase().split(";")[0]?.trim() === "application/json";
const isJsonContentType = (contentType: string | null) => {
const normalized = contentType?.toLowerCase().split(";")[0]?.trim();
return normalized === "application/json" || normalized?.endsWith("+json");
};

</div>
<SolidMarkdown
components={{
a: (props) => <a {...props} target="_blank" />,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor security hardening: target="_blank" should generally include rel="noopener noreferrer" to prevent tabnabbing.

Suggested change
a: (props) => <a {...props} target="_blank" />,
a: (props) => (
<a {...props} target="_blank" rel="noreferrer noopener" />
),

@richiemcilroy

Copy link
Copy Markdown
Member Author

hey @greptileai, please re-review the PR

self.resampled_buffer.vacant_len() <= 2 * Self::PROCESSING_SAMPLES_COUNT * self.channels
}

fn render_chunk(&mut self) -> bool {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Streaming fallback currently renders + resamples (and allocates typed_data) from inside the cpal callback path via fill(). On very long recordings this seems like the exact scenario where we want callback work to be minimal to avoid xruns (Bluetooth / small buffers especially).

Might be worth moving the streaming producer onto a background thread (producer fills the ringbuf; callback only pops), or at least reusing a scratch buffer to avoid per-chunk heap allocs.

@richiemcilroy richiemcilroy merged commit 79eb938 into main Jul 3, 2026
19 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant