Desktop editor startup and playback latency optimizations by richiemcilroy · Pull Request #1975 · CapSoftware/Cap

richiemcilroy · 2026-07-03T00:29:29Z

Desktop editor open faster, start playback sooner, and feel more responsive — especially on long recordings and Bluetooth audio.

Editor playback & audio

Background audio loading — Mic/system audio decodes in the background via AudioLoader instead of blocking editor open. Export still waits for decodes and fails loudly if a track is missing.
Progressive audio pre-render — Play no longer renders the entire timeline mix synchronously on every press (which could take seconds on long recordings). A background thread renders a window around the playhead first; the rest fills in progressively.
Persistent audio output stream — One cpal stream per editor session, prewarmed at open. Play attaches a source to the live stream instead of opening the device each time, cutting press-to-audio from hundreds of ms to ~5–15ms (Bluetooth included).
Parallel startup — Segment setup (decoders + audio decode kickoff) runs concurrently with GPU/renderer init; screen and camera decoders init in parallel on all platforms.
Telemetry & benchmarks — Play-start events (AudioSegmentsResolved, AudioPipelineReady, ClockStarted) and --press-starts benchmark mode for measuring press-to-clock latency.

Desktop UX & window behavior

Faster editor open — Native window background color before webview paint; window shown immediately when transparency is off; first frame pre-rendered so the UI has something to show on connect.
No skeleton hang — Waveforms load via signals (not Suspense); custom domain query has placeholder data; transparent editor reveal avoids throttled requestAnimationFrame.
Startup polish — font-display: block for bundled fonts; idle font/emoji cache prewarm to avoid first-render jank.
Visual cleanup — Removed fade-in animations on window open and redundant cursor-pointer classes across editor, settings, and screenshot editor.

Fixes

macOS liquid glass — Stopped disabling window/WebView occlusion detection, which could wedge WindowServer after sleep/lid-close and soft-restart the login session.
License refetch — Fixed query key (licenseQuery instead of bruh).
API parsing — JSON responses with charset suffix in Content-Type now parse correctly.
Changelog settings — Removed debug logging; simplified loading/error handling.

Dependencies

Pinned tauri-plugin-http to 2.5.2 (Rust + npm + lockfile).
Regenerated tauri specta bindings.

Greptile Summary

This PR delivers a broad set of startup and playback latency improvements to the desktop editor, replacing per-press stream creation with a persistent AudioOutput session, background audio decoding via AudioLoader, and progressive pre-rendering so the first play press no longer blocks on the full timeline mix. Alongside Rust changes, several long-standing frontend bugs are fixed (wrong licenseQuery key, Content-Type charset parsing, editor skeleton hang) and the macOS occlusion-suppression SPI that was wedging WindowServer is removed.

Audio pipeline overhaul — AudioLoader decodes tracks in the background at editor open; AudioOutput keeps a prewarmed cpal stream alive for the session; PrerenderedAudioBuffer fills a progressive ring buffer on a background thread so audio starts within one callback period of clock start.
Parallel and early startup — Segment decoder init and GPU renderer init now run concurrently; window is shown with a themed native background colour before the webview paints; the editor skeleton hang caused by createResource on waveform fetches and a slow customDomainQuery is resolved by switching to signals with placeholderData.
Bug fixes — queryKey: ["bruh"] → ["licenseQuery"] restores license refetch; isJsonContentType handles charset-suffixed content types; occlusion-detection SPI removed to prevent login-session soft-restarts on macOS 26.

Confidence Score: 5/5

Safe to merge; the audio pipeline refactor, progressive pre-render, and macOS occlusion fix are all structurally sound and well-tested.

The new AudioLoader, AudioOutput, and PrerenderedAudioBuffer implementations are correctly designed — the watch-channel-backed decode cache, lock-free watermark scheme, and generation-token stop mechanism all hold up under concurrent access. Export validation, the macOS SPI removal, and the frontend bug fixes (license key, content-type, skeleton hang) are clean and correct. The only finding is a pre-existing first-segment-only audio check that was preserved unchanged from the old code.

No files require special attention; the Rust audio pipeline changes in audio_output.rs and audio.rs are the most complex but are well-structured.

Important Files Changed

Filename	Overview
crates/editor/src/audio_output.rs	New module: persistent cpal output stream with generation-based source install/remove; prewarm, play, and stop_playback correctly handle Bluetooth device wakeup, stream rebuild on failure, and concurrent stop/start via generation tokens.
crates/editor/src/audio.rs	Replaces AudioPlaybackBuffer with PrerenderedAudioBuffer (progressive + streaming fallback for >512 MB recordings); lock-free atomic watermark scheme correctly uses Release/Acquire on watermarks with Relaxed sample loads; unsafe allocate_samples transmutation is sound per documented AtomicU32/u32 layout guarantee.
crates/editor/src/editor_instance.rs	Introduces AudioLoader (watch-channel backed, correct handling of ready/spawn/none cases) and wires AudioOutput into EditorInstance; segment setup and GPU init now run concurrently via tokio::spawn; audio output is shut down on close.
crates/editor/src/playback.rs	Old AudioPlayback (per-press stream + thread) replaced by a call to audio_output.play(); audio segments awaited before entering the sync playback thread; telemetry events added for press-to-clock latency measurement; generation-based stop correctly avoids cutting a successor's source.
crates/editor/src/segments.rs	get_audio_segments is now async; awaits AudioLoader.get() for each track sequentially, but decodes run in parallel (all spawn_blocking tasks are already in-flight); failed decodes degrade to silence for playback while export validates separately.
crates/export/src/lib.rs	Adds early audio-decode validation loop so exports fail loudly if any background decode errored, rather than silently falling back to silence as playback does.
apps/desktop/src-tauri/src/platform/macos/mod.rs	Removes disable_window_occlusion_detection and disable_webview_occlusion_detection calls that were wedging WindowServer on macOS 26; enable path kept as defense-in-depth to re-assert OS default.
apps/desktop/src-tauri/src/lib.rs	Adds shared resolution helpers (with unit tests verifying match to frontend defaults), removes debug println, waveform commands now await AudioLoader, and triggers first-frame pre-render at editor open.
apps/desktop/src-tauri/src/windows.rs	Editor window shown immediately with themed native background (skipped for transparency mode); orphaned prewarm instances cleaned up on window build failure.
apps/desktop/src/routes/editor/context.ts	Waveform data switched from createResource (would suspend the editor skeleton) to plain signals populated via onMount promises; errors surface in console but never block the UI.
apps/desktop/src/utils/web-api.ts	Fixes Content-Type comparison to strip charset suffix (e.g. application/json; charset=utf-8 now parsed as JSON correctly).
apps/desktop/src/routes/(window-chrome)/settings/license.tsx	Fixes three occurrences of incorrect queryKey ["bruh"] to ["licenseQuery"] so license refetch and deactivation actually invalidate the right cache entry.
apps/desktop/app.config.ts	Vite transform swaps font-display: swap to block in bundled font CSS to eliminate FOUT; correctly scoped to desktop build via file-pattern matching.
crates/rendering/src/lib.rs	Screen and camera decoders now always init concurrently (try_join!) instead of sequentially on non-Windows; the #[cfg] split is removed as the optimization is safe on all platforms.

Comments Outside Diff (1)

apps/desktop/src/routes/(window-chrome)/settings/changelog.tsx, line 28-56 (link)

ErrorBoundary replaced by Show — render errors no longer contained

The old code wrapped the entry list in an ErrorBoundary, which catches both async query errors and synchronous JavaScript exceptions thrown during SolidJS rendering (e.g., a null dereference on an unexpected entry shape). The new Show when={!changelog.isError} only reflects the TanStack Query error state; a runtime render exception inside the For loop would propagate up and crash the entire Settings panel instead of being contained to the changelog section. Adding a lightweight ErrorBoundary around the For would restore the isolation without undoing the Suspense → Show simplification.

Prompt To Fix With AI

This is a comment left during a code review.
Path: apps/desktop/src/routes/(window-chrome)/settings/changelog.tsx
Line: 28-56

Comment:
**`ErrorBoundary` replaced by `Show` — render errors no longer contained**

The old code wrapped the entry list in an `ErrorBoundary`, which catches both async query errors and synchronous JavaScript exceptions thrown during SolidJS rendering (e.g., a `null` dereference on an unexpected entry shape). The new `Show when={!changelog.isError}` only reflects the TanStack Query error state; a runtime render exception inside the `For` loop would propagate up and crash the entire Settings panel instead of being contained to the changelog section. Adding a lightweight `ErrorBoundary` around the `For` would restore the isolation without undoing the `Suspense → Show` simplification.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (3): Last reviewed commit: "fix: bound long audio playback buffering" | Re-trigger Greptile}

tembo · 2026-07-03T00:33:21Z

+            // so the clock below never runs ahead of audible audio.
+            let audio_spawn_start = Instant::now();
+            let audio_generation =
+                if audio_segments.is_empty() || audio_segments[0].tracks.is_empty() {


This condition only looks at the first segment. If segment 0 is silent but later segments have audio, playback will incorrectly skip audio.

Suggested change

if audio_segments.is_empty() || audio_segments[0].tracks.is_empty() {

if audio_segments.is_empty() || audio_segments.iter().all(|s| s.tracks.is_empty()) {

tembo · 2026-07-03T00:33:25Z

+		return {
+			code: code.replace(/font-display:\s*swap;/g, "font-display: block;"),
+			map: null,
+		};


Small thing: this returns a transformed result even when there are no font-display: swap matches, which can cause unnecessary downstream work.

Suggested change

return {

code: code.replace(/font-display:\s*swap;/g, "font-display: block;"),

map: null,

};

const updated = code.replace(/font-display:\s*swap;/g, "font-display: block;");

if (updated === code) return;

return {

code: updated,

map: null,

};

tembo · 2026-07-03T00:33:29Z

+        for segment in &segments {
+            segment.audio.get().await.map_err(Error::MediaLoad)?;
+            segment.system_audio.get().await.map_err(Error::MediaLoad)?;
+        }


Since both loaders are awaited just to validate decode success, these can be awaited concurrently per segment to shave a bit off export start on long recordings.

Suggested change

for segment in &segments {

segment.audio.get().await.map_err(Error::MediaLoad)?;

segment.system_audio.get().await.map_err(Error::MediaLoad)?;

}

for segment in &segments {

tokio::try_join!(segment.audio.get(), segment.system_audio.get())

.map_err(Error::MediaLoad)?;

}

richiemcilroy · 2026-07-03T00:44:04Z

hey @greptileai, please re-review the PR

tembo · 2026-07-03T00:47:33Z

 import { resolveServerRequestPath } from "./server-url-routing";

+const isJsonContentType = (contentType: string | null) =>
+	contentType?.toLowerCase().split(";")[0]?.trim() === "application/json";


This only handles exactly application/json. If the server ever returns a structured JSON type (e.g. application/problem+json), we’ll currently parse it as text.

Suggested change

contentType?.toLowerCase().split(";")[0]?.trim() === "application/json";

const isJsonContentType = (contentType: string | null) => {

const normalized = contentType?.toLowerCase().split(";")[0]?.trim();

return normalized === "application/json" || normalized?.endsWith("+json");

};

tembo · 2026-07-03T00:47:38Z

+												</div>
+												<SolidMarkdown
+													components={{
+														a: (props) => <a {...props} target="_blank" />,


Minor security hardening: target="_blank" should generally include rel="noopener noreferrer" to prevent tabnabbing.

Suggested change

a: (props) => <a {...props} target="_blank" />,

a: (props) => (

<a {...props} target="_blank" rel="noreferrer noopener" />

),

richiemcilroy · 2026-07-03T02:46:16Z

hey @greptileai, please re-review the PR

tembo · 2026-07-03T02:51:43Z

+        self.resampled_buffer.vacant_len() <= 2 * Self::PROCESSING_SAMPLES_COUNT * self.channels
+    }
+
+    fn render_chunk(&mut self) -> bool {


Streaming fallback currently renders + resamples (and allocates typed_data) from inside the cpal callback path via fill(). On very long recordings this seems like the exact scenario where we want callback work to be minimal to avoid xruns (Bluetooth / small buffers especially).

Might be worth moving the streaming producer onto a background thread (producer fills the ringbuf; callback only pops), or at least reusing a scratch buffer to avoid per-chunk heap allocs.

richiemcilroy added 27 commits July 3, 2026 03:27

perf(rendering): initialize screen and camera decoders concurrently

75df302

feat(editor): add AudioLoader for background audio decoding

a3a3e7d

feat(export): await background audio decodes before export

c358628

feat(editor): add persistent AudioOutput session stream

972d3f0

perf(editor): progressive audio pre-render for faster play start

660f13a

refactor(editor): route playback through persistent audio output

24a3bea

feat(editor): add play-start latency telemetry events

5be7c2a

test(editor): extend playback benchmark with press-start metrics

8ddc65c

docs(editor): document play-start latency optimization findings

ce13433

fix(macos): stop disabling window occlusion for liquid glass

c7464ab

feat(desktop): show windows early with native background color

eac5c35

feat(desktop): pre-render first frame and lazy-load waveforms

4b61f56

test(desktop): wire AudioOutput into display transport benchmark

e7ce872

fix(desktop): load waveforms without suspending editor UI

5019ce5

fix(desktop): reveal transparent editor without throttled rAF

a2ce530

fix(desktop): add custom domain query placeholder data

7cb2323

fix(desktop): parse JSON content-type with charset suffix

d499458

fix(desktop): use licenseQuery key when refetching license state

56d49e8

refactor(desktop): simplify changelog settings page loading

640d5b9

perf(desktop): use font-display block for bundled fonts

e6a890b

perf(desktop): prewarm font and emoji caches on app mount

f2447fb

style(desktop): remove redundant cursor-pointer from editor UI

6f11977

style(desktop): remove redundant cursor-pointer from settings UI

ca49eed

style(desktop): remove redundant cursor-pointer from screenshot editor

0e3bce3

style(desktop): remove fade-in animations on window open

7089fb2

build(desktop): pin tauri-plugin-http to 2.5.2

581c9bf

chore(desktop): regenerate tauri specta bindings

9f7e1e3

tembo Bot reviewed Jul 3, 2026

View reviewed changes

greptile-apps Bot reviewed Jul 3, 2026

View reviewed changes

Comment thread crates/editor/src/editor_instance.rs

Comment thread apps/desktop/src-tauri/src/lib.rs Outdated

CapSoftware deleted a comment from polarityinc Bot Jul 3, 2026

lockfile fix

0a38ed1

tembo Bot reviewed Jul 3, 2026

View reviewed changes

Comment thread apps/desktop/src/routes/(window-chrome)/settings/changelog.tsx Outdated

Comment thread apps/desktop/src/routes/(window-chrome)/settings/changelog.tsx Outdated

richiemcilroy added 3 commits July 3, 2026 03:43

fix: avoid duplicate audio decode warnings

f26fb5d

fix: derive editor preview pre-render size

1157ee5

fix: restore changelog render error boundary

b519947

tembo Bot reviewed Jul 3, 2026

View reviewed changes

fix: bound long audio playback buffering

2b0a911

tembo Bot reviewed Jul 3, 2026

View reviewed changes

richiemcilroy merged commit 79eb938 into main Jul 3, 2026
19 of 21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Desktop editor startup and playback latency optimizations#1975

Desktop editor startup and playback latency optimizations#1975
richiemcilroy merged 32 commits into
mainfrom
desktop-optimisations-etc

richiemcilroy commented Jul 3, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

tembo Bot Jul 3, 2026

Uh oh!

tembo Bot Jul 3, 2026

Uh oh!

tembo Bot Jul 3, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

richiemcilroy commented Jul 3, 2026

Uh oh!

tembo Bot Jul 3, 2026

Uh oh!

tembo Bot Jul 3, 2026

Uh oh!

richiemcilroy commented Jul 3, 2026

Uh oh!

tembo Bot Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	if audio_segments.is_empty() \|\| audio_segments[0].tracks.is_empty() {
	if audio_segments.is_empty() \|\| audio_segments.iter().all(\|s\| s.tracks.is_empty()) {

-	contentType?.toLowerCase().split(";")[0]?.trim() === "application/json";
+const isJsonContentType = (contentType: string | null) => {
+	const normalized = contentType?.toLowerCase().split(";")[0]?.trim();
+	return normalized === "application/json" || normalized?.endsWith("+json");
+};

-														a: (props) => <a {...props} target="_blank" />,
+									a: (props) => (
+										<a {...props} target="_blank" rel="noreferrer noopener" />
+									),

Uh oh!

Conversation

richiemcilroy commented Jul 3, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Editor playback & audio

Desktop UX & window behavior

Fixes

Dependencies

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Comments Outside Diff (1)

Uh oh!

tembo Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

tembo Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

tembo Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

richiemcilroy commented Jul 3, 2026

Uh oh!

tembo Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

tembo Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

richiemcilroy commented Jul 3, 2026

Uh oh!

tembo Bot Jul 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

richiemcilroy commented Jul 3, 2026 •

edited by greptile-apps Bot

Loading