ChatGPT Extension
Enhancing ChatGPT with a custom browser extension
Overview
If you keep ChatGPT open while juggling tasks, the slowest part of the workflow is waiting for a reply that finishes off-screen. This extension solves that specific gap: track the request, notify when the stream stops, and show exactly what question is still pending.
The implementation stays deliberately small. A Manifest V3 content script sits in the ChatGPT tab, clocks the time between prompt and completion, and the service worker plays the audio cue. No background polling, no extra permissions—just a thin layer of observability on top of the UI.
Building it came down to a short checklist: detect when a prompt is submitted, monitor the DOM for the completion signal, and persist the timing so it is visible the next time you focus the tab. Every feature either reduces the "did it finish yet?" loop or captures the answer latency for later review.
What you end up with is a dependable helper that records the prompt,
chimes when the response is done, and syncs a compact history via
chrome.storage.sync
Extension Architecture
Ship it like any other Manifest V3 project: a service worker, one
declarative content script scoped to
https://chat.openai.com
Inside the tab, the script maps the conversation ID from the URL,
mirrors the active textarea, and overlays a "Waiting for response…"
banner while the model streams. Every DOM observation feeds a simple
state machine so we always know where the request sits:
IdlePromptedStreamingComplete
Packaging was refreshingly dull. Rollup compiles the TypeScript into a
single content bundle, we use
chrome.scriptingchrome.action
The content script proxies every prompt into a tiny state machine: on
submit it marks the tab as
Prompted
While ChatGPT streams tokens, the script watches DOM mutations and keeps timers alive. Nothing is persisted yet; we only emit metrics once a response is truly complete.
Completion hands control to the service worker, which plays the
notification chime, stores timings in
chrome.storage.sync
Detecting ChatGPT Response Completion
Completion detection is the only tricky part because ChatGPT provides no
"done" event. The approach here is mechanical: wire a
to the
conversation container, filter new nodes where
data-message-author-roleassistant, then check whether the node still contains
[data-testid="result-streaming-indicator"]
Defensive guards keep the observer resilient: bail out if the container disappears, validate every added node before trusting it, and re-attach when navigation swaps the conversation. Those guardrails stop the extension from reporting false positives when OpenAI shuffles class names or introduces new layout experiments.
IdlePrompted
Once the first assistant token lands we enter
StreamingCompleteIdle
The transition into
CompleteIdle
Once the event fires, the service worker plays the chime, updates the overlay, and increments a per-conversation counter. The next time you focus the tab you see the outstanding prompt, the elapsed time, and a button to replay the notification if you missed it.
Closing Thoughts
Nothing here is fancy—just a few well-placed observers and a standard MV3 setup—but the payback shows up immediately. You stop checking tabs, you capture real latency numbers, and you can tweak the workflow without waiting on OpenAI to expose new hooks. When the UI shifts again, swap out the selectors and keep moving.
Anchor the observer to the same container ChatGPT uses for each turn, then climb one level so subtree mutations capture every new message stream. The
data-testidis the steadiest hook the UI exposes.Hard-failing on layout drift is intentional: when the DOM no longer matches expectations we would rather throw, surface it in the console, and disable notifications than silently miss completions.
Each mutation batch is filtered down to newly added assistant turns that lack the streaming indicator. Leaning on semantic attributes keeps this resilient when class names churn, and the early returns avoid needless work during rapid-fire token streams.
The emitted payload is the contract with the service worker: a normalized event type, the active conversation ID, and a millisecond timestamp so the history log can compute latencies later.
Watching both
childListandsubtreemeans we catch list item insertions as well as mutations deeper in the message hierarchy, which protects us from minor DOM reshuffles.