Speech-to-text for Google Docs on Mac
Dictate into Google Docs and get clean, punctuated text — not raw filler-filled transcripts. SpeechFlow works in any browser, no tab focus required. Free to start.
Google Docs has its own Voice Typing, and it sounds like the obvious choice — until you use it. The transcripts land raw: no punctuation, every “um” intact, and the moment you click another tab the mic cuts out. SpeechFlow takes a different approach: hold Control, speak, release, and a clean, punctuated sentence appears at your cursor in Docs — whether you’re in Chrome, Safari, or the Arc browser, and whether that tab is in focus or not.
The real problem with Google Voice Typing
Voice Typing in Google Docs is genuinely useful as a quick proof-of-concept, but it has walls that frustrate daily use. It only runs in Chrome. It stops recording the instant the Docs tab loses focus — so glancing at your notes in another tab kills the session. Most painfully, it gives you raw output: “um so basically what i wanted to say was that the deadline is friday” lands verbatim, and you edit it yourself. For a quick sentence that’s fine; for a 500-word draft it’s a new job.
Apple’s built-in dictation shares the same flaw: clean-up is on you. If you’ve ever tried to speak a long document and ended up with a wall of unpunctuated text, you already know the cost.
How SpeechFlow works in Google Docs
SpeechFlow is a native macOS app (~50 MB, Apple Silicon). It operates at the system level — not inside the browser — so it doesn’t care which browser you use or whether the Docs tab is active. The workflow:
- Open your Google Doc and click where you want text to appear.
- Hold Control and speak naturally for as long as you need.
- Release. The cleanup LLM strips fillers, adds punctuation, adjusts tone, and inserts the finished text right at the cursor.
Because insertion happens at the OS cursor, it works in every part of Docs: body text, comments, the title field, tables, and even the “Outline” heading panel. No copy-paste, no dictation window to manage.
SpeechFlow vs. Google Voice Typing
| Feature | Google Voice Typing | SpeechFlow |
|---|---|---|
| Punctuation | None — raw transcript | Auto-added by cleanup LLM |
| Filler removal | No — every “um” stays | Yes — stripped on release |
| Browser lock-in | Chrome only | Any browser or desktop app |
| Tab focus required | Yes — stops if you switch tabs | No — works system-wide |
| Privacy | Audio processed by Google | Zero retention; BYOK routes audio to your own provider |
| Works outside Docs | No | Yes — every Mac app, same shortcut |
Where dictation actually shines in Google Docs
Speaking runs at 150–180 words per minute; typing caps around 40–60. That 5× gap matters most for volume work:
- First drafts — speak the skeleton of a report or proposal, then edit. Getting words on the page is the hard part; SpeechFlow handles it.
- Comment threads — long inline comments are painful to type. Dictate them in seconds without leaving the document.
- Collaborative docs — if you share a Doc and need to add notes quickly during a call, holding Control while the video call is in the foreground works fine.
- Non-native language writing — speaking in your native language while the LLM cleans up grammar is a surprisingly effective drafting strategy.
The same shortcut works across your whole Mac, so you can dictate into Microsoft Word, Notion, Slack, email — everything — without switching tools. For a full picture of how AI dictation compares to legacy voice tools, see the AI dictation guide. And if you’re writing longer-form content, writing a book by voice on Mac walks through the same workflow at scale.
FAQ
Does SpeechFlow work in Google Docs on Safari or Firefox, not just Chrome?
Yes. SpeechFlow types at the system cursor, so it works in any browser — Chrome, Safari, Firefox, Arc, or Brave — and in the Google Docs progressive web app if you use it.
Will it keep working if I switch tabs while speaking?
Yes. Because SpeechFlow runs at the macOS level rather than inside the browser, switching tabs or apps doesn’t interrupt the recording. Text is inserted when you release Control, wherever your cursor is.
Does it store my voice or document content?
No. SpeechFlow has zero data retention. In BYOK mode, your audio goes directly to your chosen provider (OpenAI, Gemini, or Groq) and nothing passes through or is stored on a SpeechFlow server.
Is there a free plan?
Yes — 2,500 words per week, no credit card required. Pro is €10/month or €70/year. BYOK is a one-time €69 for lifetime access.
Do I need a Chrome extension or a Docs add-on?
No. There is nothing to install in the browser or in Google Docs. SpeechFlow is a standalone Mac app that inserts text anywhere your cursor is placed.
Ready to stop fighting Voice Typing? Try SpeechFlow free — 2,500 words a week, no card needed.