Speech-to-text for ChatGPT: better prompts by voice
Dictate long, detailed prompts into ChatGPT on Mac. SpeechFlow inserts clean, punctuated text at your cursor — faster than typing, works in every AI tool. Free to start.
Typing long prompts into ChatGPT is the hidden friction in AI work. You know a richer, more detailed prompt gets a better answer — but writing it out kills the flow. Speaking runs at 150–180 words per minute; typing tops out at 40–60. SpeechFlow closes that gap: hold Control, speak your prompt, release, and clean punctuated text lands right in the ChatGPT input box.
The problem: detailed prompts are tedious to type
The single biggest lever on output quality is prompt quality — context, constraints, examples, tone. But writing all that out feels like work, so you shortcut it and get a generic answer. The friction isn't creativity; it's the keyboard. Your best prompts are the ones you'd explain out loud to a colleague, not the ones you whittle down to fit what's comfortable to type.
Apple's built-in dictation can put words in the box, but it leaves every filler and skips punctuation, so you spend the saved time cleaning up raw transcription. That's not faster — it's just differently slow.
How SpeechFlow works in the ChatGPT box
SpeechFlow is a native macOS app (Apple Silicon, ~50 MB) that operates at the system cursor — meaning it works in any Mac app, including ChatGPT in the browser and the ChatGPT desktop app. The flow is three steps:
- Click into the ChatGPT prompt field.
- Hold Control and speak your prompt naturally — fillers, run-ons and all.
- Release. A cleanup LLM strips the “ums”, adds punctuation, adapts the tone and inserts polished text right where your cursor sits.
Because it inserts at the cursor, it works identically in Claude.ai, Gemini, Perplexity, Cursor, or any other tool you have open. You're not locked into one surface.
What to dictate into ChatGPT
| Use case | Why voice wins |
|---|---|
| Long, detailed prompts | Speak the full context — role, goal, constraints, format — in one breath instead of backspacing forever. |
| Follow-up questions | React to a response naturally (“can you make that more concise and give me three bullet examples”) without switching to your keyboard. |
| Pasting in context | Dictate surrounding explanation for code, doc snippets or data you’re about to paste — adds the “why” without friction. |
| Drafting with AI help | Speak a rough draft, let ChatGPT refine — faster than typing the draft and much faster than writing the prompt asking it to draft. |
| Multi-step instructions | Chain numbered steps out loud; SpeechFlow's cleanup preserves the list structure so ChatGPT follows each step correctly. |
SpeechFlow vs ChatGPT's own Voice Mode
ChatGPT has a built-in Voice Mode, and it's genuinely good — but it's a separate conversational UI. You talk, it talks back, and there's no persistent text box. That's perfect for hands-free Q&A; it's not designed for composing a carefully structured prompt, adding code snippets, or switching mid-sentence to paste something in.
SpeechFlow works inside the normal ChatGPT text box. You stay in the standard chat interface, keep your paste shortcuts, use the web search toggle — everything stays the same, you just stop typing. And because it's system-level, the same Control + speak shortcut works identically in Claude, Gemini, terminal prompts, email, Slack, or anywhere else on your Mac. One habit, every tool.
Privacy and pricing
SpeechFlow keeps zero data retention. In BYOK mode (€69, once) your audio goes straight from your microphone to the AI provider you chose — OpenAI, Gemini or Groq — and nothing passes through or is stored on a SpeechFlow server. The standard plans (Free: 2,500 words/week, no card; Pro: €10/month or €70/year) route through SpeechFlow's infrastructure with the same no-retention policy.
If you're curious how the same workflow applies to writing and coding tools, see the AI dictation overview for the full picture.
FAQ
Does it work in the ChatGPT desktop app as well as the browser?
Yes. SpeechFlow inserts text at the system cursor, so it works in the ChatGPT Mac app, ChatGPT in Chrome or Safari, and any other AI tool open on your Mac.
How is this different from ChatGPT's built-in Voice Mode?
ChatGPT Voice Mode is a conversational audio interface — you speak and it speaks back. SpeechFlow types into the standard text input, so you keep the full chat UI: paste shortcuts, file uploads, toggles, and the ability to edit before you send.
Does SpeechFlow store my prompts or voice recordings?
No. SpeechFlow has a zero data retention policy. In BYOK mode, audio goes directly to your chosen provider with no SpeechFlow server in between.
Does it only work with ChatGPT?
No — it works in every Mac app at the cursor. Claude, Gemini, Perplexity, Cursor, email, Notion, Slack — the same shortcut covers your whole workflow.
What does it cost?
Free tier: 2,500 words per week, no credit card needed. Pro is €10/month or €70/year. BYOK is €69 once for lifetime access with your own API keys.
Stop shortcutting your prompts. Try SpeechFlow free — 2,500 words a week, no card required.