Speech-to-text vs typing: which is faster?

Speaking runs 150–180 wpm; typing averages 40–60. But speed isn’t the whole story. Here’s an honest comparison to help you choose the right tool.

Most people type at 40–60 words per minute. Most people speak at 150–180. That three-to-one gap has always existed, yet we still draft emails, docs and notes by keyboard. AI dictation has finally closed speech’s old weaknesses — but typing still wins in some situations. Here’s what the evidence actually says.

The numbers: speaking vs typing speed

Raw throughput comparisons come with caveats, but the ranges below reflect what researchers and practitioners consistently observe:

Metric	Typing (average)	Typing (fast)	Speaking
Words per minute	40–60 wpm	70–90 wpm	150–180 wpm
Raw accuracy	Very high (you see errors live)	High (some misstrikes)	95–99 % with modern AI
Editing overhead	Low — inline corrections	Low	Low with AI cleanup; higher without
Fatigue / RSI risk	Moderate to high over hours	High at sustained pace	Low — vocal cords tire slowly
Best setting	Quiet or noisy, public spaces	Quiet or noisy, public spaces	Private or semi-private
Works for code / syntax	Excellent	Excellent	Poor — brackets and operators are awkward to dictate

Where typing still wins

Speed benchmarks flatter speech, but keyboard input has real, durable advantages:

Code and structured syntax. Bracket pairs, camelCase identifiers, SQL and shell commands all flow from fingers far more naturally than voice. Saying “open paren close paren semicolon” is slower than pressing the keys.
Noisy or public environments. Open offices, cafés, trains — anywhere you’d disturb others or risk being overheard, typing is simply more practical.
Heavy iterative editing. Rewriting a paragraph five times, rearranging sentences, cutting-and-pasting structure: fine motor control on a keyboard beats the back-and-forth of voice commands.
Precise formatting. Markdown, HTML, tables and numbered lists are all faster to key than to narrate, especially when the output format matters as much as the content.
Short bursts. A two-word reply, a filename, a quick search — the overhead of switching to voice isn’t worth it.

Where speech wins

Voice’s throughput advantage becomes decisive for anything long-form and prose-shaped:

First drafts. Getting words on the page is where speaking’s 3× speed advantage is unbeatable. Dictate the rough structure, then edit by keyboard — a workflow explored in our dictation productivity guide.
Emails and messages. A 200-word email takes under 90 seconds to speak; typing the same at 50 wpm takes four minutes.
Long-form writing. Blog posts, reports, meeting notes, journal entries — anything where volume matters more than pixel-perfect formatting.
Accessibility and RSI. For anyone managing repetitive strain injury or a typing-related condition, voice isn’t a productivity hack — it’s a lifeline.
Mobile. On-screen keyboards cap out well below desktop typing speed. Dictating is almost always faster on a phone or tablet.
Thinking out loud. Speaking often produces more natural, readable prose than composed typing, because you’re talking to a reader rather than performing for a cursor.

How AI cleanup changed the equation

The classic knock on dictation was transcript quality: filler words (“um”, “like”, “you know”), missing punctuation, and misheard words. That was a fair criticism of basic speech-to-text five years ago.

Modern AI-assisted dictation changes this. A language model post-processes the raw transcript — stripping fillers, adding correct punctuation, smoothing awkward phrasing — so the output you insert reads like something you typed carefully, not something you mumbled. The accuracy gap between speaking and typing is now largely closed for prose. What remains is a situational choice, not a quality compromise.

The speech-to-text software landscape has shifted accordingly: the best tools are no longer transcription engines — they’re writing assistants that happen to accept voice input.

The practical answer: use both

The fastest writers aren’t pure typists or pure dictators — they switch modes depending on the task. A pragmatic split:

Dictate first drafts, emails, meeting notes, long prose sections, and anything where getting words down fast matters.
Type code, short replies, anything that needs precise formatting, and edits to dictated text.

Even gaining back half the typing sessions where voice would work better is a meaningful productivity shift. If you write 2,000 words of prose a day, the speed difference alone saves around 20 minutes.

Where SpeechFlow fits

SpeechFlow is a native macOS app built around this hybrid workflow. Hold Control, speak naturally, release — a cleanup LLM strips fillers, adds punctuation and deposits finished text at your cursor in any app (Mail, Notion, Slack, Google Docs, anywhere). There’s no dictation window and nothing is stored; with BYOK mode, your audio goes straight to your chosen provider. The free plan covers 2,500 words a week — enough to feel the 5× speed difference without a credit card.

FAQ

Is speaking really 3× faster than typing for everyone?
The 3× figure compares average speakers (~160 wpm) to average typists (~50 wpm). Fast typists at 90 wpm still speak roughly twice as fast. The gap narrows as typing skill rises, but it doesn’t close — even professional stenographers cap out around 120 wpm by keyboard.

What about accuracy — isn’t dictated text full of errors?
Raw speech-to-text from older engines was error-prone. AI-assisted dictation (with a post-processing LLM) reaches accuracy comparable to careful typing for normal prose. The main remaining issues are proper nouns, technical jargon and homonyms, which a quick read-through catches.

Can I dictate code with speech-to-text?
Not efficiently. Natural language and programming syntax don’t map well — brackets, underscores and precise capitalisation are cumbersome to narrate. Voice works well for code comments, commit messages and documentation, but not for actual source code.

Is there a fatigue difference between speaking and typing?
Yes. Sustained keyboard use is associated with repetitive strain injury (RSI) in the hands and wrists, and neck strain from posture. Speaking engages different muscles entirely; vocal fatigue is real but typically takes hours of continuous talking, not the minutes of typing that trigger discomfort for RSI sufferers.

How do I start using voice without disrupting my current workflow?
The lowest-friction entry point is to dictate one type of task — emails are a good start — for a week. Once the habit is established you’ll naturally extend it. A tool like SpeechFlow that inserts text at the cursor means there’s no context-switch: you dictate in the same window you’re already working in.

If you want to try the speed difference yourself, SpeechFlow is free to start — 2,500 words a week, no card required.