Text is convenient. But the voice vs text statistics are remarkably consistent: voice builds stronger bonds, gets misread far less, and carries emotion that words on a screen leave out — and we systematically underestimate all of it.
Last updated: June 2026 · Compiled by the Mindfuse editorial team
Stronger
bonds from voice vs text (Kumar & Epley)
Underestimated
people wrongly expect calls to be awkward
50%
near-chance accuracy at reading tone in email
Prosody
tone and rhythm carry meaning text drops
Voice
no "Zoom fatigue" the way video has
Anxiety
rising phone-call avoidance among the young
Misquoted
"93% nonverbal" is a misreading of Mehrabian
Intimacy
voice builds parasocial closeness (podcasts)
Voice builds stronger bonds than text
Voice calls create significantly stronger social bonds than text — yet people expect the opposite.
Across a series of experiments, Kumar and Epley found that reconnecting by voice (phone or video) made people feel significantly more connected than reconnecting by text, with no meaningful increase in awkwardness. Crucially, participants systematically predicted calls would be more uncomfortable, so they chose text — and missed out on closeness as a result.
Kumar, A. & Epley, N., "It’s Surprisingly Nice to Hear You: Misunderstanding the Impact of Communication Media Can Lead to Suboptimal Choices of How to Connect With Others," Journal of Experimental Psychology: General (2021).
The barrier to calling is a forecasting error, not a real cost.
In the same research, the awkwardness people feared before a call largely failed to materialise. Because we overestimate the discomfort and underestimate the connection, we default to lower-bonding media — a small, fixable misjudgement repeated millions of times a day.
Kumar, A. & Epley, N., Journal of Experimental Psychology: General (2021).
Why text gets misread
People are barely better than chance at conveying tone in text — but feel sure they have nailed it.
Kruger and colleagues showed that senders of emails were confident their intended tone (e.g. sarcasm vs sincerity) would be understood, yet recipients identified it at close to chance levels. Without voice, the sender hears their own intonation in their head and assumes the reader does too. They do not.
Kruger, J., Epley, N., Parker, J. & Ng, Z.-W., "Egocentrism Over E-Mail: Can We Communicate as Well as We Think?," Journal of Personality and Social Psychology (2005).
Prosody — pitch, rhythm, pace, emphasis — carries emotional meaning text simply cannot encode.
A large body of work in psycholinguistics shows that the musical qualities of speech signal emotion, irony, sincerity and emphasis. Stripping them out, as text does, removes a channel the brain evolved to read — which is why the same words can land warmly by voice and coldly in a message.
See research on vocal prosody and emotion recognition (e.g. work building on Banse & Scherer, Journal of Personality and Social Psychology, 1996).
Why voice-only can beat video
Video calls produce "Zoom fatigue" through mechanisms voice-only largely avoids.
Stanford research identified specific drivers of videoconference exhaustion: excessive close-up eye contact, constantly seeing yourself, reduced mobility, and the cognitive load of decoding faces on a grid. Voice-only conversation sidesteps most of these, which is part of why a phone call can feel less draining than a video meeting.
Bailenson, J. N., "Nonverbal Overload: A Theoretical Argument for the Causes of Zoom Fatigue," Technology, Mind, and Behavior (2021).
Adding a face does not always add connection — and can subtract from it.
Because video imposes its own load, voice can be the sweet spot: rich enough to carry tone and turn-taking, light enough to stay comfortable for long, open conversation. For anonymous or vulnerable talk, the absence of a watching face can also make people more candid.
Bailenson, J. N. (2021); Kumar & Epley (2021).
How a generation lost the call — and what voice still does
Younger generations increasingly favour text and report anxiety about phone calls.
Pew’s long-running work on mobile habits shows texting overtaking calling, especially among younger users, while multiple consumer surveys from 2019–2022 report widespread "phone-call anxiety" among Gen Z and younger millennials. Avoidance becomes self-reinforcing: the less we call, the harder calling feels.
Pew Research Center, mobile messaging and calling data; consumer surveys on phone-call anxiety (2019–2022).
Voice builds intimacy on its own — the reason podcasts feel like friendship.
Research on parasocial relationships finds that hearing a voice over time fosters a sense of closeness and trust, even one-directionally. The intimacy of the human voice in the ear is a major reason audio formats like podcasts create feelings of companionship that text rarely does.
Horton, D. & Wohl, R. R., "Mass Communication and Para-Social Interaction," Psychiatry (1956); subsequent parasocial and podcast research.
The "93% of communication is nonverbal" claim is a misquote — but voice still carries far more than words.
Albert Mehrabian’s often-cited 7%–38%–55% figures applied only to communicating feelings and attitudes when channels conflict; he repeatedly warned against generalising them to all communication. The accurate takeaway is narrower but real: tone of voice conveys a great deal of emotional meaning that plain text loses.
Mehrabian, A., "Silent Messages" (1971); Mehrabian’s own clarifications on misuse of the figures.
For media enquiries, fact-checks or citation requests: [email protected]
Some things are better said out loud.
Mindfuse pairs you with a stranger in another country for an anonymous 1-on-1 voice conversation. Available on iOS and Android.