Behind the voice: an interview with TikTok’s text-to-speech actress

As a radio host for Canada’s 91.5 The Beat, Kat Callaghan struggled to keep her role as TikTok’s new text-to-speech voice under wraps. When her voice appeared on the app last year, listeners started sending her countless TikToks pointing out the similarities. “Have you heard this? It kind of sounds like you,” she recalls listeners saying.

But it wasn’t until last month that Callaghan confirmed what her listeners had long suspected: “Finally, I can tell you guys,” she said in a TikTok video, the first she’d ever posted. “It is me.” The video went viral almost immediately, amassing more than 45 million views since it went live.

Callaghan’s voice, known as “Jessie” in the app, has become a popular way for creators to convey their thoughts without actually saying anything themselves. All creators have to do is type whatever they want they want to say, and Callaghan’s voice will read it line for line in a cheery manner, no matter how goofy — or profane — the text is.

“Over the last year and a half, it’s been really interesting to see how people use [the voice], especially given that they have options,” Callaghan said in an interview with The Verge.

Callaghan realized she had a knack for voiceovers after her radio listeners started coming forward and asking to use her voice in their projects. “A light bulb went off for me,” Callaghan explained. “And I’m like, ‘Of course, why don’t I do that?’ I wanted to do that more.” In 2015, she decided to start her own voiceover business.

Since launching the business, Callaghan has landed a number of voiceover gigs since, including on e-learning content, YouTube videos, and commercials, but she says TikTok’s text-to-voice feature is the biggest job she’s had so far. “It’s certainly different than anything else I’ve ever done,” Callaghan said. “It’s been really, really cool working with TikTok.”

When Callaghan’s voice first appeared on the platform last year, users weren’t so sure how to react. Some were taken aback at the cheerfulness of the voice — an earlier voice had sounded more monotone — while others began testing the limits of what it can and can’t say. Occasionally, the artificial intelligence platform that the text-to-speech tool runs on doesn’t know how to pronounce a particular word or phrase, resulting in absurd or goofy readouts. For example, the voice pronounces “lol” as “lawl,” prompting Callaghan to create a cheeky TikTok video to prove that, yes, she does know how to say it correctly.

“At first, it was so mixed, and there were some negative reactions,” Callaghan said. “So I was perfectly fine kind of sitting back and being like, ‘Let’s see how people use it.’”

Callaghan wasn’t TikTok’s first text-to-speech actor. Last year, TikTok pulled its original voice after the artist behind it, Beverly Standing, claimed the company used her voice without permission. She also took issue with how users could make her voice say anything they want — including curse words and other offensive language — potentially causing her “irreparable harm” in the process. She reached a settlement with TikTok last September, months after Callaghan’s voice had been added to the app as a replacement.

“That’s the part of me that lives in the app”

While users almost immediately started using Callaghan’s voice to say silly things and some nasty remarks, Callaghan says she has a way of distancing herself from that kind of content — and the overall weirdness that comes along with people having total control of her own voice.

“In my mind, I understand that that’s not me — I never said that. So that’s fine if that’s what people want to do on their own platform, but it certainly isn’t anything that I would ever say,” Callaghan explains. “But I can compartmentalize that in a way. That’s just Jessie… That’s the part of me that lives in the app, but it’s not me saying all those things: it’s the content creators that are saying it.”

Callaghan’s voice on TikTok has become so widespread that it’s even become known among some non-TikTok users (myself included). I can’t even count the number of times I’ve heard the Jessie text-to-speech voice in videos sent to me by friends and family, as well as in the viral clips floating around the web. It’s almost become the voice of TikTok, similar to how Siri has become the voice of Apple.

But the “Jessie” voice isn’t just for making fun videos. Text-to-speech is an important accessibility feature that helps users with impaired vision understand a video’s content. There are currently several different text-to-speech options available on TikTok; Jessie is just one of them, but it is, by far, the most lively of the bunch.

“Jessie’s definitely different, and people are obviously hearing that, and they’re using it for that reason: because she is different and she’s upbeat,” Callaghan said. “If you do allow Jessie to speak for you and your videos, I say thank you. I’m happy to do it.”

Live Updates for COVID-19 CASES