Susan Bennett, voice of Siri, speaks at Radcliffe.

Susan Bennett is the voice of Siri.

Kevin Grady/Radcliffe Institute

Science & Tech

‘Siri, who provided your voice?’

5 min read

At Radcliffe symposium, actor explains her early role in communicating for iPhone

She’s had conversations with millions of people and most of them have no idea what she looks like.

The woman behind one of the most famous voices in North America visited Harvard on Friday to explain how she became the sound of the first iterations of the Apple iPhone’s virtual assistant, Siri. Instead of a stilted, computerized voice for their newest product in 2011, the high-tech giant opted for the dulcet tones of voice actor Susan Bennett.

“I feel like I know a lot of you already,” Bennett joked with the crowd at the Radcliffe Institute for Advanced Study’s symposium “Beyond Words: Gender and the Aesthetics of Communication,” a daylong conference that examined body communication, and included talks on perfumes, tattoos, sign language, dance, and fashion.

Bennett was part of a panel which explored advances in social robotics and artificial intelligence (AI), a field of computer science set to revolutionize the way we live in the years and decades to come. While certain AI technologies — think self-driving cars — are still in the testing phase, others, like Siri, have become essential parts of our daily lives. Today, in addition to Siri, there is Alexa, Amazon’s virtual assistant that responds to queries and requests when its name is called; Cortana, the virtual assistant created by Microsoft for several Windows platforms; and the nameless Google Assistant.

Bennett, a longtime singer and voice-over artist, said she had no idea she would become the voice of Siri when she began the tedious work of recording scripts “created just for sound and not at all for content or meaning,” several years ago. Sample lines included phrases such as “cowboys in the cow pod today” and “say the shredding again.”

They were “pretty wacky,” Bennett said of the nonsensical phrases that took four months to complete despite recording for four hours a day, five days a week. But those silly series of strung-together words allowed engineers to “get all of the sound combinations in the language” in a digital format, she said, and were essential to the next phase of the project: concatenation, during which computer programmers rearranged the sounds into new sentences.

“These are what ended up on our phones,” said Bennett, who lamented the fact that the original Siri, known for responding with a dose of attitude, has already been supplanted by a hipper, higher-pitched, and younger-sounding voice, one that likes to reply in a more casual, offhanded manner.

Siri, Bennett said, has “become a millennial.”

Offering up another perspective on the virtual assistant was Noelle LaCharite, a software engineer and a lead developer for Alexa, who reminded listeners that while the ability to converse with a computerized device may seem like magic, it takes thousands of people to make it a reality.

“It’s built on the backs of developers,” who take Alexa’s failure to understand a command or request and “hard-code in the solution,” said LaCharite. As part of a team of coders who helped program Amazon’s voice assistant in its earliest days, LaCharite said she made it her mission to write 800 lines of code for Alexa that resulted in 600 different daily affirmations for users who might be in need of a little pep talk or motivation. “I realized now as storytellers, as really just human beings,” she said, “our job is to be constantly thinking about these infinite ways we should be able to communicate.”

Cynthia Breazeal, associate professor of media arts and sciences in the MIT Media Lab and founder and chief scientist behind the social robot Jibo, said her work has focused on how to design robots that can more effectively communicate with humans and how that communication in turn affects human behavior. During her research Breazeal said she has found that the more human-like a robot, the better people seem to respond. Her studies have repeatedly shown that children relate better to, and often emulate, a robot that is more expressive with its voice and its movements and that displays a sense of empathy. “I think we are at the beginning of thinking about this world interacting with AI with intention,” said Breazeal.

In a day of discussions devoted to the topic of how humans use their bodies to communicate, it seemed only fitting to have a speaker dedicated to fashion. Valerie Steele, director and chief curator of The Museum at the Fashion Institute of Technology, delivered the keynote address, noting that through the centuries clothing has been used to send messages, attract attention, and cover flaws. The corset endured from 1500s through the mid-20th century, she said, because it was a way for women to show off their bodies’ best features while hiding their worst.

“The corset is a lie but we prefer the lie to the truth,” said Steele, quoting a 19th-century doctor during her talk. Similarly, she credited the business suit’s lasting popularity to the fact that it camouflages “all the myriad flaws in male bodies like corsets were supposed to do in female bodies.”