News, Views & Insights

Benedict Cumberbatch Can Charm Humans, but Can He Fool a Computer?

November 26, 2014

The Imitation Game, a movie portraying Alan Turing’s life (who would have celebrated his 100^th birthday on Mathematica‘s 23rd birthday—read our blog post), was released this week, which we’ve been looking forward to. Turing machines were one of the focal points of the movie, and we launched a prize in 2007 to determine whether the 2,3 Turing machine was universal.

So of course, Cumberbatch’s promotional video where he impersonates other beloved actors reached us as well, which got me wondering, could Mathematica‘s machine learning capabilities recognize his voice, or could he fool a computer too?

I personally can’t stop myself from chuckling uncontrollably while watching his impressions, however, I wanted to look beyond the entertainment factor.

So I started wondering: Is he actually good at doing these impressions? Or are we all just charmed by his persona?

Is my psyche just being fooled by the meta-language, perhaps? If we take the data of pure voices, does he actually cut the mustard in matching these?

In order to determine the answer, 10 years ago we would have needed to stroll the streets and play audio snippets to 300 people from the James Bond movies, The Shining, Batman, and Cumberbatch’s impression snippets—then survey whether those people were fooled.

But no need, if you have your Mathematica handy!

With Mathematica‘s machine learning capabilities, it’s possible to classify sample voice snippets easily, which means we can determine whether Benedict’s impressions would be able to fool a computer. So I set myself the challenge of building a decent enough database of voice samples, plus I took snippets from each of Benedict’s impression attempts, and I let Mathematica do its magic.

We built a path to each person’s snippet database, which Mathematica exported for analysis:

Classify sample voice snippets

We imported all of the real voices:

The classifier was trained simply by providing the associated real voices to Classify; in the interest of speed, a pre-trained ClassifierFunction was loaded from cfActorWDX.wdx:

Classifier was trained simply by providing the associated real voices to Classify

My audio database needed to include snippets of Benedict’s own voice, snippets of the impersonated actors’ own voices, and the impressions from Cumberbatch. The sources for the training were the following: Alan Rickman, Christopher Walken, Jack Nicholson, John Malkovich, Michael Caine, Owen Wilson, Sean Connery, Tom Hiddleston, and Benedict Cumberbatch. I used a total of 560 snippets, but of course, the more data used, the more reliable the results. The snippets needed to be as “clean” as possible (no laughter, music, chatter, etc. in the background).

These all needed to be exactly the same length (3.00 seconds), and we made sure all snippets were the same length by using this function in the Wolfram Language:

Making sure snippets are all same length

Some weren’t single-channel audio files, so we needed to exclude this factor as an additional feature to optimize our results during the export stage:

Excluding single-channel audio files

Thanks go to Martin Hadley and Jon McLoone for the code.

Drum-roll… time for the verdict!

I have to break everyone’s heart now, and I’m not sure I want to be the one to do it… so I will “blame” Mathematica, because machine learning could indeed mostly tell the difference between the actors’ real voices and the impressions (bar two).

As the results below reveal, Mathematica provides 97–100% confidence on the impressions tested:

Mathematica provides 97-100% confidence on the impressions tested

For most impressions, there is a very small reported probability of any classification other than Benedict Cumberbatch or Alan Rickman.

It might be worth noting that Rickman, Connery, and Wilson all have a slow rhythm to their speech, with many pauses (especially noticeable in the snippets I used), which could have confused the algorithm.

Sad Benedict Cumberbatch

Now it’s time to be grown up about this, and not hold it against Benedict. He is still a beloved charmer, after all.

My admiration for him lives on, and I look forward to seeing him in The Imitation Game!

Download the accompanying code for this blog post as a Computable Document Format (CDF) file.

Rita Crook, Marketing Projects Manager, European Sales

Comments

Join the discussion

13 comments

Great post, I was definitely charmed by his persona.
Reply

Jane

November 26, 2014 at 10:52 am 11/26/2014 at 10:52 am
now THIS is truly beyond fantastic :)
Reply

Mikey

November 26, 2014 at 12:24 pm 11/26/2014 at 12:24 pm
Fascinating work!

Would love to see this done with someone who actually does/did impressions for a living. Your Rich Littles, your Frank Gorshins, your Andre_Philippe Gagnons.

That would be impressive!

Thanks
Reply

Paul Thomson

November 26, 2014 at 1:06 pm 11/26/2014 at 1:06 pm
Might have to go and watch the film now! This Cumberbatch fella seems decent.
Reply

Riccardo

November 27, 2014 at 5:40 am 11/27/2014 at 5:40 am
How did you get rid of the clock ticking noise? Or did you? Could an algorithm not be set on finding that single “chime” in each sample to ID Cumberbatch?
Reply

Luke Stanley

November 28, 2014 at 9:17 pm 11/28/2014 at 9:17 pm
Now what would be truly impressive is: If Mathematica could tell why BC has such wonderful hair in all his flicks, and his hair looks like a total dork in real life.
Reply

YouRang

January 22, 2015 at 7:45 pm 01/22/2015 at 7:45 pm
This is a fantastic use for Mathematica. I love it. And am very impressed.
Reply

Jennifer

January 22, 2015 at 7:53 pm 01/22/2015 at 7:53 pm
I am charmed by your wonderful idea to use Mathematica for such a cool set of voice recognition tests! Great idea. And excellent execution. My hat is off to YOU!
Reply

David D-VA

January 23, 2015 at 10:50 am 01/23/2015 at 10:50 am
Resurrecting an old post, but it appears that the code for soundPartition isn’t listed in either the post or the attached CDF article. Do you mind pointing me to it? Or was soundPartition renamed to soundTake?
Reply

David Koslicki

September 7, 2015 at 4:07 pm 09/07/2015 at 4:07 pm
- Only a year late…
  
  soundTake[
  Sound[SampledSoundList[data_List, rate_], opts2___], {low_?NumberQ,
  high_?NumberQ}] := Block[{
  lov = Floor[low*rate + 1],
  hiv = Floor[high*rate + 1]},
  Sound[SampledSoundList[Take[#, {lov, hiv}] & /@ data, rate],
  opts2]]
  soundDuration[Sound[SampledSoundList[data_, rate_], opts___]] :=
  Length[First[data]]/rate;
  randomSoundTake[source_Sound, n_?NumberQ] :=
  Block[{duration = soundDuration[source], start},
  start = RandomReal[{0, duration – n}];
  soundTake[source, {start, start + n}]]
  soundPartition[source_Sound, n_?NumberQ] :=
  Block[{duration = soundDuration[source]},
  Table[soundTake[source, {i , i + n}], {i, 0,
  n Floor[duration/n] – n, n}]]
  Reply
  
  Martin Hadley
  
  November 26, 2015 at 10:45 am 11/26/2015 at 10:45 am

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

News, Views & Insights

Benedict Cumberbatch Can Charm Humans, but Can He Fool a Computer?

Comments

13 comments

Benedict Cumberbatch Can Charm Humans, but Can He Fool a Computer?

Posted in:

Comments

13 comments

Related Posts

Launching Version 15 of Wolfram Language & Mathematica: Built-in (Useful) AI & Lots of New Core Functionality

Compression and Recompression of JPEG: Stability, Artifacts and Iterative Image Collapse

A Data Adventure in Boston, 1929: Historical Census Corpus Analysis