Wolfram Blog
Carlo Giacometti

New in the Wolfram Language: Audio

September 23, 2016 — Carlo Giacometti, Mathematica Algorithm R&D

I have always liked listening to music. In high school, I started wondering how it is that music seems to be so universally pleasing, and how it differs from other kinds of sounds and noises. I started learning to play guitar, and later at the University of Trieste, I learned about acoustics and signal processing. I picked up the guitar in high school, but once I began learning to program, the idea of being able to create and process any sound using a computer was liberating. I didn’t need to buy expensive and esoteric gear; I just needed to write some (or a lot!) of code. There are many programming languages that focus on music and sound, but complex operations (such as sampling a number from a special distribution, or the simulation of random processes) often require a lot of effort. That’s why the audio capabilities in the Wolfram Language are special: the ability to deal with audio objects is combined with all the knowledge and computational power of the Wolfram Language!

First, we needed a brand-new atomic object in the language: the Audio object.

Import["http://exampledata.wolfram.com/bach.mp3"]

Play Audio

The Audio object is represented by a playable user interface and stores the signal as a collection of sample values, along with some properties such as sample rate.

In addition to importing and storing every sample value in memory, an Audio object can reference an external object, which means that all the processing is done by streaming the samples from a local or remote file. This allows us to deal with big recordings or large collections of audio files without the need for any special attention.

The file size of the two-minute Bach piece above is almost 50MB, uncompressed.

ByteCount[a]

47960528

The out-of-core representation of the same file is only a few hundred bytes.

afile = Audio["http://exampledata.wolfram.com/bach.mp3"]

Play Audio
audio out 3

ByteCount[afile]

576

Audio objects can be created using an explicit list of values.

f[t_] := Mod[<br />
   t*BitAnd[BitOr[BitShiftRight[t, 12], BitShiftRight[t, 8]],<br />
     BitAnd[63, BitShiftRight[t, 4]]], 256, -128];<br />
data = Table[f[t], {t, 0, 100000}];<br />
data // Short

Audio[data, "SignedInteger8", SampleRate 8000]

Play Audio

Various commonly generated audio signals can be easily and efficiently created using the new AudioGenerator function, ranging from basic waveform and noise models to more complex signals.

Table[Labeled[<br />
  AudioPlot[AudioGenerator[wave, .01], PlotTheme "Minimal"],<br />
  wave], {wave, {"Sin", "Sawtooth", "White"}}]


Play Audio

The AudioGenerator function also supports pure functions, random processes and TimeSeries as input.


Play Audio

Now that we know what Audio objects are and how to create them, what can we do with them?

The Wolfram Language has a lot of native features for audio processing. As an example, we have complex filters at our disposal with very little effort.

Use LowpassFilter to make a recording less harsh.


Play Audio


Play Audio

WienerFilter can be useful in removing background noise.


Play Audio


Play Audio

A lot of audio-specific functionality has been developed for editing and processing Audio objects—for example, editing (AudioTrim, AudioPad, AudioNormalize, AudioResample), to visualization (AudioPlot, Spectrogram, Periodogram), special effects (AudioPitchShift, AudioTimeStretch, AudioReverb) and analysis (AudioLocalMeasurements, AudioMeasurements, AudioIntervals).

It is easy to manipulate sample values or perform basic edits, such as trimming.

A fun special effect consists of increasing the pitch of a recording without changing the speed.


Play Audio


Play Audio

And maybe adding an echo to the result.


Play Audio

With a little effort, it is also possible to apply more refined processing. Let’s try to replicate what often happens at the end of commercials: speed up a normal recording without losing words.

We can start by deleting silent intervals.

Delete the silences from the recording.


Play Audio

Finally, speed up the result using AudioTimeStretch.


Play Audio

To make the result sound less dry, we can apply some reverberation using AudioReverb.


Play Audio

Much of the processing can be done by using the Wolfram Language’s arithmetic functions; all of them work seamlessly on Audio objects. This is all the code we need for amplitude modulation.


Play Audio


Play Audio


Play Audio

Or you can do a weighted average of a list of recordings.


Play Audio

Play Audio

Play Audio

Play Audio


Play Audio

A lot of the analysis tasks can be made easier by AudioLocalMeasurements. This function can automatically compute a collection of features from a recording. Say you want to synthesize a sound with the same pitch and amplitude as a recording.


Play Audio

AudioLocalMeasurements makes the extraction of the fundamental frequency and the amplitude profile a one-liner.

Using these two measurements, one can reconstruct pitch and amplitude of the original signal using AudioGenerator.


Play Audio

We get a huge bonus by using the results of AudioLocalMeasurements as an input to any of the advanced capabilities the Wolfram Language has in many different fields.

Potential applications include machine learning tasks like classifying a collection of recordings.

And then there’s 3D printing! Produce a 3D-printed version of the waveform of a recording.

You can get an idea of the variety of applications at Wolfram’s Computational Audio page, or by looking at the audio documentation pages and tutorials.

Sounds are a big part of everyone’s life, and the Audio framework in the Wolfram Language can be a powerful tool to create and understand them.

Posted in: Developer Insights
Leave a Comment

5 Comments


Martin

Could you demo going through an audio file and pairing the distinct speakers with the locations of their blurbs?

Posted by Martin    September 23, 2016 at 2:25 pm
Michael

Carlo

Thanks for an interesting post. Is it possible to get a copy of this notebook in either .nb or .cdf format? It would be interesting to play around with the examples you have provided.

Michael

Posted by Michael    September 23, 2016 at 2:49 pm
BreezeMaxWeb

Great article on audio objects! I have a quick question, you can take the weighted averages yes but, can you use AudioLocalMeasurements to weight each audio value differently? Or does it have to be an average of all the sounds.

Thanks

Posted by BreezeMaxWeb    September 29, 2016 at 11:23 am
Christopher Purcell

This is an impressive array of tools. Well done !
One more would really complete this set – Mathematica needs a robust means of recording Audio. The only tool currently documented (SystemDialogInput["RecordSound"]) is not completely stable, and lacks the programmatic controls to enable precision measurements. With a robust Recording capability, your Audio tool kit could be widely used in physics and engineering.

Posted by Christopher Purcell    September 30, 2016 at 2:24 pm
Adalberto Schuck Jr.

Nice package. An extra incentive to use Mathematica with numerical audio processing as well.

Posted by Adalberto Schuck Jr.    October 24, 2016 at 4:59 am


Leave a comment in reply to Michael

Loading...

Or continue as a guest (your comment will be held for moderation):