Doing Spy Stuff with Mathematica
July 8, 2010 — Jon McLoone , Director, Technical Communication & Strategy
I was reading about the IT problems of the recently arrested, alleged Russian spies, and I wondered if they could have managed secret communications better with Mathematica.
One of the claims was that they were using digital steganography tools that kept crashing. I wanted to see how quickly I could implement digital image steganography in Mathematica using a method known as “least significant bit insertion”.
The idea of steganography is to hide messages within other information so that no one notices your communications. The word itself comes from a Latin-Greek combination meaning “covered writing”, from earlier physical methods that apparently included tattooing a message on a messenger’s head before letting him grow his hair back to hide it. In the case of digital steganography, it is all done in the math.
The first thing that we have to do is to pick an innocent-looking image to transmit to our spy masters, perhaps via some public online forum. This picture wouldn’t look suspicious if posted on a poultry-rearing discussion site…
Amazingly, it is possible to hide another, larger, full-color picture within this image and get it back again with about a dozen lines of Mathematica code.
The key to the whole process is to use the least significant bit in each color channel of each pixel as a place to hide information. We start by clearing that bit to zero regardless of its current value by ANDing each byte with the binary word 11111110.
We have to force Mathematica to use a strictly 8-bits-per-channel integer representation, as it will naturally work in a much more accurate color space—hence the use of “Byte” in various places in our code.
In a sufficiently complex image, the human eye doesn’t see the loss of information. The fact that one color channel on a particular pixel might or might not be darker by 1/256 than before cannot be noticed…
Even if we subtract the truncated version from the original to show only the values of the differences, you still don’t see anything unless you have very good eyes and a very good monitor.
Only by exaggerating these differences to the maximum possible contrast do we see that the differences are not zero.
Next we must convert our secret content into a sequence of bits and insert each of them into these now-empty bits in the carrier. We can put one in each pixel channel; that’s three per pixel in an RGB image.
To start the encoding we use ToCharacterCode to convert the characters into ASCII numbers…
Then we convert those to binary, padding them to a fixed number of bits even if the leading values are zero (or we won’t know where one byte begins and the next ends).
We then flatten that list and pad it with zeros to equal the total number of pixel channel values in which we are able to hide information (image width * height * 3 for RGB images). Now that we have the right number of bits, we reshape the data into the dimensions of the image and add it on to the carrier image with the cleared bits.
Here is all of that put together…
To reverse the process, we will have to pull the least significant bit from every pixel channel, group them into words of 8 bits, and convert back to text.
Let’s test it…
Now reverse the process…
You must export the image in lossless format like PNG, in the original image size. Any conversion to a lossy format, like JPG, or rescaling will destroy information in the low-value bits where we are storing our secret message.
Matilda the chicken is 450*450 pixels and has three color channels. That means that we can store 450*450*3/8 characters. Over 75,000. More than enough to hide the whole of Alice’s Adventures in Wonderland.
The added information is no greater than the information that we removed. The file will still be about the same size and still look the same to the human eye.
In fact, on average, half the pixel channels are the same as the original image, around 1/4 of the pixel channels are one bit lower than the original, and 1/4 are one bit higher, so it is closer to the original than the truncated version.
This gives us a solution to transmitting any information, as long as we can convert it to 8-bit ASCII first. Mathematica provides a set of tools for this, so for extended-character-set transmission we can use this:
We can use conversion to ASCII to insert files or data in special formats. For example, we can insert a picture. Here we import a PNG file and then re-encode it as a string of ASCII in a JPG format.
This string is only around 32,000 characters, thanks to JPG compression, somewhat smaller than the Alice text, and so, amazingly, it is easy to hide in the carrier image. We can export the image to any lossless format; the JPG encoding of the secret part is only important when we come to interpret the decoded string.
Again the carrier image looks the same to us, yet hidden inside is a secret image of “Agent H” and “Agent B”…
None of this is cryptographically secure; it is all about not being noticed transmitting information. Cryptography should be applied to the secret message before inserting it into the carrier image. And of course none of this will help at all if you can’t trust the recipient, are already under surveillance, or are silly enough to ask an undercover FBI agent to fix your laptop!