Exploding Art: da Vinci Code of Another Sort
April 12, 2013 — Vitaliy Kaurov, Academic Director, Wolfram Science and Innovation Initiatives
What does programming have to do with a passion for the arts and history? Well, if you turn education into a game and add a bit of coding, then you can easily end up in the realm of a modern paradigm called, fancily, “gamification.” Though gamification is a very wide concept based on game use in non-game contexts (design, security, marketing, even protein folding, you name it), at heart it is very simple: play, have fun, and get things done. I may have oversimplified things here for the sake of a rhyme, but if you bear with my lengthy prelude, we may just see a simple case of turning passion into software.
My obsession with diagrams and simple line drawings began almost unnoticeably in the winter of 2003 in New York City after attending an exhibition at The Metropolitan Museum of Art: “the first comprehensive survey of Leonardo da Vinci’s drawings ever presented in America.” You may think it’d be a drag—crowds marching very slowly in a single long line coiling through the exhibition hallways. But perception of time transforms when you stare at 500-year-old craft. I think it was then that it started to dawn on me what special value a first sketch has. A first act when an idea, something very subjective, evasive, living solely inside one’s mind, materializes as a solid reality, now perceivable by another human being. Imagine it happened ages ago. Wouldn’t you be curious what was going on at that moment in time, what got frozen in this piece of craft in front of you?
People have been expressing things visually since immemorial times. The oldest things we can find, like cave paintings, are dated to 40,000 years ago. And we still keep doodling on napkins, sand, and touch screens—solidifying ideas. There is a tremendous body of work, celebrated or little known, that has accumulated through the ages in libraries, antique book shops, private collections, ruins, burial sites, forgotten attics. These abiding footprints of imagination possess striking visual power that survived their creators. They fuse art and history because the enigma of their stories is no less elegant than their art. I am being intentionally dramatic (you’ll miss it when we’ll get to coding), because art is a tricky thing. How do you share with others a fascination with subjective matters, understanding gained through time and experience? How do you engage others and transfer your perception? Well, maybe you can’t do it directly, but you can still orchestrate how others encounter art. For example, instead of showing an image, you can make it a discovery process. An image is a mystery until we find a meaning and learn its story. I wonder if this can be a metaphor for the image presentation. Can we tease, intrigue beholders with a conundrum and set them on a quest? The way we comprehend art echoes the way the art was created in the first place. Everything is a process, everything has a story, even merely finding an interesting image lost in the pages of an old manuscript. By analogy, we turn presentation into a process—so an encounter with an image becomes more personal. With longer positive experience comes deeper immersion. Jigsaw puzzles are a perfect example.
So now that we’re done with drama and profundities, how can we get to action? Recently I was lucky to come across the brilliant iOS game Blueprint3D by FDG Entertainment, which has been critically acclaimed and praised for its originality. The app shatters blueprint drawings into myriad pieces scattered almost randomly in 3D space—“almost” because there is a single correct perspective under which the fragments suddenly assemble back into the original drawing. The game play is to find this perspective by rotating a 3D cloud of scattered pieces. The images seemed to be created by the company developers à la simple technical blueprints. While I was indulging in several game rounds, it suddenly struck me—this would make a perfect game for “actual” historical drawings that I enjoy so much! And it would be wonderful to have a short story behind the images in the game too. I had no idea about the algorithms that the app used and how to bypass the many challenges of complete app development. But I really wanted to make it work. And what do you do when you just can’t wait for something to happen?
I launch Mathematica. For me it’s the shortest path between A and B, idea and result, a universal beeline for almost every task. And so, without further ado, below is the result of my implementation in Mathematica of Blueprint3D’s idea. It is far from its final polished form, but was achieved in a surprisingly short time (less than an hour) and with a small amount of code (see the attached notebook). Rotate 3D graphics with your mouse until you reveal a hidden image. The 3D graphic object has different rotation modes depending on whether you drag it with your mouse from the middle or from the corners. The “hint” control can help you to navigate in the space of 3D rotations: the puzzle is solved when the green and red dots overlap. Though “hint” is very approximate, it helps to understand if the correct up-down and left-right orientation of the image is found. It is important especially when images have text.
Please let me know in the comments which images you found to be most interesting. And now let’s go over major ideas and challenges of this app.
The game mechanics
So how does it work? We start from texture mapping a square image onto a large square polygon drawn in 3D space. The whiter the pixels are in the original image, the more transparent they are on the polygon. We then subdivide the square polygon into hundreds of small random polygons with the help of Voronoi tessellation that covers everything without gaps. It is very easy to get Voronoi tessellation in Mathematica, and I will use the following function:
The next step is to apply random geometric transformations to each small polygon. We will stretch, rotate, and scatter them around in 3D space in almost perfect chaos, but with one restriction: there is a single direction in space along which things happen in a similar way for all polygons. This is where we will hide the image. Now for a few more details.
The age of interactive blogging
The technology behind this app is CDF, Computable Document Format, which we embedded in the blog web page. Electronic publishing is old news. Please meet interactive blogging. We’ve been practicing this with the Wolfram Blog for a while already, but it is truly a new paradigm. For a blogger, it allows the new freedom to be able to sculpt ideas not only in writing, but with an app, a puzzle, a discovery tool: to be able to capture your readership within a new, more engaging dimension. CDF is powerfully flexible. For instance, I wanted to keep an expandable library of images that I could add to and remove from without having to re-deploy the CDF app every time on the site. So I decided to store images and their descriptions on a server and make the app access them via URLs. Such advanced CDF deployment features are available in Mathematica Enterprise Edition.
The magic of HistogramTransform
As I mentioned already, I needed a pure white image background for it to be transparent in the chaotic cloud of polygons. But the images I found online had various non-white backgrounds, especially antique ones that had heavy yellowish-gray backgrounds like the top images in this post. And though I could convert to gray levels, opacity of the background was a problem. It needed to be completely transparent for the game to be challenging, so one can see through and get visually overwhelmed by a random mixture of line fragments. If the background is opaque, then the polygon boundaries of the fragments will be visible, and it will be much easier and less interesting to solve the puzzle. After trying to code a few image processing techniques on background removal, I realized that, as it often happens in Mathematica, there is a suitable function for that. In the latest version we introduced the almost magical HistogramTransform, which can make one image look like another by matching their histograms. The following code compares an original image, the same image converted to “Grayscale,” and the image converted via HistogramTransform to look like a sample image:
You can clearly see that simple color conversion to “Grayscale” is not enough to remove heavy background residue. But if we find another image (here the celebrated Benson Lossing Reverse) with a mostly clean background, then the result after HistogramTransform is incredibly clear. It also nicely preserves various intensities of gray levels corresponding to different pencil pressure or ink concentration in the paper. This simple “statistical” method based on probability distributions for image pixels is much better than my custom coding based on blunt morphological and threshold operations. White backgrounds will become transparent in a 3D fragment cloud.
Hiding power of ShearingTransform
The main trick in this application is to figure out how to hide the original 2D image in a single 3D direction or line of sight among infinitely many others. I had no knowledge of the algorithm behind Blueprint3D and had to come up with my own. I suspected that something like ShearingTransform was the key. This particular GeometricTransformation function in Mathematica has quite a few variables. In the code below I illustrate their action. We can start from any 3D graphics object, in this case a hexagon, and transform it in such a way that it stretches only along a particular direction (red arrow), perpendicular to some “normal” (yellow arrow), and under a certain “shear” angle. There is also a point we can choose—marked green—that always stays fixed. Try playing with various controls and rotating 3D graphics. This transformation has an interesting hiding property: if you look directly from above or below (parallel to red arrow) from far away, all sheared polygons will look the same, hiding behind each other.
Enigma of a shattered image
The red arrow designates that unique single perspective where we will hide the 2D image. If we break an image into fragments and shear them under different angles and normals, but keep the sheering direction the same, the view from below or above will give us the whole unbroken image. We can illustrate this with an app below. The polygon subdivision was done with the help of Voronoi diagrams. Try controls and rotating 3D graphics below. As you can see, we also displaced polygons at random heights. Note how looking “very close above” leaves the unhealed gaps between the polygons and how these gaps are healed when the height is increased. This explains intuitively that we should look “from far” to not see the gaps. This relates to the fact that during shearing, all points of an object move along parallel lines. Don’t forget you can rotate the object below.
You may have noticed that in the game (the very first CDF app above), fragments are scattered inside a cubic volume. This is the simplest solution, which unfortunately reveals the orientation of the cube by its sharp edges. To make the puzzle more challenging, we made the red cloud above to be of a spherical shape, so there are no edges.
When in trouble, dot multiply
I don’t think this game is difficult, especially after some practice. Yet I think some people, especially kids, would appreciate a bit of help, at least sometimes. This is why I added the “hint” option. When red and green spots of the hint overlap, you are close to the solution. It is not a perfect hint, but it does help you to see if the image is flipped from left to right or upside down, or how far you are from a solution. How does this work? If no panning or zooming is involved, just rotation, a unique object orientation in 3D space is controlled by two properties: ViewPoint and ViewVertical. Both are three-element vectors or lists. While we rotate a 3D object, the elements of these lists change. When a vector is aligned with a specific direction in 3D space, its normalized dot product with that direction equals 1. All other alignments will produce different numbers. So these two dot products can be interpreted as coordinates of a point in a 2D plane. A simplest example is shown below—rotate the left 3D cube to understand the relation to the 2D plane.
I keep getting surprised at how easily we can solve various challenges just because Mathematica always has something up its sleeve—it is an amazing collection of algorithms. You can find the complete code of the game in the attached CDF file. No doubt there are many possible improvements. Let me know in the comments about any suggestions, ideas, or your favorite images. I suspect this game can be quite beneficial for kids, giving basic notions about the relation between 2D and 3D space and some history lessons.