Searching Genomes with Mathematica and HadoopLink
Editorial note: This post was written by Paul-Jean Letourneau as a follow-up to his post Mathematica Gets Big Data with HadoopLink.
In my previous blog post I described how to write MapReduce algorithms in Mathematica using the HadoopLink package. Now let's go a little deeper and write a more serious MapReduce algorithm.
I've blogged in the past about some of the cool genomics features in Wolfram|Alpha. You can even search the human genome for DNA sequences you're interested in. Biologists often need to search for the locations of DNA fragments they find in the lab, in order to know what animal the fragment belongs to, or what chromosome it's from. Let's use HadoopLink to build a genome search engine!