User Research: Deep Learning for Gravitational Wave Detection with the Wolfram Language
Daniel George is a graduate student at the University of Illinois at Urbana-Champaign, Wolfram Summer School alum and Wolfram intern whose award-winning research on deep learning for gravitational wave detection recently landed in the prestigious pages of Physics Letters B in a special issue commemorating the Nobel Prize in 2017.
We sat down with Daniel to learn more about his research and how the Wolfram Language plays a part in it.
How did you become interested in researching gravitational waves?
This was actually a perfect choice in my research area, and the timing was perfect, since within one week after I joined the group, there was the first gravitational wave detection by LIGO, and things got very exciting from there.
I was very fortunate to work in the most exciting fields of astronomy as well as computer science. At the [NCSA] Gravity Group, I had complete freedom to work on any project that I wanted, and funding to avoid any teaching duties, and a lot of support and guidance from my advisors and mentors who are experts in astrophysics and supercomputing. Also, NCSA was an ideal environment for interdisciplinary research.
Initially, my research was focused on developing gravitational waveform models using post-Newtonian methods, calibrated with massively parallel numerical relativity simulations using the Einstein Toolkit on the Blue Waters petascale supercomputer.
These waveform models are used to generate templates that are required for the existing matched-filtering method (a template-matching method) to detect signals in the data from LIGO and estimate their properties.
However, these template-matching methods are slow and extremely computationally expensive, and not scalable to all types of signals. Furthermore, they are not optimal for the complex non-Gaussian noise background in the LIGO detectors. This meant a new approach was necessary to solve these issues.
Your research is also being published in Physics Letters B—that must be pretty exciting…
My article was featured in the special issue commemorating the Nobel Prize in 2017.
Even though peer review is done for free by referees in the scientific community and the expenses to host online articles are negligible, most high-profile journals today are behind expensive paywalls and charge thousands of dollars for publication. However, Physics Letters B is completely open access to everyone in the world for free and has no publication charges for the authors. I believe all journals should follow this example to maximize scientific progress by promoting open science.
This was the main reason why we chose Physics Letters B as the very first journal where we submitted this article.
You recently won an award at SC17 for your work—how was your demo received?
I think the attendees and judges found this very impressive, since it was connecting high-performance parallel numerical simulations with artificial intelligence methods based on deep learning to enable real-time analysis of big data from LIGO for gravitational wave and multimessenger astrophysics. Basically, this research is at the interface of all these exciting topics receiving a lot of hype recently.
Deep learning seems like a novel approach. What led you to explore this?
I was always interested in artificial intelligence since my childhood, but I had no background in deep learning or even machine learning until November 2016, when I attended the Supercomputing Conference (SC16).
There was a lot of hype about deep learning at this conference, especially a lot of demos and workshops by NVIDIA, which got me excited to try out these techniques for my research. This was also right after the new neural network functionality was released in Version 11 of the Wolfram Language. I already had the training data of gravitational wave signals from my research with the NCSA Gravity Group, as mentioned before. So all these came together, and this was a perfect time to try out applying deep learning to tackle the problem of gravitational wave analysis.
Since I had no background in this field, I started out by taking an online course by Geoffrey Hinton on Coursera and CS231 at Stanford, and quickly read through the Deep Learning book by Bengio [Courville and Goodfellow], all in about a week.
Then it took only a couple of days to get used to the neural net framework in the Wolfram Language by reading the documentation. I decided to give time series inputs directly into 1D convolutional neural networks instead of images (spectrograms). Amazingly, the very first convolutional network I tried performed better than expected for gravitational wave analysis, which was very encouraging.
What advantages does deep learning have over other methods?
Here are some advantages of using deep learning over matched filtering:
1) Speed: The analysis can be carried out within milliseconds using deep learning (with minimal computational resources), which will help in finding the electromagnetic counterpart using telescopes faster. Enabling rapid followup observations can lead to new physical insights.
2) Covering more parameters: Only a small subset of the full parameter space of signals can be searched for using matched filtering (template matching), since the computational cost explodes exponentially with the number of parameters. Deep learning is highly scalable and requires only a one-time training process, so the high-dimensional parameter space can be covered.
3) Generalization to new sources: The article shows that signals from new classes of sources beyond the training data, such as spin precessing or eccentric compact binaries, can be automatically detected with this method with the same sensitivity. This is because, unlike template-matching techniques, deep learning can interpolate to points within the training data and generalize beyond it to some extent.
4) Resilience to non-Gaussian noise: The results show that this deep learning method can distinguish signals from transient non-Gaussian noises (glitches) and works even when a signal is contaminated by a glitch, unlike matched filtering. For instance, the occurrence of a glitch in coincidence with the recent detection of the neutron star merger delayed the analysis by several hours using existing methods and required manual inspection. The deep learning technique can automatically find these events and estimate their parameters.
5) Interpretability: Once the deep learning method detects a signal and predicts its parameters, this can be quickly cross-validated using matched filtering with a few templates around these predicted parameters. Therefore, this can be seen as a method to accelerate matched filtering by narrowing down the search space—so the interpretability of the results is not lost.
Why did you choose the Wolfram Language for this research?
I have been using Mathematica since I was an undergraduate at IIT Bombay. I have used it for symbolic calculation as well as numerical computation.
The Wolfram Language is very coherent, unlike other languages such as Python, and includes all the functionality across different domains of science and engineering without relying on any external packages that have to be loaded. All the 6,000 or so functions have explicit names and are designed with a very similar syntax, which means that most of the time you can simply guess the name and usage without referring to any documentation. The documentation is excellent, and it is all in one place.
Overall, the Wolfram Language saves a researcher’s time by a factor of 2–3x compared to other programming languages. This means you can do twice as much research. If everyone used Mathematica, we could double the progress of science!
I also used it for all my coursework, and submitted Mathematica notebooks exported into PDFs, while everyone else in my class was still writing things down with pen and paper.
The Wolfram Language neural network framework was extremely helpful for me. It is a very high-level framework and doesn’t require you to worry about what is happening under the hood. Even someone with zero background in deep learning can use it successfully for their projects by simply referring to just the documentation.
What about GPUs for neural net training?
Using GPUs to do training with the Wolfram Language was as simple as including the string TargetDevice->"GPU" in the code. With this small change, everything ran on GPUs like magic on any of my machines on Windows, OSX or Linux, including my laptop, Blue Waters, the Campus Cluster, the Volta and Pascal NVIDIA DGX-1 deep learning supercomputers and the hybrid machine with four P100 GPUs at the NCSA Innovative Systems Lab.
I used about 12 GPUs in parallel to try out different neural network architectures as well.
Was the Wolfram Language helpful in quick prototyping for successful grant applications?
I completed the whole project, including the research, writing the paper and posting on arXiv, within two weeks after I came up with the idea at SC16, even though I had never done any deep learning–related work before. This was only possible because I used the Wolfram Language.
I had drafted the initial version of the research paper as a Mathematica notebook. This allowed me to write paragraphs of text and typeset everything, even mathematical equations and figures, and organize into sections and subsections just like in a Word document. At the end, I could export everything into a LaTeX file and submit to the journal.
Everything, including the data preparation, preprocessing, training and inference with the deep convolutional neural nets, along with the preparation of figures and diagrams of the neural net architecture, was done with the Wolfram Language.
Apart from programming, I regularly use Mathematica notebooks as a word processor and to create slides for presentations. All this functionality is included with Mathematica.
What would you say to people who are new either to the Wolfram Language or deep learning to get them started?
Read the documentation, which is one of the greatest strengths of the language.
There are a lot of included examples about using deep learning for various types of problems, such as classification, regression in fields such as time series analysis, natural language processing, image processing, etc.
The Wolfram Neural Net Repository is a unique feature in the Wolfram Language that is super helpful. You can directly import state-of-the-art neural network models that are pre-trained for hundreds of different tasks and use them in your code. You can also perform “net surgery” on these models to customize them as you please for your research/applications.
George’s Research and Publications
Glitch Classification and Clustering for LIGO with Deep Transfer Learning (NIPS 2017, Deep Learning for Physical Science)
Deep Neural Networks to Enable Real-Time Multimessenger Astrophysics (Physics Review D)