Wolfram Blog
Jamie Peterson

Unlocking Next-Level Certification at the Wolfram Data Science Boot Camp

September 24, 2020 — Jamie Peterson, Technical Programs Manager, Wolfram U

Unlocking Next-Level Certification at the Wolfram Data Science Boot Camp

Our second year of the Wolfram Data Science Boot Camp (and the first fully virtual edition) wrapped at the end of July. After completing final project assessments last week and issuing certificates, we can confidently say it was a success! Wolfram U mentors helped dozens of budding data scientists learn the multiparadigm approach and develop valuable skills in analysis, visualization, interface construction and more. Campers collaborated on projects of their own design, earning certifications along the way.

We’re proud of everyone who participated, and their efforts deserve some recognition! So without further ado, here’s a quick recap: how we ran the camp, what kinds of projects we saw and the lowdown on our new Level II Certification program.

Coordinating a Virtual Boot Camp

At Wolfram, we’re pretty well accustomed to remote work. But running a full boot camp online posed some extra technical challenges. Thankfully, participants navigated Zoom lectures quite well, and we got through with surprisingly few technical glitches. Although the lecture times were tricky for some of our international campers, we were excited to welcome people from around the world—including several who wouldn’t have been able to travel for an in-person experience.

For ongoing conversations and announcements, we ran a Slack server. This was a new experience for many of us, but it was very easy to set up and use. We were happy to see people working through exercises, discussing project details and connecting with like-minded folks. Some of our instructors could even be found hanging out after hours to help bring student projects to fruition.

Our final reception took place within the High Fidelity virtual environment, in which people move around as avatars in a shared “room” with realistic spatial sound. This gave instructors, developers and students a chance to talk in a less formal setting. People split off into groups in much the same way they would at a real reception. The simulated soundscape almost felt like in-person conversation.

Wrangling Campers, Mentors and Projects

The two-week boot camp program started with an introduction to the Wolfram technology stack and an outline of the multiparadigm data science workflow. Instructors presented their own code examples in morning lectures, followed by guided explorations to help campers learn and apply the new concepts quickly. Later lectures featured start-to-finish case studies, demonstrating the complete data science process across several methods and problem types.

Each participant was encouraged to come prepared with a project idea based on public datasets or personal research, and our students did not disappoint! Topics ranged from economics to materials science, and we saw analyses using graph theory, machine learning, image processing and other advanced methods.

We couldn’t fit all of these impressive results in one post, but here are a few camper favorites:

Recommending Foreign Documents

Using linguistic data from the Wolfram Data Repository and other sources, information systems professor Michalis Vlachos attempted to make a predictor of readability in foreign-language documents. More specifically, his function rates a given French text on how difficult it would be for a native English speaker to read and understand. In addition to judging overall readability in a text, the function detects and rates cognates (individual words that mean the same in both languages).

Evidently, the Wikipedia page for French actor Jean Reno is an easier read for an English-speaker than those of his counterparts—but only by a little bit:

Recommending Foreign Documents

Personal Fitness Data

This project by computer science student Vlad Dobrin examines the data collected from an Apple Watch. His analysis focuses on predictions that can be made from different types of workouts, visualizations of embedded data and machine learning to classify and predict various aspects of future data.

When initially looking at his data, Vlad noticed a few outliers. With a quick visual inspection, he was able to recognize these as a 60-mile bike ride and a 2.5-hour trail run:

Personal Fitness Data

One interesting result was a predictor of workout type, which he trained using data from his outdoor activities. Even with fewer than one hundred examples, he was able to achieve about 90% accuracy:

Personal Fitness Data Predictor

Neural Network Spellchecker

Camper Tim McDevitt—a developer with Wolfram’s visualization group—created a tool for automated spelling correction, building a custom neural network to determine and adjust incorrect sequences within a given word/sentence. Though his model still needs some work to reduce the rate of errors, Tim was able to gain some interesting insights on a difficult problem. For instance, he discovered trends of vowels being mistaken for each other, common letters being mistaken for other common letters and certain uncommon letters with surprisingly consistent replacements (JM, QS, WT). You can see these results summed up in this plot of conditional distributions for each letter:

Neural Network Spellchecker

Applying Graph Theory to Subsurface Flow Problems

Inspired by a recent paper on the subject, this project from Kevin Parks of Deep Time Advisory Services explores how graph theory can be useful for analyzing and predicting pathways for subspace flow in hydrogeology. Traditionally, the field has relied on computationally expensive simulations, so the use of efficient graph-theoretic algorithms is enticing for engineers. Kevin was able to represent geocellular grid data using Wolfram Language graphs and replicate a path-length comparison from the original paper:

Applying Graph Theory to Subsurface Flow Problems

The implication is that the proposed “patchy” graph structure may be a more accurate representation of subterranean flow than a random graph. From there, he was able to show some distinguishing properties of the structure, such as a smaller number of more populous communities (i.e. higher-density flow paths) than a random graph:

Graph

This work uses a unique computational angle to give new life to some theoretical ideas in hydrogeology. But to quote Kevin himself, “as in all research—more work is required.”

A Study on Plastics: Category, Production and Pollution

This comprehensive study by recent chemical engineering graduate Zifeng Qu looks at the life cycle of plastic throughout the world using a number of open datasets. Her analysis includes several informative visualizations, such as these 3D pie charts of the most prevalent types and uses of plastic:

3D pie charts

Using a dataset that rated nations’ mismanagement of plastic waste in 2010 and predicted the levels for 2025, Zifeng used machine learning to extrapolate further out to 2040—a rough estimate, but certainly food for thought:

Getting Wolfram Certified

Lectures and presentations are only part of the story. All attendees had the opportunity to earn certifications to validate their new skills. Everyone who finished the program received a certificate of completion. By successfully solving a series of graded exercises, many students also earned a Level I Certification for multiparadigm data science.

Campers who wanted to take their projects to the next level submitted their final papers for Level II Certification in multiparadigm data science. This advanced certification is achieved after instructor review of a completed data science project. It’s a step up from our other certifications, both in the scope of the requirements and the level of personalized attention. Students get to apply their skills to a topic that interests them, get expert feedback and create a final product for their data science portfolio.

As of today, several campers have achieved the Level II Certification, with more in progress. Congratulations, everyone!

Stay Tuned for More

If you missed this year’s Data Science Boot Camp, don’t worry; Wolfram U always has more learning opportunities in the works. In addition to offering attendance and completion certificates, our programs provide pathways to Level I Certification for areas such as calculus and linear algebra, with more Level I Certifications going online soon as part of our interactive courses in image processing and data science. To stay informed of the latest offerings and upcoming events, keep an eye on the Wolfram U calendar. And who knows—maybe we’ll see you at camp next year!

Get recognized for your computational achievements with Wolfram Certifications.

Posted in: Education, Events, Wolfram U
Leave a Comment

No Comments




Leave a comment