WOLFRAM

A Year of Runkeeper: Analysis and Visualization

About a year ago, I decided to record every single move I make using Runkeeper, and now I want to make some visualizations of my activity throughout the whole year. This is a fairly straightforward project where I will download the data from Runkeeper, then use the Wolfram Language to process, analyze, and visualize my activities. I will show how to create animations like this one that superimposes 24 minutes of all my activities recorded in Barcelona:

Runkeeper Barcelona GPS

To export my Runkeeper data, I follow the steps explained on this page. Another way to get my activities is to connect my Runkeeper account to the Wolfram Language with ServiceConnect. But since this last method is limited to 25 activities, I use the manual export this time.

OK, this is giving me a ZIP file with a pair of CSV files and all my GPX files recorded so far. Let me first save this current notebook inside the unzipped file, and then set the current directory in there:

Saving current notebook and setting current directory

Now let’s take a look at the header line of “cardioActivities.csv”:

Looking at header ling of cardioActivities.csv

Runkeeper is providing several measurements with different physical units. To correctly interpret these quantities, I will use SemanticImport with the following column types:

Using SemanticImport to interpret quantities
Using SemanticImport to interpret quantities

This gives me a Dataset object that can be analyzed with ease. Here is how to get some insights about my activities:

1. Count the different types of activities:

Counting the different types of activities

2. Compute the average distance:

Computing average distance

3. Make a histogram of all the distances:

Histogram of distances

4. Make a DateListStepPlot of the average speeds:

DataListStepPlot of average speeds

5. Select activities with distances longer than 10 miles:

Activities with distances longer than 10 miles
Activities with distances longer than 10 miles

6. Find out how many times I would have climbed Mount Everest:

How many times climbed Mount Everest

7. Group the activities by their notes:

Grouping activities by their notes

8. Select activities labeled with the note “Boston” and import their GPX files:

Importing GPX files with note Boston

9. Map the starting positions of all the activities:

Mapping the starting positions of activities

10. And last but not least, make a TimelinePlot of all the activities:

TimelinePlot of activities

So far, so good. Next step is to pull out data from the GPX files. Import provides a GeoGraphics with the GPS track drawn in a black line:

Using Import to provide GeoGraphics with GPS track drawn in black

But what if I want to get the elevation I was at and the speed I was going? Import has the option "Data" that allows one to access the GPX file in a generic Wolfram Language form (list, string, etc.):

Accessing the GPX in a Wolfram Language form

This contains a list of the recorded GeoPosition, elevation, and time stamp points. Since I’m only interested in the points, I define a function that takes the positions and creates a TimeSeries of the elevation:

Creating a TimeSeries of the elevation

Given GPX data, this returns an association with a pair of keys:

An association with a pair of keys

Now it’s easy to take the "Elevation" to make a DateListPlot colored by elevation:

DateListPlot colored by elevation

Or take the "Geometry", Rescale the elevation points, and then use that to color the GPS track by elevation:

Rescaling the elevation points and color the GPS track by elevation

Another interesting thing to do with this data is to compute the instantaneous speed. Recently a fellow Wolfram Community member, Sander Huisman, showed how to compute instantaneous speed to colorize his GPS track. Here is the function that I defined to compute a time series of the instantaneous speed from the GeoPosition points and the elevation time series:

Time series of the instantaneous speed from the GeoPosition points and the elevation time series

If I apply this to the previous example, a DateListPlot can immediately tell me if I had a break during that hike:

DataListPlot indicating when breaks occurred

Short breaks come from my stops to take pictures of the rock formations that inspired the surrealist painter Salvador Dalí. Let’s map those breaks by colorizing the GPS track according to its speed:

Mapping the breaks by colorizing the GPS track

Now that I have a systematic way of taking the positions, elevation, and instantaneous speed of a given activity, it’s time to add these into a new dataset:

Dataset of positions, elevation, and instantaneous speed of a given activity

A year ago, I moved from the countryside to the city of Barcelona:

Barcelona

To select activities within Barcelona, I could use functions like GeoWithinQ or GeoDistance, but since I added specific notes to my Barcelona activities, these functions are not needed this time:

Selecting activities within Barcelona

Before mapping these activities in GeoGraphics, I want to make sure that the activities will be highlighted over a black-and-white uniform map with no labels. To do that, I add advanced GeoStyling options for the GeoBackground to make a negative grayscale background. I also add a GeoScaleBar and I constrain the map in a convenient GeoRange:

Mapping activities over a black-and-white uniform map with no labels

Looks good to me. Let’s overlay it with all the activities recorded so far:

Activities recorded so far

I’ve almost covered the entire city! If I map only the starting (yellow) and ending (red) positions, it’s quite clear where I live. For most of these activities, it’s me commuting from one place to another using the city’s bicycle-sharing system:

Mapping starting and ending positions

Now let’s compute the shortest tour to revisit all these places:

Shortest tour to revist all locations

It would take a quadcopter about 170 km to visit them all:

Distance to visit them all

Not bad if I compare it to my total annual distance:

Total annual distance

One thing that I’ve noticed over this year using Runkeeper is that my path to move from point A (home) to point B (capoeira training center) has been evolving a lot. The thing is that I’m not yet sure what’s the best way to get there by bicycle. The new function TravelDirections might have something new to say here:

Using TravelDirections to find best path

When one draws the "TravelPath" for the three different travel methods, "Biking" (green), "Walking" (blue), and "Driving" (red), one will notice that there isn’t just a simple way to get from point A to point B:

TravelPath for three different travel methods

When one looks at their TravelDistance, "Walking" is clearly the winner:

Using TravelDistance to compare distance

But since this path goes through Barcelona’s downtown, or Ciutat Vella (Old City), which is a maze of medieval streets restricted to pedestrians, it would take me nearly an hour to get from point A to point B. The TravelTime for "Biking" is way faster:

TravelTime to compare travel methods

I must say that over this year I’ve tried a myriad of different ways to get to point B by bicycle. My current favorite is a variation of the "Driving" path in red:

Variations on Driving path

Now let’s plot all 55 trips colored by their travel times:

55 trips colored by their travel times
55 trips colored by their travel times

The blue/green paths in the center are closer to the "Walking" path, and these seem to be the shortest ways to get to point B. My record time is about 13 minutes:

Record time for blue/green path in center

If one looks at the average time, this is quite close to that predicted by the TravelDirectionsData:

TravelDirectionsData to predict time

This data is really allowing me to time travel over the past year. In the following GeoGraphics, I color the GPS tracks according to speed, and I add the dates for each activity using Tooltip:

GPS tracks colored according to speed and add dates using Tooltip

It’s time to take each one of these activities as a new frame and animate the whole year. The code to generate the animation below and the animation at the beginning of this post is available at the end of this post as a CDF. I invite you to use my code to analyze your own Runkeeper data. Want more ideas? Have a look at “A Rat Race, or a Great Way to Start the Day.”

Download this post as a Computable Document Format (CDF) file.

Download the supporting code and documents as a ZIP file.

Comments

Join the discussion

!Please enter your comment (at least 5 characters).

!Please enter your name.

!Please enter a valid email address.

3 comments

  1. This is a very impressive example of “data is beautiful”. Stunning visualisations! Thanks a lot for sharing.

    Reply
  2. RunKeeper also stores heart rate data and makes it available via API. Is it possible to access that via Mathematica?

    Reply
  3. Great article! I read it in russian here: https://habrahabr.ru/company/wolfram/blog/302462/

    Greetings from Barcelona! Here is nice place to live! =)

    Reply