python4oceanographers

Turning ripples into waves

An open source story: from the classroom to a bug fix

Last week I had the pleasure of teaching my first Software Carpentry Workshop at UFBA. After the 2-day workshop we had another 3 extra days of python training for oceanographers.

We did not had internet connection for the workshop, nor a room with enough power outlets. That was a bummer, but we manage by providing copies of all the software we used in flashdrives, and bringing extension cords and power strips. Only a few examples that demanded some downloading had to be left out. Besides that the workshop went smoothly.

I will write more about the workshop in another post for SWC. Here I want to tell a story of what happened in the workshop.

At the last day, during an exercise, a student decided to mix a cartopy plot of an OpenStreetMaps image with another cartopy example using quiver. Surprisingly we found a bug in cartopy that, when transforming coordinates to plot a quiver, it raise a ValueError for 1D data.

What follows next is a nice example of how the Open Source (OS) world works. We opened an issue on GitHub and sent our own hackish fix as a PR.

After a typical OS conversation on the issue tracker and on the PR page one of cartopy's core developers, Phil Elson, started a proper PR with unit tests and everything to fix the bug. Phil Elson, as the polite British lord he is, even used part of our original commit is his PR.

Long story short: we may now pass 1D data with cartopy when transforming between coordinates.

This story might be your everyday bread-and-butter if you are already in the OS world. But for students that are just starting with programming, or coming from closed source software like Matlab, it means a lot to see that they can make a difference. I hope that next time they open issue themselves, maybe even send the PR on thier own.

To finish my post I will add a SWC Instructors/Attendees graph.

Let's load the data with pandas.

In [3]:
from pandas import read_csv


df = read_csv('./data/per-capita.csv', index_col=0)
In [4]:
df['Population'] = df['Population'] / 1e6
df['Attendees'] = df['Attendees'] / df['Population'] 
df['Instructors'] = df['Instructors'] / df['Population'] 

The cell below creates a simple CSS for the table.

In [5]:
table = """
table
{
  border-collapse: collapse;
}
th
{
  color: #ffffff;
  background-color: #000000;
}
td
{
  background-color: #cccccc;
}
table, th, td
{
  font-family:Arial, Helvetica, sans-serif;
  border: 1px solid black;
  text-align: right;
}
"""

The we will plot it using an interactive figure created with mpld3.

In [6]:
import numpy as np
import matplotlib.pyplot as plt

import mpld3
from mpld3 import plugins

%matplotlib inline

mpld3.enable_notebook()


labels = []
for k in range(df.index.size):
    label = df.ix[[k], :].T
    labels.append(str(label.to_html()))

fig, ax = plt.subplots()
ax.grid(True, alpha=0.3)
kw = dict(marker='o', color='b', markeredgecolor='k', markersize=5,
          markeredgewidth=1, alpha=0.6, linestyle='none')
points = ax.plot(df['Attendees'], df['Instructors'], **kw)
ax.set_xlabel('Attendees per million pop')
ax.set_ylabel('Instructors per million pop')
ax.axis([-0.5, 35, -0.1, 1.4])

tooltip = plugins.PointHTMLTooltip(points[0], labels,
                                   voffset=10, hoffset=10, css=table)
plugins.connect(fig, tooltip)
In [7]:
mpld3.disable_notebook()

That is nice, but let's get fancy with matplotlib and add some flags.

(You have to download the flags from http://vathanx.deviantart.com/art/World-Flag-Icons-PNG-108083900 first.)

In [8]:
from glob import glob
flags = glob('./data/flags/*.png')


def split_flag(flag):
    return ' '.join(flag.split('.')[1:-1][0].split()[2:])
In [9]:
import matplotlib.pyplot as plt
from matplotlib._png import read_png
from matplotlib.offsetbox import AnnotationBbox, OffsetImage


def make_plot(ax):
    ax.scatter(df['Attendees'], df['Instructors'], )

    for (label, x, y) in zip(df.index, df['Attendees'], df['Instructors']):
        for flag in flags:
            flag_name = split_flag(flag)
            if label == flag_name:
                png = read_png(flag)
                imagebox = OffsetImage(png, zoom=.1)
                ab = AnnotationBbox(imagebox, xy=(x,y), frameon=False,
                                    xycoords='data',
                                    boxcoords='data')
                ab.set_alpha(0)
                ax.add_artist(ab)

fig, ax = plt.subplots(figsize=(7, 7))
ax.set_xlabel('Attendees per million pop')
ax.set_ylabel('Instructors per million pop')
make_plot(ax)

If you could not find Brazil in any of those graphs, or if you did find it and saw the numbers, you now understand the need for more workshops!

In [10]:
HTML(html)
Out[10]:

This post was written as an IPython notebook. It is available for download or as a static html.

Creative Commons License
python4oceanographers by Filipe Fernandes is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Based on a work at https://ocefpaf.github.io/.

Comments