This post is an update on of the previous iso-surface post.
After reading a little bit more on cython I found out that, when using memory views, we can release the GIL using the nogil annotation. I wonder if releasing the GIL can get the cython performance closer to numba's.
First let's set the same data as before up to run some tests.
import numpy as np
p = np.linspace(-100, 0, 30)[:, None, None] * np.ones((50, 70))
x, y = np.mgrid[0:20:50j, 0:20:70j]
q = np.sin(x) + p
p0 = -50.
The two cells below will compiled two versions of the zslice
function.
The first with the GIL and the second releasing during the loop.
%load_ext Cython
%%cython
cimport cython
import numpy as np
cimport numpy as np
NaN = np.NaN
@cython.boundscheck(False)
@cython.wraparound(False)
def gil_zslice(double[:, :, ::1] q,
double[:, :, ::1] p,
double p0,
mask_val=NaN):
cdef int L = q.shape[2]
cdef int M = q.shape[1]
cdef int N = q.shape[0]
cdef double dp, dq, dq0
cdef int i, j, k
cdef np.ndarray[double, ndim=2, mode='c'] q_iso = np.empty((M, L), dtype=np.float64)
for i in range(L):
for j in range(M):
q_iso[j, i] = mask_val
for k in range(N-1):
if (((p[k, j, i] < p0) and (p[k+1, j, i] > p0)) or
((p[k, j, i] > p0) and (p[k+1, j, i] < p0))):
dp = p[k+1, j, i] - p[k, j, i]
dp0 = p0 - p[k, j, i]
dq = q[k+1, j, i] - q[k, j, i]
q_iso[j, i] = q[k, j, i] + dq*dp0/dp
return q_iso
%%cython
cimport cython
import numpy as np
cimport numpy as np
NaN = np.NaN
@cython.boundscheck(False)
@cython.wraparound(False)
def nogil_zslice_2D(double[:, ::1] q,
double[:, ::1] p,
double p0,
double mask_val=NaN):
cdef int IJ = q.shape[1]
cdef int K = q.shape[0]
cdef int ij, k
cdef np.ndarray[double, ndim=1, mode='c'] q_iso = np.empty(IJ, dtype=np.float64)
with nogil:
for ij in range(IJ):
q_iso[ij] = mask_val
for k in range(K-1):
if (((p[k, ij] < p0) and (p[k+1, ij] > p0)) or
((p[k, ij] > p0) and (p[k+1, ij] < p0))):
q_iso[ij] = (q[k, ij] +
(q[k+1, ij] - q[k, ij]) * # dq
(p0 - p[k, ij]) / # dp0
(p[k+1, ij] - p[k, ij])) # dp
return q_iso
Note that cython won't allow any conversion to Python objects without GIL.
That is why we had to eliminate the temporary variables (dq
, dp0
, and dp
).
Hopefully we are sacrificing readability to gain some speed.
The second difference has nothing to do with the GIL! I modified the code to work with 2D arrays (depth, y_x_space) instead of 3D arrays (depth, y, x). The reason is to make this code more UGRID friendly.
Again we will use our poor man's benchmarking.
K, J, I = q.shape
qr = q.reshape(K, -1)
pr = p.reshape(K, -1)
gil = %timeit -n1000 -o gil_zslice(q, p, p0)
nogil = %timeit -n1000 -o nogil_zslice_2D(qr, pr, p0)
gil.best / nogil.best
Note that, under the same test conditions, we got a slightly better performance than numba (175 µs).
Due to the the input shape change and the fact that the cython code expects only np.float64
we need to wrap the cython function to prepare the input.
That is a small price to pay to have a single code that can deal with any ocean model grid.
The two functions below should take care of the input for us.
def _save_typecast(arr):
if arr.dtype is not 'float64' and np.can_cast(arr, np.float64):
arr = arr.astype(np.float64)
return arr
def zslice(q, p, p0):
q = _save_typecast(q)
p = _save_typecast(p)
p0 = -abs(p0)
if q.ndim == 3:
K, J, I = q.shape
iso = nogil_zslice_2D(q.reshape(K, -1), p.reshape(K, -1), p0)
return iso.reshape(J, I)
elif q.ndim == 2:
return nogil_zslice_2D(q, p, p0)
else:
msg = "Expected 2 (ugrid) or 3 (r-/s-grid) dimensions. Got {}."
raise ValueError(msg(q.ndim))
All that is left is to test the nogil and UGRID friendly version against real data. First some UGRID (FVCOM) data.
import iris
url = ('http://crow.marine.usf.edu:8080/thredds/dodsC/'
'FVCOM-Nowcast-Agg.nc')
cubes = iris.load_raw(url)
# Last time step.
temp = cubes.extract_strict('sea_water_potential_temperature')[-1, ...]
p = temp.coord('sea_surface_height_above_reference_ellipsoid').points
q = temp.data
(In the future iris will use pyugrid
and the next cell won't be necessary.)
import pyugrid
ugrid = pyugrid.UGrid.from_ncfile(url)
lon = ugrid.nodes[:, 0]
lat = ugrid.nodes[:, 1]
triangles = ugrid.faces[:]
import matplotlib.tri as tri
triang = tri.Triangulation(lon, lat, triangles=triangles)
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
from cartopy.io import shapereader
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER
cmap = plt.cm.viridis
def make_map(projection=ccrs.PlateCarree()):
fig, ax = plt.subplots(figsize=(9, 13),
subplot_kw=dict(projection=projection))
gl = ax.gridlines(draw_labels=True)
gl.xlabels_top = gl.ylabels_right = False
gl.xformatter = LONGITUDE_FORMATTER
gl.yformatter = LATITUDE_FORMATTER
ax.coastlines('50m')
return fig, ax
Let's compute a slice of potential temperature at -25 meters.
I could not figure out how to use masked array and tricontourf
.
The code was segfaulting!
I ended up filling the data with -999
and specifying the contour levels instead.
import numpy.ma as ma
temp_slice = zslice(q, p, -25)
mask = temp_slice = ma.masked_invalid(temp_slice)
vmin, vmax = temp_slice.min(), temp_slice.max()
temp_slice = temp_slice.filled(fill_value=-999)
fig, ax = make_map()
extent = [lon.min(), lon.max(),
lat.min(), lat.max()]
ax.set_extent(extent)
levels = np.arange(vmin, vmax, 0.5)
kw = dict(cmap=cmap, alpha=0.9, levels=levels)
cs = ax.tricontourf(triang, temp_slice, **kw)
kw = dict(shrink=0.5, orientation='vertical')
cbar = fig.colorbar(cs, **kw)
UGRID seems to be OK. Let's try some SGRID (ROMS) now.
url = ('http://tds.marine.rutgers.edu/thredds/dodsC/roms/espresso/2013_da/avg/'
'ESPRESSO_Real-Time_v2_Averages_Best')
cubes = iris.load_raw(url)
salt = cubes.extract_strict('sea_water_salinity')[-1, ...] # Last time step.
lon = salt.coord(axis='X').points
lat = salt.coord(axis='Y').points
p = salt.coord('sea_surface_height_above_reference_ellipsoid').points
q = salt.data
salt_slice = ma.masked_invalid(zslice(q, p, 300))
fig, ax = make_map()
extent = [lon.min(), lon.max(),
lat.min(), lat.max()]
ax.set_extent(extent)
cs = ax.pcolormesh(lon, lat, salt_slice, cmap=cmap)
kw = dict(shrink=0.5, orientation='vertical', extend='both')
cbar = fig.colorbar(cs, **kw)
I known we still have a bug when requesting levels where the data will fall in the top or bottom levels. I am working to find a proper way to return a valid slice in those places. However, we are pretty close to a code that works with any ocean model grid!
HTML(html)