xray is an interesting package that has been under my radar for a while.
According to the docs xray is:
... an open source project and Python package that aims to bring the
labeled data power of pandas to the physical sciences, by providing
N-dimensional variants of the core pandas data structures, Series and
DataFrame: the xray DataArray and Dataset.
Our goal is to provide a pandas-like and pandas-compatible toolkit for
analytics on multi-dimensional arrays, rather than the tabular data for
which pandas excels. Our approach adopts the Common Data Model for
self- describing scientific data in widespread use in the Earth sciences
(e.g., netCDF and OPeNDAP): xray.Dataset is an in-memory representation of
a netCDF file.
This post is my first attempt using it for ocean models output. Because xray is under heavy development I will be using the latest git checkout.
import xray
print(xray.__version__)
xray uses python-netCDF4 under the hood, so it "speaks" OPeNDAP.
url = ('http://tds.marine.rutgers.edu/thredds/dodsC/roms/'
       'espresso/2013_da/his_Best/'
       'ESPRESSO_Real-Time_v2_History_Best_Available_best.ncd')
nc = xray.open_dataset(url)
It has a nice and meaniful __repr__ for the DataArray object.
salt = nc['salt']
salt
My first impression is that, when compared to raw netCDF4 interface, the
xray.DataArray object is richer (more metadata).  However, if compared to his older brother iris.cube.Cube, it falls short.  For example: xray
does not attach the 2D coordinates to the object. (As of the latest git
checkout it does attach the coordinates.)
print(salt.coords['xi_rho'].coords)
print(salt.coords['eta_rho'].coords)
print(salt.coords['lon_rho'].coords)
print(salt.coords['lat_rho'].coords)
The Dataset is another important xray object.  Here is a small snippet on how to construct one:
ds = xray.Dataset({'temperature': nc['temp']},
                  coords={'lon': (['eta_rho', 'xi_rho'], nc['lon_rho']),
                          'lat': (['eta_rho', 'xi_rho'], nc['lat_rho']),
                          's': nc['s_rho'],
                          'time': nc['time']})
ds
The best feature, IMHO, is the use of Pandas "fuzzyness" when slicing with
strings dates.  Iris uses the cumbersome lambda approach, but using strings is much easier.
s = salt.loc['2014-12-02']
s.coords['time']
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import cartopy.feature as cfeature
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER
LAND = cfeature.NaturalEarthFeature('physical', 'land', '50m',
                                    edgecolor='face',
                                    facecolor=cfeature.COLORS['land'])
def make_map(bbox, projection=ccrs.PlateCarree()):
    fig, ax = plt.subplots(figsize=(8, 6),
                           subplot_kw=dict(projection=projection))
    ax.set_extent(bbox)
    ax.add_feature(LAND, facecolor='0.75')
    ax.coastlines(resolution='50m')
    gl = ax.gridlines(draw_labels=True)
    gl.xlabels_top = gl.ylabels_right = False
    gl.xformatter = LONGITUDE_FORMATTER
    gl.yformatter = LATITUDE_FORMATTER
    return fig, ax
import numpy.ma as ma
import matplotlib.pyplot as plt
from oceans.colormaps import cm
sal = s[0, -1, ...].values
lon, lat = nc['lon_rho'], nc['lat_rho']
fig, ax = make_map([lon.min(), lon.max(), lat.min(), lat.max()])
cs = ax.pcolormesh(lon.values, lat.values,
                   ma.masked_invalid(sal), cmap=cm.odv)
cbar = fig.colorbar(cs, extend='both')
Note that when plotting I had to use the .values method.  The reason is because the
DataArray object does not behave entirely like a numpy array.  The .ravel()
method, need for pcolormesh, is missing.  Also note that xray does not support masked array (yet).
One might ask: why use labeled arrays? The main argument is to write idiomatic expressions like,
s.mean(dim='time')
there is no doubt on which dimension I am averaging.
Because xray is designed to behave like pandas, when dealing with time series
one can take advantage of all pandas features.  The same may be accomplished with
iris.pandas.as_series, but with xray there is no awkward conversions.
Now let's try an example straight from xray's documentation.
First I will reduce the data from hourly to daily observations,
then compute weekly anomalies.
def daily_aggregator(data_array):
    time = data_array.coords['time']
    values = time.to_index().to_period('D').to_timestamp()
    return xray.DataArray(values, [time], name='daily')
temp = nc['temp'].loc['2013-06-01':'2013-07-01']  # I get a weird error if extending this.
ts = temp.isel(eta_rho=16, xi_rho=40, s_rho=-1)  # Cannot use lon, lat because xray does not support 2D dimensions.
daily_avg = ts.groupby(daily_aggregator(ts)).mean()
week_avg = daily_avg.groupby('daily.week').mean('daily')
anomalies = daily_avg.groupby('daily.week') - week_avg
ax = anomalies.reset_coords(drop=True).to_series().plot()
Those anomalies could very well be the GS front meandering around that point.
surface = temp.isel(s_rho=-1)
fig, ax = make_map([lon.min(), lon.max(), lat.min(), lat.max()])
cs = ax.pcolormesh(lon.values, lat.values,
                   ma.masked_invalid(surface[0].values),
                                     cmap=cm.avhrr, alpha=0.5, zorder=1)
p, = ax.plot(lon.isel(eta_rho=16, xi_rho=40),
             lat.isel(eta_rho=16, xi_rho=40),
             'ko', zorder=2)
HTML(html)
