Community powered packaging: conda-forge

$ whoami?

  • Filipe Fernandes
  • twitter, GitHub, etc: ocefpaf

What I do?

IOOS

IOOS

The Problem

The scientific Python community always wanted a

  • cross-platform package manager
  • that does not require elevated privileges
  • handles all types of packages
  • including compiled Python packages and non-Python packages
  • and generally lets Python be the awesome scientific toolbox of choice.

History

pip before wheels

History

distributions

EPD, Anaconda, Pythonxy, ActivePython, etc
EPD, Anaconda, Pythonxy, ActivePython, etc

History

conda and the various 3rd party channels

History

pip after wheels

We are happy, right!?

Packaging is not easy

  • We still have trouble installing packages in our machines
  • Worse when the packages use ctypes to access an odd dependencies,
  • or has a compilation step: cython, C/C++, or Fortran extension
  • Windows!? vcvarsall.bat!!!

(╯°□°)╯︵ ┻━┻

What is a conda?

  • An open-source packaging system developed for, and used by, the scientific software community.
  • From their own webpage:
  • "Package Everything!"
  • "And share your repositories with clients or colleagues."

Why conda?

  • Why not apt-get, yum, chocolatey, or brew?
  • Why not pip and wheels?

What is a conda channel?

  • Is similar to a Linux repository (or app store)
  • The service is hosted for free at Continuum's Anaconda Cloud
  • We can upload pre-compiled binaries using the conda package manager

Packaging is hard and you should not do it alone!

How is conda-forge different from any other 3rd party channel?

  • It is a community led collection of recipes,
  • build infrastructure,
  • and packages for the conda package manager.

conda-forge

Recipes 0

package:
  name: pandas
  version: {{ version }}
source:
  url: https://github.com/pydata/pandas/archive/v{{ version }}.tar.gz
  sha256: d9f67bb17f334ad395e01b2339c3756f3e0d0240cb94c094ef711bbfc5c56c80
build:
  number: 0
  script: python setup.py install --single-version-externally-managed --record=record.txt

Recipes 1

about:
  home: http://pandas.pydata.org
  license: BSD 3-clause
  summary: 'High-performance, easy-to-use data structures and data analysis tools.'
extra:
  recipe-maintainers:
    - jreback
    - jorisvandenbossche
    - TomAugspurger

Infrastructure

The community by numbers

outdate daily

  • 458 people
  • 2,456 teams
  • 2,461 repositories

How to use the channel?

conda config --add channels conda-forge

conda install gdal -c conda-forge

How to help the community?

  • reporting issues
  • updating existing packages
  • adding new packages
  • reviewing new recipes

Reporting Issues

Updating a recipe

Adding new packages

everything starts with a PR

  • The point of entry is staged-recipes
  • Once the PR is accepted a GitHub team is created based on the maintainers list
  • The maintainers have commit rights only to their own recipes

staged-recipes PR

What is a feedstock?

  • The new package lives in the feedstock
  • This Repository, CI configuration, team permissions are automatically created
  • The tooling lives in conda-smithy

Reviewing recipes

  • not everything can be automated
  • it is one of the most human-time demanding activity in conda-forge
  • consists on catching errors and suggesting "best practices"

Looking under the hood

  • conda-smithy tool to lint, re-render, and create the feedstocks
  • heroku services to update the feedstocks (auto re-render)
  • and to pin the dependencies to a 'known' set versions

linter

conda smithy recipe-lint

conda-smihty templates

osx_image: xcode6.4
{% block env -%}
{% if matrix[0] or travis.secure -%}
env:
  {%- if matrix[0] %}
  matrix:
    {% for case in matrix %}
    - {{ matrix_env(case) }}
    {%- endfor %}
  {%- endif %}

MNT updating a feedstock

conda smithy rerender

MNT pinning the dependencies

Pinning bot

pinned = {
          'boost': 'boost 1.64.*',
          'bzip2': 'bzip2 1.0.*',
          'cairo': 'cairo 1.14.*',
          'ffmpeg': 'ffmpeg 2.8.*',
          'freetype': 'freetype 2.7|2.7.*',
          'geos': 'geos 3.5.1',
          'giflib': 'giflib 5.1.*',
          ...
        }

Who are the core members?

But why not just upload wheels for dose to PyPI?

  • PyPI is a first come first served publishing platform
  • One will need the authors permission to do that
  • Wheels can be built on GitHub using CIs but in a de-centralized way
  • Hard to share experience and take advantage of the community
  • The dream? Use conda-forge workflow to publish wheels on PyPI too

Want to know more?

Aside: conda

  • conda is a cross platform package manager (compare conda to apt-get/yum/zypper not pip)
  • manage multiple sources of binaries: numpy+atlas, numpy+openblas, numpy+mkl
  • cannot install directly from source (can install Python)
  • should not be used as a canonical source for Python packages

Aside: pip

  • will install into your system Python easily (but cannot install Python)
  • can install packages directly from the source distribution
  • cannot manage multiple sources of binaries: numpy+openblas built with manylinux1
  • pip (and PyPI) is the canonical source for Python packages

conda+pip

name: IOOS
channels: conda-forge
dependencies:
  - jsanimation
  - pynco
  - emacs
  - r-oce
  - pip: dolfyn
conda env create environment.yml

A tale of two packages

(actually only one with two hard to install dependencies)

dose install instructions:

  • Install gtk library and headers (dev/devel/etc)
  • Install wxpython
  • pip install dose

Not easy for most users!

conda-forge solution for dose?

  • Same steps! But we need to do it only once.
  • The build steps (recipe) and the binary are shared on the cloud
  • Then anyone can just conda install dose --channel conda-forge

Questions ?