Summary and Setup

What is a package? * An organized collection of modules containing functions and classes with a common purpose. * Packages allow developers to organize code in manageable and reusable structured units.

Why is packaging important in scientific research?

  • Reusability: Common code can be used across many projects
  • Collaboration: Distribute your work with collaborators
  • Reproducibility: Allows anyone to run code with the same dependencies to reproduce your results
  • Portability: Work in more than one place (e.g. distributed computing)

If you have used Python for scientific computation, you’ve likely encountered widely distributed open-source packages like NumPy, SciPy, or Pandas. However, it’s important to note that packages can exist at many scales and contexts, ranging from personal and organizational use to wide public distributions.

In this tutorial we’ll walk through methods and tools used to create and distribute your own Python packages.

Lesson Plan


We’re going to create a Python package from scratch, then publish it. Packages often include additional files for configuration, documentation, and automations. We’ll look at the following important groups of files:

.
├── src/                    ─┐
│   └── example_package_YOUR_USERNAME_HERE/
│   │   ├── __init__.py      │
│   │   └── rescale.py       ├─ Minimum to make the code
├── tests/                   │  work and be installable
│   └── test_rescale.py      │
├── pyproject.toml          ─┘ ← Metadata and versioning
│
├── .git/                   ─┬─ Store the history of the code
├── .gitignore              ─┘
│
├── .venv/                  ─── Environment to run the code
│
├── README                  ─┐
├── CHANGELOG                ├─ Tell people browsing the code what it's for,
├── LICENSE                  │  how it's changed, under what conditions they
├── CITATION.cff            ─┘  can use it, and how they can cite it
│
├── docs/                   ─┐
│   ├── index.md             ├─ Document the code in more detail
│   └── conf.py             ─┘
│
├── .pre-commit-config.yaml ─┐
├── noxfile.py               │
└── .github/                 ├─ Automate tasks which run _on_ the code
    └── workflows/           │  like style-checks, tests, and publishing
        ├── test.yaml        │
        └── publish.yaml    ─┘

See Also

This is a tutorial. For reference material, you should bookmark the following guides:

Prerequisites

  • Basic Python

User Accounts


If you don’t already have them, sign up for accounts on:

Development Environment


If this is your first time, use the Cloud instructions

If you have not installed and used these tools before, then in the interests of time for the lesson, please use the Cloud setup instructions. After the session you can repeat the setup using the Local setup instructions.

Support will be available for the rest of the week if you have trouble with setting up these tools locally!

Cloud

You can get all the prerequisites for this lesson by using a GitHub Codespace.

  • Create a new repository called example-package-YOUR-USERNAME-HERE by going to https://github.com/new.
    • Ensure that the owner is you.
    • Initialize the repository with a README file by clicking the tickmark next to Add a README file.
  • After the repository is created, click on the < > Code button, click the Codespaces tab, and click Create codespace on main. This will create an editor with a shell, python and git pre-configured for you.

Local

If you want to run this example on your own computer, you will need to install the parts independently.

Editor

We recommend using the text editor VSCode from https://code.visualstudio.com/ for this lesson. We won’t be using any of its special features, so if you prefer a different editor, please use that.

Shell

This lesson uses shell commands which you can run in a terminal emulator. Depending on the operating system you use, you have different options. - Linux – you can use any terminal emulator. Common options are GNOME Terminal and Konsole (KDE). - macOS – you can use any terminal emulator. Common options are: Terminal.app, iTerm2 - Windows – we recommend using the Windows Subsystem for Linux to install Linux (https://learn.microsoft.com/en-us/windows/wsl/install). You could also use Powershell, but some commands will be different and others unavailable.

Python

You will need to have Python installed for this lesson.

To check if python is available in your shell, call python3 --version. You should see some output like:

BASH

% python3 --version
Python 3.12.3

{:code}

If this command returns something like command not found: python3 then you can install python using the instructions on https://wiki.python.org/moin/BeginnersGuide/Download.

Often, developers will need to manage multiple projects which might all use several different python versions. This is sometimes tricky, and there are specialized tools which can help, like:

Git & GitHub

You will need a GitHub account and an installation of Git.

If you don’t already have a GitHub account you can sign up at https://github.com/join.

For beginners, we recommend using GitHub desktop when working with GitHub – installation instructions at https://desktop.github.com/.

Empty repository

Create a new repository using GitHub desktop. - Open GitHub Desktop - Select File > New Repository - Specify the repository data: - Name: example-package-YOUR-USERNAME-HERE (use dashes -, don’t use underscores _ for this name) - Local Path: pick somewhere which isn’t synced to a cloud provider like DropBox, OneDrive or iCloud – they can interfere with git. - Initialize the repository with a README. - Leave the .gitignore and License both as “None”. - Click Create repository - Click Publish repository to make it available on GitHub. - Click Open in Visual Studio Code or open Visual Studio Code and click File > Open folder and select the directory with your new repository.