Summary and Setup
What is a package? * An organized collection of modules containing functions and classes with a common purpose. * Packages allow developers to organize code in manageable and reusable structured units.
Why is packaging important in scientific research?
- Reusability: Common code can be used across many projects
- Collaboration: Distribute your work with collaborators
- Reproducibility: Allows anyone to run code with the same dependencies to reproduce your results
- Portability: Work in more than one place (e.g. distributed computing)
If you have used Python for scientific computation, you’ve likely encountered widely distributed open-source packages like NumPy, SciPy, or Pandas. However, it’s important to note that packages can exist at many scales and contexts, ranging from personal and organizational use to wide public distributions.
In this tutorial we’ll walk through methods and tools used to create and distribute your own Python packages.
Lesson Plan
We’re going to create a Python package from scratch, then publish it. Packages often include additional files for configuration, documentation, and automations. We’ll look at the following important groups of files:
.
├── src/ ─┐
│ └── example_package_YOUR_USERNAME_HERE/
│ │ ├── __init__.py │
│ │ └── rescale.py ├─ Minimum to make the code
├── tests/ │ work and be installable
│ └── test_rescale.py │
├── pyproject.toml ─┘ ← Metadata and versioning
│
├── .git/ ─┬─ Store the history of the code
├── .gitignore ─┘
│
├── .venv/ ─── Environment to run the code
│
├── README ─┐
├── CHANGELOG ├─ Tell people browsing the code what it's for,
├── LICENSE │ how it's changed, under what conditions they
├── CITATION.cff ─┘ can use it, and how they can cite it
│
├── docs/ ─┐
│ ├── index.md ├─ Document the code in more detail
│ └── conf.py ─┘
│
├── .pre-commit-config.yaml ─┐
├── noxfile.py │
└── .github/ ├─ Automate tasks which run _on_ the code
└── workflows/ │ like style-checks, tests, and publishing
├── test.yaml │
└── publish.yaml ─┘
See Also
This is a tutorial. For reference material, you should bookmark the following guides:
Prerequisites
- Basic Python
User Accounts
If you don’t already have them, sign up for accounts on:
- https://github.com/join (required for the part on “continuous integration”, optional for setting up a development environment)
- https://test.pypi.org/account/register/ (required for the part on publishing)
- https://zenodo.org/signup/ (optional for the part on citation)
Development Environment
If this is your first time, use the
Cloud
instructions
If you have not installed and used these tools before, then in the interests of time for the lesson, please use the Cloud setup instructions. After the session you can repeat the setup using the Local setup instructions.
Support will be available for the rest of the week if you have trouble with setting up these tools locally!
Cloud
You can get all the prerequisites for this lesson by using a GitHub Codespace.
- Create a new repository called
example-package-YOUR-USERNAME-HERE
by going to https://github.com/new.- Ensure that the owner is you.
- Initialize the repository with a README file by clicking the
tickmark next to
Add a README file
.
- After the repository is created, click on the
< > Code
button, click theCodespaces
tab, and clickCreate codespace on main
. This will create an editor with a shell,python
andgit
pre-configured for you.
Local
If you want to run this example on your own computer, you will need to install the parts independently.
Editor
We recommend using the text editor VSCode from https://code.visualstudio.com/ for this lesson. We won’t be using any of its special features, so if you prefer a different editor, please use that.
Shell
This lesson uses shell commands which you can run in a terminal
emulator. Depending on the operating system you use, you have different
options. - Linux – you can use any terminal emulator. Common options are
GNOME Terminal
and Konsole (KDE)
. - macOS
– you can use any terminal emulator. Common options are:
Terminal.app
, iTerm2
- Windows – we recommend
using the Windows Subsystem for Linux to install Linux (https://learn.microsoft.com/en-us/windows/wsl/install).
You could also use Powershell, but some commands will be different and
others unavailable.
Python
You will need to have Python installed for this lesson.
To check if python is available in your shell, call
python3 --version
. You should see some output like:
{:code}
If this command returns something like
command not found: python3
then you can install python
using the instructions on https://wiki.python.org/moin/BeginnersGuide/Download.
Often, developers will need to manage multiple projects which might all use several different python versions. This is sometimes tricky, and there are specialized tools which can help, like:
Git & GitHub
You will need a GitHub account and an installation of Git.
If you don’t already have a GitHub account you can sign up at https://github.com/join.
For beginners, we recommend using GitHub desktop when working with GitHub – installation instructions at https://desktop.github.com/.
Empty repository
Create a new repository using GitHub desktop. - Open GitHub Desktop -
Select File > New Repository - Specify the repository data: - Name:
example-package-YOUR-USERNAME-HERE (use dashes -
, don’t use
underscores _
for this name) - Local Path: pick somewhere
which isn’t synced to a cloud provider like DropBox, OneDrive
or iCloud – they can interfere with git
. - Initialize the
repository with a README. - Leave the .gitignore and License both as
“None”. - Click Create repository
- Click
Publish repository
to make it available on GitHub. - Click
Open in Visual Studio Code
or open Visual Studio
Code and click File > Open folder and select the directory with your
new repository.