Not an Albertan pipeline, but a CI/CD pipeline.

This week’s lab for my Open Source Development class has us working on a CD pipeline (as opposed to last week’s, which was the CI aspect) for our ever changing, semester long, link checking program.

For the uninitiated: A Continuous Delivery (CD) Pipeline deploys a new version of the software to the public for use (when a new version is available.) Releasing a new version can be done by hand, but having a CD pipeline ensures an automated process, so developers don’t have to release new a version manually and can instead focus on development.

My program is written in Python, and as such the process of creating my CD pipeline for it will be described below from the perspective of the Python ecosystem.

Let’s dive in!


For delivering new versions of my program, naturally the first step was to investigate and evaluate potential websites for hosting my deliverables. Since PyPI is the goto standard for python packages, I decided to use that. The next step is obviously to RTFM.

Since we’re automating our Python package, we have to ensure that the package has the right structure. Thankfully, my project knows its place, and was already structured properly. Here is a structure example from the above link, and my project’s structure below it:

packaging_tutorial
├── LICENSE
├── README.md
├── example_pkg
│   └── __init__.py
├── tests
│   └── __init__.py
└── setup.py
He-s-Dead-Jim
├── LICENSE
├── README.md
├── src
│   └── __init__.py
│   └── hdj.py
│   └── hdj_fileio.py
│   └── hdj_linkchecker.py
│   └── hdj_util.py
├── tests
│   └── __init__.py
│   └── test_hdj_fileio.py
│   └── test_hdj_linkchecker.py 
└── setup.py

The more keen readers (all 2 of you) would notice the setup.py file. setup.py is very important for a lot of reasons:

  1. It keeps track of your project’s dependencies, and without those your project will never run.
  2. It’s also used in the packaging process (see below) to determine what dependencies and files to keep track of.

Again, I already had a setup.py file but just recently deleted it to use the (much) easier requirements.txt. So I made it again (and actually learned how to use it this time):

import setuptools

with open("docs/README.md", "r") as fh:
    long_description = fh.read()

setuptools.setup(
    name="He's Dead, Jim",
    version="1.0.4",
    author="Chris Pinkney",
    author_email="[email protected]",
    description="A command-line tool for finding and reporting dead/broken links in a file or webpage.",
    long_description=long_description,
    long_description_content_type="text/markdown",
    url="https://github.com/chrispinkney/He-s-Dead-Jim",
    install_requires=[
        "argparse == 1.4.0",
        "requests == 2.24.0",
        "beautifulsoup4 == 4.9.1",
        "datetime == 4.3",
        "colorama == 0.4.4",
        "black == 20.8b1",
        "flake8 == 3.8.4",
        "pre-commit == 2.7.1",
        "pytest == 6.1.2",
        "pytest-cov == 2.10.1",
    ],
    packages=setuptools.find_packages(),
    classifiers=[
        "Programming Language :: Python :: 3",
        "License :: OSI Approved :: MIT License",
        "Operating System :: OS Independent",
    ],
    entry_points={
        "console_scripts": [
            "hdj = src.hdj:main_wrapper",
        ]
    },
    python_requires=">=3.6",
)

An important step is to get the entry_points dict just right. I had to restructure a lot of my project to accommodate for setup.py, namely the main file, this had to happen because the project needed a proper entry point for use with setup.py and the main function already in place received command line arguments, making it not runnable when given no arguments. I’m sure there’s a better way to do this but for now it works. I also had to suffix my import statements as the built version of my project was having troubles finding the proper files. Not sure why but my hack fixed it.

In order to test that my setup.py file works, I created virtual environments to see if all my dependencies were being installed properly:

  1. python -m venv hdj_env
  2. .\hdj_env\Scripts\activate.bat
  3. pip install .
  4. pip list

As your new shiny virtual environment does not come with ANY preinstalled dependencies, if your project’s dependencies show up after running step 4, you know you did dun did it all good-like. With my setup.py file created and functioning properly, let’s see if we can actually package our project.

In order to package my project, I ran python setup.py sdist bdist_wheel to generate the Source Archive (.tar.gz) file, and the Build Distribution (.whl) file. We are generating these files in particular, because that is what PyPI accepts for upload. If our project can generate these files, we know that our automated CD pipeline can too.

Let’s test manually uploading these files to PyPI to see if they will even be accepted.

PyPI uses a piece of software called Twine to upload things to their servers, naturally let’s start there: python -m pip install --user --upgrade twine. With Twine installed, we now need to create an account and an API token on PyPI. But now every time you upload, you’ll be prompted for that API Token. That’s annoying, so let’s register our token locally on your computer. Simply create a file called .pypirc in C:\Users\Chris (aka ~/) and put the following contents into it so twine won’t prompt for username and password:

[distutils]
  index-servers =
    pypi

[pypi]
  username = __token__
  password = <your API key here>

Note that PyPI only allowed one version of your software on their servers, forever. Even if you delete that version from them, you must increment your versioning in order to push new source code to their servers, so if your pushes or CD pipeline fail, that might be why.

And now the moment of truth, pushing our source code to PyPI. Remember that the archive and .whl file we generated earlier is located in the /dist directory, so let’s point our command there: python -m twine upload dist/*

Great success! Since my project is now uploaded, I can (kind of) be my own tester using a virtual environment! Let’s create a virtual environment, activate it, and see if our project installs directly from PyPI: python -m pip install He-s-Dead-Jim

Woohoo! We’re now live here.

Okay, so what? We can upload files manually. That’s all fine and dandy, let’s take this one step further and add a CD pipeline to our GitHub repo to automate this process for us. The process is surprisingly easy, and similar to the process of creating a CI pipeline: Generate a secret, create yet another YML file (heh) to register our GitHub actions, and push. Let’s begin (RTFM here.)

Remember that API token I created earlier? Well, we need another one. This one is for GitHub (which is another user who will be submitting our project on behalf of us, so naturally we need an API token for them.) I generated a new one and added it to my repo’s secrets (on GitHub: Your Repo > Settings > Secrets), and named it PYPI_PASSWORD (the secret name will be referenced later). Then I created the following workflow .yml file in my .github/workflows/ folder:

name: CD Pipeline - Publish to PyPI and TestPyPI

on:
  push:  # Pattern matched against refs/tags
    tags:
      - '*'  # Push events to every tag

jobs:
  build-n-publish:
    name: Publish Python distributions to PyPI and TestPyPI
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@master
      - name: Set up Python 3.9
        uses: actions/setup-python@v1
        with:
          python-version: 3.9
      - name: Install pypa/build
        run: >-
          python -m
          pip install
          build
          --user          
      - name: Build a binary wheel and a source tarball
        run: >-
          python -m
          build
          --sdist
          --wheel
          --outdir dist/
          .          
      - name: Publish distribution to PyPI
        if: startsWith(github.ref, 'refs/tags')
        uses: pypa/gh-action-pypi-publish@master
        with:
          password: ${{ secrets.pypi_password }}

Now that my project has a proper setup.py file configured, and can properly generate source code archives for automated CD upload, let’s try adding a tag to our project and see if it uploads automatically. (Reminder: We need a tag because our CD pipeline will do this process each and every time a new tag is pushed):

git tag -a 1.0.4 -m "Big release!"

git push --follow-tags

Amazing. I can’t believe all of this is free. Unreal.

And that, my friends, is how I spent 9 hours avoiding studying for my finals which take place in 9 days.


With my project now live, I asked my friend Nilan to test my project out and see if it installs, runs, and works for him out of the box. The process went something like this:

Me: Hey, I updated my project’s README, can you test it for me? Me: Just follow the steps listed here. Nilan: wtf, I have to install Python?

<15 minutes later>

Nilan: Wow I can’t believe that just worked out of the box. Me: Cool eh? Nilan: This took you four months to build? Me: … Nilan: Also fix up your readme, it’s kind of confusing. Add some more what/where/why/how details, but otherwise your program downloaded and ran perfectly. Me: … ok.

It was great to see that the program just worked off the bat from an install, and his feedback regarding the wording was really helpful, I think the document reads really nicely now.


For a very brief moment in time, I was number one in something:

also, adding emojis to a GitHub actions makes things really obnoxious, haha.

All I want for Christmas is to pass my Data Structures and Algorithms class.