Skip to content
Snippets Groups Projects
package-loader-tutorial.rst 26.30 KiB

Package Loader Tutorial

In this tutorial, we will see how to write a loader for |swh| that loads packages from a package manager, such as PyPI or Debian's.

First, you should be familiar with Python, unit-testing, |swh|'s :ref:`data-model` and :ref:`architecture`, and go through the :ref:`developer-setup`.

Creating the files hierarchy

Once this is done, you should create a new directory (ie. a (sub)package from Python's point of view) for you loader. It can be either a subdirectory of swh-loader-core/swh/loader/package/ like the other package loaders, or it can be in its own package.

If you choose the latter, you should also create the base file of any Python package (such as setup.py), you should import them from the swh-py-template repository.

In the rest of this tutorial, we will assume you chose the former and your loader is named "New Loader", so your package loader is in swh-loader-core/swh/loader/package/newloader/.

Next, you should create boilerplate files needed for SWH loaders: __init__.py, tasks.py, tests/__init__.py, and tests/test_tasks.py; copy them from an existing package, such as swh-loader-core/swh/loader/package/pypi/, and replace the names in those with your loader's.

Finally, create an entrypoint in :file:`setup.py`, so your loader can be discovered by the SWH Celery workers:

entry_points="""
    [swh.workers]
    loader.newloader=swh.loader.package.newloader:register
""",

Writing a minimal loader

It is now time for the interesting part: writing the code to load packages from a package manager into the |swh| archive.

Create a file named :file:`loader.py` in your package's directory, with two empty classes (replace the names with what you think is relevant):

from typing import Optional

import attr

from swh.loader.package.loader import BasePackageInfo, PackageLoader
from swh.model.model import Person, Release, Sha1Git, TimestampWithTimezone


@attr.s
class NewPackageInfo(BasePackageInfo):
    pass

class NewLoader(PackageLoader[NewPackageInfo]):
    visit_type = "newloader"

We now have to fill some of the methods declared by :class:`swh.loader.package.PackageLoader`: in your new NewLoader class.

Listing versions

get_versions should return the list of names of all versions of the origin defined at self.url by the default constructor; and get_default_version should return the name of the default version (usually the latest stable release).