-
Pierre-Yves David authoredPierre-Yves David authored
Package Loader Tutorial
In this tutorial, we will see how to write a loader for |swh| that loads packages from a package manager, such as PyPI or Debian's.
First, you should be familiar with Python, unit-testing, |swh|'s :ref:`data-model` and :ref:`architecture`, and go through the :ref:`developer-setup`.
Creating the files hierarchy
Once this is done, you should create a new directory (ie. a (sub)package from
Python's point of view) for you loader.
It can be either a subdirectory of swh-loader-core/swh/loader/package/
like
the other package loaders, or it can be in its own package.
If you choose the latter, you should also create the base file of any Python
package (such as setup.py
), you should import them from the swh-py-template
repository.
In the rest of this tutorial, we will assume you chose the former and
your loader is named "New Loader", so your package loader is in
swh-loader-core/swh/loader/package/newloader/
.
Next, you should create boilerplate files needed for SWH loaders: __init__.py
,
tasks.py
, tests/__init__.py
, and tests/test_tasks.py
;
copy them from an existing package, such as
swh-loader-core/swh/loader/package/pypi/
, and replace the names in those
with your loader's.
Finally, create an entrypoint in :file:`setup.py`, so your loader can be discovered by the SWH Celery workers:
entry_points="""
[swh.workers]
loader.newloader=swh.loader.package.newloader:register
""",
Writing a minimal loader
It is now time for the interesting part: writing the code to load packages from a package manager into the |swh| archive.
Create a file named :file:`loader.py` in your package's directory, with two empty classes (replace the names with what you think is relevant):
from typing import Optional
import attr
from swh.loader.package.loader import BasePackageInfo, PackageLoader
from swh.model.model import Person, Release, Sha1Git, TimestampWithTimezone
@attr.s
class NewPackageInfo(BasePackageInfo):
pass
class NewLoader(PackageLoader[NewPackageInfo]):
visit_type = "newloader"
We now have to fill some of the methods declared by
:class:`swh.loader.package.PackageLoader`: in your new NewLoader
class.
Listing versions
get_versions
should return the list of names of all versions of the origin
defined at self.url
by the default constructor; and get_default_version
should return the name of the default version (usually the latest stable release).