From b2c6ddc35b6b126990099d923e91a365727aeccb Mon Sep 17 00:00:00 2001
From: Archit Agrawal <archit18221@iiitd.ac.in>
Date: Thu, 13 Jun 2019 15:39:16 +0530
Subject: [PATCH] tutorial: Add testing and how to run section

Add testing section for lister. Also added how to run a
new lister section which elaborates about the steps
required to run the now lister in docker
---
 docs/tutorial.rst | 73 ++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 69 insertions(+), 4 deletions(-)

diff --git a/docs/tutorial.rst b/docs/tutorial.rst
index 14e23f9c..6c656f2e 100644
--- a/docs/tutorial.rst
+++ b/docs/tutorial.rst
@@ -56,18 +56,18 @@ Fundamentally, a basic lister must follow these steps:
 3. Populate a work queue for fetching and ingesting source repositories.
 
 Steps 1 and 3 are generic problems, so they can get generic solutions hidden
-away in base code, most of which never needs to change. That leaves us to
-implement step 2, which can be trivially done now for services with clean web
+away in the base code, most of which never needs to change. That leaves us to
+implement step 2, which can be trivially done now for services with a clean web
 APIs.
 
-In the new code we've tried to hide away as much generic functionality as
+In the new code, we've tried to hide away as much generic functionality as
 possible, turning it into set-and-forget plumbing between a few simple
 customized elements. Different hosting services might use different network
 protocols, rate-limit messages, or pagination schemes, but, as long as there is
 some way to get a list of the hosted repositories, we think that the new base
 code will make getting those repositories much easier.
 
-First let me give you the 30,000 foot view…
+First, let me give you the 30,000 foot view…
 
 The old GitHub-specific lister code looked like this (265 lines of Python):
 
@@ -164,6 +164,71 @@ above are already provided for 99% of services by the HTTP mix-in module. It
 looks much simpler when we look at the actual implementations of the two
 new-style indexing listers we currently have…
 
+An important aspect for making a new lister is its testing. To register the 
+celery tasks of your new lister, you need to add your lister in the main
+conftest.py (swh/lister/core/tests/conftest.py)
+
+After testing, it is suggested to run your new lister in docker as it provides
+good, almost-production like test. Here are the steps you need to follow to run
+a new lister in docker.
+
+1. You must write a docker-compose override file (`docker-compose.override.yml`).
+   An example is given in the `docker-compose.override.yml.example` file ::
+
+        version: '2'
+
+        services:
+        swh-lister:
+            volumes:
+            - "$SWH_ENVIRONMENT_HOME/swh-lister:/src/swh-lister"
+
+   The file named `docker-compose.override.yml` will automatically be loaded by
+   `docker-compose`. For more details, you may refer to README.md present in
+   swh-docker-dev.
+2. Follow the instruction mentioned under heading Preparation steps and 
+   Configuration file sample in README.md of swh-lister.  
+3. Make sure to run storage (5002) and scheduler (5008) services locally.
+   You can run them by the following command::
+
+    ~/swh-environment/swh-docker-dev$ docker-compose up -d swh-scheduler-api \
+      swh-storage
+4. Add the lister task-type in the scheduler.  For example, if you want to
+   add pypi lister task-type ::
+
+    ~/swh-environment$swh-scheduler task-type add list-pypi recurring \
+     "Full pypi lister"
+
+  You can check all the task-type by::
+
+    ~/swh-environment$swh scheduler task-type list
+    Known task types:
+    list-bitbucket-incremental:
+      Incrementally list BitBucket
+    list-cran:
+      Full CRAN Lister
+    list-debian-distribution:
+      List a Debian distribution
+    list-github-full:
+      Full update of GitHub repos list
+    list-github-incremental:
+    ...
+
+  If your lister is creating new loading task not yet registered, you need
+  to register that task type as well. Like for GNU lister::
+
+     ~/swh-environment$swh scheduler task-type add load-gnu-full recurring \
+      "GNU Loader"
+
+5. Run your lister with the help of scheduler cli.You need to add the task in 
+   the schedular using its cli. For example you need to execute this command
+   to run gnu lister ::
+ 
+     ~/swh-environment$swh scheduler --url http://localhost:5008/ task add \
+      list-gnu-full --policy oneshot  
+
+After the execution of lister is complete you can see the loading task created.
+    ~/swh-environment/swh-lister$swh scheduler task list
+
 This is the entire source code for the BitBucket repository lister::
 
     # Copyright (C) 2017 the Software Heritage developers
-- 
GitLab