Skip to content

models: Keep scheduler task ids reference on deposit model

First step to actually ease the rescheduling of new deposits, keeping the scheduler task identifiers reference on the deposit side.

Related #1703 (closed)

Test Plan

tox

Then using docker-dev:

doco up -d
  • Trigger a deposit
swh deposit upload --url http://localhost:5006/1 --username test \
                   --password test --collection test \
                   --archive ../swh-docker-dev.tgz  --author mg \
                   --name 'swh-docker-dev'
  • check in db the check_task_id and load_task_id are referenced within the deposit record
$ doco exec swh-deposit bash -c 'psql swh-deposit -c "select id, status, swh_id, check_task_id, load_task_id from deposit"'
 id | status |                       swh_id                       | check_task_id | load_task_id
----+--------+----------------------------------------------------+---------------+--------------
  1 | done   | swh:1:dir:3b0919ddd42be1ba0405d33f383b6e0ee8dedcba | 1             | 2
  (1 row)
  • check those corresponds to the scheduling task:
$ swh scheduler task list
Found 2 tasks

Task 1
  Next run: 19 minutes ago (2019-05-07 11:25:32+00:00)
  Interval: 1 day, 0:00:00
    Type: swh-deposit-archive-checks
    Policy: oneshot
    Status: completed
    Priority:
    Args:
    Keyword args:
      deposit_check_url: '/1/private/test/1/check/'

Task 2
  Next run: 3 minutes ago (2019-05-07 11:41:27+00:00)
    Interval: 1 day, 0:00:00
    Type: swh-deposit-archive-loading
    Policy: oneshot
    Status: next_run_scheduled  # <- strange as the scheduling took place, issue unrelated to the deposit's code though
    Priority:
    Args:
    Keyword args:
      archive_url: '/1/private/test/1/raw/'
      deposit_meta_url: '/1/private/test/1/meta/'
      deposit_update_url: '/1/private/test/1/update/'
  • Empty the record and change status to 'verified'
swh-deposit=# update deposit
swh-deposit-# set status='verified', swh_id=null, swh_anchor_id=null, swh_id_context=null, swh_anchor_id_context=null
swh-deposit-# where id=1;
UPDATE 1
swh-deposit=# select id, status, swh_id, check_task_id, load_task_id from deposit;
 id |  status  | swh_id | check_task_id | load_task_id
----+----------+--------+---------------+--------------
  1 | verified |        | 1             | 2
  (1 row)
  • Respawn manually the loading task using the associated task id
$ swh scheduler task respawn 2
  • Wait for the loading to keep up

  • Check the deposit's status is 'done' again with the right ids (same as initial)

swh-deposit=# select id, status, swh_id, check_task_id, load_task_id from deposit;
 id | status |                       swh_id                       | check_task_id | load_task_id
----+--------+----------------------------------------------------+---------------+--------------
  1 | done   | swh:1:dir:3b0919ddd42be1ba0405d33f383b6e0ee8dedcba | 1             | 2
  (1 row)

Migrated from D1446 (view on Phabricator)

Merge request reports