Skip to content

tar.loader: Make the loader-tar able to download remote artifact

Also:

  • Keep the legacy behavior (use the revision and snapshot's branch provided by the caller, e.g. the deposit loader depends on this)
  • Increase code coverage (50 -> 93%)
  • Remove dir loader inheritance (nightmare to maintain). We still need the dir loader for some functions but the main issue is resolved).
  • Remove deprecated producer code (we use the scheduler now)

Related T1351 Related T1431 Depends on !9 (closed)

Use sample:

>>> url = 'https://ftp.gnu.org/gnu/8sync/8sync-0.1.0.tar.gz'
>>> origin = {'url': url, 'type': 'tar'}
>>> visit_date = 'Tue, 3 May 2017 17:16:32 +0200'
>>> last_modified = '2016-04-22 16:35'
>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>>
>>> from swh.loader.tar.tasks import LoadTarRepository
>>> l = LoadTarRepository()
>>> l.run_task(origin=origin, visit_date=visit_date,
...            last_modified=last_modified)
DEBUG:swh.scheduler.task.LoadTarRepository:Creating tar origin for https://ftp.gnu.org/gnu/8sync/8sync-0.1.0.tar.gz
DEBUG:swh.scheduler.task.LoadTarRepository:Done creating tar origin for https://ftp.gnu.org/gnu/8sync/8sync-0.1.0.tar.gz
DEBUG:swh.scheduler.task.LoadTarRepository:Creating origin_visit for origin 3 at time Tue, 3 May 2017 17:16:32 +0200
DEBUG:swh.scheduler.task.LoadTarRepository:Done Creating origin_visit for origin 3 at time Tue, 3 May 2017 17:16:32 +0200
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): ftp.gnu.org:443
DEBUG:urllib3.connectionpool:https://ftp.gnu.org:443 "GET /gnu/8sync/8sync-0.1.0.tar.gz HTTP/1.1" 200 221837
INFO:swh.scheduler.task.LoadTarRepository:Uncompress /tmp/swh.loader.tar._nh29j5e-18075/8sync-0.1.0.tar.gz to /tmp/swh.loader.tar._nh29j5e-18075/swh.loader.tar-7wox5_rx
DEBUG:swh.scheduler.task.LoadTarRepository:Sending 28 contents
DEBUG:swh.scheduler.task.LoadTarRepository:Done sending 28 contents
DEBUG:swh.scheduler.task.LoadTarRepository:Sending 7 directories
DEBUG:swh.scheduler.task.LoadTarRepository:Done sending 7 directories
DEBUG:swh.scheduler.task.LoadTarRepository:Sending 1 revisions
DEBUG:swh.scheduler.task.LoadTarRepository:Done sending 1 revisions
DEBUG:swh.scheduler.task.LoadTarRepository:Updating origin_visit for origin 3 with status full
DEBUG:swh.scheduler.task.LoadTarRepository:Done updating origin_visit for origin 3 with status full
DEBUG:swh.scheduler.task.LoadTarRepository:Clean up /tmp/swh.loader.tar._nh29j5e-18075
{'status': 'eventful'}

Test Plan

tox


Migrated from D795 (view on Phabricator)

Merge request reports