Commits · 9c1e83c04a15ed17e3c46a65194f77af47c99d74 · Platform / Development / swh-archiver

Mar 13, 2017
- Add journal client to update content archiver with new content · 9c1e83c0
  Antoine R. Dumont authored 8 years ago
  
  Related T494
  9c1e83c0
Mar 07, 2017
- archiver.director: only yield plain content ids, not dicts · 5936caa5
  Nicolas Dandrimont authored 8 years ago
  
  5936caa5
- archiver.worker: allow disabling the task chaining mechanism · f4cec41f
  Nicolas Dandrimont authored 8 years ago
  
  f4cec41f
- archiver.storage: add a stub archiver only writing data to logfiles · 6b966701
  Nicolas Dandrimont authored 8 years ago
  
  6b966701
- archiver.storage: refactor to provide a get_archiver_storage function · c4643828
  Nicolas Dandrimont authored 8 years ago
  
  This will allow us to handle another storage backend for the storage of the archiver data.
  c4643828
- test_archiver: clean up after yourself · 8e8e84e3
  Nicolas Dandrimont authored 8 years ago
  
  8e8e84e3
Mar 02, 2017
- archiver.worker: only get copies from the configured object storages · 5bc66d67
  Nicolas Dandrimont authored 8 years ago
  
  By default we would try to copy objects from all the archives, even those for which we didn't have a configuration.
  5bc66d67
- archiver.storage: remove implicit sources_missing from content_archive_add · d586a453
  Nicolas Dandrimont authored 8 years ago
  
  The default value for content copies is "missing", so we don't need to make it explicit.
  d586a453
- archiver.director: the source objstorage for unknown content ids is implicit · d2492f5e
  Nicolas Dandrimont authored 8 years ago
  
  d2492f5e
- archiver.director: make the standard input reader more resilient to errors · 435dc372
  Nicolas Dandrimont authored 8 years ago
  
  435dc372
- Refactor: Unify the content_archive_add with swh.storage.content_add · b7f8e66c
  Antoine R. Dumont authored 8 years ago
  
  Implementation wise, this uses COPY statement and drop duplicates if encountered for content_archive insertion.
  b7f8e66c
- archiver-storage: Improve content_archive_add function · 58001f3b
  Antoine R. Dumont authored 8 years ago
  
  Use the same insertion pattern as swh.storage.content_add.
  58001f3b
- Refactor: Reuse swh.scheduler.get_task function · 083f9bdd
  Antoine R. Dumont authored 8 years ago
  
  This also has the benefit to hide some celery name (which is an implementation detail from swh.scheduler).
  083f9bdd
- content_archive_add: Use the right 'missing' status · 343c5302
  Antoine R. Dumont authored 8 years ago
  
  Related: T494
  343c5302
- test: Remove impossible and commented test · 01f4de3e
  Antoine R. Dumont authored 8 years ago
  
  This use case cannot happen with ArchiverWithRetentionPolicyDirector: - If a row entry is referenced in the archiver db, it's present in the objstorage - And if a row entry is not referenced in the archiver db, it won't be listed as missing since it's the archiver db which is read for listing the contents we want to archive.
  01f4de3e
- Refactor: Merge common behavior in director and content updater client · baf527b2
  Antoine R. Dumont authored 8 years ago
  
  Related T494 Related T569
  baf527b2
- archiver.storage: Add content_archive_content_add endpoint · a4038547
  Antoine R. Dumont authored 8 years ago
  
  Related T494
  a4038547
Feb 16, 2017
- archiver: fix brown paper bag bug for object counter · 00ccce0b
  Nicolas Dandrimont authored 8 years ago
  
  00ccce0b
Feb 14, 2017

converters: normalize timestamps using swh.model · f66b0b40

To make sure corruptions such as T680 don't happen again, use the same
normalization function as swh.model before inserting timestamps into our
database.

This makes swh.storage reject non-integer timestamp values as well.

Update tests to reflect this change.

f66b0b40

Feb 09, 2017
- sql/archiver: get the count of objects in each archive · 617b6316
  Nicolas Dandrimont authored 8 years ago
  
  Close T672
  617b6316
- sql/archiver: move function defs to the functions file · 8ad17ae3
  Nicolas Dandrimont authored 8 years ago
  
  8ad17ae3
- requirements: split internal and external requirements in two separate files · 36d76764
  Antoine Pietri authored 8 years ago
  
  36d76764
- requirements.txt: s/dateutil/python-dateutil/ · 400bda76
  Antoine Pietri authored 8 years ago
  
  400bda76
Feb 07, 2017

sql/archiver: keep archive counts using a bucketed list · cc857f43

Nicolas Dandrimont authored 8 years ago

The buckets are using the last two bytes of the object id, so that we spread the
load across different lines on sequential archivings.

cc857f43

Jan 26, 2017
- d/control: Update dependencies · c0047d9c
  Antoine R. Dumont authored 8 years ago
  
  Closes T646
  c0047d9c
Jan 03, 2017
- archiver.worker: fix typo · f9c61cf0
  Nicolas Dandrimont authored 8 years ago
  
  f9c61cf0
Dec 31, 2016
- test: Fix wrong key from base_url to url · 9c84152b
  Antoine R. Dumont authored 8 years ago
  
  9c84152b
Dec 20, 2016
- d/control: Update to latest objstorage · cfc48798
  Antoine R. Dumont authored 8 years ago
  
  cfc48798
Nov 15, 2016
- Fix pep8 violation · 340d85c2
  Antoine R. Dumont authored 8 years ago
  
  340d85c2
Nov 03, 2016

storage: add check_config method · b3ba92b6

Nicolas Dandrimont authored 8 years ago

The check_config method allows a dynamic check of the configuration for
a running storage. We can make sure that we have proper permissions on
the object storage as well as the database before running things.

b3ba92b6

Oct 13, 2016
- Add the means to pipe contents to another queue once copied · 0ce338c2
  Antoine R. Dumont authored 8 years ago
  
  Related T575
  0ce338c2
Sep 29, 2016
- Fix: Missing incremented version 5 for archiver.dbversion · 99594e76
  Antoine R. Dumont authored 8 years ago
  
  99594e76
- Fix copyright range · f188d9c9
  Antoine R. Dumont authored 8 years ago
  
  f188d9c9
- archiver: Remove print statement · 71faa860
  Antoine R. Dumont authored 8 years ago
  
  71faa860
Sep 23, 2016
- archiver: Pass the destination as parameter of the worker to backend · 317939fa
  Antoine R. Dumont authored 8 years ago
  
  317939fa
- archiver: Add missing property for worker to backend · 77fe6b8c
  Antoine R. Dumont authored 8 years ago
  
  77fe6b8c
- archiver: Complete docstring's information · 115cbdcb
  Antoine R. Dumont authored 8 years ago
  
  115cbdcb
- archiver: Simplify update on content · 61b6796a
  Antoine R. Dumont authored 8 years ago
  
  61b6796a
- archiver: Improve 'unknown sha1' and 'force copy' policies · 115050d5
  Antoine R. Dumont authored 8 years ago
  
  The 'unknown sha1 path' cannot happen in the default archiver since it reads from the archive db (so the fallback code is not necessary in the worker). To the contrary, since 'archiver to backend' reads from stdin (for now), we could have unregistered sha1s from that source. This commit makes the director deal with that before sending sha1 to workers. It's also the director's job to set the state to 'missing' when the force_copy is true before sending sha1 to worker.
  115050d5
- archiver: Fix random.choice input to a list · 1e9768e5
  Antoine R. Dumont authored 8 years ago
  
  1e9768e5