Skip to content
Snippets Groups Projects

Adapt content indexer to allow journal objects processing

This only adds the content mimetype indexer for now. Some extra mocking work is needed to test the fossology license one so it will go in another diff.

Note that it also refactors the tests dataset to stop hard-coding wrong ids and use proper hash from our model.

In another extra diff after that, we'll drop obsolete parts and refactor to simplify existing base code.

Related to #4273 (closed)


Migrated from D8147 (view on Phabricator)

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
215 215 @indexer_cli_group.command("journal-client")
216 216 @click.argument(
217 217 "indexer",
218 type=click.Choice(["origin-intrinsic-metadata", "extrinsic-metadata", "*"]),
218 type=click.Choice(
  • Add tests

    TODO: Unstuck regresssions in tests (for test reason, not for runtime change reasons)

  • Build has FAILED

    Patch application report for D8147 (id=29444)

    Could not rebase; Attempt merge onto 7cf9cf59...

    Updating 7cf9cf5..0ac477b
    Fast-forward
     swh/indexer/cli.py                          |  26 +-
     swh/indexer/indexer.py                      |  58 +++-
     swh/indexer/tests/conftest.py               |  15 +-
     swh/indexer/tests/test_cli.py               |  91 +++++-
     swh/indexer/tests/test_ctags.py             |  17 +-
     swh/indexer/tests/test_fossology_license.py |  20 +-
     swh/indexer/tests/test_metadata.py          |   6 +-
     swh/indexer/tests/test_mimetype.py          |  50 ++--
     swh/indexer/tests/utils.py                  | 427 ++++++++++++++--------------
     9 files changed, 440 insertions(+), 270 deletions(-)
    Changes applied before test
    commit 0ac477be06e8ec659a5ac501385b7cf933598e6a
    Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
    Date:   Thu Jul 21 19:22:41 2022 +0200
    
        Add tests around new content-mimetype journal client indexer
    
    commit 9c16a32833f72a5baac4fd06ecf6bc38eddb92f2
    Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
    Date:   Wed Jul 20 19:16:12 2022 +0200
    
        Adapt content indexer to allow journal objects processing
        
        Related to #4273

    Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/372/ See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/372/console

  • Add missing requirements which fails some tests on objstorage

  • Build has FAILED

    Patch application report for D8147 (id=29447)

    Rebasing onto 7cf9cf59...

    Current branch diff-target is up to date.
    Changes applied before test
    commit 6736fcab0aad2b87b900fa3188f9c4bedd6857cb
    Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
    Date:   Thu Jul 21 19:22:41 2022 +0200
    
        Add tests around new content-mimetype journal client indexer
    
    commit c7bdb5b4ec17ad0cdf242119c9c5d84bbc20b4ea
    Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
    Date:   Wed Jul 20 19:16:12 2022 +0200
    
        Adapt content indexer to allow journal objects processing
        
        Related to #4273

    Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/373/ See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/373/console

  • Fix tests (i've kept some setup in the fixture 'cause that made the tests fail and i don't want to dig in just right now)

  • Build has FAILED

    Patch application report for D8147 (id=29448)

    Rebasing onto 7cf9cf59...

    Current branch diff-target is up to date.
    Changes applied before test
    commit 58c257a6c936e8b61b224838ab24610f5acb3caf
    Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
    Date:   Thu Jul 21 19:22:41 2022 +0200
    
        Add tests around new content-mimetype journal client indexer
    
    commit f069be81d76a0bd39684d6d06c415ddf4a9047c4
    Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
    Date:   Wed Jul 20 19:16:12 2022 +0200
    
        Adapt content indexer to allow journal objects processing
        
        Related to #4273

    Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/374/ See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/374/console

  • Fix inconsistency in test dataset

  • Keep fossology license indexer out of the diff for now. It will go in another diff.

  • 92 94 assert tool is not None
    93 95 dir_ = DIRECTORY2
    94 96
    97 assert (
    98 dir_.entries[0].target
    99 == MAPPING_DESCRIPTION_CONTENT_SHA1GIT["json:yarn-parser-package.json"]
    100 )
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading