Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Register
  • Sign in
  • S swh-loader-git
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 26
    • Issues 26
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 5
    • Merge requests 5
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Platform
  • Development
  • swh-loader-git
  • Issues
  • #3627
Closed
Open
Issue created Oct 04, 2021 by Antoine R. Dumont@ardumontMaintainer

Consider dropping pull request references from the git loader ingestion

The loader git currently filters out references considered not that interesting [1]:

  • auto-merged github pull requests (reference names starting with refs/pulls and finishes with /merge)
  • peeled refs (reference names finishing with ^{})

Nonetheless, the current loader git actually still load pull requests references and there can be a lot depending on the repository. See for example a recent snapshot on the torvalds/linux [2] repository.

We should consider whether that's still relevant to ingest those references.

The webapp already considers this noise and has been filtering them out from the browsing since v0.0.288 version [3]. So that tends toward ignoring them as well during the ingestion.

That should also alleviate other current considerations [4].

  • [1] https://forge.softwareheritage.org/source/swh-loader-git/browse/master/swh/loader/git/utils.py$89-90

  • [2] https://archive.softwareheritage.org/api/1/snapshot/c2847dfd741eae21606027cf29250d1ebcd63fb4/

  • [3] rDWAPPScc652d5240

  • [4] #3625 (closed)


Migrated from T3627 (view on Phabricator)

Edited Jan 07, 2023 by Phabricator Migration user
Assignee
Assign to
Time tracking