Skip to content
Snippets Groups Projects
  1. Sep 23, 2022
    • Pierre-Yves David's avatar
      model: inline the call to `_check_swhid` · 2d65a24a
      Pierre-Yves David authored
      This reduce the number of function call and should be faster.
      
      The mashup of blind optimisation in the previous changeset yield some
      interesting results in total.
      
      It would be insightful to measure them individually, but that would
      take more time than we currently have.
      
      When testing all the validator changes on our previous "benchmark" we
      see quite interesting improvement.
      
          swh loader run mercurial https://foss.heptapod.net/mercurial/mercurial-devel directory=/data/repos/mercurial-devel
      
      = Median time of 3 run =
      base:   17 minutes 48 seconds
      before: 11 minutes 50 seconds
      after:  11 minutes 11 seconds
      
      On a profile of the same run, the `to_model` call of the from_disk's `Directory` class took the following percentage:
      base:   43%
      before: 15%
      after:  11%
      v6.5.0
      2d65a24a
    • Pierre-Yves David's avatar
      model: optimization pass on custom validator · 3608271a
      Pierre-Yves David authored
      (This commit is actually doing two things /o\)
      
      - we inline the type-checking in the custom validator to reduce the
        number of function call.
      
      - we optimize some of the custom validator by skipping the creation of
        intermediate tuples.
      3608271a
    • Pierre-Yves David's avatar
      model: delete unused validator code · 3796e5ba
      Pierre-Yves David authored
      Since all `generic_type_validator` are optimized away, the code will no
      longer be called. So we remove that code to avoid any drifting.
      
      A nice "exception" is provided in case this start getting called again
      in the future.
      3796e5ba
    • Pierre-Yves David's avatar
      model: remove the try/except · b7267a89
      Pierre-Yves David authored
      Since try/except context are known to be expensive in Python, it seems
      useful to remove them.
      b7267a89
    • Pierre-Yves David's avatar
      model: also optimize combined validator · cf529cd1
      Pierre-Yves David authored
      This ensure we don't have any remaining `generic_type_validator` call
      that have not been optimized away.
      cf529cd1
    • Pierre-Yves David's avatar
      model: drop the `type_validator()` indirection · 6ababdeb
      Pierre-Yves David authored
      This indirection seems useless and is probably the remains of some long
      forgotten rituals.
      6ababdeb
    • Pierre-Yves David's avatar
      model: implement specialized attribute-validator functions · edb57fb1
      Pierre-Yves David authored
      This should reduces function calls and speeds things up.
      
      It might be useful to introduce even more specialized validator in the
      future. It would also be useful to skip the intermediate try/except.
      
      Some of this will be done in later changesets.
      edb57fb1
    • Pierre-Yves David's avatar
      model: prepare the filtering of type_validator into something faster · 1dfea324
      Pierre-Yves David authored
      This is currently doing nothing, but prepare for actually changing the
      generic validator into faster specialized variants.
      1dfea324
    • Pierre-Yves David's avatar
      from_disk: skip intermediate dictionnary creation when building model · a2e8f18c
      Pierre-Yves David authored
      Before this change we would do the following :
      
      1) translate from_disk's object into `dict`,
      2) sort these dict,
      3) feed the list to `Directory.from_dict`,
      4) create DirectoryEntry from these dict.
      
      Skipping the directory creating and directly creating the
      DirectoryEntries provide us with a small but stable and noticeable
      performance win.
      
      We tested this change on simple information of the Mercurial loader,
      with a noop-loader stockage:
      
          swh loader run mercurial https://foss.heptapod.net/mercurial/mercurial-devel directory=/data/repos/mercurial-devel
      
      = Median time of 3 run =
      before: 11 minute  56 seconds
      aftere: 11 minute  50 seconds
      
      On a profile of the same run, the `to_model` call of the from_disk's `Directory` class took the following percentage:
      before: 17%
      after:  15%
      a2e8f18c
    • Pierre-Yves David's avatar
      model: avoid another extra creation of Model object · ad3ecac9
      Pierre-Yves David authored
      Do not create model object while sorting entry before creating model
      object.
      
      This is another case of "let us create object X to prepare the creation
      of object X", slowing things down.
      
      In practice, we will likely skip this code-path after the next
      changeset, however this seems useful to get this performance footgun
      out the way.
      
      We tested this change on simple information of the Mercurial loader,
      with a noop-loader stockage:
      
          swh loader run mercurial https://foss.heptapod.net/mercurial/mercurial-devel directory=/data/repos/mercurial-devel
      
      = Median time of 3 run =
      before  12 minutes 59 seconds
      after:  11 minute  56 seconds
      
      On a profile of the same run, the `to_model` call of the from_disk's `Directory` class took the following percentage:
      before: 24%
      after:  17%
      ad3ecac9
    • Pierre-Yves David's avatar
      from_disk: only build a model object once · 814a6c84
      Pierre-Yves David authored
      Before this change, a Directory object was built to compute the `id` of
      we fed to the Directory object we built for `to_model`.
      
      We tested this change on simple information of the Mercurial loader,
      with a noop-loader stockage:
      
          swh loader run mercurial https://foss.heptapod.net/mercurial/mercurial-devel directory=/data/repos/mercurial-devel
      
      = Median time of 3 run =
      before: 17 minutes 48 seconds
      after:  12 minutes 59 seconds
      
      On a profile of the same run, the `to_model` call of the from_disk's `Directory` class took the following percentage:
      before: 43%
      after:  24%
      814a6c84
  2. Aug 30, 2022
  3. Aug 08, 2022
  4. Aug 04, 2022
  5. Jul 19, 2022
  6. Jul 11, 2022
  7. Jul 06, 2022
    • vlorentz's avatar
      model: Add Directory.from_possibly_duplicated_entries factory · 0f7a1cbe
      vlorentz authored
      It will be used by swh.storage.backfiller (so indirectly, swh.scrubber)
      to load directories from the postgresql database, whose schema accidentally
      allowed directories with duplicate entries -- without corrupting the
      shape of the directory too much.
      0f7a1cbe
  8. Jul 04, 2022
  9. May 09, 2022
  10. May 01, 2022
  11. Apr 27, 2022
  12. Apr 26, 2022
  13. Apr 21, 2022
  14. Apr 11, 2022
  15. Apr 08, 2022
  16. Mar 31, 2022
  17. Mar 30, 2022
  18. Mar 23, 2022
  19. Mar 22, 2022
    • Antoine Lambert's avatar
      pytest: Exclude build directory for tests discovery · f6ad1ed1
      Antoine Lambert authored
      Due to test modules being copied in subdirectories of the
      build directory by setuptools, it makes pytest fail by raising
      ImportPathMismatchError exceptions when invoked from root
      directory of the module.
      
      So ignore the build folder to discover tests.
      f6ad1ed1
  20. Mar 18, 2022
  21. Mar 16, 2022
Loading