Skip to content
Snippets Groups Projects

Prevent erroneous HashCollisions by using the same ctime for all rows.

  1. Apr 08, 2020
    • vlorentz's avatar
      Prevent erroneous HashCollisions by using the same ctime for all rows. · 8e8577e8
      vlorentz authored
      'swh_content_add' tries to avoid this issue with a DISTINCT clause
      on the entire row; but it is useless because 'ctime' cells differ by
      a few microseconds.
      This commit ensures all ctime values are exactly the same, so they
      are filtered out.
      
      An alternative would be to change 'swh_content_add' to do:
      
      ```
      select distinct on (sha1, sha1_git, sha256, blake2s256, length, status) sha1, sha1_git, sha256, blake2s256, length, status, ctime from tmp_content
      ```
      
      instead of:
      
      ```
      select distinct sha1, sha1_git, sha256, blake2s256, length, status, ctime from tmp_content
      ```
      
      but this is more verbose and there's no good reason to call 'now()' for
      every row.
      8e8577e8
Loading