Skip to content
Snippets Groups Projects
  1. Aug 07, 2023
  2. Jun 23, 2023
  3. Feb 17, 2023
  4. Feb 16, 2023
  5. Feb 02, 2023
  6. Dec 19, 2022
  7. Oct 18, 2022
  8. May 09, 2022
  9. Apr 26, 2022
  10. Apr 21, 2022
  11. Apr 08, 2022
  12. Apr 06, 2022
  13. Mar 22, 2022
    • Antoine Lambert's avatar
      pytest: Exclude build directory for tests discovery · 1e4e63af
      Antoine Lambert authored
      Due to test modules being copied in subdirectories of the
      build directory by setuptools, it makes pytest fail by raising
      ImportPathMismatchError exceptions when invoked from root
      directory of the module.
      
      So ignore the build folder to discover tests.
      1e4e63af
  14. Mar 03, 2022
  15. Feb 10, 2022
  16. Feb 07, 2022
  17. Jan 25, 2022
  18. Dec 16, 2021
  19. Dec 15, 2021
  20. Dec 09, 2021
  21. Dec 08, 2021
  22. Nov 10, 2021
    • Loïc Dachary's avatar
      create and lookup a Read Shard with a perfect hash · 9266eaa6
      Loïc Dachary authored
      
      This package is intended to be used by the new object storage, as
      a low level dependency to create and lookup a Read Shard.
      
      It is implemented in C and based on the cmph library for better
      performances. It will be used when a Read Shard must be created with
      around fifty millions objects, totaling around 100GB.
      
      The objects and their key (their cryptographic signature) will be
      retrieved, in python from the postgres database where the Write Shard
      lives. One after the other they will be inserted in the Read Shard
      using the **write** method. In the end the **save** method will create
      the perfect hash table using the cmph library and store it in the
      file (it typically takes a few seconds).
      
      There is no write amplification during the creation of the Read Shard:
      each byte is written exactly once, sequentially. There is no read
      operation. The memory footprint is 2*n*32 where n is the number of
      inserted keys.
      
      The **lookup** method relies on the hash function which is loaded in
      memory when the **load** function is called. It obtains the offset of
      the object by looking up its offset in the file from an index which
      may be up to 2x the number of keys (it is not minimal).
      
      Signed-off-by: default avatarLoïc Dachary <loic@dachary.org>
      v0.1.0
      9266eaa6
  23. Oct 06, 2021
Loading