Skip to content
Snippets Groups Projects
  1. Dec 15, 2021
  2. Dec 09, 2021
  3. Dec 08, 2021
  4. Nov 10, 2021
    • Loïc Dachary's avatar
      create and lookup a Read Shard with a perfect hash · 9266eaa6
      Loïc Dachary authored
      
      This package is intended to be used by the new object storage, as
      a low level dependency to create and lookup a Read Shard.
      
      It is implemented in C and based on the cmph library for better
      performances. It will be used when a Read Shard must be created with
      around fifty millions objects, totaling around 100GB.
      
      The objects and their key (their cryptographic signature) will be
      retrieved, in python from the postgres database where the Write Shard
      lives. One after the other they will be inserted in the Read Shard
      using the **write** method. In the end the **save** method will create
      the perfect hash table using the cmph library and store it in the
      file (it typically takes a few seconds).
      
      There is no write amplification during the creation of the Read Shard:
      each byte is written exactly once, sequentially. There is no read
      operation. The memory footprint is 2*n*32 where n is the number of
      inserted keys.
      
      The **lookup** method relies on the hash function which is loaded in
      memory when the **load** function is called. It obtains the offset of
      the object by looking up its offset in the file from an index which
      may be up to 2x the number of keys (it is not minimal).
      
      Signed-off-by: default avatarLoïc Dachary <loic@dachary.org>
  5. Oct 06, 2021
Loading