Skip to content

Using a custom Sorted String Table format

It is a format to store what is conceptually a Sorted String Table. There is no reference defining what a Sorted String Table is and the implementations varies depending on the context. It is often said to have been introduced in a paper from Google. It is a Key/Value map sorted by Key.

Format

The custom format is a header:

  • Format version
  • Number of entries in the index

followed by an index which is a sorted list of fixed size entries:

  • SHA256,offset,size

after the index the content of the objects is found.

Writing

It is assumed writing is done in batch, sequentially

Reading

  • Binary search for the SHA256 in the index
  • Seek to the object content to stream it to the caller in chunks of a given size

Migrated from T3048 (view on Phabricator)

Edited by Phabricator Migration user