pw_kvs#

Note

The documentation for this module is currently under construction.

pw_kvs is Pigweed’s Key Value Store (KVS) library. KVS is a flash-backed persistent storage system with integrated wear-leveling that serves as a relatively lightweight alternative to a file system.

KeyValueStore#

The KVS system stores key and value data pairs. The key value pairs are stored in flash memory as a key-value entry (KV entry) that consists of a header/metadata, the key data, and value data. KV entries are accessed through Put, Get, and Delete operations.

Each flash sector is written sequentially in an append-only manner, with each following entry write being at a higher address than all of the previous entry writes to that sector since erase. Once information (header, metadata, data, etc) is written to flash, that information is not modified or cleared until a full sector erase occurs as part of garbage collection.

Individual KV entries are contained within a single flash sector (do not cross sector boundaries). Flash sectors can contain as many KV entries as fit in the sector.

KVS does not store any data/metadata/state in flash beyond the KV entries. All KVS system state can be derived from the stored KV entries. Current KVS system state is determined at boot from flash-stored KV entries and then maintained in ram by the KVS. The KVS is at all times in a valid state on-flash, so there are no windows of vulnerability to unexpected power loss or crash. The old entry for a key is maintained until the new entry for that key is written and verified.

Each key-value entry has a unique transaction ID that is incremented for each KVS update transaction. When determining system state from flash-stored KV entries, the valid entry with the highest transaction ID is considered to be the “current” entry of the key. All stored entries of the same key with lower transaction ID are considered old or “stale”.

Updates/rewrites of a key that has been previously stored is done as a new KV entry with an updated transaction ID and the new value for the key. The KVS internal state is updated to reflect the new entry. The previously stored KV entries for that key are not modified or removed from flash storage, until garbage collection reclaims the “stale” entries.

Garbage collection is done by copying any currently valid KV entries in the sector to be garbage collected to a different sector and then erasing the sector.

Flash Memory#

Pigweed AI summary: The flash storage used by KVS consists of two layers: FlashMemory and FlashPartition. FlashMemory manages the raw read/write/erase of the flash memory device, while FlashPartition is a subset of FlashMemory that represents different parts of the FlashMemory. Each FlashPartition has its own separate logical address space, and writes to flash must have a start address that is a multiple of the flash write alignment. FlashPartitions may have a different alignment than the FlashMemory they are part of, and sectors

The flash storage used by KVS is comprised of two layers, FlashMemory and FlashPartition.

FlashMemory is the lower level that manages the raw read/write/erase of the flash memory device.

FlashPartition is a subset of a FlashMemory. A FlashMemory may have one or multiple FlashPartitions that represent different parts of the FlashMemory - such as partitions for KVS, OTA, snapshots/crashlogs, etc. Each FlashPartition has its own separate logical address space starting from zero to size bytes of the partition. FlashPartition logical address does not always map directly to FlashMemory addresses due to partition encryption, sector headers, etc.

Writes to flash must have a start address that is a multiple of the flash write alignment. Write size must also be a multiple of flash write alignment. Write alignment varies by flash device and partition type. Reads from flash do not have any address or size alignment requirement - reads always have a minimum alignment of 1.

FlashPartitions may have a different alignment than the FlashMemory they are part of, so long as the partition’s alignment is a multiple of the alignment for the FlashMemory.

Sectors are the minimum erase size for both FlashMemory and FlashPartition. FlashPartitions may have a different logical sector size than the FlashMemory they are part of. Partition logical sectors may be smaller due to partition overhead (encryption, wear tracking, etc) or larger due to combining raw sectors into larger logical sectors.

FlashPartition supports access via NonSeekableWriter and SeekableReader. The reader defaults to the full size of the partition but can optionally be limited to a smaller range.

Size report#

The following size report showcases the memory usage of the KVS and FlashPartition.

Label

Segment

Delta

KeyValueStore

FLASH

+24

[section .code]

-40

main

+26

pw::kvs::FlashPartition::Read()

-2

pw::kvs::FlashPartition::size_bytes()

+22

pw::Vector<>::operator[]()

DEL

-16

_GLOBAL__sub_I_test_partition

NEW

+656

pw::kvs::KeyValueStore::InitializeMetadata()

NEW

+412

pw::kvs::internal::EntryCache::Find()

NEW

+380

pw::kvs::KeyValueStore::Init()

NEW

+338

pw::kvs::internal::Sectors::Find()

NEW

+294

pw::kvs::KeyValueStore::WriteEntry()

NEW

+286

pw::kvs::KeyValueStore::UpdateEntriesToPrimaryFormat()

NEW

+276

pw::kvs::KeyValueStore::FullMaintenanceHelper()

NEW

+256

pw::kvs::KeyValueStore::GetSectorForWrite()

NEW

+256

pw::kvs::internal::Sectors::FindSectorToGarbageCollect()

NEW

+214

pw::kvs::KeyValueStore::RemoveDeletedKeyEntries()

NEW

+208

pw::kvs::internal::Entry::VerifyChecksumInFlash()

NEW

+204

pw::kvs::internal::EntryCache::AddNewOrUpdateExisting()

NEW

+196

pw::kvs::KeyValueStore::GarbageCollectSector()

NEW

+192

pw::kvs::internal::EntryCache::RemoveEntry()

NEW

+184

pw::kvs::KeyValueStore::LoadEntry()

NEW

+180

pw::kvs::KeyValueStore::AppendEntry()

NEW

+170

pw::kvs::KeyValueStore::RelocateEntry()

NEW

+166

pw::AlignedWriter::Write()

NEW

+166

pw::kvs::internal::Entry::CalculateChecksumFromFlash()

NEW

+164

pw::kvs::internal::Entry::Read()

NEW

+160

pw::kvs::internal::Entry::Write()

NEW

+156

pw::kvs::internal::Entry::Copy()

NEW

+146

pw::kvs::KeyValueStore::PutBytes()

NEW

+144

pw::kvs::KeyValueStore::FixedSizeGet()

NEW

+144

pw::kvs::KeyValueStore::FullMaintenanceHelper()::{lambda()#1}::operator()()

NEW

+140

pw::kvs::KeyValueStore::WriteEntryForNewKey()

NEW

+136

pw::kvs::KeyValueStore::CreateEntry()

NEW

+134

pw::kvs::KeyValueStore::CopyEntryToSector()

NEW

+134

pw::kvs::internal::Entry::ValueMatches()

NEW

+128

pw::kvs::KeyValueStore::AddRedundantEntries()

NEW

+128

pw::kvs::KeyValueStore::GetStorageStats()

NEW

+126

pw::kvs::internal::Entry::ReadValue()

NEW

+122

pw::kvs::KeyValueStore::UpdateKeyDescriptor()

NEW

+118

pw::kvs::internal::(anonymous namespace)::Contains<>()

NEW

+116

pw::kvs::KeyValueStore::CreateOrUpdateKeyDescriptor()

NEW

+116

pw::kvs::KeyValueStore::Get()

NEW

+110

pw::kvs::KeyValueStore::CheckForErrors()

NEW

+110

pw::kvs::KeyValueStore::RepairCorruptSectors()

NEW

+108

_GLOBAL__sub_I_working_buffer

NEW

+104

pw::kvs::KeyValueStore::EnsureEntryRedundancy()

NEW

+104

pw::kvs::KeyValueStore::ReadEntry()

NEW

+104

pw::kvs::KeyValueStore::ScanForEntry()

NEW

+98

pw::kvs::internal::Entry::CalculateChecksum()

NEW

+92

pw::kvs::internal::Entry::Entry()

NEW

+88

pw::kvs::KeyValueStore::RelocateKeyAddressesInSector()

NEW

+88

pw::kvs::internal::Entry::VerifyChecksum()

NEW

+78

pw::kvs::KeyValueStore::KeyValueStore()

NEW

+78

pw::kvs::internal::Entry::AddPaddingBytesToChecksum()

NEW

+76

pw::kvs::internal::EntryCache::AddNew()

NEW

+72

pw::kvs::KeyValueStore::EnsureFreeSectorExists()

NEW

+70

pw::kvs::KeyValueStore::GetAddressesForWrite()

NEW

+66

pw::AlignedWriter::Flush()

NEW

+60

pw::kvs::KeyValueStore::FindEntry()

NEW

+58

pw::kvs::KeyValueStore::FixErrors()

NEW

+58

pw::kvs::KeyValueStore::WriteEntryForExistingKey()

NEW

+56

pw::AlignedWriter::AddBytesToBuffer()

NEW

+56

pw::kvs::internal::EntryMetadata::RemoveAddress()

NEW

+52

pw::kvs::internal::EntryCache::Iterator<>::operator*()

NEW

+48

pw::kvs::KeyValueStore::GarbageCollect()

NEW

+48

pw::kvs::internal::EntryMetadata::Reset()

NEW

+44

pw::kvs::ChecksumAlgorithm::Verify()

NEW

+44

pw::kvs::internal::EntryCache::FindIndex()

NEW

+44

pw::kvs::internal::EntryCache::ResetAddresses()

NEW

+42

pw::kvs::FlashPartition::Output::DoWrite()

NEW

+42

pw::kvs::KeyValueStore::FindExisting()

NEW

+42

pw::kvs::internal::Sectors::WearLeveledSectorFromIndex()

NEW

+40

pw::kvs::FlashPartition::AppearsErased()

NEW

+40

pw::kvs::KeyValueStore::Repair()

NEW

+40

pw::kvs::internal::Sectors::AddressInSector()

NEW

+38

pw::kvs::KeyValueStore::ValueSize()

NEW

+38

pw::kvs::internal::EntryCache::addresses()

NEW

+38

pw::kvs::internal::Sectors::FromAddress()

NEW

+38

pw::kvs::internal::Sectors::NextWritableAddress()

NEW

+36

pw::kvs::FlashPartition::Input::DoRead()

NEW

+36

pw::kvs::internal::EntryCache::AddAddressIfRoom()

NEW

+34

pw::kvs::KeyValueStore::CheckWriteOperation()

NEW

+32

memcmp

NEW

+32

pw::kvs::internal::Entry::Update()

NEW

+30

pw::kvs::internal::Entry::size()

NEW

+30

pw::kvs::internal::EntryCache::present_entries()

NEW

+30

pw::kvs::internal::EntryFormats::Find()

NEW

+28

pw::kvs::internal::Entry::ReadKey()

NEW

+26

pw::kvs::ChecksumAlgorithm::Finish()

NEW

+26

pw::kvs::internal::EntryMetadata::AddNewAddress()

NEW

+24

pw::kvs::internal::Sectors::BaseAddress()

NEW

+20

pw::Output::Write()

NEW

+20

pw::kvs::ChecksumAlgorithm::Update()

NEW

+12

pw::kvs::FlashPartition::Input

NEW

+12

pw::kvs::FlashPartition::Output

NEW

+8

kvs_format

+10,208

FlashPartition

FLASH

+20

[section .code]

-16

_ctype_

+140

main

-2

pw::allocator::FreeList::RemoveChunk()

+2

pw_varint_ZigZagEncode

+2

pw_log_tokenized_HandleLog

NEW

+216

pw::kvs::FakeFlashMemory::Write()

NEW

+168

_GLOBAL__sub_I__ZN2pw3kvs18FlashTestPartitionEv

NEW

+162

pw::kvs::FlashPartition::IsRegionErased()

NEW

+140

pw::kvs::FlashPartition::Write()

NEW

+116

pw::kvs::FlashPartition::Erase()

NEW

+110

pw::kvs::FlashError::Check()

NEW

+100

pw::kvs::FakeFlashMemory::Erase()

NEW

+94

pw::kvs::FakeFlashMemory::Read()

NEW

+80

pw::kvs::FlashPartition::FlashPartition()

NEW

+68

pw::kvs::FlashPartition::Read()

NEW

+64

pw::kvs::FlashPartition::CheckBounds()

NEW

+48

pw::kvs::FakeFlashMemory::FlashAddressToMcuAddress()

NEW

+48

pw::kvs::FakeFlashMemoryBuffer<>

NEW

+48

pw::kvs::FlashPartition

NEW

+28

_GLOBAL__sub_I__ZN2pw3kvs10FlashError5CheckENS_4spanIS1_Lj4294967295EEEmj

NEW

+28

__cxa_atexit

NEW

+26

pw::kvs::FlashPartition::size_bytes()

NEW

+20

pw::kvs::FakeFlashMemoryBuffer<>::~FakeFlashMemoryBuffer()

NEW

+20

pw::kvs::FlashPartition::PartitionToFlashAddress()

NEW

+18

pw::kvs::FlashPartition::~FlashPartition()

NEW

+16

_GLOBAL__sub_I_test_partition

NEW

+16

__wrap_free

NEW

+10

__aeabi_atexit

NEW

+8

[section .ARM]

NEW

+8

operator delete()

NEW

+8

pw::kvs::FlashTestPartition()

NEW

+6

pw::kvs::FakeFlashMemory::Disable()

NEW

+6

pw::kvs::FakeFlashMemory::Enable()

NEW

+6

pw::kvs::FlashMemory::SelfTest()

NEW

+6

pw::kvs::FlashPartition::Init()

NEW

+6

pw::kvs::FlashPartition::sector_size_bytes()

NEW

+4

pw::kvs::FakeFlashMemory::IsEnabled()

NEW

+4

pw::kvs::FlashPartition::sector_count()

+1,852

Storage Allocation#

Pigweed AI summary: KVS requires more storage space than the actual size of the key-value data stored due to the always free sector required for garbage collection and the "write and garbage collect later" approach. It works best with stored data being less than 50% of the available storage and poorly with stored data being more than 75%. Applications that prefer/need to do garbage collection at scheduled times or that write heavily can benefit from additional flash store space. The flash storage used by KVS is multiplied by the redundancy

KVS requires more storage space than the size of the key-value data stored. This is due to the always free sector required for garbage collection and the “write and garbage collect later” approach KVS uses.

KVS works poorly with stored data being more than 75% of the available storage. It works best with stored data being less than 50% of the available storage. For applications that prefer/need to do garbage collection at scheduled times or that write very heavily can benefit from additional flash store space.

The flash storage used by KVS is multiplied by redundancy used. A redundancy of 2 will use twice the storage.

Key-Value Entry#

Pigweed AI summary: This paragraph explains the structure and behavior of key-value (KV) entries, which consist of header/metadata, key data, and value data. Each KV entry is contained within a single flash sector and has a maximum size equal to the partition sector size. KV entries are appended to sectors over time and can be written to multiple active sectors simultaneously. When a key is rewritten, the new entry is stored at a new location with a higher transaction ID, while the previous entry remains unaltered but is

Each key-value (KV) entry consists of a header/metadata, the key data, and value data. Individual KV entries are contained within a single flash sector (do not cross sector boundaries). Because of this the maximum KV entry size is the partition sector size.

KV entries are appended as needed to sectors, with append operations spread over time. Each individual KV entry is written completely as a single high-level operation. KV entries are appended to a sector as long as space is available for a given KV entry. Multiple sectors can be active for writing at any time.

When a key is rewritten (writing a new KV entry of an existing key), the KV entry is stored at a new location that may or may not be located in the same sector as the previous entry for that key. The new entry uses a transaction ID greater than the previous transaction ID. The previous KV entry for that key remains unaltered “on-disk” but is considered “stale”. It is garbage collected at some future time.

Redundancy#

Pigweed AI summary: KVS allows for storing redundant copies of KV entries, with N total copies for a given redundancy level. These copies are stored in different sectors to protect against corruption or sector loss. However, redundancy also increases flash and RAM usage.

KVS supports storing redundant copies of KV entries. For a given redundancy level (N), N total copies of each KV entry are stored. Redundant copies are always stored in different sectors. This protects against corruption or even full sector loss in N-1 sectors without data loss.

Redundancy increases flash usage proportional to the redundancy level. The RAM usage for KVS internal state has a small increase with redundancy.

Garbage Collection#

Pigweed AI summary: This section discusses the process of garbage collection in a Key-Value Store (KVS). Stale KV entries are reclaimed through a base garbage collection operation, which is done one sector at a time. The KVS always keeps at least one sector free for garbage collection, and this free sector is rotated as part of wear leveling. Full Maintenance and Heavy Maintenance are two types of garbage collection, with Heavy Maintenance removing entries for deleted keys. Garbage collection can be performed manually or automatically to make space for

Storage space occupied by stale KV entries is reclaimed and made available for reuse through a garbage collection process. The base garbage collection operation is done to reclaim one sector at a time.

KVS always keeps at least one sector free at all times to ensure the ability to garbage collect. This free sector is used to copy valid entries from the sector to be garbage collected before erasing the sector to be garbage collected. The always free sector is rotated as part of the KVS wear leveling.

Full Maintenance does garbage collection of all sectors except those that have current valid KV entries.

Heavy Maintenance does garbage collection of all sectors, including removing entries for deleted keys. Use strong caution when doing Heavy Maintenance as it can, compared to Full Maintenance, result in a significant amount of moving valid entries,

Garbage collection can be performed by request of higher level software or automatically as needed to make space available to write new entries.

Flash wear management#

Pigweed AI summary: Flash wear management is achieved through wear leveling, which involves cycling the selection of the next sector to write to in order to spread flash wear across all free sectors. New writes and rewrites of key-values prefer sectors that are already in use, with new sectors used only when no in-use sectors have enough available space. The first available sector is searched for, starting from the current write sector +1 and wrapping around to the end of the partition. This reduces wear on any single sector and erase count is

Wear leveling is accomplished by cycling selection of the next sector to write to. This cycling spreads flash wear across all free sectors so that no one sector is prematurely worn out.

Wear leveling through cycling selection of next sector to write

  • Location of new writes/rewrites of key-values will prefer sectors already in-use (partially filled), with new (blank) sectors used when no in-use sectors have large enough available space for the new write

  • New (blank) sectors selected cycle sequentially between available free sectors

  • Search for the first available sector, starting from current write sector + 1 and wrap around to start at the end of partition.

  • This spreads the erase/write cycles for heavily written/rewritten key-values across all free sectors, reducing wear on any single sector

  • Erase count is not considered as part of the wear leveling decision making process

  • Sectors with already written key-values that are not modified will remain in the original sector and not participate in wear-leveling, so long as the key-values in the sector remain unchanged

Configuration#

PW_KVS_MAX_FLASH_ALIGNMENT#

The maximum flash alignment supported.

PW_KVS_REMOVE_DELETED_KEYS_IN_HEAVY_MAINTENANCE#

Whether to remove deleted keys in heavy maintanence. This feature costs some code size (>1KB) and is only necessary if arbitrary key names are used. Without this feature, deleted key entries can fill the KVS, making it impossible to add more keys, even though most keys are deleted.