feat(kvp): add store trait and Hyper-V KVP storage crate#288
feat(kvp): add store trait and Hyper-V KVP storage crate#288peytonr18 wants to merge 16 commits intoAzure:mainfrom
Conversation
b134a98 to
f435f70
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new standalone workspace crate (libazureinit-kvp) that defines a storage abstraction (KvpStore) and a production Hyper-V pool-file implementation (HyperVKvpStore), along with updated technical reference documentation.
Changes:
- Introduces
libazureinit-kvpcrate withKvpStoreandKvpLimits(Hyper-V/Azure presets). - Implements Hyper-V binary pool-file backend with flock-based concurrency and stale-file truncation support.
- Rewrites
doc/libazurekvp.mdas a technical reference for the new crate/API.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
Cargo.toml |
Adds libazureinit-kvp to the workspace members. |
libazureinit-kvp/Cargo.toml |
Defines the new crate and its dependencies (fs2, sysinfo). |
libazureinit-kvp/src/lib.rs |
Defines KvpStore, KvpLimits, and exports HyperVKvpStore plus size constants. |
libazureinit-kvp/src/hyperv.rs |
Implements Hyper-V pool-file encoding/decoding and the KvpStore backend, plus unit tests. |
doc/libazurekvp.md |
Updates documentation to describe the new crate’s record format, semantics, truncation behavior, and limits. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
bpryan99
left a comment
There was a problem hiding this comment.
My comments were addressed, LGTM! Great work.
| #[repr(u8)] | ||
| pub enum KvpPool { | ||
| /// Host-to-guest data pushed by the host admin (`.kvp_pool_0`). | ||
| External = 0, |
There was a problem hiding this comment.
these should be the file paths so we don't new two params to new()
There was a problem hiding this comment.
There are kinda two ways to do this, depending if you want KvpPoolStore to be pool oriented or path oriented.
The first is this one, which just does an impl and new() will just take in the path (or a custom path if the user wants instead)
libazureinit-kvp/src/kvp_pool.rs
/// Hyper-V KVP pool indices.
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
#[repr(u8)]
pub enum KvpPool {
/// Host-to-guest data pushed by the host admin (`.kvp_pool_0`).
External = 0,
/// Guest-to-host data; cloud-init / azure-init write here (`.kvp_pool_1`).
Guest = 1,
// ...
}
impl From<KvpPool> for PathBuf {
fn from(pool: KvpPool) -> Self {
pool.default_path()
}
}
impl KvpPoolStore {
/// Create a new KVP pool store.
///
/// - `path`: pool file path. Pass a [`KvpPool`] variant for a
/// well-known pool (e.g. `KvpPool::Guest`), or any
/// [`PathBuf`]/[`&Path`] for a custom location.
/// - `mode`: size-limit policy (see [`PoolMode`]).
/// - `clear_if_stale`: if `true`, clears stale data from a
/// previous boot.
pub fn new(
path: impl Into<PathBuf>,
mode: PoolMode,
clear_if_stale: bool,
) -> Result<Self, KvpError> {
let store = Self {
path: path.into(),
mode,
};
if clear_if_stale && store.is_stale()? {
store.clear()?;
}
Ok(store)
}
}
Or, we can semantically change KvpPoolStore to be path oriented as a whole instead.
use std::path::{Path, PathBuf};
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum KvpPool {
External,
Guest,
Auto,
AutoExternal,
AutoInternal,
}
impl KvpPool {
pub fn as_path(self) -> &'static Path {
match self {
Self::External => Path::new("/var/lib/hyperv/.kvp_pool_0"),
Self::Guest => Path::new("/var/lib/hyperv/.kvp_pool_1"),
Self::Auto => Path::new("/var/lib/hyperv/.kvp_pool_2"),
Self::AutoExternal => Path::new("/var/lib/hyperv/.kvp_pool_3"),
Self::AutoInternal => Path::new("/var/lib/hyperv/.kvp_pool_4"),
}
}
}
impl From<KvpPool> for PathBuf {
fn from(pool: KvpPool) -> Self {
pool.as_path().to_path_buf()
}
}
I was leaning towards the first approach, but am open to either depending on what you envisioned here!
| /// - `mode`: size-limit policy (see [`PoolMode`]). | ||
| /// - `truncate_on_stale`: if `true`, clears stale data from a | ||
| /// previous boot. | ||
| pub fn new( |
There was a problem hiding this comment.
let's leave path out of the default new....
impl KvpPool {
pub fn filename(self) -> &'static str {
match self {
Self::External => ".kvp_pool_1",
Self::Guest => ".kvp_pool_2",
}
}
pub fn default_dir() -> &'static Path {
Path::new("/var/lib/hyperv")
}
pub fn default_path(self) -> PathBuf {
Self::default_dir().join(self.filename())
}
}
#[derive(Debug)]
pub struct KvpStore {
pool: KvpPool,
path: PathBuf,
mode: PoolMode,
}
impl KvpStore {
/// Open using the default Hyper-V directory.
pub fn new(pool: KvpPool, mode: PoolMode) -> Result<Self, KvpError> {
Self::new_in(pool, KvpPool::default_dir(), mode)
}
/// Open using a custom directory (filename derived from pool).
pub fn new_in(
pool: KvpPool,
dir: impl AsRef<Path>,
mode: PoolMode,
) -> Result<Self, KvpError> {
let path = dir.as_ref().join(pool.filename());
Ok(Self { pool, path, mode })
}
pub fn path(&self) -> &Path {
&self.path
}
...| pub use kvp_pool::KvpPoolStore; | ||
|
|
||
| /// Key-value store with Hyper-V KVP semantics. | ||
| pub trait KvpStore: Send + Sync { |
There was a problem hiding this comment.
let's remove this in favor of the definition in kvp_pool.rs
and let's do a little restructuring, e.g.:
src/
├── lib.rs
├── error.rs
└── store.rs
and lib.rs is something like:
mod error;
mod store;
pub use error::KvpError;
pub use store::{KvpPool, KvpStore, PoolMode};3fa165a to
cfc44ac
Compare
| /// Release a lock on the entire file. | ||
| /// | ||
| /// Unlock errors are non-actionable (the fd is about to close, which | ||
| /// releases the lock unconditionally), so the return value is discarded. |
There was a problem hiding this comment.
it seems like this should raise the error, but then have the caller ignore it if it's not needed.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #288 +/- ##
==========================================
+ Coverage 84.77% 89.34% +4.56%
==========================================
Files 17 19 +2
Lines 3435 4907 +1472
==========================================
+ Hits 2912 4384 +1472
Misses 523 523 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
c5531ac to
4e1f57a
Compare
…ation; add PoolMode to handle size limits
…ics and JSON dump Overhaul the KvpStore trait into a pure customer-facing interface with no default implementations or backend_* methods. All logic (validation, file I/O, locking) now lives in the KvpPoolStore implementation. Key changes: - Replace append-based write with upsert (insert-or-update) so each key has exactly one record in the file. Eliminates entries_raw and last-write-wins deduplication. - Move validate_key/validate_value from trait defaults to private KvpPoolStore methods. - Decouple read validation from PoolMode: reads accept keys up to the full wire-format max (512 bytes) regardless of mode; only writes are restricted by PoolMode. - Add len(), is_empty(), and dump() (JSON via serde_json) to the trait. - Add 4 multi-threaded concurrency tests covering concurrent upserts to different keys, same key, mixed readers/writers, and key cap boundary.
… locks Use libc fcntl(F_OFD_SETLKW/F_OFD_SETLK) for shared/exclusive/unlock operations so pool file locking interoperates with Linux KVP consumers that use fcntl record locks. Also replace fs2 with libc and keep existing pool behavior/tests intact.
Introduce explicit insert (overwrite existing key) and append (always add record) APIs, refactor pool operations to iterator-based in-place updates/deletes, and document record-count vs unique-key behavior with expanded concurrency/integrity test coverage.
…verage - Split kvp_pool.rs into error.rs (KvpError) and store.rs (KvpStore trait + KvpPoolStore) - lib.rs reduced to module declarations and re-exports - Replace infallible dump_json map_err with expect - Inline get_uptime into boot_time - Add permission-denied, stale-clear, and missing-key-delete tests - 98.5% line coverage; remaining gaps are untestable kernel-level fcntl error paths
Add 8 tests for uncovered paths: invalid UTF-8 decoding, size_hint, iterator read errors, fcntl lock failures, and FIFO error propagation. Minor production cleanups (no behavior change): boot_time() uses .expect instead of unreachable .map_err, lock context message moved into fcntl_lock_* functions, fcntl_unlock changed to void return.
- Replace validate_key_for_read and per-method validation with consistent standalone validate_key/validate_value free functions. - Remove key validation from read and delete (read-path leniency allows Safe-mode stores to read keys written in Unsafe mode). - Move is_empty default impl to the KvpStore trait. - Narrow decode_record/encode_record visibility to private. - Use open_for_read_write helper in clear().
- Merge KvpStore trait methods directly into impl KvpPoolStore. The trait added indirection without a consumer; testability is already covered by KvpPoolStore + TempDir. Can be re-introduced if a second backend is ever needed. - fcntl_unlock now returns io::Result<()> so callers can propagate or ignore unlock errors explicitly. clear() surfaces unlock errors via write_result.and(unlock_result), preserving the primary error when both fail. Drop impl ignores the result (can't return from Drop); KvpPoolIter::new error paths rely on fd-drop for lock release, consistent with how its other ? paths already work. - Update lib.rs exports and doc links to reflect the removed trait.
4e1f57a to
680057c
Compare
What this PR introduces
libazureinit-kvp.libazureinit-kvp/src/lib.rs,libazureinit-kvp/src/store.rs, andlibazureinit-kvp/src/error.rs.libc,serde,serde_json,sysinfo.libazureinit. A follow-up PR will replacelibazureinit/src/kvp.rswith this implementation.KVP design
Single concrete type — no trait indirection.
KvpPoolStore: file-backed KVP pool store.KvpPool: enum identifying one of the five Hyper-V pool indices.PoolMode: policy for write-path size limits.KvpError: error type.KvpPoolStoreAPIConstruction:
KvpPoolStore::new(pool, mode)— uses the default directory/var/lib/hyperv.KvpPoolStore::new_in(pool, dir, mode)— uses a caller-supplied directory (file name derived frompool).Reads / queries (do not enforce mode limits on the key argument; a
Safe-mode store can read keys written inUnsafemode):read(key) -> Result<Option<String>, KvpError>entries() -> Result<HashMap<String, String>, KvpError>(deduplicated, last-write-wins)dump() -> Result<Vec<(String, String)>, KvpError>(on-disk order, preserves duplicates)dump_json(path) -> Result<(), KvpError>(writes deduplicated entries as JSON)len() -> Result<usize, KvpError>(on-disk record count, not unique keys)is_empty() -> Result<bool, KvpError>is_stale() -> Result<bool, KvpError>Writes (enforce
PoolModekey and value size limits):insert(key, value) -> Result<(), KvpError>— upsert; overwrites first match in place, removes any further duplicates, and enforces a 1024 unique-key cap for new keys.append(key, value) -> Result<(), KvpError>— blind append; no duplicate check, no unique-key cap. Intended for log/telemetry workloads.delete(key) -> Result<bool, KvpError>— removes all records matching the key; returnstrueif any were present.clear() -> Result<(), KvpError>— truncates the file.clear_if_stale() -> Result<(), KvpError>— convenience:is_stale()+clear().Metadata accessors:
pool() -> KvpPoolmode() -> PoolModepath() -> &Pathmax_key_size() -> usizemax_value_size() -> usizePool identity
KvpPoolenumerates the five standard Hyper-V KVP pool indices:External.kvp_pool_0Guest.kvp_pool_1Auto.kvp_pool_2AutoExternal.kvp_pool_3AutoInternal.kvp_pool_4KvpPool::default_dir()returns/var/lib/hyperv.KvpPool::default_path()returns the full default path for a pool.On-disk format and limits
Hyper-V wire format: fixed-size records, identical across environments.
insert, notappend)PoolModecontrols write-side size limits:SafeUnsafeRead-side operations do not enforce these limits, so a
Safe-mode store can read keys written inUnsafemode.Locking change (
flock->fcntlOFD)fs2advisory locks withlibc::fcntlopen file description (OFD) locks.fcntl(F_OFD_SETLKW, F_RDLCK)fcntl(F_OFD_SETLKW, F_WRLCK)fcntl(F_OFD_SETLK, F_UNLCK)flockandfcntllocks are separate lock domains and do not coordinate. OFD locks are also per-fd (safe across threads sharing a process) while still being compatible with the POSIXfcntllocks used byhv_kvp_daemonand cloud-init.fcntl-style locks.fcntl_unlockreturnsio::Result<()>so callers can choose to surface unlock failures or ignore them (e.g. inDropimpls, where returning errors isn't possible).File I/O behavior
Standard file I/O via
OpenOptions,read_exact,write_all,set_len,seek,flush.open_for_read_write_create()explicitly sets.truncate(false)to avoid implicit truncation on open.KvpPoolIterthat holds the file, keeps the lock for its lifetime, and supports in-place value overwrite and swap-remove. The lock is released when the iterator is dropped.Flows:
insert— open read+write+create, take exclusive lock, iterate records (collecting unique keys into a set). If the key exists, overwrite the first match in place and remove any further duplicates. If the key is new, enforce the 1024 unique-key cap and append at EOF. Flush, unlock on drop.append— open read+write+create, take exclusive lock, seek to EOF, write a new record. No duplicate check, no unique-key cap. Flush, unlock on drop.delete— open read+write, take exclusive lock, iterate records. For each match, swap with the final record and truncate by one. Flush if any were removed, unlock on drop.clear— open read+write, take exclusive lock,set_len(0). Unlock error surfaces only if the truncate succeeded (viawrite_result.and(unlock_result)).read/entries/dump— open read-only, take shared lock, decode records. Unlock on drop.Stale-data handling
is_stale()compares the pool file's mtime to the system boot time (derived fromsysinfo::System::uptime()).clear_if_stale()is a convenience that clears the file only when stale.Validation
Key and value validation are implemented as free functions:
validate_key(key, max)— checks non-empty, no null bytes, and length ≤ max.validate_value(value, max)— checks length ≤ max.Both are called by
appendandinsertusing limits derived from the store'sPoolMode.readanddeletedo not validate the key argument — absent keys simply returnNone/false.Error model
KvpErrorvariants:EmptyKeyKeyContainsNullKeyTooLarge { max, actual }ValueTooLarge { max, actual }MaxUniqueKeysExceeded { max }Io(io::Error)Test coverage
88 tests cover:
SafeandUnsafemodes.insertsemantics: upsert, duplicate collapse, unique-key cap (allows 1024 then rejects 1025th, allows overwrite at cap, allows new key after delete, cap uses unique keys not record count).appendsemantics: no cap, duplicate-key last-wins on read, size validation.readleniency: accepts over-limit keys, returnsNonefor missing keys without validation errors.delete: removes single and duplicate records, returnsfalsefor missing file and missing key, no key validation.clear,clear_if_stale,len,is_empty,is_stale,is_stale_at_boot.dump_jsonround-trips with and without data.next().next().fcntlfailure paths (lock on mismatched fd modes, FIFO truncation / seek errors on Linux).KvpErrordisplay, source, andFrom<io::Error>conversion.Coverage Report