Feat/shared tensorstore context by edyoshikun · Pull Request #407 · czbiohub-sf/iohub

edyoshikun · 2026-04-24T18:01:17Z

Add recheck_cached_data to TensorStoreConfig and forward it into
ts.open in TensorStoreImplementation.open_array. The option controls
whether cached chunk data is revalidated on every read (the TensorStore
driver default) or only at open time ("open"), which is the recommended
setting for long-running read-heavy workloads on networked filesystems
(NFS/VAST) where revalidation costs one stat/GETATTR per chunk per read.

Covered by a parametrized test that monkey-patches _ts_open to assert
the kwarg reaches TensorStore for each configured value and is absent when
unset.

Co-Authored-By: Claude Opus 4.7 (1M context) [email protected]

Add ``recheck_cached_data`` to ``TensorStoreConfig`` and forward it into ``ts.open`` in ``TensorStoreImplementation.open_array``. The option controls whether cached chunk data is revalidated on every read (the TensorStore driver default) or only at open time (``"open"``), which is the recommended setting for long-running read-heavy workloads on networked filesystems (NFS/VAST) where revalidation costs one stat/GETATTR per chunk per read. ``None`` (default) preserves existing behaviour by omitting the kwarg so the TensorStore driver keeps its own default. ``True``, ``False``, and ``"open"`` are forwarded verbatim. Covered by a parametrized test that monkey-patches ``_ts_open`` to assert the kwarg reaches TensorStore for each configured value and is absent when unset. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Let callers reuse a single ts.Context across many open_ome_zarr calls. Problem: every open_ome_zarr(implementation="tensorstore", ...) creates a fresh TensorStoreImplementation, and each instance lazily builds its own ts.Context. Workloads that open dozens of plates (multi-experiment training) end up with N disjoint cache pools and thread pools, none of which share chunk data. This silently regresses over iohub 0.2.x's Position.tensorstore(context=...) API that allowed Context sharing. Fix: add shared_context: Any = None on TensorStoreConfig. When set, TensorStoreImplementation._context() returns it verbatim instead of building a fresh Context from the other knobs. Fully backwards- compatible — default (None) preserves existing per-instance behavior. Usage: import tensorstore as ts from iohub import open_ome_zarr from iohub.core.config import TensorStoreConfig shared_ctx = ts.Context({"cache_pool": {"total_bytes_limit": 4_000_000_000}}) cfg = TensorStoreConfig(shared_context=shared_ctx, recheck_cached_data="open") plate_a = open_ome_zarr(path_a, implementation="tensorstore", implementation_config=cfg) plate_b = open_ome_zarr(path_b, implementation="tensorstore", implementation_config=cfg) # plate_a and plate_b now share one Context, one cache pool, one thread pool. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

edyoshikun and others added 4 commits April 22, 2026 16:44

delete redudant text

284df02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/shared tensorstore context#407

Feat/shared tensorstore context#407
edyoshikun wants to merge 4 commits intomainfrom
feat/shared-tensorstore-context

edyoshikun commented Apr 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

edyoshikun commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

edyoshikun commented Apr 24, 2026 •

edited

Loading