feat(metrics): add binomial splitting for single-image FRC/FSC#40
feat(metrics): add binomial splitting for single-image FRC/FSC#40alxndrkalinin merged 24 commits intomainfrom
Conversation
Add a procedural commit workflow checklist (list → group by concept → plan N commits → execute) to prevent grouping changes by workflow step. Simplify CLAUDE.md to just reference AGENTS.md. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Implement random binomial splitting (Rieger et al., Optics Express 2024) as an alternative to checkerboard splitting for single-image FRC/FSC. Each pixel's photon count is split into n1 ~ Binomial(n, 0.5) and n2 = n - n1, producing two noise-independent images with preserved Poisson statistics and no spatial subsampling. New `split_type="binomial"` parameter threads through calculate_frc, frc_resolution, fsc_resolution, and calculate_sectioned_fsc. Supports counts mode (raw camera data with gain/offset/readout noise correction) and poisson_thinning mode (fallback for float/deconvolved images). Multi-repeat averaging via `n_repeats` produces correlation-std and resolution-std for uncertainty quantification. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Use cubic.cuda abstraction (to_same_device/asnumpy) in binomial_split - Use nanmean/nanstd for multi-repeat FRC averaging - Add xy_std/z_std keys to single-pass binomial fsc_resolution - Document resolution averaging asymmetry in FRC - Add ndim, gain, readout_noise_rms validation to binomial_split - Fix gain convention docstring in binomial_split Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
- Warn when n_repeats>1 is silently ignored (checkerboard or two-image) - Clarify n_repeats docstring: requires single-image + binomial mode - Add xy_std/z_std keys to fsc_resolution mask backend return path Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
…, iterators.py - preprocess_images: assert image2 is not None in else branch and before hamming_window - iterators.py: accept int | float for d_extract_angle - calculate_sectioned_fsc: use eff_spacing for FourierCorrelationAnalysis, add shape3d for sectioned_bin_id - fsc_resolution: assert spacing_list is not None after _normalize_spacing guard - calculate_frc/fsc_resolution: assert original_image1 is not None in reverse/do_average blocks - grid_crop_resolution (both): raise ValueError for None spacing, use spacing_list for indexing, annotate aggregate_fn as Callable, rename pad_size to half/pad_arg to avoid int/tuple reassignment - image_utils.py: merge poisson_thinning img1/img2 assignments with to_same_device - bandlimited.py: type _APODIZATION_FNS as dict[str, Callable] to fix apo_fn operator errors Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
- Add targeted type-ignore comments for pre-existing mypy errors - Narrow exception handling in estimate_cutoff to (ValueError, RuntimeError, TypeError) - Replace print/assert with logging/ValueError in average_precision - Fix typos in README - Enforce mypy type checking (remove ignore_errors = true from pyproject.toml) - Add class-level annotations to CUDAManager, cast mesh volume - Align spacing to Sequence[float] in dcr.py internals - Replace ArrayLike with ndarray in segmentation modules Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
- Replace assert with RuntimeError in _calculate_frc_single_pass for original_image1 None check (assert stripped by python -O) - Warn when n_repeats>1 is silently ignored on deprecated mask backend in fsc_resolution Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Reviewer's GuideAdds a new binomial photon-count splitting method for single-image FRC/FSC (with counts and Poisson-thinning modes and multi-repeat uncertainty estimates), threads the new split configuration through FRC/FSC APIs, and tightens type-safety and error handling across metrics, segmentation, CUDA, and documentation to satisfy mypy/ruff constraints. Class diagram for FourierCorrelationData with new statistics fieldsclassDiagram
class FourierCorrelationData {
+FixedDictionary correlation
+FixedDictionary resolution
+__init__(data)
}
class FourierCorrelationDataCollection {
+__getitem__(index) FourierCorrelationData
+__setitem__(index, value) void
+__iter__() FourierCorrelationData
}
class FixedDictionary {
-keys: list~str~
-data: dict~str, Any~
+__getitem__(key) Any
+__setitem__(key, value) void
}
FourierCorrelationData --> FixedDictionary : uses correlation
FourierCorrelationData --> FixedDictionary : uses resolution
%% Correlation keys
class CorrelationFields {
+correlation: np.ndarray
+frequency: np.ndarray
+points_x_bin: np.ndarray
+curve_fit: np.ndarray
+curve_fit_coefficients: np.ndarray
+correlation_std: np.ndarray
}
%% Resolution keys
class ResolutionFields {
+threshold: float
+criterion: str
+resolution_point: float
+resolution_threshold_coefficients: np.ndarray
+resolution: float
+spacing: float
+resolution_std: float
}
FourierCorrelationData o-- CorrelationFields : logical_fields
FourierCorrelationData o-- ResolutionFields : logical_fields
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 3 issues, and left some high level feedback:
- The informational log on
n_repeats == 1for binomial splits (_BINOMIAL_SINGLE_REPEAT_MSG) is emitted on every call and may be noisy in real pipelines; consider downgrading this tologger.debug, or gating it with a one-time flag or usingwarnings.warnif you specifically want user-facing feedback. - The binomial FRC/FSC parameter set (
split_type,counts_mode,gain,offset,readout_noise_rms,n_repeats,rng) is threaded manually through several public functions; consider introducing a small configuration object or helper to reduce duplication and the risk of parameter drift across these call sites.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The informational log on `n_repeats == 1` for binomial splits (`_BINOMIAL_SINGLE_REPEAT_MSG`) is emitted on every call and may be noisy in real pipelines; consider downgrading this to `logger.debug`, or gating it with a one-time flag or using `warnings.warn` if you specifically want user-facing feedback.
- The binomial FRC/FSC parameter set (`split_type`, `counts_mode`, `gain`, `offset`, `readout_noise_rms`, `n_repeats`, `rng`) is threaded manually through several public functions; consider introducing a small configuration object or helper to reduce duplication and the risk of parameter drift across these call sites.
## Individual Comments
### Comment 1
<location path="cubic/preprocessing/deconvolution.py" line_range="301-303" />
<code_context>
for i, result in enumerate(results):
- results[i]["iter_image"] = asnumpy(result["iter_image"])
+ iter_image = result["iter_image"]
+ assert isinstance(iter_image, np.ndarray)
+ results[i]["iter_image"] = asnumpy(iter_image)
return (thresh_iter, results)
</code_context>
<issue_to_address>
**issue (bug_risk):** The new ndarray assertion in `decon_observer` may break GPU / arraylike results.
This assertion changes behavior: `result['iter_image']` used to be passed directly to `asnumpy`, allowing any arraylike your CUDA helpers support (e.g., CuPy). Now `assert isinstance(iter_image, np.ndarray)` will fail for CuPy or other non-NumPy arrays that `asnumpy` can still handle. If you just want to guard against `None` or non-array values, consider dropping the assertion, checking minimal properties (e.g., shape/dtype), or using a duck-typed check (e.g., `hasattr(iter_image, 'shape')`) instead of requiring `np.ndarray`.
</issue_to_address>
### Comment 2
<location path="tests/metrics/spectral/test_frc.py" line_range="524-533" />
<code_context>
+def test_frc_binomial_readout_noise_plateau() -> None:
</code_context>
<issue_to_address>
**issue (testing):** Readout noise plateau test assertion does not match the stated expectation
The docstring states that "Readout noise correction reduces high-frequency FRC plateau bias", but `tail_with <= tail_no + 0.05` permits the corrected plateau to be slightly higher than the uncorrected one. To better enforce the intended behavior, consider an assertion like `tail_with <= tail_no` (optionally with a small tolerance for numerical noise, e.g. `tail_with <= tail_no * 1.01`), so the test fails when the correction degrades the plateau instead of allowing non-improving or slightly worse outcomes to pass.
</issue_to_address>
### Comment 3
<location path="CLAUDE.md" line_range="3" />
<code_context>
-This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
-
[email protected]
+Read @AGENTS.md file and follow instructions closely.
</code_context>
<issue_to_address>
**nitpick (typo):** Add missing articles/possessive for smoother grammar in this sentence.
Consider rephrasing to: `Read the @AGENTS.md file and follow its instructions closely.` This fixes the missing article and possessive while keeping the same meaning.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
- grid_crop_resolution / five_crop_resolution: replace assert for input validation with raise ValueError (asserts are stripped by python -O) - fsc_resolution: replace assert spacing_list is not None with RuntimeError - Remove narration comment "No cutoff correction for binomial" in calculate_frc Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
- Extract calibration warning thresholds as named constants in image_utils.py - Add clipping-fraction warning after readout noise correction — large readout_noise_rms relative to photon counts zeros many pixels silently - Replace assert image2 is not None with raise RuntimeError in preprocess_images — asserts are stripped by python -O - Warn when counts_mode != 'counts' with split_type='checkerboard' in calculate_frc and frc_resolution — params are silently ignored there - Document RNG state mutation in _make_repeat_rngs docstring Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
…s_mode warning - test_frc_binomial_camera_calibration: verifies gain/offset/readout_noise_rms flow through frc_resolution end-to-end with simulated raw camera data - test_counts_mode_warns_without_binomial: verifies UserWarning is raised when counts_mode='poisson_thinning' is passed with split_type='checkerboard' Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
n1_cpu is int64; comparing against float32 half_var upcasts to float64, giving a numerically different result than what actually gets clipped (float32 img1). Cast to float32 before comparison to match the clipping dtype. Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
assert isinstance(iter_image, np.ndarray) fails for CuPy arrays, which asnumpy() handles correctly. The assert was overly strict and violated the device-agnostic contract of the module. Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
The fixed seed produces tail_with=0.039 vs tail_no=0.121, so the +0.05 additive slack was masking potential regressions where the correction makes things worse. Remove the slack to enforce that corrected plateau is strictly no worse than uncorrected. Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
There was a problem hiding this comment.
Pull request overview
Adds a binomial photon-count splitting strategy for single-image FRC/FSC (as an alternative to checkerboard), threads the new parameters through spectral metrics APIs, and tightens typing/docs/tests across the project.
Changes:
- Add
binomial_splitand integratesplit_type="binomial"into FRC/FSC workflows (incl. multi-repeat uncertainty outputs). - Extend/adjust spectral metrics (FRC/FSC/DCR) typing, spacing handling, and internal helpers.
- Add/expand tests and documentation; tighten mypy configuration and various type hints.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_image_utils.py | Adds unit tests for binomial_split behavior and edge cases. |
| tests/metrics/spectral/test_frc.py | Adds FRC/FSC tests covering binomial splitting, repeats, calibration/read-noise behavior. |
| README.md | Fixes typos/wording in project overview. |
| pyproject.toml | Enables stricter mypy by removing ignore_errors. |
| cubic/segmentation/segment_utils.py | Updates type hints and watershed flow; adds assertions and refactors CPU masking. |
| cubic/segmentation/cellpose.py | Replaces npt.ArrayLike with np.ndarray typing in signatures. |
| cubic/metrics/spectral/README.md | Documents splitting methods and adds binomial usage guidance. |
| cubic/metrics/spectral/iterators.py | Broadens d_extract_angle type to allow floats. |
| cubic/metrics/spectral/frc.py | Implements binomial split plumbing, repeat averaging, and cutoff behavior changes. |
| cubic/metrics/spectral/dcr.py | Adjusts spacing typing and sectioned DCR internals. |
| cubic/metrics/spectral/analysis.py | Extends result containers to include std fields. |
| cubic/metrics/bandlimited.py | Tightens typing for apodization function registry. |
| cubic/metrics/average_precision.py | Hoists optional numba JIT helper and replaces asserts with exceptions. |
| cubic/image_utils.py | Introduces binomial_split implementation and related calibration warnings. |
| cubic/feature/voxel.py | Clarifies spacing handling for cucim memoization. |
| cubic/feature/mesh.py | Attempts to harden mesh feature extraction against missing volume values. |
| cubic/cuda.py | Adds annotations for CUDAManager singleton fields. |
| CLAUDE.md | Updates contributor guidance pointer to AGENTS.md. |
| AGENTS.md | Adds a commit workflow checklist. |
Comments suppressed due to low confidence (1)
cubic/metrics/spectral/dcr.py:530
- This adds a new
# type: ignore[arg-type]when callingsectioned_bin_id. Sincesectioned_bin_idrequires atuple[int, int, int], consider constructing a properly-typedshape3d = (int(shape[0]), int(shape[1]), int(shape[2]))(similar to the approach used infrc.py) so mypy passes without suppressions.
# Get sectioned bin IDs
radial_id, angle_id = sectioned_bin_id(
shape, # type: ignore[arg-type]
r_edges,
angle_edges,
spacing=list(spacing) if spacing is not None else None,
exclude_axis_angle=exclude_axis_angle,
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Resolve conflicts from 8883d9c landing on main after our branch diverged. All conflicts resolved by keeping HEAD (superset of main's changes) plus applying main's label() -> np.ndarray return type fix that was lost in the conflict block. Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
…nt_watershed Same pattern as deconvolution.py: assert isinstance(distance, np.ndarray) fails for CuPy arrays returned by distance_transform_edt on GPU. Replace with targeted type: ignore comments on the downstream usages that mypy cannot narrow without the assert. Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
- frc.py: fix gain docstring in calculate_frc and fsc_resolution — gain is ADU/electron (electrons = (image - offset) / gain), not electrons/ADU as previously stated - mesh.py: replace `float(mesh.volume or 0.0)` with an explicit isfinite guard; NaN is truthy in Python so `nan or 0.0` == nan, which propagates silently through Extent/Solidity divisions Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
When n_repeats > 1, the averaged mean FRC curve can stay above the threshold at all frequencies (especially with poisson_thinning on moderate-signal data), yielding NaN for the mean-curve resolution even though individual repeats produce valid resolutions. Fall back to the mean of per-repeat resolutions, matching the FSC multi-repeat path which already uses nanmean. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Side-by-side comparison of the two single-image splitting strategies for FRC (2D) and FSC (3D): checkerboard (Koho 2019) vs binomial (Rieger 2024). Demonstrates poisson_thinning mode, multi-repeat uncertainty quantification, and directional 3D FSC on pollen data. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…on floor Binomial counts split produces anticorrelated noise (n1 + n2 = n). The FRC hist backend took abs() of the cross-spectrum numerator, flipping negative noise correlation to a false ~0.3 positive floor at high frequencies. Preserve the sign when signed=True so the FRC correctly goes negative in the noise-dominated regime. The fix threads a signed flag from calculate_frc through _calculate_frc_single_pass to _calculate_frc_core. It is enabled only for split_type="binomial" + counts_mode="counts". Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…odes - Show all three FRC variants: checkerboard, binomial counts, binomial poisson_thinning (was only showing poisson_thinning) - Fix bug: res_checker was assigned from frc_binom instead of frc_checker - Explain why poisson_thinning gives different results (measures noise model self-consistency, not physical resolution) - Note that checkerboard matches Koho et al. 2019 published values - Set y-axis to [-0.5, 1.05] to show signed FRC negative tail Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add low-frequency bin exclusion for DC / background / autofluorescence suppression. nbins_low zeros the first N bins (hard cutoff); taper_low applies a half-cosine ramp (soft transition). taper_low takes precedence when both are set. Ported from feat/experimental-spectral-metrics without the experimental weighting methods (smooth_wiener, snr2, etc.) that remain parked. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The half-cosine ramp used arange(1, n+1)/n which started at 1/n instead of 0, so taper_low=1 had multiplier 1.0 (no taper) and DC was never fully suppressed. Use arange(n)/(n-1) so the first bin is 0 and the last bin is 1, with a special case for n=1. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Summary
binomial_splittoimage_utils.py: implements photon-count splitting (Rieger et al., Optics Express 2024) as an alternative to checkerboard for single-image FRC/FSC. Supports counts mode (with gain/offset/readout noise correction) and Poisson thinning fallback. Multi-repeat averaging vian_repeatsgives correlation-std and resolution-std.split_type="binomial"throughcalculate_frc,frc_resolution,fsc_resolution,calculate_sectioned_fsc.# type: ignoresuppressions.Test plan
pytest tests/metrics/spectral/test_frc.py— binomial split tests (223 lines)pytest tests/test_image_utils.py— binomial_split unit tests (107 lines)uv run mypy --ignore-missing-imports cubic/— no new errorsuv run ruff check . && uv run ruff format --check .— clean🤖 Generated with Claude Code
Summary by Sourcery
Add support for binomial photon-count-based splitting as an alternative to checkerboard for single-image FRC/FSC, including multi-repeat averaging with uncertainty estimates, while tightening typing, validation, and documentation across metrics, segmentation, and feature modules.
New Features:
Bug Fixes:
Enhancements:
Documentation:
Tests:
Chores: