Skip to content

Optimize algorithmic complexity and fix go vet warnings#578

Open
mattn wants to merge 7 commits intogorgonia:masterfrom
mattn:optimize-performance
Open

Optimize algorithmic complexity and fix go vet warnings#578
mattn wants to merge 7 commits intogorgonia:masterfrom
mattn:optimize-performance

Conversation

@mattn
Copy link
Copy Markdown
Member

@mattn mattn commented Mar 12, 2026

Improve algorithmic complexity of hot-path graph operations (O(n²) → O(n) for Nodes.remove, Nodes.Equals, containsDuplicate), pre-populate the node ID cache to eliminate linear fallback searches, pre-size maps in the code generator, and fix go vet warnings.

Benchmark (benchstat)

                                                         │    before     │       after        │
                                                         │    sec/op     │    sec/op     diff  │
TypeSystem-16                                                 1.214µ          1.050µ     -13.5%
Reshape_Dense/simple-16                                       5.943µ          5.096µ     -14.3%
Reshape_Dense/simple_big_tensor-16                            5.713µ          5.039µ     -11.8%
Reshape_Dense/negative_dim1_1-16                              5.283µ          4.903µ      -7.2%
Reshape_Dense/negative_dim1_2-16                              4.829µ          4.930µ         ~
Reshape_Dense/negative_dim0_1-16                              4.996µ          5.037µ         ~
Reshape_Dense/negative_dims0.1_with_error-16                  1.203µ          1.353µ         ~
Reshape_Dense/devative_dim0_with_error-16                     1.553µ          1.586µ         ~
OneMil-16                                                     12.98m          12.77m      -1.6%
CTCLossForward/Benchmark_#1_float64_((4,_4,_4))-16           35.54µ          37.13µ         ~
CTCLossForward/Benchmark_#2_float64_((4,_1024,_4))-16        775.1µ          783.7µ         ~
CTCLossForward/Benchmark_#3_float64_((4,_1024,_1024))-16     33.35m          34.33m         ~
CTCLossForward/Benchmark_#4_float64_((4,_2048,_8))-16        1.779m          2.125m         ~
SoftmaxLargeOldAxis0-16                                       64.56           64.19          ~
TrainingConcurrent-16                                         180.6m          181.1m         ~
TrainingNonConcurrent-16                                      78.46m          76.39m      -2.6%
TapeMachineExecution-16                                       45.94m          46.83m         ~
geomean                                                       335.7µ          333.8µ    -0.57%

mattn added 3 commits March 12, 2026 22:30
- collections.go: Nodes.remove O(n²) → O(n) single-pass filter
- collections.go: Nodes.Equals O(n²) → O(n) with map-based lookup
- operations.go: containsDuplicate O(n²) → O(n) with seen-set
- graph.go: pre-populate byID cache in addToAll to avoid linear fallback
- compile.go: pre-size maps in newCodeGenerator with len(sorted)
- blas.go: use sync.RWMutex for read-heavy WhichBLAS path
- op_tensor.go: remove unreachable dead code after return
- examples/stacked_autoencoder: use keyed fields in jpeg.Options
- actions/cache v2 → v4 (v2 has been shut down)
- actions/checkout v2 → v4
- actions/setup-go v2 → v5
- codecov/codecov-action v1 → v4
- Replace deprecated set-output with GITHUB_OUTPUT
macos-latest now resolves to arm64 which does not support Go 1.15/1.16.
Pin to macos-13 (last Intel runner) for amd64 builds.
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 12, 2026

Codecov Report

❌ Patch coverage is 90.62500% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.18%. Comparing base (d7a3ce2) to head (cb7a057).

Files with missing lines Patch % Lines
operations.go 60.00% 1 Missing and 1 partial ⚠️
collections.go 91.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #578      +/-   ##
==========================================
+ Coverage   57.55%   60.18%   +2.63%     
==========================================
  Files          77       77              
  Lines       13202    14915    +1713     
==========================================
+ Hits         7598     8977    +1379     
- Misses       4663     4980     +317     
- Partials      941      958      +17     
Flag Coverage Δ
unittests 60.18% <90.62%> (+2.63%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

mattn added 4 commits March 12, 2026 22:42
macos-13 is deprecated and macos-latest is now arm64.
Go 1.15/1.16 don't support darwin/arm64, so bump to 1.21/1.22.
vecf32 asm stubs are amd64-only, so avx/sse build tags fail on
darwin/arm64. Restrict MacOS matrix to tag-less builds only.
Set ASSUME_NO_MOVING_GC_UNSAFE_RISK_IT_WITH=$(go env GOVERSION) for
all test steps. Also bump coverage.yml Go version from 1.15 to 1.22.
go env GOVERSION returns go1.22.10 but the library expects go1.22.
Use sed to strip the trailing .N patch version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant