Skip to content

Releases: bbeckca/mini-tensor

Mini-Tensor v0.1 – PyTorch-style tensor library in C++ with CUDA + IR tracing

13 Jul 15:04

Choose a tag to compare

This is the first stable release of mini-tensor, a PyTorch-inspired C++ tensor library designed for systems-level learning and performance experimentation.

🔧 Key Features

  • 2D and 3D tensors with slicing, reshaping, and broadcasting
  • Arithmetic and matrix ops (manual, Eigen, and CUDA-based)
  • Neural network layers: Linear, ReLU, Softmax, Sequential
  • IR tracing system with named tensor IDs and introspection
  • CUDA support for mat_mul_cuda() and bmm_cuda()
  • Fused CUDA kernel: bmm_add_cuda() for matmul + bias
  • Benchmarks, reproducible builds, and device-safe memory

📊 Performance

  • Up to 600×–700× speedup on batched matmul (GPU vs CPU)
  • Benchmarks shown in README and demo.md

🧠 Why This Exists

Built as a solo project to explore inference system internals, GPU kernel integration, and forward-pass execution from first principles.


🔗 Full code, tests, and benchmarks: README