PINNs-Torch, Physics-informed Neural Networks (PINNs) implemented in PyTorch.
-
Updated
Feb 8, 2026 - Python
PINNs-Torch, Physics-informed Neural Networks (PINNs) implemented in PyTorch.
Foundry materializes CUDA graphs along with its execution context to disk to support fast cold start of serving engines.
A comprehensive guide to using CUDA Graphs effectively with PyTorch, covering CUDA fundamentals, PyTorch integration, Megatron-LM implementations, and practical troubleshooting.
This repository contains the source code for Grape.
Enhancing CUDA Intra-Streaming-Multiprocessor Parallelism for Large Language Models via Fine-Grained Task Graph
A minimalist, production-ready framework for training small language models through 4 sequential stages (Pretraining, SFT, Alignment, Reasoning). Optimized for extreme performance with a hardware-aware inference engine featuring CUDA Graphs, Static KV Caching, and Topic-Aware Conversational Memory.
Enable large-scale transformer model training with GPU-optimized tools and parallelism strategies for research and custom workflows.
Add a description, image, and links to the cuda-graph topic page so that developers can more easily learn about it.
To associate your repository with the cuda-graph topic, visit your repo's landing page and select "manage topics."