- Berlin, Germany
-
08:45
(UTC +02:00) - @krasul
Highlights
- Pro
Stars
- All languages
- Assembly
- C
- C#
- C++
- CSS
- Clojure
- Cuda
- Cython
- Dockerfile
- Erlang
- Go
- Groff
- HTML
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- Lean
- Lua
- MATLAB
- MLIR
- Makefile
- Mojo
- Objective-C
- Objective-C++
- Perl
- Processing
- Python
- R
- Rich Text Format
- Ruby
- Rust
- SCSS
- Scala
- Shell
- Svelte
- Swift
- TeX
- TypeScript
- Vue
- XSLT
🤗 ml-intern: an open-source ML engineer that reads papers, trains models, and ships ML models
Simple & Scalable Pretraining for Neural Architecture Research
Cisco Time Series Model is a continued pretrained time series forecasting model developed by Cisco.
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
Unified benchmarking framework for time series forecasting, comparing traditional and foundation models with automated pipelines and isolated execution.
Dissecting the Duck's Innards — A DuckDB-based course on the Design and Implementation of Database System Internals
lewtun / parameter-golf
Forked from openai/parameter-golfTrain the smallest LM you can that fits in 16MB. Best model wins!
MiSS is a novel PEFT method that features a low-rank structure but introduces a new update mechanism distinct from LoRA, achieving an excellent balance between performance and efficiency.
dTRPO: Trajectory Reduction in Policy Optimization of Diffusion Large Language Models
Code for “Enhancing Diffusion-Based Sampling with Molecular Collective Variables"
A Lean formalisation of Maryna Viazovska's Fields Medal-winning solution to the sphere packing problem in dimension 8 and 24.
Course website for 6.S184/6.S975: Generative AI with Stochastic Differential Equations
Code for the papers: “Four Over Six: More Accurate NVFP4 Quantization with Adaptive Block Scaling” and “Adaptive Block-Scaled Data Types”
Course on Flash-attention in Triton
Generic building-block toolbox for training neural networks with adaptive and recursive execution. It provides reusable components to control iteration, stopping, and unrolling during training, ena…
Official Implementation of pMF https://arxiv.org/abs/2601.22158
Official Implementation of "Meta Flow Maps enable scalable reward alignment"
[ICLR 2026] Official code for TraceRL: Revolutionizing post-training for Diffusion LLMs, powering the SOTA TraDo series.
[ICLR 2026] Discrete Diffusion Divergence Instruct (DiDi-Instruct)