Popular repositories Loading
-
-
ParallelBench
ParallelBench Public[ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs
-
eta-inversion
eta-inversion Public[ECCV 2024] Official Pytorch Implementation for "Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing"
-
draft-based-approx-llm
draft-based-approx-llm Public[ICLR 2026] Draft-based Approximate Inference for LLMs
Repositories
Showing 10 of 151 repositories
- vllm-compression-part Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
furiosa-ai/vllm-compression-part’s past year of commit activity - furiosa-rngd-validator Public
furiosa-ai/furiosa-rngd-validator’s past year of commit activity - furiosa-opt Public
furiosa-ai/furiosa-opt’s past year of commit activity - furiosa-feature-discovery Public
furiosa-ai/furiosa-feature-discovery’s past year of commit activity - furiosa-metrics-exporter Public
furiosa-ai/furiosa-metrics-exporter’s past year of commit activity - furiosa-device-plugin Public
furiosa-ai/furiosa-device-plugin’s past year of commit activity - furiosa-smi-go Public
furiosa-ai/furiosa-smi-go’s past year of commit activity - llm-compressor-compression-part Public Forked from vllm-project/llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
furiosa-ai/llm-compressor-compression-part’s past year of commit activity - compressed-tensors-compression-part Public Forked from vllm-project/compressed-tensors
A safetensors extension to efficiently store sparse quantized tensors on disk
furiosa-ai/compressed-tensors-compression-part’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…