Stars
Search-based optimizer for MLX/Metal on Apple Silicon.
Hugging Face native LLM inference on Apple Silicon via direct Metal
vLLM Metal plugin powered by mlx-swift — high-performance LLM inference on Apple Silicon
Apple Silicon (MLX) port of Karpathy's autoresearch — autonomous AI research loops on Mac, no PyTorch required.
TriAttention — Efficient long reasoning with trigonometric KV cache compression. Enables OpenClaw local deployment on memory-constrained GPUs.
⚡ Native MLX Swift LLM inference server for Apple Silicon. OpenAI-compatible API, SSD streaming for 100B+ MoE models, TurboQuant KV cache compression, MACOS + iOS iPhone app.
LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
TheTom / llama-cpp-turboquant
Forked from ggml-org/llama.cppLLM inference in C/C++
LLAMA Turboquant implementation with CUDA support
miolini / autoresearch-macos
Forked from karpathy/autoresearchAI agents running research on single-GPU nanochat training automatically adopted for MacOS
I'm crazy and trying to make a ForScan OBD reader work on my mac.
The missing DevTools for Claude Code — inspect session logs, tool calls, token usage, subagents, and context window in a visual UI. Free, open source.