Stars
turbo-tan / llama.cpp-tq3
Forked from ggml-org/llama.cppllama.cpp fork with TQ3_1S/4S CUDA kernels — 3.5-bit WHT quantization achieving Q4s quality at 10% smaller size. Based on RaBitQ-inspired Walsh-Hadamard transform. Enables 27B models on 16GB GPUs w…
An LM Studio plugin that allows the assistant to view the links, image URLs and text content of web pages.
A theoretical reconstruction of the Claude Mythos architecture, built from first principles using the available research literature.
LLM inference in C/C++
KV cache compression via block-diagonal rotation. Beats TurboQuant: better PPL (6.91 vs 7.07), 28% faster decode, 5.3x faster prefill, 44x fewer params. Drop-in llama.cpp integration.
The agent that grows with you
Adaptive Precision for EXpert Models: MoE-aware mixed-precision quantization
LocalAGI is a powerful, self-hostable AI Agent platform designed for maximum privacy and flexibility. A complete drop-in replacement for OpenAI's Responses APIs with advanced agentic capabilities. …
🧠 100% Local Memory layer and Knowledge base for agents with WebUI
⛵ The immutable, decentralized, statically built p2p VPN without any central server and automatic discovery! Create decentralized introspectable tunnels over p2p with shared tokens
LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
TheTom / llama-cpp-turboquant
Forked from ggml-org/llama.cppLLM inference in C/C++
Claw Code for local models — run claw against Ollama, LM Studio, or any OpenAI-compatible endpoint
Agent skills for coding CLIs, multi-agent runtimes, context engines, MCP extensions, and terminal tooling. Instead of using claude code's source code, give your agent skills to create your own!
The repo is finally unlocked. enjoy the party! The fastest repo in history to surpass 100K stars ⭐. Join Discord: https://discord.gg/5TUQKqFWd Built in Rust using oh-my-codex.
Open Claude Is Open-source coding-agent CLI for OpenAI, Gemini, DeepSeek, Ollama, Codex, GitHub Models, and 200+ models via OpenAI-compatible APIs.
Browse media content with your own rules on Android TV
"OpenHarness: Open Agent Harness with a Built-in Personal Agent--Ohmo!"
A unified TypeScript SDK for building chat bots across Slack, Microsoft Teams, Google Chat, Discord, and more.
Qwen3.6 is the large language model series developed by Qwen team, Alibaba Group.
OpenClaw: Use All Major AI Models NO API Token! Claude/ChatGPT/Gemini/DeepSeek/Doubao/Grok/Qwen/Manus/Kimi
A complete AI agency at your fingertips - From frontend wizards to Reddit community ninjas, from whimsy injectors to reality checkers. Each agent is a specialized expert with personality, processes…
Model Context Protocol Servers
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…