kiminh

Ramsey kiminh

Starred repositories

cxcscmu / AutoGEO

[ICLR'26] AutoGEO: a Generative Engine Optimization framework to automatically learn generative engine preferences, and rewrite web contents for more traction.

Python 127 12 Updated Apr 11, 2026

unslothai / unsloth

Web UI for training and running open models like Gemma 4, Qwen3.5, DeepSeek, gpt-oss locally.

Python 62,788 5,475 Updated Apr 23, 2026

kakao / diatool-dpo

Python 14 Updated Aug 25, 2025

kakao / OrchestrationBench

Python 45 10 Updated Apr 17, 2026

KAOPU-XiaoPu / web-design

A Claude Code SKILL for designing beautiful, consistent web pages — spec first, code second.

Python 178 8 Updated Apr 16, 2026

ZJU-REAL / ClawGUI

Build, Evaluate, and Deploy GUI Agents — online RL training, standardized benchmarks, and real-device deployment in one framework.

Python 828 30 Updated Apr 21, 2026

inclusionAI / DR-Venus

Python 42 5 Updated Apr 23, 2026

luoyanze07 / SearchLLM-Exp-v2

Python 1 Updated Mar 1, 2026

AppLovin / AxonCache

C++ 6 Updated Apr 22, 2026

tsinghua-ideal / flash-topk-attention

Efficient and unified implementations for TopK-based sparse attention

Cuda 35 Updated Apr 20, 2026

tottenjordan / adk_pipe

offline agentic workflow for generating ad creatives

Jupyter Notebook 5 1 Updated Apr 1, 2026

qqhard / zero-claw

JavaScript 5 1 Updated Apr 20, 2026

lvzhuojun / video-recsys-pipeline

Industrial-grade video recommendation system: Two-Tower recall + Faiss + DeepFM/DIN ranking

Python 1 Updated Apr 12, 2026

kanru-wang / MIND_Multistage_and_Distill_DLRM_Student

Trained a two-tower ranker and distilled it into a student ranker. Reranking to enforce item diversity, category diversity/coverage, and exposure fairness for categories/new items. Found the best s…

Python 1 Updated Apr 14, 2026

ExpediaGroup / kamae

Feature engineering for big data and quick inference

Python 20 2 Updated Apr 21, 2026

rabiloo / llm-finetuning

Sample for Fine-Tuning LLMs & VLMs

Python 3 4 Updated Apr 3, 2025

ReinierKoops / IR_project_02

CoreIR project: Reproduction of "Query Auto-Completion for Rare Prefixes"

Jupyter Notebook 3 1 Updated Jul 2, 2020

tushar4891 / Space-efficient-Top-k-completion

Virtually, every modern search application features some kind of query auto completion. In its basic form, the problem consists in retrieving from a string set a small number of completions i.e. st…

C++ 1 Updated Feb 4, 2021

Varunn-31 / Netflix-GPT

This project is a feature-rich, responsive clone of the Netflix UI, supercharged with an AI-powered movie recommendation engine. It allows users to browse popular and trending movies, watch trailer…

TypeScript 1 Updated Jan 16, 2026

Ruggero1912 / CroQS-benchmark

CroQS: a Benchmark for Cross-modal Query Suggestion

HTML 5 Updated Feb 3, 2026

IRLab-UDC / SIQSE

Simulation-based Interactive Query Suggestion Evaluation

Python 7 1 Updated Feb 13, 2026

QiSun123 / GenRDK

Python 20 1 Updated Feb 20, 2025

zhinianboke / xianyu-auto-reply

闲鱼自动回复管理系统是一个基于 Python + FastAPI 开发的自动化客服系统，专为闲鱼平台设计。系统通过 WebSocket 连接闲鱼服务器，实时接收和处理消息，提供智能化的自动回复服务。

Python 4,296 1,216 Updated Apr 23, 2026

hyo-genie / recsys-study

빅테크 추천시스템 스터디 (Meta HSTU, MS Recommenders, NVIDIA recsys-examples, Youtube)

Python 1 Updated Apr 22, 2026

rynfar / meridian

Use your Claude Max subscription with OpenCode, Pi, Droid, Aider, Crush, Cline. Proxy that bridges Anthropic's official SDK to enable Claude Max in third-party tools.

TypeScript 933 126 Updated Apr 22, 2026

forrestchang / andrej-karpathy-skills

A single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.

81,895 7,771 Updated Apr 20, 2026

kaichen / agent-dispatch

TypeScript 1 Updated Apr 20, 2026

saitejasrivilli / flash-attn-from-scratch

Custom LLM inference kernels in Triton & CUDA C++: Flash Attention (beats torch SDPA at seqlen≥1024), int8 GEMM+dequant (104% of fp16 cuBLAS at M=2048), fused RMSNorm+Linear. Benchmarked on NVIDIA …

Python 1 Updated Apr 20, 2026

saitejasrivilli / sglang_spec_decode

SGLang speculative decoding on 4× NVIDIA A30 — implemented lossless draft/verify with a RadixAttention-safe provisional KV cache (insert/commit/evict), accept-reject sampling verified mathematicall…

Ramsey kiminh

Starred repositories

arrays

parameter-server

lottery-ticket-hypothesis

covariate-shift

submodular-optimization