Skip to content
View kiminh's full-sized avatar

Block or report kiminh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

[ICLR'26] AutoGEO: a Generative Engine Optimization framework to automatically learn generative engine preferences, and rewrite web contents for more traction.

Python 127 12 Updated Apr 11, 2026

Web UI for training and running open models like Gemma 4, Qwen3.5, DeepSeek, gpt-oss locally.

Python 62,788 5,475 Updated Apr 23, 2026
Python 14 Updated Aug 25, 2025
Python 45 10 Updated Apr 17, 2026

A Claude Code SKILL for designing beautiful, consistent web pages — spec first, code second.

Python 178 8 Updated Apr 16, 2026

Build, Evaluate, and Deploy GUI Agents — online RL training, standardized benchmarks, and real-device deployment in one framework.

Python 828 30 Updated Apr 21, 2026
Python 42 5 Updated Apr 23, 2026
Python 1 Updated Mar 1, 2026
C++ 6 Updated Apr 22, 2026

Efficient and unified implementations for TopK-based sparse attention

Cuda 35 Updated Apr 20, 2026

offline agentic workflow for generating ad creatives

Jupyter Notebook 5 1 Updated Apr 1, 2026
JavaScript 5 1 Updated Apr 20, 2026

Industrial-grade video recommendation system: Two-Tower recall + Faiss + DeepFM/DIN ranking

Python 1 Updated Apr 12, 2026

Trained a two-tower ranker and distilled it into a student ranker. Reranking to enforce item diversity, category diversity/coverage, and exposure fairness for categories/new items. Found the best s…

Python 1 Updated Apr 14, 2026

Feature engineering for big data and quick inference

Python 20 2 Updated Apr 21, 2026

Sample for Fine-Tuning LLMs & VLMs

Python 3 4 Updated Apr 3, 2025

CoreIR project: Reproduction of "Query Auto-Completion for Rare Prefixes"

Jupyter Notebook 3 1 Updated Jul 2, 2020

Virtually, every modern search application features some kind of query auto completion. In its basic form, the problem consists in retrieving from a string set a small number of completions i.e. st…

C++ 1 Updated Feb 4, 2021

This project is a feature-rich, responsive clone of the Netflix UI, supercharged with an AI-powered movie recommendation engine. It allows users to browse popular and trending movies, watch trailer…

TypeScript 1 Updated Jan 16, 2026

CroQS: a Benchmark for Cross-modal Query Suggestion

HTML 5 Updated Feb 3, 2026

Simulation-based Interactive Query Suggestion Evaluation

Python 7 1 Updated Feb 13, 2026
Python 20 1 Updated Feb 20, 2025

闲鱼自动回复管理系统是一个基于 Python + FastAPI 开发的自动化客服系统,专为闲鱼平台设计。系统通过 WebSocket 连接闲鱼服务器,实时接收和处理消息,提供智能化的自动回复服务。

Python 4,296 1,216 Updated Apr 23, 2026

빅테크 추천시스템 스터디 (Meta HSTU, MS Recommenders, NVIDIA recsys-examples, Youtube)

Python 1 Updated Apr 22, 2026

Use your Claude Max subscription with OpenCode, Pi, Droid, Aider, Crush, Cline. Proxy that bridges Anthropic's official SDK to enable Claude Max in third-party tools.

TypeScript 933 126 Updated Apr 22, 2026

A single CLAUDE.md file to improve Claude Code behavior, derived from Andrej Karpathy's observations on LLM coding pitfalls.

81,895 7,771 Updated Apr 20, 2026
TypeScript 1 Updated Apr 20, 2026

Custom LLM inference kernels in Triton & CUDA C++: Flash Attention (beats torch SDPA at seqlen≥1024), int8 GEMM+dequant (104% of fp16 cuBLAS at M=2048), fused RMSNorm+Linear. Benchmarked on NVIDIA …

Python 1 Updated Apr 20, 2026

SGLang speculative decoding on 4× NVIDIA A30 — implemented lossless draft/verify with a RadixAttention-safe provisional KV cache (insert/commit/evict), accept-reject sampling verified mathematicall…

Python 1 Updated Apr 20, 2026

1.95× faster LLM inference via compiler-level kernel fusion

Python 1 Updated Apr 21, 2026
Next