Open-source, End-to-end, Lightweight, Vision-Language-Action Model for GUI Drag operations.
ShowUI-π 是一款开源的、端到端、轻量级的视觉-语言-动作模型,专为 GUI 拖拽交互设计。
📑 Paper | 🤗 Model | 🤗 Datasets
ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands
Siyuan Hu*, Kevin Qinghong Lin*, Mike Zheng Shou
Show Lab @ National University of Singapore
- [2026.2.20] ShowUI-π is accepted by CVPR 2026.
- [2025.12.31] We released ShowUI-π for GUI dragging.
- [2025.12.31] We released the DEX Benchmark for GUI drag-and-drop evaluation.
git clone https://github.com/showlab/showui-pi.git
cd showui-pi
pip install -e .bash scripts/train_showui_pi.shSee scripts/train_showui_pi.sh for all flags and defaults.
The DEX benchmark is downloaded automatically on first run.
PYTHONPATH=lerobot/src \
python scripts/eval_dex.py \
--ckpt <path/to/checkpoint> \
--output_dir outputs/eval_dexPYTHONPATH=lerobot/src \
python scripts/eval_screenspot_pro.py \
--ckpt <path/to/checkpoint> \
--annotations_root <path/to/ScreenSpot-Pro/annotations> \
--images_root <path/to/ScreenSpot-Pro/images>We extend our gratitude to LeRobot and SmolVLA for the training framework, and SeeClick for grounding data.
@article{hu2025showui,
title={ShowUI-$$\backslash$pi $: Flow-based Generative Models as GUI Dexterous Hands},
author={Hu, Siyuan and Lin, Kevin Qinghong and Shou, Mike Zheng},
journal={arXiv preprint arXiv:2512.24965},
year={2025}
}
If you like our project, please give us a star ⭐ on GitHub for the latest update.