About meI am a first-year Ph.D. student in Paul G. Allen School of Computer Science & Engineering at the University of Washington. I'm fortunate to be advised by Simon Du and Banghua Zhu. Previously, I received my bachelor's degree in Computer Science (Yao class), with a minor in Literature, from Tsinghua University. I'm currently interested in algorithm design and theoretical analysis of deep learning and foundation models.
Contact
Selected Papers Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO.
Decoding-Time Language Model Alignment with Multiple Objectives.
Rethinking Transformers in Solving POMDPs.
Notes and SlidesOne-step flow matching paper reading. [slide] Understanding the gaps between two-stage and direct preference-based policy learning. [slide] The crucial role of samplers in online direct preference optimization. [slide][recording] Logit mixing and RLHF paper reading. [slide] Decoding-time language model alignment with multiple objectives. [slide][recording] An incomplete list of books I like, randomly maintained. [list] |