Ziqi Gao (Roy)

Bio: I am an incoming PhD student at the University of Washington, advised by Prof. Ranjay Krishna at UW CSE RAIVN Lab. I am currently working at the Allen Institute for AI (AI2) as a Predoctoral Young Investigator (PYI). I work closely with senior Ph.D. student Jieyu Zhang.

Research Interests: My research focuses on Computer Vision and MLLMs, with a particular interest in grounding reasoning in vision-language models and building AI models/systems that connect perception, language, and the physical world in 2D/3D environments. I hope to enable machines to perceive, understand, and interact with the world in a human-like way.

📝 Publications

(* denotes equal contribution, + denotes corresponding author)

Preprints

Vision-Language Grounding as Bidirectional Concept Correspondence
Jieyu Zhang*, Ziqi Gao*, Luke Zettlemoyer, Ranjay Krishna+.

Peer-reviewed

Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from Videos
Ziqi Gao, Jieyu Zhang, Wisdom Oluchi Ikezogwo, Jae Sung Park, Tario G. You, Daniel Ogbu, Chenhao Zheng, Weikai Huang, Yinuo Yang, Winson Han, Quan Kong, Rajat Saini, Ranjay Krishna+.
ECCV 2026
Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding
Christopher Clark, Jieyu Zhang, Zixian Ma, Jae Sung Park, Mohammadreza Salehi, Rohun Tripathi, Sangho Lee, Jason Ren, Chris Dongjoo Kim, Yinuo Yang, Vincent Shao, Yue Yang, Weikai Huang, Ziqi Gao, Taira Anderson, Jianrui Zhang, Jitesh Jain, George Stoica, Winston Han, Ali Farhadi, Ranjay Krishna+.
CVPR 2026
Synthetic Object Compositions for Scalable and Accurate Learning in Detection, Segmentation, and Grounding
Weikai Huang, Jieyu Zhang, Taoyang Jia, Chenhao Zheng, Ziqi Gao, Jae Sung Park, Ranjay Krishna+
CVPR 2026
Generate Any Scene: Evaluating and Improving Text-to-Vision Generation with Scene Graph Programming
Ziqi Gao*, Weikai Huang*, Jieyu Zhang, Aniruddha Kembhavi, Ranjay Krishna+
ICLR 2026
One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory
Chenhao Zheng, Jieyu Zhang, Mohammadreza Salehi, Ziqi Gao, Vishnu Iyengar, Norimasa Kobori, Quan Kong, Ranjay Krishna+
ICCV 2025 Highlight
MMTSA: Multi-Modal Temporal Segment Attention Network for Efficient Human Activity Recognition
Ziqi Gao*, Yuntao Wang*+, Jianguo Chen, Junliang Xing, Shwetak Patel, Xin Liu, Yuanchun Shi.
IMWUT 2023

🎓 Education

University of Washington
M.S. in Technology Innovation
2023 – 2025
Tsinghua University
M.S. in Data Science and Information Technology
2022 – 2025

📫 Contact

Email: gzq@cs.washington.edu