Zhi Gao | Vision-Language Models | Best Researcher Award

Dr. Zhi Gao | Vision-Language Models | Best Researcher Award

Postdoctoral Research Fellow at Peking University, China.

Dr. Zhi Gao is a Postdoctoral Research Fellow at the School of Intelligence Science and Technology, Peking University. His research focuses on multimodal learning, vision-language models, and human-robot interaction. With expertise in computer vision and machine learning, he explores the development of intelligent agents capable of understanding and interacting with complex environments.

Professional Profile:

Google Scholar Profile

Education Background 🎓📖

  • Ph.D. in Computer Science and Technology, Beijing Institute of Technology (2018–2023)
  • Master in Computer Science and Technology, Beijing Institute of Technology (2017–2018)
  • B.S. in Computer Science and Technology, Beijing Institute of Technology (2013–2017)

Professional Development 📈💡

Dr. Gao is currently a Postdoctoral Research Fellow at Peking University under the supervision of Prof. Song-Chun Zhu, focusing on multimodal learning and agent development. Concurrently, he serves as a Research Scientist at the Beijing Institute for General Artificial Intelligence, working on vision-language models in the Machine Learning Lab. His research integrates deep learning, data representation, and human-centered AI to enhance machine perception and reasoning.

Research Focus 🔬📖

His work spans computer vision and machine learning, particularly in developing multimodal agents capable of learning from human-robot interactions and adapting to dynamic environments. He is also interested in leveraging the geometry of data space to address challenges such as insufficient annotations and distribution shifts.

Author Metrics

  • Publications in top-tier AI and computer vision conferences and journals
  • Research contributions in multimodal intelligence, vision-language understanding, and AI-driven reasoning

Awards & Honors 🏆🎖️

  • National Science Foundation for Young Scientists of China (2025–2027) for research on Riemannian multimodal large language models for video understanding
  • Distinguished Dissertation Award from SIGAI CHINA (October 202X)

Publication Top Notes

1. A Hyperbolic-to-Hyperbolic Graph Convolutional Network

Authors: Jindou Dai, Yuwei Wu, Zhi Gao, Yunde Jia
Published in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 154-163
Abstract: This paper introduces a hyperbolic-to-hyperbolic graph convolutional network (H2H-GCN) that operates directly on hyperbolic manifolds. The proposed method includes a manifold-preserving graph convolution with hyperbolic feature transformation and neighborhood aggregation, avoiding distortions from tangent space approximations. Extensive experiments demonstrate substantial improvements in tasks such as link prediction, node classification, and graph classification.

2. Curvature Generation in Curved Spaces for Few-Shot Learning

Authors: Zhi Gao, Yuwei Wu, Yunde Jia, Mehrtash Harandi
Published in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 8671-8680
Abstract: This research addresses few-shot learning by proposing task-aware curved embedding spaces using hyperbolic geometry. By generating task-specific embedding spaces with appropriate curvatures, the method enhances the generality of embeddings. The study leverages intra-class and inter-class context information to create discriminative class prototypes, showing benefits over existing embedding methods in both inductive and transductive few-shot learning scenarios.

3. Deep Convolutional Network with Locality and Sparsity Constraints for Texture Classification

Authors: Xiaoyu Bu, Yuwei Wu, Zhi Gao, Yunde Jia
Published in: Pattern Recognition, Volume 91, 2019, Pages 34-46
Abstract: This paper presents a deep convolutional network incorporating locality and sparsity constraints to improve texture classification. The proposed model enhances feature representation by enforcing local connectivity and sparse activation, leading to improved classification performance on texture datasets.

4. Meta-Causal Learning for Single Domain Generalization

Authors: Jianlong Chen, Zhi Gao, Xiaodan Wu, Jiebo Luo
Published in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Abstract: The study introduces a meta-causal learning framework aimed at enhancing generalization in single-domain settings. By leveraging causal relationships within the data, the approach seeks to improve model robustness when applied to unseen domains, addressing challenges in domain generalization.

5. A Robust Distance Measure for Similarity-Based Classification on the SPD Manifold

Authors: Zhi Gao, Yuwei Wu, Mehrtash Harandi, Yunde Jia
Published in: IEEE Transactions on Neural Networks and Learning Systems, Volume 31, Issue 9, 2019, Pages 3230-3244
Abstract: This research proposes a robust distance measure tailored for similarity-based classification tasks on the Symmetric Positive Definite (SPD) manifold. The developed measure enhances classification accuracy by effectively capturing the intrinsic geometry of the SPD manifold, demonstrating robustness in various similarity-based classification scenarios.

Conclusion:

Dr. Zhi Gao is a strong candidate for the Best Researcher Award, given his groundbreaking contributions in vision-language models, hyperbolic learning, and multimodal AI. His strong academic background, top-tier publications, and national recognition make him a well-qualified nominee. However, to further strengthen his impact, he could focus on industry collaborations, real-world AI applications, and global AI leadership.

Verdict:Highly suitable for the Best Researcher Award with minor areas of improvement for long-term impact.