Dr. Benyou Wang exemplifies the modern researcher's ideal, seamlessly combining technical depth, interdisciplinary application, global engagement, and measurable societal impact. His pioneering contributions—such as the development and real-world deployment of HuatuoGPT in healthcare, the creation of multilingual LLM benchmarks, and innovative work on fine-grained AI-generated content detection—underscore his leadership in advancing language model research. With a proven trajectory of sustained excellence, high-impact publications, and international recognition, Dr. Wang is not only a strong nominee but also a front-runner for the Best Researcher Award in Language Models. His continued expansion into global collaboration and theoretical grounding promises to shape the future landscape of natural language processing.
Mr. Benyou Wang | Language Models | Best Researcher Award
School of Data Science at The Chinese University of Hong Kong, Shenzhen, China
Professional Profile
Scopus
Orcid
Google Scholar
Summary
Dr. Benyou Wang is an Assistant Professor jointly appointed in the School of Data Science and the School of Medicine at The Chinese University of Hong Kong, Shenzhen. He is also a Vice Director at the Center for Medical Artificial Intelligence and a Principal Investigator (PI) of multiple projects related to large language models (LLMs) and healthcare AI. Dr. Wang is a recipient of the prestigious Marie Skłodowska-Curie Fellowship and has spearheaded the development of "HuatuoGPT," the first large-scale medical LLM successfully deployed across 11 public hospitals in Shenzhen, impacting over half a million residents. His work bridges scientific innovation, real-world implementation, and industrial transformation, earning widespread media and institutional acclaim.
Educational Background
Dr. Wang earned his Ph.D. in Information Science and Technology from the University of Padua, Italy (2018–2022), funded by the EU's Marie Skłodowska-Curie Actions. He holds an M.Sc. in Computer Science from Tianjin University, China (2014–2017), where he specialized in pattern recognition and intelligent systems, and a B.Sc. in Software Engineering from Hubei University of Automotive Technology (2010–2014). He completed his secondary education at the prestigious Huanggang Middle School in Hubei Province.
Professional Experience
Dr. Wang began his career as an associate researcher at Tencent and later joined the University of Padua as a full-time Marie Curie Researcher. He has held visiting research appointments at institutions like the University of Montreal, University of Amsterdam, University of Copenhagen, and the Chinese Academy of Sciences. He interned at Huawei’s Noah’s Ark Lab and has delivered numerous invited talks worldwide. Since 2022, he has been teaching and leading research at CUHK-Shenzhen while supervising multiple Ph.D. and undergraduate students.
Research Interests
Dr. Wang’s research focuses on large language models (LLMs), their applications in vertical domains like healthcare and multilingual systems, quantum-inspired natural language processing, and information retrieval. He is deeply involved in the development of explainable AI, multimodal LLMs, and efficient LLM training. His work often explores the theoretical foundations of LLM alignment and evaluation and has recently expanded into medical reasoning and visual-language integration.
Author Metrics
Dr. Wang has authored over 40 peer-reviewed papers in top-tier venues such as ICLR, NeurIPS, ICML, NAACL, ACL, EMNLP, SIGIR, and AAAI. As of April 2024, his Google Scholar profile reports over 4,965 citations and an H-index of 37. He is the first or corresponding author on several high-impact studies and is actively engaged as a reviewer and Area Chair for major NLP and ML conferences.
Awards and Honors
Dr. Wang has received multiple Best Paper Awards, including at ICLR Financial AI 2025, NLPCC 2022, NAACL 2019, and SIGIR 2017. He was honored with the Huawei Spark Award (presented by Ren Zhengfei), and his HuatuoGPT project has been recognized in national strategic AI deployment plans. He also earned funding from Tencent’s Rhino-Bird Project, Huawei’s AI Top 100 Universities Program, and CCF-DiDi’s Gaia Scholar Initiative. His work has been featured in Nature, CCTV, Financial Times, and Global Times, among others.
Publication Top Notes
1. Learning from Peers in Reasoning Models
Authors: T. Luo, W. Du, J. Bi, S. Chung, Z. Tang, H. Yang, M. Zhang, B. Wang
Venue: arXiv preprint arXiv:2505.07787
Year: 2025
Summary:
This paper proposes a novel peer-learning framework where multiple large language models (LLMs) interact to enhance their reasoning abilities. By sharing intermediate reasoning steps and critiques, the models improve logical consistency and performance across various reasoning tasks.
2. Pushing the Limit of LLM Capacity for Text Classification
Authors: Y. Zhang, M. Wang, Q. Li, P. Tiwari, J. Qin
Venue: Companion Proceedings of the ACM on Web Conference 2025, pp. 1524–1528
Year: 2025
Summary:
The study investigates the potential of large language models for multi-class text classification without traditional fine-tuning. Using prompt engineering and strategic data augmentation, the authors demonstrate competitive or superior performance compared to classical approaches.
3. Beyond Binary: Towards Fine-Grained LLM-Generated Text Detection via Role Recognition and Involvement Measurement
Authors: Z. Cheng, L. Zhou, F. Jiang, B. Wang, H. Li
Venue: Proceedings of the ACM on Web Conference 2025, pp. 2677–2688
Year: 2025
Summary:
This work moves beyond binary classification of AI-generated text and introduces a fine-grained detection system that recognizes multiple roles and the degree of AI involvement. The proposed model offers better insights into collaborative human-AI authored content.
4. UCL-Bench: A Chinese User-Centric Legal Benchmark for Large Language Models
Authors: R. Gan, D. Feng, C. Zhang, Z. Lin, H. Jia, H. Wang, Z. Cai, L. Cui, Q. Xie, ... B. Wang (et al.)
Venue: Findings of the Association for Computational Linguistics: NAACL 2025, pp. 7945–7988
Year: 2025
Summary:
This paper introduces UCL-Bench, a comprehensive legal benchmark in Chinese designed for evaluating LLMs in real-world legal advisory tasks. It emphasizes user intent, fairness, and practical utility, and serves as a tool for the responsible deployment of legal AI systems.
5. Huatuo-26M: A Large-scale Chinese Medical QA Dataset
Authors: X. Wang, J. Li, S. Chen, Y. Zhu, X. Wu, Z. Zhang, X. Xu, J. Chen, J. Fu, X. Wan, ... B. Wang (et al.)
Venue: Findings of the Association for Computational Linguistics: NAACL 2025, pp. 3828–3848
Year: 2025
Summary:
Huatuo-26M is a massive, high-quality Chinese medical question-answering dataset with 26 million entries. It supports the development of specialized medical LLMs like HuatuoGPT, which has been implemented in clinical settings and widely adopted across hospitals in Shenzhen.