Zhipeng Zhang

I am currently a PhD candidate at the School of Computer Science, Northwestern Polytechnical University, advised by Prof. Peng Wang. I received my B.Eng. in Computer Science and Technology from NWPU's Honors College, where I was enrolled in the integrated B.Eng.-M.Sc.-Ph.D. experimental program. I completed the M.Sc. coursework ahead of schedule with top-ranked performance and advanced directly to the Ph.D. stage.

My research focuses on vision-language models, with an emphasis on multimodal generation, visual grounding, and real-world applications. I previously spent two years as a research intern at Alibaba Group and one year as a visiting PhD student / Academic Guest at ETH Zurich's Computer Vision Lab (CVL).

Google Scholar LinkedIn Email

News

2026/07
Our paper MoRe-UAV was accepted to ACM Multimedia (ACM MM) 2026.
2026/06
I was invited to serve on the AAAI 2027 Program Committee.
2026/05
I was invited to serve as a NeurIPS Reviewer.
2026/05
I was invited to serve as a BMVC Area Chair.
2026
Our paper on uncertainty-aware visual grounding in remote-sensing images was published in IEEE TGRS.
2025
Our paper on infrared image super-resolution was accepted by International Journal of Computer Vision (IJCV).
2024
Our paper Image Fusion via Vision-Language Model appeared at ICML 2024.
2023
Our robustness evaluation for referring expression comprehension was selected as a BMVC 2023 oral.

Publications

2026

Faster and Better: An Efficient Training-Free Framework for Object Goal Navigation

Z. Zhang, W. Suo, J. Wang, P. Wang

under review

2026

MoRe-UAV: A Large-Scale Benchmark for Motion-Aware Visual Grounding in UAV Videos

Z. Zhang, Y. Zhang, W. Suo, L. Liu, J. Wang, P. Wang

ACM Multimedia (ACM MM) 2026

2026

Adaptive Scale Fusion via Uncertainty Estimation for Visual Grounding in Remote Sensing Images

Z. Zhang, Y. Zou, J. Wang, P. Wang

IEEE Transactions on Geoscience and Remote Sensing, 2026

2025

Contourlet Refinement Gate Framework for Thermal Spectrum Distribution-Regularized Infrared Image Super-Resolution

Y. Zou, Z. Chen, Z. Zhang, X. Li, L. Ma, J. Liu, P. Wang, Y. Zhang

International Journal of Computer Vision, 2025

2024

Image Fusion via Vision-Language Model

Z. Zhang, L. Deng, H. Bai, Y. Cui, Z. Zhang, Y. Zhang, H. Qin, D. Chen, J. Zhang, P. Wang, L. Van Gool

ICML, 2024

2024

Self-Explainable Affordance Learning with Embodied Caption

Z. Zhang, Z. Wei, G. Sun, P. Wang, L. Van Gool

arXiv, 2024

2023

A Critical Robustness Evaluation for Referring Expression Comprehension Methods

Z. Zhang, Z. Wei, P. Wang

BMVC, 2023, Oral Presentation

2023

One for All: One-stage Referring Expression Comprehension with Dynamic Reasoning

Z. Zhang, Z. Wei, Z. Huang, R. Niu, P. Wang

Neurocomputing, 2023

2023

Fine-Grain Alignment for Text-Based Person Retrieval via Semantics-Centric Visual Division

Z. Wei*, Z. Zhang*, P. Wu, J. Wang, P. Wang, Y. Zhang

IEEE TCSVT, 2023

Experience

Aug 2023 - Jul 2024

ETH Zurich, Computer Vision Lab

Visiting PhD Student (Academic Guest). Research on vision-language modeling for embodied intelligence, grounding natural-language instructions in perception-centric decision-making and action planning.

Apr 2021 - Mar 2023

Alibaba Group, Beijing

Research Intern (Algorithm Engineer). Led large-scale multimodal modeling for e-commerce content understanding and generation, including product copywriting generation and customer-oriented question answering.

Dec 2019 - Dec 2020

Jiachuang Intelligent Technology Co., Ltd., Xi'an

Founding Team Member / Software Engineer. Worked on UAV visual navigation and built an AI central system integrating computer vision, language commands, and UAV flight maneuvers.

Academic Service

Conference: AAAI 2027 Program Committee; BMVC Area Chair; NeurIPS Reviewer; CVPR 2026, AAAI 2026, ICML 2025, ACM MM 2024-2025, BMVC 2023-2025.

Journal: TCSVT, TGRS, TIP, Neurocomputing, et al.

Awards and Honors

Research Funding

Generalizable and Interpretable Vision Reasoning with External Knowledge

Doctoral Research Funding, Principal Applicant / Project Lead. $50,000

Robot Navigation and Map Building Based on Computer Vision and Radar

Doctoral Research Funding, Principal Applicant / Project Lead. $10,000

Scholarships and Honors

National Inspirational Scholarship.
Outstanding Graduate Student Award.
Excellent Student Award.
First-Class School Scholarship.

Intellectual Property

Two authorized national patents; one international patent under review.