Portrait of Zhipeng Zhang

Northwestern Polytechnical University

School of Computer Science

Zhipeng Zhang

PhD candidate at the School of Computer Science, Northwestern Polytechnical University

I am a PhD candidate at the School of Computer Science, Northwestern Polytechnical University, advised by Prof. Peng Wang. My research focuses on vision-language models, multimodal generative modeling, visual grounding and embodied vision-language intelligence.

News

Selected Publications

2026

Adaptive Scale Fusion via Uncertainty Estimation for Visual Grounding in Remote Sensing Images

Z. Zhang, Y. Zou, J. Wang, P. Wang

IEEE Transactions on Geoscience and Remote Sensing, 2026

2026

MoRe-UAV: A Large-Scale Benchmark for Motion-Aware Visual Grounding in UAV Videos

Z. Zhang, Y. Zhang, W. Suo, L. Liu, J. Wang, P. Wang

2025

Contourlet Refinement Gate Framework for Thermal Spectrum Distribution-Regularized Infrared Image Super-Resolution

Y. Zou, Z. Chen, Z. Zhang, X. Li, L. Ma, J. Liu, P. Wang, Y. Zhang

International Journal of Computer Vision, 2025

2024

Image Fusion via Vision-Language Model

Z. Zhang, L. Deng, H. Bai, Y. Cui, Z. Zhang, Y. Zhang, H. Qin, D. Chen, J. Zhang, P. Wang, L. Van Gool

ICML, 2024

2024

Self-Explainable Affordance Learning with Embodied Caption

Z. Zhang, Z. Wei, G. Sun, P. Wang, L. Van Gool

arXiv, 2024

2023

A Critical Robustness Evaluation for Referring Expression Comprehension Methods

Z. Zhang, Z. Wei, P. Wang

BMVC, 2023, Oral Presentation

2023

One for All: One-stage Referring Expression Comprehension with Dynamic Reasoning

Z. Zhang, Z. Wei, Z. Huang, R. Niu, P. Wang

Neurocomputing, 2023

2023

Fine-Grain Alignment for Text-Based Person Retrieval via Semantics-Centric Visual Division

Z. Wei*, Z. Zhang*, P. Wu, J. Wang, P. Wang, Y. Zhang

IEEE TCSVT, 2023

Experience

Mar 2023 - Present

Northwestern Polytechnical University

PhD candidate in Computer Science and Technology. Direct PhD program, advised by Prof. Peng Wang and Prof. Yanning Zhang.

Aug 2023 - Jul 2024

ETH Zurich, Computer Vision Lab

Visiting PhD student / Academic Guest. Research on vision-language modeling for embodied intelligence and perception-centric decision-making.

Apr 2021 - Mar 2023

Alibaba Group, Beijing

Research intern, Algorithm Engineer. Large-scale multimodal modeling for e-commerce content understanding, copywriting generation, and customer QA.

Sep 2017 - Oct 2022

Northwestern Polytechnical University

B.Eng. and M.Sc. coursework in Computer Science and Technology. Selected for the direct PhD track.

Academic Service

Conference reviewer for CVPR 2026, AAAI 2026, ICML 2025, and BMVC 2023-2025. Journal reviewer for TCSVT, TGRS, TIP, Neurocomputing, and other venues.

Awards and Honors

Recipient of doctoral research funding as principal applicant / project lead, robotics and innovation competition awards, scholarships, two authorized national patents, and one international patent under review.

Skills

C/C++, Python, Java, PyTorch, Linux, ROS, model design, model training, optimization, and robotics-oriented deployment. Research expertise includes vision-language models, multimodal generative modeling, referring expression comprehension, image-text retrieval, and embodied vision-language intelligence.