Yuanhan Zhang (张 元瀚)

Hi! I'm Yuanhan Zhang, here is the standard Chinese pronunciation for my first name : Yuanhan, a second-year PhD student at MMLab@NTU, supervised by Prof. Ziwei Liu.

My research interests lie in computer vision and deep learning. Particularly, I am focused on adapting foundation models, from vision to multi-modal, for real-world applications. This involves benchmarking these models to identify the most suitable one for specific problems, and efficiently adapting them through techniques such as parameter-efficient tuning, in-context learning, and instruction tuning.

Email (yuanhan002@e.ntu.edu.sg)  /  Google Scholar  /  Twitter  /  Github

profile photo
News
Pre-prints
MMBench: Is Your Multi-modal Model an All-around Player?
Yuan Liu, Haodong Duan, Yuanhan Zhang, Bo Li, Songyang Zhang, Wangbo Zhao, Yike Yuan, Jiaqi Wang, Conghui He, Ziwei Liu, Kai Chen, Dahua Lin
arXiv Preprint, 2023
PDF / Dataset and Code GitHub Repo stars

Benchmarking the 20 abilities of vision-language models.

Otter: A multi-modal model with in-context instruction tuning
Bo Li*, Yuanhan Zhang*, Liangyu Chen, Jinghao Wan, Fanyi Pu, Jingkang Yang, Chunyuan Li, Ziwei Liu
arXiv Preprint, 2023
PDF / Dataset and Code GitHub Repo stars

A vision-language model with in-context instruction tuning.

Neural Prompt Search
Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu.
arXiv Preprint, 2022
PDF / Project Page / Code GitHub Repo stars

Searching prompt modules for parameter-efficient transfer learning.

Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy
Yuanhan Zhang, Qinghong Sun, Yichun Zhou, Zexin He,
Zhenfei Yin, Kun Wang, Lu Sheng, Yu Qiao, Jing Shao, Ziwei Liu.
arXiv Preprint, 2022
PDF / Project Page / Demo / Code GitHub Repo stars

4 times larger than ImageNet; 2 time larger than Object365; Built by active learning.

Publications
Knowledge augmented instruction tuning for zero-shot animal species recognition
Zalan Fabian, Zhongqi Miao, Chunyuan Li, Yuanhan Zhang, Ziwei Liu, Andrés Hernández, Andrés Montes-Rojas, Rafael Escucha, Laura Siabatto, Andrés Link, Pablo Arbeláez, Rahul Dodhia, Juan Lavista Ferres
Instruction Tuning and Instruction Following Workshop@NeurIPS 2023.

PDF

A knowledge augmented vision-language model for AI conservation.

What Makes Good Examples for Visual In-Context Learning?
Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu.
NeurIPS, 2023
PDF / Code GitHub Repo stars

Retrieving prompt for visual in-context learning.

Benchmarking Omni-Vision Representation through the Lens of Visual Realms
Yuanhan Zhang, Zhenfei Yin, Jing Shao, Ziwei Liu.
ECCV, 2022
PDF / Project Page / Leaderboard / Challenge:ImageNet1k-Pretrain Track / Challenge:Open-Pretrain Track / Dataset and Code GitHub Repo stars

New benchmark for evaluating vision foundation models; New supervised contrastive learning framework.

CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations
Yuanhan Zhang, Zhenfei Yin, Yidong Li, Guojun Yin, Junjie Yan, Jing Shao, Ziwei Liu.
ECCV, 2020
PDF / Dataset / Demo / Code GitHub Repo stars

Large-scale face-antispoofing Dataset.

Activities
Public Office Hour

Last updated in Dec. 2023.

Homepage credits: Jon Barron.