Yuanhan Zhang (张 元瀚)
Hi! I'm Yuanhan Zhang, here is the standard Chinese pronunciation for my first name : Yuanhan, a second-year PhD student at MMLab@NTU, supervised by Prof. Ziwei Liu.
My research interests lie in computer vision and deep learning. Particularly, I am focused on adapting foundation models, from vision to multi-modal, for real-world applications. This involves benchmarking these models to identify the most suitable one for specific problems, and efficiently adapting them through techniques such as parameter-efficient tuning, in-context learning, and instruction tuning.
Email (yuanhan002@e.ntu.edu.sg)  / 
Google Scholar  / 
Twitter  / 
Github
|
|
|
MMBench: Is Your Multi-modal Model an All-around Player?
Yuan Liu,
Haodong Duan,
Yuanhan Zhang,
Bo Li,
Songyang Zhang,
Wangbo Zhao,
Yike Yuan,
Jiaqi Wang,
Conghui He,
Ziwei Liu,
Kai Chen,
Dahua Lin
arXiv Preprint, 2023
PDF /
Dataset and Code
Benchmarking the 20 abilities of vision-language models.
|
|
Otter: A multi-modal model with in-context instruction tuning
Bo Li*,
Yuanhan Zhang*,
Liangyu Chen,
Jinghao Wan,
Fanyi Pu,
Jingkang Yang,
Chunyuan Li,
Ziwei Liu
arXiv Preprint, 2023
PDF /
Dataset and Code
A vision-language model with in-context instruction tuning.
|
|
Neural Prompt Search
Yuanhan Zhang,
Kaiyang Zhou,
Ziwei Liu.
arXiv Preprint, 2022
PDF /
Project Page /
Code
Searching prompt modules for parameter-efficient transfer learning.
|
|
Bamboo: Building Mega-Scale Vision Dataset Continually with Human-Machine Synergy
Yuanhan Zhang,
Qinghong Sun, Yichun Zhou, Zexin He,
Zhenfei Yin, Kun Wang,
Lu Sheng,
Yu Qiao,
Jing Shao,
Ziwei Liu.
arXiv Preprint, 2022
PDF /
Project Page /
Demo /
Code
4 times larger than ImageNet; 2 time larger than Object365; Built by active learning.
|
|
Knowledge augmented instruction tuning for
zero-shot animal species recognition
Zalan Fabian,
Zhongqi Miao,
Chunyuan Li,
Yuanhan Zhang,
Ziwei Liu,
Andrés Hernández,
Andrés Montes-Rojas,
Rafael Escucha,
Laura Siabatto,
Andrés Link,
Pablo Arbeláez,
Rahul Dodhia,
Juan Lavista Ferres
Instruction Tuning and Instruction Following Workshop@NeurIPS 2023.
PDF
A knowledge augmented vision-language model for AI conservation.
|
|
What Makes Good Examples for Visual In-Context Learning?
Yuanhan Zhang,
Kaiyang Zhou,
Ziwei Liu.
NeurIPS, 2023
PDF /
Code
Retrieving prompt for visual in-context learning.
|
|
Benchmarking Omni-Vision Representation through the Lens of Visual Realms
Yuanhan Zhang,
Zhenfei Yin,
Jing Shao,
Ziwei Liu.
ECCV, 2022
PDF /
Project Page /
Leaderboard /
Challenge:ImageNet1k-Pretrain Track /
Challenge:Open-Pretrain Track /
Dataset and Code
New benchmark for evaluating vision foundation models; New supervised contrastive learning framework.
|
|
CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations
Yuanhan Zhang,
Zhenfei Yin,
Yidong Li,
Guojun Yin,
Junjie Yan,
Jing Shao,
Ziwei Liu.
ECCV, 2020
PDF /
Dataset /
Demo /
Code
Large-scale face-antispoofing Dataset.
|
Last updated in Dec. 2023.
Homepage credits: Jon Barron.
|
|