YUANHAN (John) ZHANG

张元瀚

Coder at Meta Superintelligence Lab.

PhD from MMLab@NTU, advised by Prof. Ziwei Liu.

I build and evaluate multimodal large language models, with a focus on post-training and benchmarking.

New Video-MMMU ACL'26

Video-TT ICCV'25 · LLaVA-Video TMLR'25

Email Scholar Twitter GitHub LinkedIn

Current Interests

Information-dense Video Reasoning: Building models that identify salient details, connect dispersed evidence, and infer causal structure in complex visual narratives.

Harness-centric LLM Systems: Designing harnesses that structure memory for effective in-context comprehension and encode success criteria as robust executable checks.

Benchmarking Model–Human Gaps: Developing diagnostic benchmarks that reveal systematic differences between models and humans, and suggest better optimization targets.

Featured Work

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos

Kairui Hu, Penghao Wu, Fanyi Pu, Wang Xiao, Xiang Yue, Bo Li, Yuanhan Zhang, Ziwei Liu

A multi-discipline video benchmark for knowledge acquisition — adopted by Gemini-3 as part of their evaluation suite.

ACL 2026 Paper Project Dataset and Code

Towards Video Thinking Test: A Holistic Benchmark for Advanced Video Reasoning and Understanding

Yuanhan Zhang*, Yunice Chew*, Yuhao Dong, Aria Leo, Bo Hu, Ziwei Liu

A video reasoning benchmark that eliminates key-frame sampling bias — exposes a 48-point gap between GPT-4o (36.6%) and humans (84.3%).

ICCV 2025 Paper Project Dataset and Code

LLaVA-Video: Video Instruction Tuning With Synthetic Data

Yuanhan Zhang, Jinming Wu, Wei Li, Bo Li, Zejun Ma, Ziwei Liu, Chunyuan Li

An open video LMM built entirely on synthetic data — 1.2M+ HuggingFace downloads.

TMLR 2025

Paper Project Dataset, Model and Code

LLaVA-OneVision: Easy Visual Task Transfer

Bo Li, Yuanhan Zhang, Dong Guo, Renrui Zhang, Feng Li, Hao Zhang, Kaichen Zhang, Yanwei Li, Ziwei Liu, Chunyuan Li

One model for image, multi-image, and video — SOTA across all three.

TMLR 2025

Paper Project Dataset, Model and Code

Full publication list →