Chen Zhao

Research Scientist

King Abudullah University of Science and Technology (KAUST)

About Me

I am a Research Scientist at King Abdullah University of Science and Technology (KAUST), and Lead of the Video Group in Image and Video Understanding Lab (IVUL) with Prof. Bernard Ghanum. I obtained my Ph.D from Peking University (PKU), advised by Prof. Wen Gao and Prof. Siwei Ma. My research interests focus on image/video understanding (after I got my Ph.D) and image/video compression (during my Ph.D study).

I have published 40+ papers in representative journals and conferences in both fields, such as TPAMI, CVPR, ICCV, ECCV in the field of image/video understanding, and TCSVT, TIP, DCC in the field of image/video compression. I received the Best Paper Nomination in CVPR 2022, the Best Paper Award in CVPR workshop 2023, and the Best Paper Award in NCMT 2015. I have also be awarded the First Prize of Qualcomm Innovation Fellowship Contest (QInF) (only 2 in China), and Goldman Sachs Global Leaders Award (only 26 in the mainland China and 150 worldwide).

Interests

Image/video understanding
Vision-languange learning
Efficient neural networks
Image/video processing
Image/video compression
Image/video continual learning

Education

Ph.D. in Computer Science, 2016
Peking University (PKU), Beijing, China
Research Intern, 2016
National Institute of Informatics (NII), Tokyo, Japan
Joint Ph.D. student, 2012
University of Washington (UW), Seattle, USA
B.Eng. in Software Engineering, 2010
Sichuan University (SCU), Chengdu, China

News

2024

[2024-10-07] I gave a lecture on “Reversifying Neural Networks: Efficient Memory Optimization Strategies for Finetuning Large Models” in KAUST CS Seminar.
[2024-06-18] I gave a talk on “Towards More Realistic Continual Learning at Scale” as an invited speaker in the CLVision Workshop in CVPR 2024.
[2024-06-17] We have won the first place in 4 challenges in CVPR 2024: Epic-kitchens audio-based interaction detection, Epic-kitchens action detection, Epic-kitchens action recognition, Ego4D Visual Queries 3D!
[2024-06-11] I gave a talk on “Optimizing Memory Efficiency in Pretrained Model Finetuning” in the Berkeley Artificial Intelligence Research (BAIR) Lab, UC Berkeley.
[2024-05-05] I gave a lecture in KAUST CEMSE graduate seminar on “Toward Long-form Video Understanding” as part of KAUST Research Open Week!
[2024-03-28] We released OpenTAD, an open-source toolbox for temporal action detection (TAD), comprising 14 methods with 8 datasets.
[2024-02-27] 4 papers are accepted to CVPR 2024: Dr2Net, AdaTAD, TGT, and Ego-Exo4D!
[2024-02-19] I gave a spotlight talk in the Rising Star in AI Symposium 2024 !

2023

[2023-12-15] I gave a talk in HIT Webinar on “Challenges and innovation for long-form video understanding: compute, algorithm, and data”.
[2023-08-08] EgoLoc is selected as an ORAL in ICCV'23!
[2023-08-07] Ego4D was accepted to TPAMI (recommended submission as an CVPR'22 award winner)!
[2023-07-14] All three papers (LAE, FreeDoM, EgoLoc) submitted to ICCV'23 were accepted!
[2023-06-22] SMILE won the Best Paper Award in CVPRW'23 CLVision!
[2023-06-22] We won the first place in CVPR'23 Ego4D VQ3D Challenge!
[2023-04-07] ETAD was accepted to CVPRW'23 ECV!
[2023-04-04] OWL was accepted to CVPRW'23 L3D-IVU !
[2023-03-29] SMILE was accepted to CVPRW'23 CLVision!
[2023-02-27] Re2TAL and LF-VSN were accepted to CVPR'23!
[2023-02-20] I gave a spotlight talk in the Rising Star in AI Symposium 2023 !

2022

[2022-12-02] I was the lecturer in the Artificial Intelligence Bootcamp on behalf of KAUST to Saudi Arabia’s smartest undergraduate students!
[2022-07-04] R-DFCIL and EASEE were accepted into ECCV'22!
[2022-06-21] Ego4D got into CVPR'22 Best Paper Finalist!
[2022-04-18] All Ego4D challenges are live now!
[2022-03-29] Ego4D was accepted to CVPR'22 as ORAL presentation!
[2022-03-29] MAD was accepted to CVPR'22!

2021

[2021-11-30] I gave a talk virtually in the computer vision group of University of Bristol on “Detecting Actions in Videos via Graph Convolutional Networks”.
[2021-10-15] Ego4D was released and paper on arxiv!
[2021-07-23] VSGN was accepted to ICCV'21!
[2021-05-20] I was recognized by CVPR’21 as Outstanding Reviewer!

2020

[2020-07-29] ThumbNet was accepted to ACM MM'20!
[2020-06-07] We won the 2‑nd place in the HACS’20 Weakly‑supervised action detection Challenge!
[2020-02-27] G-TAD was accepted to CVPR'20!

2019

[2019-10-23] Our paper for YouTube-8M challenge got accepted as Oral presentation in ICCV'19 Workshop!
[2019-10-12] We missed the gold medal by only 0.0004 in Kaggle’s 3rd YouTube‑8M Video Understanding Challenge; rank 9/11 out of 283 teams in the public/private leaderboards!

Publications

Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024.

Chen Zhao, Shuming Liu, Karttikeya Mangalam, Guocheng Qian, Fatimah Zohra, Abdulmohsen Alghannam, Jitendra Malik, Bernard Ghanem

Large pretrained models are increasingly crucial in modern computer vision tasks. These models are …

Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning

Re2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

Chen Zhao, Shuming Liu, Karttikeya Mangalam, Bernard Ghanem

Temporal action localization (TAL) requires long-form reasoning to predict actions of various …

Re2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization

EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries

International Conference on Computer Vision (ICCV), 2023. [Won the first place in Ego4D VQ3D Challenge 2023, Oral].

Jinjie Mai, Abdullah Hamdi, Silvio Giancola, Chen Zhao, Bernard Ghanem

With the recent advances in video and 3D understanding, novel 4D spatio-temporal methods fusing both …

EgoLoc: Revisiting 3D Object Localization from Egocentric Videos with Visual Queries

A Unified Continual Learning Framework with General Parameter-Efficient Tuning

International Conference on Computer Vision (ICCV), 2023.

Qiankun Gao, Chen Zhao, Yifan Sun, Teng Xi, Gang Zhang, Bernard Ghanem, Jian Zhang

The ‘pre-training → downstream adaptation’ presents both new opportunities and …

A Unified Continual Learning Framework with General Parameter-Efficient Tuning

Just a Glimpse: Rethinking Temporal Information for Video Continual Learning

IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), 2023. [Best Paper Award, Oral].

Lama Alssum, Juan Leo ́n Alca ́zar, Merey Ramazanova, Chen Zhao, Bernard Ghanem

Class-incremental learning is one of the most important settings for the study of Continual …

Just a Glimpse: Rethinking Temporal Information for Video Continual Learning

Ego4D: Around the World in 3,000 Hours of Egocentric Video

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022. [Best Paper Nominee, Oral].

Chen Zhao, with other 84 authors

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 …

Ego4D: Around the World in 3,000 Hours of Egocentric Video

SegTAD: Precise Temporal Action Detection via Semantic Segmentation

European Conference on Computer Vision Workshop (ECCVW), 2022.

Chen Zhao, Merey Ramazanova, Mengmeng Xu, Bernard Ghanem

Temporal action detection (TAD) is an important yet challenging task in video analysis. Most …

SegTAD: Precise Temporal Action Detection via Semantic Segmentation

Video Self‑Stitching Graph Network for Temporal Action Localization

IEEE International Conference on Computer Vision (ICCV), 2021.

Chen Zhao, Ali Thabet, Bernard Ghanem

Short actions are critical and challenging in the task of action localization. We target this problem and propose a video self-stitching graph network (VSGN), which enhances short action by video self-stitching (VSS) and a cross-scale graph pyramid network (xGPN).

Video Self‑Stitching Graph Network for Temporal Action Localization

See all publications

Selected Awards

2024 First place, Ego4D Visual Queries 3D at CVPR 2024
2024 First place, Epic-kitchens audio-based interaction detection at CVPR 2024
2024 First place, Epic-kitchens action detection at CVPR 2024
2024 First place, Epic-kitchens action recognition at CVPR 2024
2023 Best Paper Award, CVPR workshop CLVision
2023 First place, Visual Queries 3D Localization Challenge in Ego4D Workshop at CVPR 2023
2022 First place, Visual Queries 3D Localization Challenge in Ego4D Workshop at ECCV 2022
2021 Outstanding Reviewer, IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
2020 Finalist, MIT Enterprise Forum Saudi Startup Competition
2020 Second place, HACS Temporal Action Localization Challenge
2019 Finalist, Taqadam Startup Accelerator, Saudi Arabia
2016 Outstanding Graduate, Peking University
2016 Scholarship of Outstanding Talent, Peking University
2015 Best Paper Award, National Conference on Multimedia Technology (NCMT)
2012 First Prize of Qualcomm Innovation Fellowship Contest (QInF), only 2 in China
2012 Outstanding Individual in the Summer Social Practice, Peking University
2010 Outstanding Graduate Leader, Sichuan University
2008 Goldman Sachs Global Leaders Award (only 26 in the China mainland and 150 worldwide)
2007 National Scholarship (Top 1 out of 329 students), Sichuan University
2007 First‑Class Scholarship (Top 1 out of 329 students), Sichuan University

Contact

Please fill in the following form to leave me a message.

chen[dot]zhao[at]kaust[dot]edu[dot]sa
Office No. 3124, Building 13, KAUST, Thuwal, Makkah Province 23955-6900