Distinguished Seminar Series on Data Science & Artificial Intelligence - "Exploring Knowledge Learning from Video Generative Modeling" by Dr Jiashi FENG
Research Seminar
-
Date
14 Feb 2025
-
Organiser
Department of Computing
-
Time
16:00 - 17:00
-
Venue
Online via Zoom / FJ302
Speaker
Dr Jiashi FENG
Summary
Current AI models primarily acquire knowledge through generative text modeling, as exemplified by large language models. However, videos, which contain rich information such as visual reasoning and task execution, remain largely underexplored. These modalities offer complementary knowledge that text alone cannot capture.
In this talk, I will present our recent efforts to leverage video generative modeling for learning from video data. First, I will introduce VideoWorld, a generative video learning framework designed to extract visual reasoning and planning capabilities from unlabeled videos. This model has demonstrated applicability in tasks such as playing Go and robotic manipulation. Next, I will discuss our recent findings on the limitations of video generative models when used as world models. Finally, I will conclude with a discussion on existing challenges and outline potential future directions.
Keynote Speaker
Dr Jiashi FENG
Head of Vision Research
ByteDance
Dr Jiashi FENG is the Head of Vision Research at ByteDance. Before this role, he served as an Assistant Professor in the Department of Electrical and Computer Engineering at the National University of Singapore (NUS). He earned his PhD from NUS in 2014 and worked as a Research Fellow at UC Berkeley from 2014 to 2015. His research focuses on deep learning and its applications in computer vision. Over the years, he has received several awards, including the Best Technical Demo Award at ACM MM 2012, the Best Paper Award at TASK-CV ICCV 2015, and the Best Student Paper Award at ACM MM 2018. In 2018, he was also named one of MIT Technology Review's Innovators Under 35 Asia.