The School of EECS is hosting the following HDR Progress Review 1 Confirmation Seminar:

 
Towards Unified Online 3D World Modeling with Vision Foundation Models
 
Speaker: Fengyi Zhang
Host: Rocky Chen
 
Abstract: Recent 3D Vision Foundation Models (3DVFMs) have substantially advanced 3D world modeling by directly predicting key 3D attributes from uncalibrated images within a unified feed-forward architecture. Their strong generalization ability marks a promising shift away from traditional multi-stage SfM and MVS pipelines. Nevertheless, current 3DVFMs suffer from two fundamental limitations that hinder their real-world deployment: failing to maintain consistency between independent submap predictions in online settings, and lacking the semantic understanding required for downstream perception tasks. To advance 3DVFMs toward consistent online reconstruction, we introduce TALO, which employs a higher-DOF, long-term alignment framework based on Thin Plate Spline, leveraging globally propagated control points to correct spatially varying inconsistencies. In addition, TALO adopts a point-agnostic submap registration design that is inherently robust to noisy geometry predictions. Collectively, these components enable globally consistent geometry and trajectory reconstruction in online settings. Building on the success of TALO, we further propose TT-Occ, which extends 3DVFMs toward semantic 3D world perception by incrementally integrating 3DVFMs with 2D foundation models and 3D Gaussian Splatting at runtime. TT-Occ yields accurate geometric reconstruction and open-vocabulary semantic understanding, without any network training or fine-tuning required. Together, these efforts move toward our broader research vision of unified online 3D world modeling with vision foundation models, enabling robust and consistent 3D perception for challenging 3D understanding tasks such as autonomous driving and embodied intelligence.
 
Bio: Fengyi Zhang is a second-year PhD student in the School of Electrical Engineering and Computer Science at the University of Queensland, supervised by Dr. Yadan Luo and Prof. Zi Huang. He obtained his Bachelor of Engineering from the School of Software at Shandong University, China, and his Master of Engineering from the School of Software Engineering at Tongji University, China. His research focuses on data-driven 3D reconstruction for real-world visual understanding.

About Data Science Seminar

This seminar series is hosted by EECS Data Science.

Venue

Room: 78-411, General Purpose South Building
Zoom: https://uqz.zoom.us/j/86449618497