ECE Women Community

Distilling Foundation Models for a Semantic Reconstruction of the 3D World

Date: November,25,2025 Start Time: 11:30 - 12:30

Location: 506, Zisapel Building

Add to:

Lecturer: Sagie Benaim

Research Areas:

Image & Signal Processing, Computer Vision & Bio-signals

Enabling machines to reconstruct and semantically understand the 3D world is a fundamental goal for applications such as robotics, autonomous systems, and immersive telepresence. While recent advances achieve photorealistic 3D reconstruction and novel-view synthesis, this talk presents a framework that goes beyond visual accuracy to build scenes that are semantically rich. By distilling the vast knowledge of pre-trained 2D foundation models into our 3D representations, we unlock powerful open-vocabulary understanding, allowing scenes to be interactively segmented with simple text or clicks without relying on scarce 3D supervised data.

First, I will demonstrate how we can build a queryable 3D representation of large-scale static environments using Lang3D-XL, which enables interactive, text-based exploration by augmenting 3D Gaussian Splatting with low-dimensional semantic features. Building on this, I will extend the concept to dynamic scenes with Dynamic 3D Gaussian Distillation (DGD). This method reconstructs a full 4D semantic scene from a single monocular video, allowing for instantaneous, open-vocabulary segmentation of any object or agent. This enables, for instance, fine-grained interaction and analysis within complex, unstructured 4D data. Finally, I will reverse this information flow with “Splat and Distill,” showing how 3D reconstruction can serve as a geometric teacher to instill robust 3D awareness back into 2D foundation models.

Sagie Benaim is an Assistant Professor (Senior Lecturer) at the School of Computer Science and Engineering at the Hebrew University of Jerusalem. Previously, he was a postdoc at Copenhagen University, working with Prof. Serge Belongie and as a member of the Pioneer Center for AI. Prior to that, he completed his PhD at Tel Aviv University in the Deep Learning Lab under the supervision of Prof. Lior Wolf. His research interests lie in computer vision, machine learning, and computer graphics, with a particular focus on generative models, neural-based signal representations, and inverse graphics.

Seminar: Pixel Club

Seminars

Distilling Foundation Models for a Semantic Reconstruction of the 3D World

Seminars

Distilling Foundation Models for a Semantic Reconstruction of the 3D World

Upcoming Seminars

Understanding and Improving Laplacian Positional Encodings For Temporal GNNs

Noise Recycling Based Multi-level Flash Memory

An All-Digital Cancellation Flow for Mitigating TX Noise, Cross-Modulation, and IP2 Distortion in a Balanced N-Path FDD Transceiver