Command Palette
Search for a command to run...
4DNeX: Feed-Forward 4D Generative Modeling Made Easy
4DNeX: Feed-Forward 4D Generative Modeling Made Easy
Zhaoxi Chen Tianqi Liu Long Zhuo Jiawei Ren Zeng Tao He Zhu Fangzhou Hong Liang Pan Ziwei Liu
Abstract
We present 4DNeX, the first feed-forward framework for generating 4D (i.e.,dynamic 3D) scene representations from a single image. In contrast to existingmethods that rely on computationally intensive optimization or requiremulti-frame video inputs, 4DNeX enables efficient, end-to-end image-to-4Dgeneration by fine-tuning a pretrained video diffusion model. Specifically, 1)to alleviate the scarcity of 4D data, we construct 4DNeX-10M, a large-scaledataset with high-quality 4D annotations generated using advancedreconstruction approaches. 2) we introduce a unified 6D video representationthat jointly models RGB and XYZ sequences, facilitating structured learning ofboth appearance and geometry. 3) we propose a set of simple yet effectiveadaptation strategies to repurpose pretrained video diffusion models for 4Dmodeling. 4DNeX produces high-quality dynamic point clouds that enablenovel-view video synthesis. Extensive experiments demonstrate that 4DNeXoutperforms existing 4D generation methods in efficiency and generalizability,offering a scalable solution for image-to-4D modeling and laying the foundationfor generative 4D world models that simulate dynamic scene evolution.