DreamUDF: Generating Unsigned Distance Fields from A Single Image

 

Yu-Tao Liu1,2          Xuan Gao1,2          Weikai Chen3          Jie Yang1          Xiaoxu Meng3          Bo Yang3          Lin Gao1,2*               

 

1 Institute of Computing Technology, Chinese Academy of Sciences

 

2University of Chinese Academy of Sciences          3Tencent America

 

* Corresponding author  

Accepted by SIGGRAPH Asia 2024  

 

 

 

 

Figure 1: DreamUDF is capable of generating both open and closed surfaces from a single RGB image. Specifically, both 2D data prior brought by multi-view diffusion models and 3D geometry prior that comes with a data-free unsigned distance field (UDF) reconstructor are leveraged for the generation of 3D shapes. The two priors are simultaneously employed by different modules of the joint network. A Field Coupler and an alternating training strategy are proposed to allow the two modules to mutually boost each other. Experiments show that DreamUDF can be used to generate diverse shapes with open boundaries, such as the external canopy of the red umbrella, the wild daffodil with pale white tepals and yellow central trumpet, and the collars of garments. Note that the front faces are rendered in cyan and the back faces are rendered in grey.

 

 

Abstract

 

Recent advances in diffusion models and neural implicit surfaces have shown promising progress in generating 3D models. However, existing generative frameworks are limited to closed surfaces, failing to cope with a wide range of commonly seen shapes that have open boundaries. In this work, we present DreamUDF, a novel framework for generating high-quality 3D objects with arbitrary topologies from a single image. To address the challenge of generating proper topology given sparse and ambiguous observations, we propose to incorporate both the data priors from a multi-view diffusion model and the geometry priors brought by an unsigned distance field (UDF) reconstructor. In particular, we leverage a joint framework that consists of 1) a generation module that produces a neural radiance field for photorealistic renderings from arbitrary views; and 2) a reconstruction module that distills the learnable radiance field into surfaces with arbitrary topologies. We further introduce a field coupler that bridges the radiance field and UDF under a novel optimization scheme. This allows the two modules to mutually boost each other during training. Extensive experiments and evaluations demonstrate that DreamUDF achieves high-quality reconstruction and robust 3D generation on both closed and open surfaces with arbitrary topologies, compared to the previous works.

 

 

 

Paper

 

Coming Soon

 

Code

 

Coming Soon

 

 

 

Methodology

 

We use a dedicated hybrid network with data and geometry priors, which consists of two modules: 1) a generation module for data prior, and 2) a reconstruction module for geometry prior. An alternating training strategy is also used for the two modules to mutually enhance each other, which is achieved with a Field Coupler that is designed to bridge the two modules.

 

 

 

 

Figure 2: Pipeline of DreamUDF. In this work, We leverage a joint framework that consists of a generative module and a reconstructive module. The generative module provides consistent renderings from arbitrary views by optimizing a shallow density network 𝜎𝑔 using data prior brought by diffusion models. The reconstructive module introduces geometry prior to regularize a deep UDF network U, which distills the renderings from the generative module into accurate geometry. The Hash Encoding in the generative module and the Frequency Encoding in the reconstructive module are used to encode the position (𝑥, 𝑦, 𝑧). The Field Coupler establishes a positive feedback from the UDF to the renderings in the diffusion procedure, greatly improving the geometry-awareness of the generative module without negatively impacting the UDF distribution.

 

 

 

 

Figure 3:llustration and motivation of the field coupler. The renderings generated by the generation module have diversity, especially at regions that cannot be supervised by the input image. As shown in the figure, the obscure, filled, or empty shirt are all feasible NeRF field distributions given the single input image, while only the empty one is desirable. To address this issue, we use the field coupler to incorporate the geometry priors brought by the reconstruction module with the generation module in an iterative manner. Specifically, this field coupler 1) provides positive feedback for regions with more consistent supervision (e.g. the outer surface) and help them converge to a clean and sharp UDF, 2) provides negative feedback when the supervision is ambiguous and the UDF remains chaotic (e.g the inner region), and 3) prevents the generation process from adversely influencing the UDF gradient.

 

 

Results

 

 

Figure 4: The comparisons on the Deepfashion3D dataset. Ours can generate meshes with open boundaries, while other methods can only provide closed shapes.

 

 

 

 

Figure 5: The comparisons on open-boundary structures (butterfly, four-leaf clover, sunnydoll) and closed shapes (teddy bear from GSO dataset and chicken toy from RealFusion dataset). It can be seen that ours can generate shapes with both open and closed surfaces.

 

 

Video

 

 

 

BibTex

 

@article {DreamUDF2024,
    author = Liu, Yu-Tao and Gao, Xuan and Chen, Weikai and Yang, Jie and Meng, Xiaoxu and Yang, Bo and Gao, Lin},
    title = {DreamUDF: Generating Unsigned Distance Fields from A Single Image},
    journal = {ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH 2024)},
    year = {2024}
}
640,058 Total Pageviews