DE-NeRF: DEcoupled Neural Radiance Fields for View-Consistent Appearance Editing and High-Frequency Environmental Relighting

 

Tong Wu1          Jia-Mu Sun1         Yu-Kun Lai2       Lin Gao1         

 

 

1Institute of Computing Technology, Chinese Academy of Sciences

 

2Cardiff University      

 

Accepted by Proc. of SIGGRAPH 2023

 

 

 

 

Figure: Given a set of input images, we train a neural radiance field that decouples geometry, appearance, and lighting. Our method supports not only the geometry manipulation and appearance editing but also the rendering of the captured or modified scene in a novel lighting condition

 

 

 

Abstract

 

Neural Radiance Fields (NeRF) have shown promising results in novel view synthesis. While achieving state-of-the-art rendering results, NeRF usually encodes all properties related to geometry and appearance of the scene together into several MLP (Multi-Layer Perceptron) networks, which hinders downstream manipulation of geometry, appearance and illumination. Recently researchers made attempts to edit geometry, appearance and lighting for NeRF. However, they fail to render view-consistent results after editing the appearance of the input scene. Moreover, high-frequency environmental relighting is also beyond their capability as lighting is modeled as Spherical Gaussian (SG) and Spherical Harmonic (SH) functions or a low-resolution environment map. To solve the above problems, we propose DE-NeRF to decouple view-independent appearance and view-dependent appearance in the scene with a hybrid lighting representation. Specifically, we first train a signed distance function to reconstruct an explicit mesh for the input scene. Then a decoupled NeRF learns to attach view-independent appearance to the reconstructed mesh by defining learnable disentangled features representing geometry and view-independent appearance on its vertices. For lighting, we approximate it with an explicit learnable environment map and an implicit lighting network to support both low-frequency and high-frequency relighting. By modifying the view-independent appearance, rendered results are consistent across different viewpoints. Our method also supports high-frequency environmental relighting by replacing

 

 

 

paper thumbnail

Paper

 

DE-NeRF: DEcoupled Neural Radiance Fields for View-Consistent Appearance Editing and High-Frequency Environmental Relighting

[ArXiv Preprint]

 

Code

 

[Link]

 

 

 

 

 

 

 

 

 

Methodology

Overview of DE-NeRF

 

 

Figure: Given a set of images, we learn a signed distance function to reconstruct the geometry. Then, on the vertices of the reconstructed mesh, we set up learnable geometry features lg and appearance features la, lr, lp (corresponding to diffuse, roughness and specular components) to decompose geometry, appearance, and lighting in the scene. A sample point’s geometry feature lwg and appearance features lwa, lwr, lwp are obtained by KNN (K-nearest neighbor) interpolation. The geometry feature lwg and the distance to the mesh h are fed into an SDF decoder to predict its signed distance value s. Similarly, appearance features lwa, lwr, lwp, and distance h go hrough several appearance decoders to predict diffuse albedo a, roughness value r, and specular tint p. A learnable environment map Ed is integrated with the diffuse albedo to get diffuse color cd. We also train a specular lighting decoder Fs to predict specular lighting cl, which is multiplied by the specular tint t to produce the specular color cs. Combining cd and cs , we get the color c for this point.

 

 

 

 

 

Figure: Given a sample point in the scene (the red point), we sample multiple directions wo from the sample point to points (black points on the blue frame) on the sky sphere. We treat these directions as view directions and feed them along with the roughness value of the sample point into the specular lighting decoder to get the specular lighting colors from different view directions. These predicted specular lighting colors are unwrapped to the 2D image space as an environment map.

 

 

 

 

 

 

 

Geometry Reconstruction

 

 

 

Figure: Qualitative comparison of geometry reconstruction. Our method can recover better surface details compared to NeuS [Wang et al. 2021], PhySG [Zhang et al. 2021a], and NvDiffRec [Munkberg et al. 2022].

 

 

 

Novel View Synthesis

 

 

 

Figure: Novel view synthesis comparisons with PhySG [Zhang et al. 2021a], NeRFactor [Zhang et al. 2021b], NvDiffRec [Munkberg et al. 2022], and NeuMesh [Bao et al. 2022].

 

 

 

Appearance Editing

 

 

 

Figure: Scene appearance editing comparison with NeuMesh [Bao et al. 2022]. NeuMesh [Bao et al. 2022] can generate plausible rendering results from the editing viewpoint but rendered results from another viewpoint may be inconsistent with the input editing. Our method produces more faithful editing results from both editing viewpoint and novel viewpoints.

 

 

 

Relighting

 

 

 

Figure: Scene relighting comparisons with PhySG [Zhang et al. 2021a], InvRender [Zhang et al. 2022b], NeRFactor [Zhang et al. 2021b], NvDiffRec [Munkberg et al. 2022], and NvDiffRecMC [Hasselgren et al. 2022]. In each row, the input scene and target environment map are shown in the first column. In other columns, we show relighting results by different methods and the ground truth relighting result. With the help of our reconstructed geometry and hybrid lighting representation, our method can produce more faithful relighting results with high-frequency details.

 

 

 

 

 

 

 

 

BibTex

 

@inproceedings {DE-NeRF,
    author = {Tong Wu and Jia-Mu Sun and Yu-Kun Lai and Lin Gao},
    title = {DE-NeRF: DEcoupled Neural Radiance Fields for View-Consistent Appearance Editing and High-Frequency Environmental Relighting},
    booktitle = {ACM SIGGRAPH},
    year = {2023}
}

 

 


Last updated on June, 2023.