PARF: Primitive-Aware Radiance Fusion for Indoor Scene Novel View Synthesis

Abstract

We propose a method for fast scene radiance field reconstruction with strong novel view synthesis performance and convenient scene editing functionality. The key idea is to fully utilize semantic parsing and primitive extraction for constraining and accelerating the radiance field reconstruction process. To fulfill this goal, a primitive-aware hybrid rendering strategy was proposed to enjoy the best of both volumetric and primitive rendering. We further contribute a reconstruction pipeline conducts primitive parsing and radiance field learning iteratively for each input frame which successfully fuses semantic, primitive, and radiance information into a single framework. Extensive evaluations demonstrate the fast reconstruction ability, high rendering quality, and convenient editing functionality of our method.

Performance comparison with the state-of-the-art radiance field reconstruction methods

Representation

Standard volume based rendering methods like NeRF can model complex scenes but suffer from heavy sampling and ambiguous geometry. Primitive based rendering methods like NeurMips enjoys fast rendering but have difficulty representing complex geometric.
We propose a novel hybrid representation to take advantage of both kinds of methods. Specifically, we represent the scene with a semantic volume, which consists three kinds of voxels with different sampling strategy. D-voxels we simply use dense sampling like nerf, while for P-voxels we apply sparse primitive aware sampling strategy. Besides, we simply skip sampling within E-voxels.

Framework

we show the optimization framework for incremental radiance reconstruction. Given each RGBD image as input, we first detect primitives such as planes. Then we merge the detected new primitives into a global primitive list. Next, we back project the semantic frame into a 3D semantic volume to update the global semantic state. After that, primitive aware hybrid rendering is applied to render RGB values, depth values as well as semantic values which are supervised by the input frames.

Incremental Performance

We provide performance comparison between PARF and NeRF-SLAM which is in depth supervised version of InstantNGP. Notice our PARF enjoys much faster convergence with the help of primitive-aware hybrid representation.

1. Replica Office_0

2. Replica Office_2

3. Replica Room_0

Extropolation Performance

We also show the extrapolation ability of our method. Note that under extrapolation views, PARF shows robust rendering results with the help of primitive-aware representation, while NeRF-SLAM shows blurry rendering results due to the ambiguous geometry.

1. Replica Office_0

2. Replica Office_2

3. Replica Room_0

4. BundleFusion apt0

Real-time Interaction and Rendering

We show that our method is capable of real-time rendering and interactions.

Replica Office_0

Replica Room_0

Sparse Reconstruction

Given only sparse view as input, PARF shows robust rendering performance thanks to the primitive based hybrid representation.

Scene Editing

Our primitive-aware hybrid representation also enables convenient scene editing.

Conclusion

We introduce PARF, a Primitive-Aware Radiance Fusion method for indoor scene radiance field reconstruction and editing. By combining volumetric and primitive rendering in a hybrid neural representation, we successfully merge semantic parsing, primitive extraction, and radiance fusion into a single framework. PARF achieves significant improvement in convergence speed, strong view extrapolation performance, and realistic semantic editing effects simultaneously. Since the discrete semantic volume may lead to jagged primitive boundaries for novel view synthesis, future work includes combining the semantic information in a more compact manner and adding more kinds of primitives for more effective reconstruction.

Bibtex

@inproceedings{Ying:etal:ICCV2023, author = {Haiyang Ying and Baowei Jiang and Jinzhi Zhang and Di Xu and Tao Yu and Qionghai Dai and Lu Fang}, title = {PARF: Primitive-Aware Radiance Fusion for Indoor Scene Novel View Synthesis}, booktitle = {Proceedings of the International Conference on Computer Vision (ICCV)}, year={2023} }