NeRF.cpp

C++ / LibTorch implementation of NeRF. Given RGB images of a static scene with known camera intrinsics and per-image extrinsics, fits a SIREN-trunk radiance field with Fourier positional encoding and renders RGB and depth from arbitrary novel viewpoints.


RGB	Depth

The radiance field forming over training — the object rotates while the reconstruction sharpens from the noisy SIREN initialisation to the final render:

Pipeline

Ray generation. For each pixel the camera-space direction ((x − W/2)/f, −(y − H/2)/f, −1) is rotated into world space by the extrinsics; the camera origin is the translation column.
Sampling. Three strategies are supported at train and eval time: UNIFORM, STRATIFIED (NeRF §5.2: partition the interval into bins and draw one jittered sample per bin), and PROPOSAL (hierarchical: a coarse pass defines a per-ray PDF from which extra points are importance-sampled via inverse transform, concentrating samples near surfaces). For fair comparisons, single-pass samplers use 192 depth samples per ray (n_samples + n_importance); the proposal path uses 64 coarse + 128 importance samples, then evaluates the full radiance field at all 192 merged depths. The coarse pass uses a lightweight density-only proposal head (SirenNeRF::proposal_sigma): Fourier-encoded points with a stop-gradient feed a small trunk + σ head (no view directions, no RGB), so guiding importance sampling is much cheaper than re-running the full model. An interlevel loss (Mip-NeRF 360) trains the coarse proposal histogram to upper-bound the final main-network weight distribution (different bin locations, as in the paper).
Encoding + MLP. Inputs use NeRF-style Fourier positional encoding (L = 10 for xyz, L = 4 for view directions). See Architecture for layer counts.
Volume render. Discretised quadrature of σ along the ray gives per-sample α and transmittance; the resulting weights composite RGB and depth, alpha-composited onto a configurable background.
Loss. Pseudo-Huber photometric loss (δ = 0.1) plus the interlevel loss when training with PROPOSAL. Ray-batched training draws 8192 rays per step pooled across all training images (for PROPOSAL and STRATIFIED).

Architecture

Default widths and depths are set in SirenNeRF (include/siren_nerf.h, src/nerf.cpp). Fourier positional encoding (L = 10 xyz, L = 4 view) is applied before the SIREN trunks; view directions use the same encoding scheme as position for the RGB head only.

Component	Width	SIREN layers	Other layers	Output
Main trunk (`nerf_net_`)	128	6 (1 input + 5 hidden)	—	features
Main σ head	128 → 1	—	linear + softplus	density
Main RGB head	128 + view enc → 128	2	linear + sigmoid	RGB
Proposal trunk (`prop_net_`)	128	3 (1 input + 2 hidden)	—	features
Proposal σ head	128 → 1	—	linear + softplus	density

Main network total: 6 SIREN layers in the position trunk (view-independent σ), plus 2 SIREN layers in the RGB branch after view concat (8 SIREN layers on the RGB path), with separate linear density/RGB heads. The proposal path uses xyz Fourier encoding only (stop-gradient).

Proposal network total: 3 SIREN layers + 1 linear σ head; no RGB, no view input. Trained only via the interlevel loss, not photometric loss.

Training defaults (src/main.cpp): 160×160 images, TrainSampler::PROPOSAL, 64 coarse + 128 importance samples, 8192 rays/step, AdamW (lr = 5e-4, weight decay = 1e-2), warmup_iters = 0, interlevel weight 1.0, pseudo-Huber δ = 0.1, 10000 iterations.

Results

Trained and evaluated on the NeRF synthetic lego and ship scenes (held-out test split: every 8th view). Training uses 160×160 images, 10000 AdamW iterations (lr = 5e-4, weight decay = 1e-2), ray-batched PROPOSAL sampling (64 coarse + 128 importance → 192 main-model evals/ray), pseudo-Huber loss (δ = 0.1), on an NVIDIA RTX A6000. Metrics are averaged over the 13 test views at the final evaluation step.

Quality (PSNR / SSIM)

Scene	PSNR ↑	SSIM ↑	Output dir
lego	25.05	0.901	`output_lego_160_10k`
ship	25.03	0.767	`output_ship_160_10k`

At 10k iterations lego reaches 0.901 SSIM (+0.004 vs 6k @ 160) and 25.05 PSNR (+0.26 dB). Ship holds ~25.0 PSNR; SSIM is slightly below earlier 6k ship runs, which is common for this scene at this resolution.

Inference efficiency (coarse pass)

When rendering with the hierarchical PROPOSAL strategy, the coarse pass can reuse the full radiance field (Option A) or the cheap proposal head (Option B). Timing one deterministic 120×120 test view, averaged over 10 runs on the RTX A6000 (lego, trained with proposal):

Coarse pass	ms / view ↓	vs Option A
full model (Option A)	167	—
proposal head (Option B)	136	−19 %

The proposal head cuts coarse-pass latency by ~19 % because it skips view-dependent RGB and uses a narrow density-only trunk. Total hierarchical render time still includes a full 192-sample fine pass; the win is making the coarse PDF step cheap enough to use at both train and eval time without a large quality penalty.

Build

Requires a C++20 compiler, CMake ≥ 3.20, LibTorch (with CUDA if GPU training is wanted), and nlohmann_json.

mkdir build && cd build
cmake ..
make

Run

./NeRF.cpp <data_path> <output_path>

After training, assemble README GIFs (orbit + training progress):

bash make_gifs.sh output_lego_160_10k 30 10000 100 5 output/

load_dataset reads <data_path>/transforms.json — a shared camera_angle_x (FoV) and a transform_matrix (4×4 camera-to-world) per frame — plus the referenced image files. Scenes from the NeRF synthetic dataset (e.g. lego, ship) work directly.

References

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (Mildenhall et al., ECCV 2020)
Implicit Neural Representations with Periodic Activation Functions (Sitzmann et al., NeurIPS 2020)
Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields (Barron et al., CVPR 2022)
cNeRF by rafaelanderka

License

BSD 3-Clause.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
data		data
docs		docs
include		include
output		output
src		src
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
google6bd869b411cf5cbc.html		google6bd869b411cf5cbc.html
make_gifs.sh		make_gifs.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeRF.cpp

Pipeline

Architecture

Results

Quality (PSNR / SSIM)

Inference efficiency (coarse pass)

Build

Run

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NeRF.cpp

Pipeline

Architecture

Results

Quality (PSNR / SSIM)

Inference efficiency (coarse pass)

Build

Run

References

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages