The light field camera has significantly advanced conventional imaging methods and microscopy over the past decades, providing high-dimensional information in 2D images and enabling a variety of applications. However, inherent shortcomings persist, mainly due to the complex optical setup and the trade-off between resolution. In this work, we propose a Neural Defocus Light Field (NDLF) rendering method, which constructs the light field without a micro-lens array but achieves the same resolution as the original image. The basic unit of NDLF is the 3D point spread function (3D-PSF), which extends the 2D-PSF by incorporating the focus depth axis. NDLF can directly solve the distribution of PSFs in 3D space, enabling direct manipulation of the PSF in 3D and enhancing our understanding of the defocus process. NDLF achieves the focused images rendering by redefining the focus images as slices of the NDLF, which are superpositions of cross-sections of the 3D-PSFs. NDLF modulates the 3D-PSFs using three multilayer perceptron modules, corresponding to three Gaussian-based models from coarse to fine. NDLF is trained on 20 high‑resolution (1024 × 1024) images at different focus depths, enabling it to render focused images at any given focus depth. The structural similarity index between the predicted and measured focused images is 0.9794. Moreover, we developed a hardware system to collect the high resolution focused images with corresponding focus depth, and depth maps. NDLF achieves high-resolution light field imaging with a single-lens camera and also resolves the distribution of 3D-PSFs in 3D space, paving the way for novel light‑field synthesis techniques and deeper insights into defocus blur.
Defocus Light Field
The fundamental unit of our system is the 3D Point Spread Function (3D-PSF), where the depth axis represents focus level, and each 2D slice corresponds to a conventional 2D-PSF, also known as the Circle of Confusion (CoC). A focused image at a given depth can be rendered as a slice along this axis—obtained by superposing all relevant 2D-PSFs. In the figure above, the green and yellow slices illustrate focal planes at two different depths.
Algorithm Overview
Our method comprises three stages. First, MLP-a generates a basic CoC using a Gaussian model, parameterized by $\sigma_x$ and $\sigma_y$, based on input light rays and focal depth. Second, MLP-b refines this by producing a more expressive Gaussian with covariance, outputting five parameters: $\sigma_x$, $\sigma_y$, $\mu_x$, $\mu_y$, and $\rho$. Finally, MLP-c models non-Gaussian blur effects by learning a non-parametric kernel from the previous output, enabling accurate modeling of complex defocus behaviors not captured by Gaussian assumptions.
Dataset Overview
We collected a high-resolution defocus image dataset consisting of 20 samples (1024×1024 pixels) captured at distinct focus depths. Each sample is paired with an accurate depth map, providing spatially resolved focus information across the scene. This dataset enables supervised training of our NDLF model and supports evaluation under various optical configurations.
Figure 4: Results of focused image rendering. (a) Focused image and error map generated by MLP-a parameters. (b) Focused image and error map generated by MLP-a+b. (c) Focused image and error map generated by MLP-a+b+c. (d) Ground-truth image. The error map represents the absolute difference between the ground truth image and the synthesized image, scaled by a factor of 8.5. The effectiveness of these three modules can be observed through the error maps, as the generated focused images gradually approach the ground-truth image.
Figure Description: (Top) The 2D-PSFs at different focus depths (800–1200 mm) are extracted as slices of the 3D-PSF at pixel (320, 200), illustrating how PSF shape evolves with depth. (Bottom) Focused images paired with defocus maps, generated using the MLP-b module. The defocus level is computed as $\sigma(x,y) = \frac{1}{2} \left( \sigma_{x}(x,y) + \sigma_{y}(x,y) \right)$, reflecting the spatially-varying blur across the image plane.
Figure: (Left) The 3D Point Spread Function (3D-PSF) exhibits a double-cone structure with focus-depth offset and non-circular slices, reflecting physical aberrations. (Right) The 2D-PSFs plotted across the image X-Y plane reveal spatially varying shapes, normalized for visualization. Centers are marked with gray lines to highlight spatial shift.
@ARTICLE{11023177,
author={He, Renzhi and Hong, Hualin and Cheng, Zhou and Liu, Fei},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Neural Defocus Light Field Rendering},
year={2025},
volume={},
number={},
pages={1-12},
keywords={Light fields;Three-dimensional displays;Cameras;Image resolution;Kernel;Rendering (computer graphics);Optical imaging;Accuracy;Computational modeling;Training;Computational Imaging;Point Spread Function;Light Field;Defocus Light Field},
doi={10.1109/TPAMI.2025.3576638}}