Article Contents

Citation:

SGARNet: a deep artifact removal approach for lensless multi-core fiber imaging


  • Light: Advanced Manufacturing  7, Article number: 50 (2026)
More Information
  • Corresponding author:
    Jiachen Wu (wjc2022@mail.tsinghua.edu.cn)Liangcai Cao (clc@tsinghua.edu.cn)
  • Received: 03 December 2025
    Revised: 30 March 2026
    Accepted: 31 March 2026
    Accepted article preview online: 15 April 2026
    Published online: 15 April 2026

doi: https://doi.org/10.37188/lam.2026.050

  • Multi-core fiber (MCF) imaging is essential for minimally invasive endoscopy in medicine and industrial inspection. However, the bulky distal optics increase the diameter and invasiveness, causing tissue damage. Its applications are further constrained by low spatial resolution and prominent honeycomb artifacts. We present a lensless MCF imaging approach based on Spectral-Guided Artifact Removal Network (SGARNet). In this framework, a physics-informed prior is embedded in a lightweight SpectralGate module to suppress lattice-frequency artifacts in the feature domain. The experimental results show a 12.12 dB improvement in PSNR and 0.4064 increase in SSIM, indicating superior performance over previous methods. The robustness and generalizability are confirmed by successful reconstructions across diverse textural complexities and biological tissue samples. These results demonstrate potential for practical deployment in compact and safer biomedical endoscopes.
  • 加载中
  • [1] Boppart, S. A., Deutsch, T. F. & Rattner, D. W. Optical imaging technology in minimally invasive surgery: current status and future directions. Surgical Endoscopy 13, 718-722 (1999). doi: 10.1007/s004649901081
    [2] Gulati, S. et al. The future of endoscopy: Advances in endoscopic image innovations. Digestive Endoscopy 32, 512-522 (2020). doi: 10.1111/den.13481
    [3] Zhi, Z. W. et al. Supercontinuum light source enables in vivo optical microangiography of capillary vessels within tissue beds. Optics Letters 36, 3169-3171 (2011). doi: 10.1364/OL.36.003169
    [4] Park, C. H. et al. Role of probe-based confocal laser endomicroscopy-targeted biopsy in the molecular and histopathological study of gastric cancer. Journal of Gastroenterology and Hepatology 34, 84-91 (2019). doi: 10.1111/jgh.14471
    [5] Bouma, B. E. et al. Evaluation of intracoronary stenting by intravascular optical coherence tomography. Heart 89, 317-320 (2003). doi: 10.1136/heart.89.3.317
    [6] Kaur, M., Lane, P. M. & Menon, C. Endoscopic optical imaging technologies and devices for medical purposes: state of the art. Applied Sciences 10, 6865 (2020). doi: 10.3390/app10196865
    [7] Kim, Y. et al. Smartphone-based rigid endoscopy device with hemodynamic response imaging and laser speckle contrast imaging. Biosensors 13, 816 (2023). doi: 10.3390/bios13080816
    [8] Yuan, W. et al. Theranostic oct microneedle for fast ultrahigh-resolution deep-brain imaging and efficient laser ablation in vivo. Science Advances 6, eaaz9664 (2020). doi: 10.1126/sciadv.aaz9664
    [9] Zhang, Q. et al. Diffractive optical elements 75 years on: from micro-optics to metasurfaces. Photonics Insights 2, R09 (2023). doi: 10.3788/PI.2023.R09
    [10] Gissibl, T. et al. Sub-micrometre accurate free-form optics by three-dimensional printing on single-mode fibres. Nature Communications 7, 11763 (2016). doi: 10.1038/ncomms11763
    [11] Ren, H. R. et al. An achromatic metafiber for focusing and imaging across the entire telecommunication range. Nature Communications 13, 4183 (2022). doi: 10.1038/s41467-022-31902-3
    [12] Vasquez-Lopez, S. A. et al. Subcellular spatial resolution achieved for deep-brain imaging in vivo using a minimally invasive multimode fiber. Light: Science & Applications 7, 110 (2018). doi: 10.1038/s41377-018-0111-0
    [13] Turtaev, S. et al. High-fidelity multimode fibre-based endoscopy for deep brain in vivo imaging. Light: Science & Applications 7, 92 (2018). doi: 10.1038/s41377-018-0094-x
    [14] Li, S. H. et al. Memory effect assisted imaging through multimode optical fibres. Nature Communications 12, 3751 (2021). doi: 10.1038/s41467-021-23729-1
    [15] Cifuentes, A. et al. Polarization-resolved secondharmonic generation imaging through a multimode fiber. Optica 8, 1065-1074 (2021). doi: 10.1364/OPTICA.430295
    [16] Stellinga, D. et al. Time-of-flight 3D imaging through multimode optical fibers. Science 374, 1395-1399 (2021). doi: 10.1126/science.abl3771
    [17] Wen, Z. et al. Single multimode fibre for in vivo lightfield-encoded endoscopic imaging. Nature Photonics 17, 679-687 (2023). doi: 10.1038/s41566-023-01240-x
    [18] Zhan, N. et al. Enhanced ultrafine multimode fiber imaging based on mode modulation through singular value decomposition. Photonics Research 12, 2214-2225 (2024). doi: 10.1364/PRJ.529353
    [19] Yu, H. Y. et al. All-optical image transportation through a multimode fibre using a miniaturized diffractive neural network on the distal facet. Nature Photonics 19, 486-493 (2025). doi: 10.1038/s41566-025-01621-4
    [20] Liu, N. H. et al. Rotational memory effect-inspired radon domain learning empowers image transmission through multimode fibers. Laser & Photonics Reviews 19, e00089 (2025). doi: 10.1002/lpor.202500089
    [21] Zhou, Y. B. et al. Multiplexing-enhanced computational imaging: High-fidelity reconstruction and color imaging in multimode fibers. Laser & Photonics Reviews 20, e01267 (2026).
    [22] Du, Y. et al. Hybrid multimode-multicore fibre based holographic endoscope for deep-tissue neurophotonics. Light: Advanced Manufacturing 3, 408-416 (2022).
    [23] Zolnacz, K. et al. Multicore fiber with thermally expanded cores for increased collection efficiency in endoscopic imaging. Light: Advanced Manufacturing 5, 580-587 (2024). doi: 10.37188/lam.2024.049
    [24] Shanker, A. et al. Quantitative phase imaging endoscopy with a metalens. Light: Science & Applications 13, 305 (2024). doi: 10.1038/s41377-024-01587-y
    [25] Li, H. et al. 500 μm field-of-view probe-based confocal microendoscope for large-area visualization in the gastrointestinal tract. Photonics Research 9, 1829-1841 (2021). doi: 10.1364/PRJ.431767
    [26] Froch, J. E. et al. Real time full-color imaging in a ¨ Meta-optical fiber endoscope. eLight 3, 13 (2023). doi: 10.1186/s43593-023-00044-4
    [27] Lich, J. et al. Single-shot 3D incoherent imaging with diffuser endoscopy. Light: Advanced Manufacturing 5, 218-228 (2024). doi: 10.37188/lam.2024.015
    [28] Skarsoulis, K. et al. Ptychographic imaging with a fiber endoscope via wavelength scanning. Optica 11, 782-790 (2024). doi: 10.1364/OPTICA.519965
    [29] Wu, J. C. et al. Single-shot lensless imaging with fresnel zone aperture and incoherent illumination. Light: Science & Applications 9, 53 (2020). doi: 10.1038/s41377-020-0289-9
    [30] Huang, Z. Z. & Cao, L. C. Quantitative phase imaging based on holography: trends and new perspectives. Light: Science & Applications 13, 145 (2024). doi: 10.1038/s41377-024-01453-x
    [31] Gao, Y. H. & Cao, L. C. Model-based deep learning enables time-resolved computational microscopy. PhotoniX 7, 3 (2026). doi: 10.1186/s43074-025-00222-2
    [32] Tsvirkun, V. et al. Flexible lensless endoscope with a conformationally invariant multi-core fiber. Optica 6, 1185-1189 (2019). doi: 10.1364/OPTICA.6.001185
    [33] Kuschmierz, R. et al. Ultra-thin 3D lensless fiber endoscopy using diffractive optical elements and deep neural networks. Light: Advanced Manufacturing 2, 415-424 (2021). doi: 10.37188/lam.2021.030
    [34] Wu, J. C. et al. Learned end-to-end high-resolution lensless fiber imaging towards real-time cancer diagnosis. Scientific Reports 12, 18846 (2022). doi: 10.1038/s41598-022-23490-5
    [35] Sun, J. W. et al. Quantitative phase imaging through an ultra-thin lensless fiber endoscope. Light: Science & Applications 11, 204 (2022). doi: 10.1038/s41377-022-00898-2
    [36] Badt, N. & Katz, O. Real-time holographic lensless micro-endoscopy through flexible fibers via fiber bundle distal holography. Nature Communications 13, 6055 (2022). doi: 10.1038/s41467-022-33462-y
    [37] Stephan, R. et al. Bendable fiber lens for minimally invasive endoscopy. Laser & Photonics Reviews 19, 2401757 (2025). doi: 10.1002/lpor.202401757
    [38] Sun, J. W. et al. Lensless fiber endomicroscopy in biomedicine. PhotoniX 5, 18 (2024). doi: 10.1186/s43074-024-00133-8
    [39] Chen, Z. Q. et al. Diffusion-driven lensless fiber endomicroscopic quantitative phase imaging towards digital pathology. Advanced Imaging 2, 041003 (2025). doi: 10.3788/AI.2025.10010
    [40] Dremel, J. et al. Lensless single-shot multicore fiber endomicroscopy using a single multispectral hologram. Light: Advanced Manufacturing 6, 896-903 (2025). doi: 10.37188/lam.2025.027
    [41] Lee, C. Y. & Han, J. H. Integrated spatio-spectral method for efficiently suppressing honeycomb pattern artifact in imaging fiber bundle microscopy. Optics Communications 306, 67-73 (2013). doi: 10.1016/j.optcom.2013.05.045
    [42] Lee, C. Y. & Han, J. H. Elimination of honeycomb patterns in fiber bundle imaging by a superimposition method. Optics Letters 38, 2023-2025 (2013). doi: 10.1364/OL.38.002023
    [43] Shao, J. B. et al. Resolution enhancement for fiber bundle imaging using maximum a posteriori estimation. Optics Letters 43, 1906-1909 (2018). doi: 10.1364/OL.43.001906
    [44] Wang, J. Y. et al. Honeycomb effect elimination in differential phase fiber-bundle-based endoscopy. Optics Express 32, 20682-20694 (2024). doi: 10.1364/OE.526033
    [45] Tsvirkun, V. et al. Widefield lensless endoscopy with a multicore fiber. Optics Letters 41, 4771-4774 (2016). doi: 10.1364/OL.41.004771
    [46] Andresen, E. R. et al. Measurement and compensation of residual group delay in a multi-core fiber for lensless endoscopy. Journal of the Optical Society of America B 32, 1221-1228 (2015). doi: 10.1364/JOSAB.32.001221
    [47] Kim, Y. et al. Semi-random multicore fibre design for adaptive multiphoton endoscopy. Optics Express 26, 3661-3673 (2018). doi: 10.1364/OE.26.003661
    [48] Zhao, J. et al. High-fidelity imaging through multimode fibers via deep learning. Journal of Physics: Photonics 3, 015003 (2021). doi: 10.1088/2515-7647/abcd85
    [49] Resisi, S., Popoff, S. M. & Bromberg, Y. Image transmission through a dynamically perturbed multimode fiber by deep learning. Laser & Photonics Reviews 15, 2000553 (2021). doi: 10.1002/lpor.202000553
    [50] Liu, Z. T. et al. All-fiber high-speed image detection enabled by deep learning. Nature Communications 13, 1433 (2022). doi: 10.1038/s41467-022-29178-8
    [51] Hu, X. W. et al. Unsupervised full-color cellular image reconstruction through disordered optical fiber. Light: Science & Applications 12, 125 (2023). doi: 10.1038/s41377-023-01183-6
    [52] Abdulaziz, A. et al. Robust real-time imaging through flexible multimode fibers. Scientific Reports 13, 11371 (2023). doi: 10.1038/s41598-023-38480-4
    [53] Feng, H. G., Zhu, R. Z. & Xu, F. Feature-enhanced fiber bundle imaging based on light field acquisition. Advanced Imaging 1, 011002 (2024). doi: 10.3788/AI.2024.10002
    [54] Sun, J. W. et al. Calibration-free quantitative phase imaging in multi-core fiber endoscopes using end-toend deep learning. Optics Letters 49, 342-345 (2024). doi: 10.1364/OL.509772
    [55] Wang, F., Czarske, J. W. & Situ, G. Deep learning for computational imaging: from data-driven to physicsenhanced approaches. Advanced Photonics 7, 054002 (2025).
    [56] Sun, J. W. et al. AI-driven projection tomography with multicore fibre-optic cell rotation. Nature Communications 15, 147 (2024). doi: 10.1038/s41467-023-44280-1
    [57] Wang, T. J. et al. Resolution-enhanced multi-core fiber imaging learned on a digital twin for cancer diagnosis. Neurophotonics 11, S11505 (2024).
    [58] Shao, J. B. et al. Fiber bundle image restoration using deep learning. Optics Letters 44, 1080-1083 (2019). doi: 10.1364/OL.44.001080
    [59] Kim, E. et al. Honeycomb artifact removal using convolutional neural network for fiber bundle imaging. Sensors 23, 333 (2023).
    [60] Chen, J. L., Shang, W. F. & Xu, S. Endoir: A GAN-based method for fiber bundle endoscope image restoration. Optics and Lasers in Engineering 184, 108588 (2025). doi: 10.1016/j.optlaseng.2024.108588
    [61] Renteria, C. et al. Depixelation and enhancement of fiber bundle images by bundle rotation. Applied Optics 59, 536-544 (2020). doi: 10.1364/AO.59.000536
    [62] Chen, L. Y. et al. Simple baselines for image restoration. In Computer Vision–ECCV 2022 (eds Avidan, S. et al.) 17–33 (Springer Nature Switzerland, Cham, 2022).
    [63] LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436-444 (2015). doi: 10.1038/nature14539
    [64] Guay, P. et al. Correcting photodetector nonlinearity in dual-comb interferometry. Optics Express 29, 29165-29174 (2021). doi: 10.1364/OE.435701
    [65] Russakovsky, O. et al. ImageNet large scale visual recognition challenge. International Journal of Computer Vision 115, 211-252 (2015). doi: 10.1007/s11263-015-0816-y
    [66] Wang, K. Q. & Lam, E. Y. Deep learning phase recovery: Data-driven, physics-driven, or a combination of both?. Advanced Photonics Nexus 3, 056006 (2024). doi: 10.1117/1.apn.3.5.056006
    [67] Shen, Y. X. et al. Comparative study of the influence of imaging resolution on linear retardance parameters derived from the Mueller matrix. Biomedical Optics Express 12, 211-225 (2020).
    [68] Pshenay-Severin, E. et al. Multimodal nonlinear endomicroscopic imaging probe using a double-core double-clad fiber and focus-combining micro-optical concept. Light: Science & Applications 10, 207 (2021). doi: 10.1038/s41377-021-00648-w
    [69] Schmidt, K. et al. Chromatic aberration correction employing reinforcement learning. Optics Express 31, 16133-16147 (2023). doi: 10.1364/OE.487045
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Figures(7) / Tables(1)

Research Summary

Fiber endoscopy: Physics-guided network erases honeycomb artifacts

Lensless multi-core fiber endoscopes can capture images through extremely small openings, but their regularly arranged fiber cores create honeycomb-like artifacts that hide fine structures. Researchers from Tsinghua University and Technische Universität Dresden report a physics-guided deep learning network, SGARNet. This network identifies these honeycomb patterns from their unique spectral features and suppresses them using a learnable frequency-domain mask. By incorporating physical principles into the model, the method mitigates aliasing while preserving important image details. Tests on various textures, resolution targets and biological tissues show that SGARNet achieves better image quality and stability than existing methods, demonstrating potential for high-fidelity biomedical endoscopic imaging.

show all

Article Metrics

Article views(25) PDF downloads(373) Citation(0) Citation counts are provided from Web of Science. The counts may vary by service, and are reliant on the availability of their data.

SGARNet: a deep artifact removal approach for lensless multi-core fiber imaging

  • 1. State Key Laboratory of Precision Measurement Technology and Instruments, Department of Precision Instrument, Tsinghua University, Beijing 100084, China
  • 2. Laboratory of Measurement and Sensor System Technique, Technische Universität Dresden, Dresden 01069, Germany
  • 3. Competence Center BIOLAS, Technische Universität Dresden, Dresden 01069, Germany
  • 4. Institute of Applied Physics, Faculty of Science, Technische Universität Dresden, Dresden 01069, Germany
  • 5. Cluster of Excellence Physics of Life, Technische Universität Dresden, Dresden 01069, Germany
  • Corresponding author:

    Jiachen Wu, wjc2022@mail.tsinghua.edu.cn

    Liangcai Cao, clc@tsinghua.edu.cn

doi: https://doi.org/10.37188/lam.2026.050

Abstract: Multi-core fiber (MCF) imaging is essential for minimally invasive endoscopy in medicine and industrial inspection. However, the bulky distal optics increase the diameter and invasiveness, causing tissue damage. Its applications are further constrained by low spatial resolution and prominent honeycomb artifacts. We present a lensless MCF imaging approach based on Spectral-Guided Artifact Removal Network (SGARNet). In this framework, a physics-informed prior is embedded in a lightweight SpectralGate module to suppress lattice-frequency artifacts in the feature domain. The experimental results show a 12.12 dB improvement in PSNR and 0.4064 increase in SSIM, indicating superior performance over previous methods. The robustness and generalizability are confirmed by successful reconstructions across diverse textural complexities and biological tissue samples. These results demonstrate potential for practical deployment in compact and safer biomedical endoscopes.

Research Summary

Fiber endoscopy: Physics-guided network erases honeycomb artifacts

Lensless multi-core fiber endoscopes can capture images through extremely small openings, but their regularly arranged fiber cores create honeycomb-like artifacts that hide fine structures. Researchers from Tsinghua University and Technische Universität Dresden report a physics-guided deep learning network, SGARNet. This network identifies these honeycomb patterns from their unique spectral features and suppresses them using a learnable frequency-domain mask. By incorporating physical principles into the model, the method mitigates aliasing while preserving important image details. Tests on various textures, resolution targets and biological tissues show that SGARNet achieves better image quality and stability than existing methods, demonstrating potential for high-fidelity biomedical endoscopic imaging.

show all
Introduction
  • Real-time, high-resolution in vivo imaging is essential for biomedical diagnosis and minimally invasive therapy1,2. Such imaging enables microvascular perfusion mapping3 and image-guided interventions such as targeted biopsy4 and stent placement5. Conventional lens-based endoscopes have been widely employed for examining luminal organs; however, the inherent rigidity and trade-off between flexibility and resolution restrict these endoscopes from accessing confined regions, such as the deep brain or cardiovascular tissues69. To address these limitations, fiber-optic imaging has emerged as an active and rapidly evolving research frontier, with recent advances spanning single-mode fiber10,11, multimode fiber imaging1222, and multi-core fiber (MCF) imaging2328. To make imaging systems more compact, lensless architectures have been applied, offering the advantage of eliminating distal optics and thereby reducing the size and weight of the system2931. In particular, lensless MCF imaging enables distal endoscope tips with a few hundred micrometers in diameter, maintaining imaging stability against fiber bending and environmental perturbations3240. Nevertheless, the intrinsic spacing between MCF cores imposes a fundamental sampling limit. While each core effectively integrates the incident light within its numerical aperture (NA) to act as a spatial low-pass filter, the object's spatial frequencies often exceed the maximum resolvable frequency defined by the inter-core spacing. Consequently, this discrete-sampling lattice inevitably leads to undersampling of the object field, thereby violating the Nyquist–Shannon theorem and resulting in spectral aliasing that manifests itself as pronounced honeycomb artifacts.

    Traditional approaches to suppress MCF honeycomb artifacts employed filtering and interpolation, such as morphological closing combined with Fourier-domain peak correction and Gaussian notch filtering41. These methods are computationally efficient, but yield only modest improvements over raw acquisitions. More advanced algorithms exploit multi-frame fusion42, maximum a posteriori (MAP) estimation43, and spatial pixel-shift integration44 to reduce honeycomb artifacts and enhance the image resolution simultaneously. These approaches enhance the image quality but require significant computational costs. Optical strategies can also be employed to mitigate imaging artifacts. By using the MCF as a “fiber lens” placed away from the image plane, honeycomb artifacts can be significantly reduced or eliminated37,45. This approach mitigates artifacts that are caused by sampling but introduces ghost replicas from higher diffraction orders. Suppression of these higher orders is necessary to achieve a large field of view. One effective strategy is employing femtosecond laser pulses. However, this method requires considerable technical effort to compensate for the inherent inter-core group delay dispersion46. Alternatively, higher-order replicas can be suppressed computationally using physics-informed neural networks or structurally by adopting a randomized core arrangement47.

    In recent years, deep learning has become a powerful framework for fiber-optic imaging37,4857. GARNN58, HAR-CNN59, and GAN-based methods60 have enabled the end-to-end removal of honeycomb artifacts in MCF imaging, facilitating real-time endoscopic applications. However, deep-learning-based methods still suffer from two key limitations that stem from their methodological design. First, most rely purely on data-driven approaches, without integrating physics-informed priors. Thus, their performance depends heavily on the quality and scale of the training datasets, which typically require tens of thousands of samples. However, acquiring such paired data using fiber-optic imaging is challenging. Existing methods use additional optical paths58 to acquire reference images, and each data acquisition set requires manual replacement of the imaging target, which is typically a biological tissue or sample, thereby significantly increasing the costs and difficulty. Second, some studies have used synthetic datasets to mitigate real data shortages59,60. However, this solution is flawed because synthetic data generation generally overlooks inter-core crosstalk, which is a critical physical phenomenon in MCF imaging. These two issues make it difficult to ensure the robustness of trained models when they are applied to different scenarios.

    To address these challenges, we propose the Spectral-Guided Artifact Removal Network (SGARNet), which embeds a physics-informed prior into the bottleneck feature spectra through a lightweight mid-layer SpectralGate that is implemented as a learnable notch mask. A degradation model for lensless MCF imaging is first established to link honeycomb artifacts to the core geometry quantitatively. Guided by this model, SpectralGate is introduced to encode the frequency-domain characteristics, and the network is designed without explicit nonlinear activation functions, thereby streamlining the data flow and lowering the computational cost and latency. In addition, we develop a self-registered paired-dataset acquisition strategy for real MCF imaging without additional optical paths. Using only 2,800 paired samples, the network is trained within a supervised learning framework using a joint loss function and evaluated on images of diverse textural complexities. Experimental results demonstrate that SGARNet generalizes across different image complexities, suppresses honeycomb artifacts, and enables accurate full-color image restoration with an end-to-end processing speed of 10 images per second.

    • The honeycomb artifacts produced in MCF image transmission are closely related to the core arrangement at the facets as well as the fiber parameters. The most common packing geometries are square and hexagonal lattices. In this section, a physics model is established to analyze the influence of the core diameter and inter-core spacing on honeycomb artifacts under hexagonal packing.

      In lensless MCF image transmission, each fiber core directly collects coupled incident light within the critical angle. As shown in Fig. 1, the transmitted image can be regarded as a local-weighted average of the input intensity, followed by convolution with the point spread function (PSF) of each fiber, resulting in a low-quality output image with honeycomb artifacts. The PSF of each fiber can be expressed in terms of the coupling efficiency at the fiber facet, and is commonly modeled as a Gaussian distribution61:

      Fig. 1  Lensless MCF degradation mechanism and frequency-domain-guided SGARNet image restoration. MCF transmission process: Fiber cores sample the ground truth (GT) via locally weighted averaging; the sampled field is then convolved with the PSF of each core, yielding a degraded image with honeycomb artifacts. For the proximal facet with a hexagonal arrangement, $ a_1 $ and $ a_2 $ are the basis vectors in the spatial domain, whereas $ b_1 $ and $ b_2 $ are the corresponding basis vectors in the frequency domain; $ f_0 $ denotes the fundamental spatial frequency and $ \theta_0 $ is the angle between the first reciprocal-lattice vector and the $ k_x $-axis.

      $$ {PS F} = \text{exp} \left(-\frac{\lvert \vec r \rvert^{2}}{2\sigma^{2}}\right) $$ (1)

      where $ \sigma $ denotes the standard deviation of the Gaussian distribution, and the weighting function at the input facet is expressed as

      $$ G(\vec r_0) = \frac{1}{2{{\text{π}}} \sigma_c^{2}}\,\text{exp} \left(-\frac{\lvert \vec r_0 \rvert^{2}}{2\sigma_c^{2}}\right) $$ (2)

      where $ \vec r_0 $ is the position vector with the origin defined at the center of the distal facet plane perpendicular to the optical axis. $ {\sigma _c} $ is related to the fiber parameters. When the full width at half maximum of the Gaussian function is approximately equal to the core diameter, it can be expressed as

      $$ \sigma _c = \sigma = \frac{d} {2\sqrt {2\ln 2}} $$ (3)

      For MCF with hexagonal packing, the center position of each fiber core can be expressed as

      $$ {\vec r_{m,n}} = m{\vec a_1} + n{\vec a_2} $$ (4)

      where

      $$ \left\{ \begin{array}{l} {\vec a_1} = l(1, 0) \\ {\vec a_2} = l\left( \dfrac{1}{2}, \dfrac{\sqrt{3}}{2}\right) \end{array} \right. $$ (5)

      are basis vectors, determined by the inter-core spacing $ l $. Then the output intensity distribution at the proximal facet can be expressed as

      $$ Y(\vec r) = \sum\limits_{m,n} {[X(\vec r_0) * G(\vec r_0)]} \cdot \delta (\vec r_0 - {\vec r_{m,n}}) * {{PS F}} $$ (6)

      where * denotes convolution, $ X(\vec r_0) $ represents the ideal distribution of image information on the distal facet, and $ G(\vec r_0) $ is the weighting function. Eq. 6 captures the sampling effect introduced by MCF image transmission. The simulated degraded image using the proposed physics model and experimental parameters, including the core diameter $ d=2\; $µm and inter-core spacing $ l=3\; $µm, is presented in Fig. 2b. The resulting honeycomb artifacts closely match those observed in the experimentally acquired image shown in Fig. 2c. The influence of the fiber parameters $ l $ and $ d $ on the formation of honeycomb artifacts is then analyzed in the frequency domain.

      Fig. 2  Simulation and analysis of honeycomb artifacts in lensless MCF imaging: a Simulated artifacts under different core diameters $ d $ and inter-core spacings $ l $ in the spatial and frequency domains. b Simulated degraded image. c Experimentally acquired image. d, e Correlation coefficient $ C $ versus $ d $ and $ d/l $. f Azimuthally averaged radial spectra of the ground-truth, experimental, and simulated images, and the hexagonal lattice mask, each normalized individually; the gray dashed line indicates the fundamental frequency $ f_0 $.

      In the frequency domain, the reciprocal-lattice vectors of the hexagonal core arrangement with lattice constant $ l $ are expressed as

      $$ \left\{ \begin{array}{l} {\vec b_1} = \dfrac{{2{\text π} }}{l}\left(1, - \dfrac{1}{{\sqrt 3 }}\right)\\ {\vec b_2} = \dfrac{{2{\text π} }}{l}\left(0,\dfrac{2}{{\sqrt 3 }}\right) \end{array} \right. $$ (7)

      The optical transfer function (OTF) of a single fiber is approximated by a Gaussian function,

      $$ H(f) = \exp ( - 2{{\text π} ^2}{\sigma ^2}{f^2}) $$ (8)

      where $ \sigma $ is related to the core diameter $ d $ as indicated in Eq. 3.

      The fundamental spatial frequency of the hexagonal lattice is $ {f_0} = {|{{\vec b}_1}|}/{{2{\text π} }} = {2}/{({\sqrt 3 l})} $. Therefore, the inter-core spacing $ l $ primarily determines the spatial frequency of the honeycomb artifacts via its influence on the lattice frequencies. The sampling bandwidth increases as $ l $ decreases; thus, smaller values of $ l $ reduce the visibility of the artifacts. The core diameter $ d $ primarily affects the OTF through $ \sigma $. A larger $ d $ suppresses high spatial frequencies, leading to blurrier artifact edges and lower contrast, hence a less pronounced appearance. Since the dominant honeycomb pattern arises from the first ring of reciprocal-lattice frequencies with magnitude $ f_0 $, substituting $ f_0 $ and Eq. 3 into Eq. 8 yields the following expression for the artifact contrast:

      $$ C \propto \text{exp} \left( - \frac{{{{\text π} ^2}}}{{3\ln 2}}\frac{{{d^2}}}{{{l^2}}}\right) $$ (9)

      As shown in Fig. 2a, the prominence of honeycomb artifacts is primarily related to the ratio of the core diameter $ d $ to the inter-core spacing $ l $. Given that $ l $ must physically exceed $ d $, a smaller $ d/l $ leads to higher artifact contrast and a more distinct honeycomb pattern, confirming that the artifact contrast can be expressed by Eq. 9. The curves in Fig. 2d, e further illustrate the influence of the fiber parameters on $ C $.

    • Hexagonal core layouts in MCF impose a periodic sampling pattern on the imaging plane, producing pronounced peaks at the reciprocal-lattice frequencies in the Fourier domain (Fig. 2f). To weaken these lattice peaks explicitly while preserving the broadband image content, we propose a physics-guided frequency-domain gate, termed SpectralGate, and integrate it into a lightweight U-Net framework.

      SpectralGate SpectralGate constructs a soft mask $ M(u,v) $ at the fundamental reciprocal-lattice frequency $ f_0 $ along the six hexagonal orientations $ \theta_m=\theta_0 + m{\text π}/ 3\; (m=0,\cdots,5) $, and applies a learnable attenuation $ 1-\alpha M(u,v) $ in the Fourier domain. With Gaussian standard deviation $ \sigma>0 $, the mask is

      $$ \begin{align} M(u,v) &= \text{Norm}\left\{ \sum_{m=\text{0}}^\text{5} \exp \left(-\frac{u^{2}+v^{2}+f_{\text{0}}^{2}}{2\sigma^{2}}\right) \right. \\ &\quad \left. \cosh \left[\frac{f_{\text{0}}\big(u\cos\theta_{m}+v\sin\theta_{m}\big)}{\sigma^{2}}\right] \right\} \end{align} $$ (10)

      where $ {\mathop{{\rm{Norm}}}\nolimits} ( \cdot {\rm{)}} $ denotes the normalization function, and $ (u,v) $ represents the normalized spatial frequencies in the fast Fourier transform (FFT) baseband. For a feature map $ X(x,y) $ with a Fourier transform $ \hat{X}(u,v) $, SpectralGate applies

      $$ \hat Y(u,v)=\bigl[1-\alpha M(u,v)\bigr]\hat X(u,v) $$ (11)

      where $ 0\le \alpha \le \alpha_{\text{max}} $. The output is expressed as $ Y={\cal{F}}^{-1}\{\hat{Y}\} $, where $ {\cal{F}}\{\cdot\} $ represents the Fourier transform. This task-adaptive notch formulation attenuates the fundamental lattice peaks that are responsible for honeycomb artifacts while minimally disturbing the non-lattice frequencies.

      SGARNet architecture We instantiate SpectralGate within a compact U-Net encoder-decoder, named SGARNet. Since the SpectralGate explicitly resolves global periodic artifacts in the frequency domain, we deliberately employ a U-Net backbone rather than computationally expensive Transformer architectures to efficiently extract hierarchical multi-scale features. The network is built from NAFBlocks62 that are equipped with simplified channel attention (SCA) and SimpleGate units, which offer high computational efficiency while capturing multi-scale contextual features. SpectralGate is inserted into the U-Net bottleneck layer as shown in Fig. 3, providing a resolution-aware cacheable mask without increasing the complexity.

      Fig. 3  Architecture of the proposed SGARNet. $ C $ denotes the number of channels, $ H $ and $ h $ denote the heights of the feature maps, $ W $ and $ w $ denote the widths of the feature maps.

      Overview of NAFBlocks The primary role of nonlinear activation functions in neural networks is to break the limitation of linear superposition and provide the network with the capacity to represent complex mappings63. Conventional pointwise activations (e.g. ReLU and GELU) expand the spectra via self-convolution64, which can reinforce periodic components at the reciprocal-lattice locations and distort the amplitude distributions. In contrast, SimpleGate eliminates explicit pointwise activations and realizes nonlinearity through multiplicative channel interactions. The basic formulation of SimpleGate is expressed as

      $$ Y = X_1 \odot X_2 $$ (12)

      where $ \odot $ denotes element-wise multiplication. This operation corresponds to the convolution of two sub-spectra in the frequency domain:

      $$ {\cal{F}}\{ Y\} = {\cal{F}}\{ {X_1}\} * {\cal{F}}\{ {X_2}\} $$ (13)

      This promotes cross-channel frequency fusion rather than harmonic pile-up at the lattice frequencies, complementing the physics-guided suppression of SpectralGate.

      Implementation and loss function Model training was conducted for 100,000 iterations with a batch size of 8. The optimizer used was AdamW, with an initial learning rate of $ 1\times10^{-3} $. To balance the pixel fidelity and structural consistency, the objective combined a pixel-level term and a VGG-based perceptual term,

      $$ {{{\cal{L}}}_\text{total}} = {{{\cal{L}}}_{\text{PSNR}}} + {\lambda _p} \cdot {{{\cal{L}}}_{\text{percep}}} $$ (14)

      $ {\cal{L}}_{\text{PSNR}} $ represents the pixel-level loss and is defined as follows:

      $$ {{{\cal{L}}}_{{\text{PSNR}}}} = \frac{{10}}{{\ln (10)}} \cdot \ln \left( {\frac{1}{N}\sum\limits_{i = 1}^N {{{\left( {I_{{\text{pred}}}^{(i)} - I_{{\text{gt}}}^{(i)}} \right)}^2}} + \varepsilon } \right) $$ (15)

      where $ I_\text{pred}^{(i)} $ and $ I_\text{gt}^{(i)} $ represent the predicted and GT intensity values at the $ i $-th pixel, respectively. $ N $ is the total number of pixels in an image, that is, $ N = C \times H \times W $ for an image with $ C $ channels, height $ H $, and width $ W $. $ \varepsilon $ is a small positive constant added to ensure numerical stability by avoiding the logarithm of zero. $ {\lambda _p} $ in Eq. 14 is the weight coefficient of the perceptual loss:

      $$ {\cal{L}}_\text{percep} = \sum\limits_{j \in {\cal{J}}} \frac{1}{C_j H_j W_j} \left\| \phi_j (I_\text{pred}) - \phi_j (I_\text{gt}) \right\|_F^2 $$ (16)

      where $ \|\cdot\|_F $ denotes the Frobenius norm, and $ {\phi _j}( \cdot ) $ denotes the feature map extracted from the $ j $-th selected layer of a pre-trained VGG-19 network. $ C_j $, $ H_j $, and $ W_j $ are the number of channels, height, and width of the feature map in layer $ j $, respectively. In our implementation, $ {\cal{J}}= \{\text{relu1}\_\text{1},\; \text{relu2}\_\text{1},\; \text{relu3}\_\text{1},\; \text{relu4}\_\text{1},\; \text{relu5}\_\text{1}\} $, corresponding to the five selected VGG-19 layers used for feature extraction and perceptual loss computation, as shown in Fig. 3. This joint loss function enables the effective suppression of regular honeycomb patterns while preserving the tissue details and global structure.

    Experiments and results
    • We designed a self-registered paired-dataset acquisition method for lensless MCF imaging, with the experimental setup illustrated in Fig. 4. The GT images from the ImageNet dataset65 were displayed using a projector (DLPDLCR4710EVM-G2) featuring a digital micromirror device with a pixel pitch of 5.4 µm and a resolution of 1,920 × 1,080 pixels. The GT images were optically demagnified by the first objective lens (OBJ1: 50×, NA 0.55) and projected 30 µm away from the distal MCF facet37. The MCF (SUMITA HDIG) had an overall diameter of 350 µm and a length of 1,000 mm, containing approximately 10,000 cores with a core diameter $ d $ of 2 µm and an inter-core spacing $ l $ of 3 µm. The proximal facet was magnified by the second objective lens (OBJ2: 20×, NA 0.42) onto an industrial camera (GS3-U3-123S6C-C), which utilizes a sensor with a 3.45 µm pixel pitch and a resolution of 4,096 × 3,000 pixels.

      Fig. 4  Experimental setup for paired-dataset acquisition. The projector displays the GT; OBJ1 demagnifies the GT onto the distal facet of the MCF; OBJ2 (20× microscope objective) magnifies the proximal facet onto the CMOS sensor.

      Compared with previous methods for obtaining paired datasets, the proposed approach requires no additional optical path for reference images, thereby reducing the system complexity and acquisition cost. Furthermore, it is more reliable than artificially synthesized datasets because it inherently reflects the physical characteristics of real image transmission through MCF.

    • We used an NVIDIA GeForce RTX 3090 GPU with 24 GB memory. With this hardware configuration, training SGARNet required approximately 21 h. For each input image, the raw measurement captured from the MCF was cropped from 4,096 × 3,000 to 368 × 368 with a negligible computational overhead, and the cropped input was then processed by SGARNet to produce an artifact-free restoration. Overall, the end-to-end pipeline reconstructed approximately 10 images per second.

    • First, we evaluated the trained end-to-end network using a validation set consisting of 200 paired images. The peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM), which are two widely used metrics for image restoration and enhancement, were employed for the quantitative assessment. The proposed SGARNet effectively suppressed artifacts, resulting in significant improvements in both PSNR and SSIM, as well as noticeable enhancements in the image contrast and color balance. A summary of the quantitative results for different methods is provided in Table 1.

      Method PSNR (dB) SSIM
      Raw 14.15 0.3382
      Filtering 13.97 0.4069
      Interpolation 15.57 0.4461
      HAR-CNN59 18.62 0.4866
      Res-UNet66 25.08 0.6997
      SGARNet 26.27 0.7446

      Table 1.  Average PSNR and SSIM over a 200-image test set for different restoration methods.

    • The gray-level co-occurrence matrix (GLCM) is a widely used fundamental method for extracting image texture features67. In this study, for each image, we computed GLCM with an inter-pixel distance $ d=1 $ at four orientations: $ 0 $, $ \text{π}/4 $, $ \text{π}/2 $, and $ 3\text{π}/4 $. We then defined the textural complexity as follows:

      $$ {{T excomp}} = 0.5 \times \frac{1}{{{\mathop{\text {Norm}}\nolimits} ({\text{ASM) + }}\varepsilon }} + 0.5 \times {\mathop{\text {Norm}}\nolimits} ({\text{Con)}} $$ (17)

      where ASM denotes the angular second moment of the GLCM, reflecting the uniformity of the gray-level distribution and coarseness of the texture. A small constant $ \varepsilon $ is introduced to avoid division by zero, as in Eq. 15. Con denotes the GLCM contrast, which reflects the image sharpness and depth of the texture grooves.

      A comparison of textural complexities using the PSNR and SSIM across different methods on the test dataset is shown in Fig. 5. The proposed SGARNet exhibited consistent superiority in terms of honeycomb artifact removal.

      Fig. 5  PSNR and SSIM according to various textural complexities across the entire test set.

      For images with Texcomp values below 0.5, where the intrinsic contrast is low and the texture distribution is relatively uniform, the proposed SGARNet demonstrates outstanding restoration capability. Specifically, the PSNR was improved from 15.14 dB to 31.66 dB, whereas the SSIM increased from 0.3894 to 0.8760. A comparison between SGARNet and other deep-learning-based methods is shown in Fig. 6a, with the quantitative metrics annotated in the bottom right of the images.

      Fig. 6  Restoration results for images with Texcomp of a 0.166 and b 0.906. Top row: full 368 × 368 images; bottom row: zoomed-in crops. Columns (left to right): MCF raw images with honeycomb artifacts; HAR-CNN-restored images; Res-UNet-restored images; SGARNet-restored images; GT.

      For images with Texcomp values above 0.5, as shown in Fig. 6b, richer high-frequency details are presented. As revealed by our degradation model, high-frequency information is inevitably lost during MCF transmission owing to discretized sampling. Consequently, single-frame deep-learning methods struggle to achieve faithful recovery of the missing high-frequency content.

      Res-UNet is a widely used neural network for improving image quality. The version of Res-UNet that we used employs LeakyReLU pointwise activations, which lead to irregular artifacts in the reconstruction of high-frequency details. This observation is consistent with our earlier analysis that pointwise nonlinear activations are unfavorable for the smooth suppression of regular honeycomb artifacts. In contrast, the proposed SGARNet, which eliminates nonlinear activations and is trained with a joint perceptual loss, delivered significant improvements in the quantitative metrics along with smoother suppression of honeycomb artifacts with higher fidelity.

    • SGARNet was trained on a projector-based paired dataset, but realistic imaging conditions rely on reflected or transmitted illumination. To assess the generalization capability of the model in such scenarios, we replaced the projector shown in Fig. 4 with physical objects and conducted MCF imaging using broadband white-LED.

      As shown in Fig. 7a-d, we first validated SGARNet using a standard USAF 1951 resolution test target, which features a pattern with transparent lines and a reflective background. The target includes 10 groups of line pairs with equal line spacing and width (from −2 to +7), each group containing six elements (horizontal and vertical line pairs).

      Fig. 7  Restoration results for real-object images. a Raw images of the USAF 1951 resolution test target acquired by lensless imaging at the distal facet of MCF. b SGARNet-restored images without honeycomb artifacts. c, d Intensity profiles along dashed lines L1 and L2 (marked by red arrows in a), respectively, where the horizontal axes represent the actual physical dimensions at the MCF distal facet. e Example of experimentally observed biological sample sections. fi SGARNet-restored images of biological tissue sections. From left to right: longitudinal section of wheat caryopsis, neural tissue, agaric section, transverse section of a woody dicot stem.

      Fig. 7a, b present the raw and SGARNet-restored images of elements 1 to 6 from the first to third groups of the resolution target. The MCF raw images showed pronounced hexagonal honeycomb artifacts, whereas the SGARNet restoration effectively eliminated these artifacts. The line intensity profiles extracted along the white dashed segments marked with red arrows are shown in Fig. 7c, d. These profiles demonstrate that SGARNet clearly resolved the sixth element of the third group of line pairs in the resolution target. Based on the line width of the original resolution target and magnification of the objective lens used in the experiment, the minimum line width in the projected image on the fiber facet was calculated to be 2.1 µm. This indicates that, under SGARNet enhancement, the lensless MCF imaging system can resolve a minimum line width of 2.1 µm, which is consistent with the MCF core diameter of 2 µm and inter-core spacing of 3 µm used in the experiment. These results demonstrate that SGARNet improves the image fidelity and resolution while removing honeycomb artifacts under realistic object-imaging conditions.

      Furthermore, we applied SGARNet to biological tissue sections acquired under the same transmitted white-LED illumination. Fig. 7e shows an example of a neural tissue sample and Fig. 7f-i present four comparison sets of, each including the raw MCF image, SGARNet-restored image, and corresponding GT image captured by a commercial optical microscope. These results demonstrate that SGARNet remains robust for real-world color tissue imaging and can be directly transferred from projector-based training to practical biomedical imaging scenarios.

    Conclusion and discussion
    • Lensless MCF-based endoscopy is attractive for computational imaging in biomedicine, because it enables high image quality with keyhole access, which is crucial for intra-operative monitoring. However, the periodic core arrangement of the MCF leads to sampling-induced honeycomb artifacts, that manifest as pixelation effects in the image. These artifacts degrade the image quality, making it difficult to distinguish real features. Therefore, suppressing these artifacts is essential for obtaining high-quality images. In addition to techniques such as using the MCF as a “fiber lens” that is placed away from the image plane, these artifacts can be digitally removed. However, conventional methods have a low restoration efficiency, and learning-based methods lack physics-informed priors and rely on the quality of the dataset, leading to poor robustness and generalization.

      In conclusion, we have proposed SGARNet, a deep artifact removal approach for lensless MCF imaging that enables real-time, high-quality computational imaging. Compared with existing deep-learning-based methods, SGARNet offers several key innovations: (1) A degradation model was established for MCF image transmission and the unique spectral characteristics of the arranged honeycomb artifacts were analysed. (2) Guided by this model, we devised the SpectralGate module that tranforms intermediate feature maps into the frequency domain and selectively attenuates peaks at the reciprocal-lattice frequencies using a learnable frequency mask. (3) A reliable dataset containing real physical information from the imaging process was acquired for training, and the data-acquisition method did not require an additional optical path that bypasses the MCF. (4) As opposed to previous deep-learning approaches that are typically evaluated on a single type of image, our method was validated across images with varying textural complexities, real-world resolution targets, and colored biological tissue samples, ensuring robustness and significant application potential.

      The results showed that SGARNet outperforms existing methods in terms of quantitative metrics and exhibits strong robustness across images with varying textural complexities. Even in challenging scenarios in which high-frequency details risk aliasing with honeycomb artifacts, SGARNet effectively suppresses these artifacts while preserving key image features. Notably, SGARNet significantly enhances the resolution and eliminates honeycomb artifacts, making it highly suitable for real-world applications that demand high image fidelity. Furthermore, the method demonstrates strong generalizability, as shown by its success with both resolution test targets and biological tissue samples. These findings highlight the potential of SGARNet for reliable, real-time and high-quality computational imaging in a variety of biomedical applications.

      Despite the advantages of the proposed method, certain limitations remain to be addressed in future work. First, the current imaging setups utilize either active projection or transmissive LED illumination. Notably, the transmissive mode requires the samples to be transparent. However, clinical minimally invasive surgery typically requires a single MCF to perform simultaneous illumination and imaging in reflection mode. This requirement poses a challenge for future optical system designs that integrate illumination and imaging functions. Potential strategies to overcome the associated back-reflection noise include the use of cross-polarization schemes or double-clad fiber architectures68.

      Fiber bending and deformation are critical practical factors in endoscopic scenarios. From a theoretical perspective, we anticipate that the SpectralGate mask will remain effective in suppressing lattice-related honeycomb artifacts under dynamic stress. This stability arises because the mask relies on a fixed-core arrangement at the proximal facet. Furthermore, the inherent lateral memory effect of the MCF ensures that speckle patterns remain highly correlated, even under bending or twisting. Nevertheless, bending-induced alterations to the fiber transmission characteristics36 could still affect the fidelity of fine-detail restoration, making dynamic stress testing an important avenue for future physical validation.

      Beyond physical constraints, the generalization performance of the network is closely tied to the training data and architecture. For instance, the slight edge shadows in binary targets (Fig. 7b) arise because the network, which was trained primarily on continuous-tone images, misinterpreted PSF-blurred sharp transitions as grayscale structures. In color imaging, because the honeycomb pattern is wavelength independent, a single SGARNet can efficiently process all color channels using shared weights, ensuring structural consistency. To improve color fidelity, future work could explore channel-adaptive modulation and integrate deep-learning-based chromatic aberration correction69, which is particularly important for advanced applications like multi-color fluorescence endoscopy. Owing to the strong frequency-domain guidance provided by SpectralGate, SGARNet is data efficient, and its performance largely saturates beyond 1,000 training pairs. Therefore, we used 2,800 pairs mainly to increase the diversity across textures and imaging modalities, enabling robust generalization with a single model for both monochrome and color cameras.

      In terms of scalability and real-time performance, the SpectralGate mask is intrinsically linked to the specific fiber core arrangement. Consequently, adapting the method to fibers with different core layouts requires mask redesign and network retraining. This process is particularly challenging for fibers with irregular lattices, in which the spectral peaks are less pronounced. At present, retraining takes approximately 21 h, and the inference speed is approximately 10 fps. To bridge this gap for real-time applications, which require speeds exceeding 30 fps and to enable rapid adaptation to new fibers, future work will explore strategies such as multi-GPU parallel training and transfer learning57. Addressing these aspects will enhance the capability of the system for practical applications in minimally invasive endoscopy.

    Acknowledgements
    • This work was supported by the National Natural Science Foundation of China (Grant Nos. W2511066, 62235009 and 62305183), and partially funded by the Deutsche Forschungsgemeinschaft (DFG, Cz 55/61-1).

Reference (69)

Catalog

    /

    DownLoad:  Full-Size Img PowerPoint
    Return
    Return