-
We modeled a random phase diffuser as a phase-only mask, whose complex transmission coefficient $ {t}_{D}\left(x,y\right) $ is defined by the refractive index difference between the air and the diffuser material ($ \mathrm{\Delta }n\approx 0.74 $) and a random heightmap $ D\left(x,y\right) $ at the diffuser plane, i.e.,
$${t}_{D}\left(x,y\right)=\mathit{exp}\left(j\dfrac{2\pi \mathrm{\Delta }n}{\lambda }D\left(x,y\right)\right) $$ (1) where $ j=\sqrt{-1} $. The random height map $ D\left(x,y\right) $ is defined as
$$D\left(x,y\right)=W\left(x,y\right)*K\left(\mathrm{\sigma }\right) $$ (2) where $ W\left(x,y\right) $ follows a normal distribution with a mean $ \mathrm{\mu } $ and a standard deviation $ {\mathrm{\sigma }}_{0} $, i.e.
$$ W\left(x,y\right)\sim N\left(\mathrm{\mu },{\mathrm{\sigma }}_{0}^{2}\right)$$ (3) $ K\left(\mathrm{\sigma }\right) $ is a zero-mean Gaussian smoothing kernel with a standard deviation of $ \mathrm{\sigma } $, and ‘ $ \mathrm{*} $ ’ denotes the 2D convolution operation. The phase-autocorrelation function $ {R}_{d}\left(x,y\right) $ of a random phase diffuser is related to the correlation length L as:
$$ {R}_{d}\left(x,y\right)=\mathrm{exp}\left(-\pi \left({x}^{2}+{y}^{2}\right)/{L}^{2}\right) $$ (4) By numerically fitting the function $ \mathrm{exp}\left(-\pi \left({x}^{2}+{y}^{2}\right)/{L}^{2}\right) $ to $ {R}_{d}\left(x,y\right) $, we can statistically get the correlation length L of randomly generated diffusers. In this work, for $ \mathrm{\mu }=25\mathrm{\lambda } $, $ {\mathrm{\sigma }}_{0}=8\lambda $ and $ \mathrm{\sigma }=7\mathrm{\lambda } $, we calculated the average correlation length as $ L\sim 14\mathrm{\lambda } $ based on 2000 randomly generated phase diffusers. We accordingly modified the $ \mathrm{\sigma } $ values to generate the corresponding random phase diffusers for the other correlation lengths used in this work.
-
Free space propagation in air between the diffractive layers was formulated using the Rayleigh-Sommerfeld equation. The propagation can be modeled as a shift-invariant linear system with the impulse response:
$$ w\left(x,y,z\right)=\frac{z}{{r}^{2}}\left(\frac{1}{2\mathrm{\pi }{r}^{2}}+\frac{1}{j\mathrm{\lambda }}\right)exp\left(\frac{j2\mathrm{\pi }r}{\mathrm{\lambda }}\right) $$ (5) where $ r=\sqrt{{x}^{2}+{y}^{2}+{z}^{2}} $ and $ n=1 $ for air. Considering a plane wave that is incident at a phase-modulated object $ h\left(x,y,z=0\right) $ positioned at $ z=0 $, we formulated the distorted image right after the random phase diffuser located at $ {z}_{0} $ as:
$$ {u}_{0}\left(x,y,{z}_{0}\right)={t}_{D}\left(x,y\right)\cdot \left[h\left(x,y,0\right)*w\left(x,y,{z}_{0}\right)\right] $$ (6) This distorted field is used as the input field of subsequent diffractive layers. The diffractive layers were modeled as thin phase elements. Consequently, the transmission coefficient of the layer $ m $ located at $ z={z}_{m} $ can be formulated as:
$$ {t}_{m}=\mathrm{exp}\left(j\mathrm{\varphi }\left(x,y,{z}_{m}\right)\right) $$ (7) The optical field $ {u}_{m}\left(x,y,{z}_{m}\right) $ right after the $ {m}^{th} $ diffractive layer at $ z={z}_{m} $ can be written as:
$$ {u}_{m}\left(x,y,{z}_{m}\right)={t}_{m}\left(x,y,{z}_{m}\right)\cdot \left[{u}_{m-1}\left(x,y,{z}_{m-1}\right)\mathrm{*}w\left(x,y,\mathrm{ }\mathrm{\Delta }{z}_{m}\right)\right]$$ (8) where $ \mathrm{\Delta }{z}_{m}={z}_{m}-{z}_{m-1} $ is the axial distance between two successive diffractive layers, which was selected as $ 2.67\mathrm{\lambda } $ throughout this paper. After being modulated by all the $ K $ diffractive layers, the optical field was collected at an output plane which was $ \mathrm{\Delta }{z}_{d}=9.3\mathrm{\lambda } $ away from the last diffractive layer. The intensity of this optical field is used as the raw output of the QPI D2NN:
$$ {I}_{raw}\left(x,y\right)={\left|{u}_{M}*w\left(x,y,\mathrm{\Delta }{z}_{d}\right)\right|}^{2} $$ (9) -
During the training process of QPI D2NNs, we sampled the 2D space with a grid of 0.4$ \mathrm{\lambda } $, which is also the size of each diffractive feature on the diffractive layer. A coherent light was assumed as the illumination source for the diffractive neural networks with a wavelength of $ \lambda \approx 0.75\mathrm{m}\mathrm{m} $. As for the physical layout of the QPI D2NN, the input field-of-view (FOV) was set to be $ 96\mathrm{\lambda }\times 96\mathrm{\lambda } $, which corresponds to 240×240 pixels defining the phase distribution of the input objects. Handwritten digits from the MNIST training dataset were first normalized to the range $ [0, 1] $ and bilinearly interpolated from 28×28 pixels to 14×14 ($ {\varphi }_{target} $). The resulting images were up-sampled to 168×168 using ‘nearest’ mode, then padded with zeros to 240×240 pixels ($ {\varphi }_{i} $), matching the size of the input FOV. Stated differently, without loss of generality, we defined an object-free region with a constant transmission coefficient of 1 to surround the samples of interest. The values of the padded images $ {\varphi }_{i}(x,y) $ were used to define the input phase values, and the amplitude at each pixel was taken as 1. Another parameter $ (\alpha $) was introduced to control the range of the input phase; accordingly, the complex amplitude at the input FOV can be expressed as $ input={e}^{j\alpha \pi {\varphi }_{i}} $ with a size of 240×240 pixels, and the target (ground truth) output intensity is $ {I}_{target}=\alpha \pi {\varphi }_{target} $ with a size of 14×14 pixels.
The physical size of each diffractive layer was also set to be $ 96\lambda \times 96\lambda $, i.e., each diffractive layer contained 240×240 trainable diffractive features, which only modulated the phase of the incident light field. The axial distances between the input phase object and the random diffuser, the diffuser and first diffractive layer, two successive diffractive layers, and the last diffractive layer and the output plane were set to be $ 53.3\lambda , 2.67\lambda , 2.67\lambda $ and $ 9.3\lambda $, respectively. The size of the signal area at the output plane, including the reference region, was set to be $ 69.6\lambda \times 69.6\lambda $ (174×174 pixels), in which we cropped the central $ 67.2\lambda \times 67.2\lambda $ (168×168 pixels) region as the QPI signal area and the edge region extending (in both directions on $ x $ and $ y $ axes) by 3 pixels was set as the reference region. According to our forward model, the QPI signal $ {I}_{QPI}(x,y) $ can be written as:
$$ {I}_{QPI}\left(x,y\right)=\frac{{I}_{raw}\left(x,y\right)}{Ref} $$ (10) where $ Ref $ is the mean background intensity value within the reference region at the output plane, and $ {I}_{QPI}(x,y) $ indicates the quantitative phase image in radians. We further cropped the central 168×168 pixels of $ {I}_{QPI} $ and binned every 12×12 pixels to one pixel by averaging such that $ {I}_{QPI} $ had a final size of 14×14 pixels representing the input object phase in radians.
During the training, n uniquely different phase diffusers were randomly generated at each epoch. In each training iteration, a batch of $ B=10 $ different objects from the MNIST handwritten digit dataset were sampled randomly; each input object in a batch was numerically duplicated $ n $ times and separately perturbed by a set of $ n $ randomly selected diffusers. Therefore, $ B\times n $ different optical fields were obtained, and these distorted fields were individually forward propagated through the same state of the diffractive network. Therefore, we got $ B\times n $ different normalized intensity patterns at the output plane ($ {I}_{QPI\_1},\dots ,{I}_{QPI\_Bn} $), which were used for the mean square error (MSE)-based training loss function calculation:
$$ Loss=\frac{\dfrac{1}{{N}_{QPI}}{\displaystyle\sum }_{i=1}^{Bn}\displaystyle\sum _{x,y}{\left|{I}_{target}\left(x,y\right)-{I}_{QPI\_i}\left(x,y\right)\right|}^{2}}{Bn} $$ (11) where $ {N}_{QPI}=14\times 14 $.
Pearson Correlation Coefficient (PCC) was used to evaluate the linear correlation between the output QPI image $ {I}_{QPI}(x,y) $ and the target $ {I}_{target}(x,y) $, which can be expressed as:
$$PCC=\frac{\displaystyle\sum \left({I}_{QPI}\left(x,y\right)-\overline{{I}_{QPI}}\right)\cdot \left({I}_{target}\left(x,y\right)-\overline{{I}_{target}}\right)}{\sqrt{\displaystyle\sum {\left({I}_{QPI}\left(x,y\right)-\overline{{I}_{QPI}}\right)}^{2}\cdot {\left({I}_{target}\left(x,y\right)-\overline{{I}_{target}}\right)}^{2}}} $$ (12) We also calculated the absolute phase error to assess the phase recovery performance of a QPI D2NN:
$$phase\;error=\frac{1}{{N}_{QPI}}\sum _{x,y}\left|{I}_{target}\left(x,y\right)-{I}_{QPI}\left(x,y\right)\right| $$ (13) while the percent phase error is:
$$ phase\;error{\text%}=\frac{1}{{N}_{QPI}}\sum _{x,y}\frac{\left|{I}_{target}\left(x,y\right)-{I}_{QPI}\left(x,y\right)\right|}{{I}_{target}\left(x,y\right)} $$ (14) In analyzing the impact of reduced input phase contrast on the QPI performance, we trained a QPI D2NN using the MNIST dataset and tested it with binary gratings and handwritten digits. We binarized the MNIST samples by setting a threshold of 0.5 during the testing stage. In the exploration of the tradeoff between the QPI performance and the output diffraction efficiency, we calculated the power-efficiency $ E\left({I}_{raw}\right) $ of the QPI D2NN as:
$$ E\left({I}_{raw}\right)=\frac{\displaystyle\sum {I}_{raw}\left(x,y\right)}{\displaystyle\sum \left|input\left(x,y\right)\right|}=\frac{\displaystyle\sum {I}_{raw}\left(x,y\right)}{{240}^{2}} $$ (15) and the corresponding diffraction efficiency penalty was calculated as follows:
$$ Los{s}_{eff}\left({I}_{raw}\right)=\mathrm{max}\left\{0,{E}_{target}-E\left({I}_{raw}\right)\right\}$$ (16) where $ {E}_{target} $ was the target power-efficiency, which varied from 0 to 0.03 for the models presented in Fig. 8; the diffractive model presented in Fig. 2 was trained without any diffraction efficiency penalty. Based on these definitions, the total loss function that included the power-efficiency penalty can be rewritten as:
$$ \begin{split}Loss=&\frac{\dfrac{1}{{N}_{QPI}}{\displaystyle\sum }_{i=1}^{Bn}\displaystyle\sum _{x,y}{\left|{I}_{target}\left(x,y\right)-{I}_{QPI\_i}\left(x,y\right)\right|}^{2}}{Bn}+\\&\frac{{\displaystyle\sum }_{i=1}^{Bn}\mathrm{max}\left\{0,{E}_{target}-E\left({I}_{raw\_i}\right)\right\}}{Bn}\end{split} $$ (17) -
To assess the effectiveness of the vaccination strategy against the negative impact of potential fabrication inaccuracies and mechanical misalignments, we trained a QPI D2NN while intentionally introducing random displacements during the training stage. Specifically, a random lateral displacement $ \left({D}_{x},{D}_{y}\right) $ was added to the diffractive layer positions, where $ {D}_{x} $ and $ {D}_{y} $ were randomly and independently sampled, i.e.,
$$ \begin{array}{c}{D}_{x}\sim\text{U}\left(-0.2\;\mathrm{m}\mathrm{m},\;0.2\;\mathrm{m}\mathrm{m}\right),\;{D}_{y}\sim\text{U}\left(-0.2\;\mathrm{m}\mathrm{m},\;0.2\;\mathrm{m}\mathrm{m}\right)\end{array} $$ (18) where $ {D}_{x} $ and $ {D}_{y} $ are not necessarily equal to each other in each misalignment step. Additionally, a random axial displacement $ {D}_{z} $ was also added to the axial distance between any two successive planes after the diffuser, including the distances between the diffuser and the first diffractive layer, between two successive diffractive layers, and from the last diffractive layer to the output plane. $ {D}_{z} $ was also randomly sampled:
$$ \begin{array}{c}{D}_{z}\sim\text{U}\left(-0.375\;\mathrm{m}\mathrm{m},\;0.375\;\mathrm{m}\mathrm{m}\right)\end{array} $$ (19) -
The QPI diffractive neural networks were trained using Python (v3.6.13) and PyTorch (v1.11, Meta AI) with a GeForce GTX 1080 Ti graphical processing unit (GPU, Nvidia Corp.), an Intel® Core™ i7-7700K central processing unit (CPU, Intel Corp.) and 64 GB of RAM, running the Windows 10 operating system (Microsoft Corp.). The calculated loss values were backpropagated to update the diffractive layer transmission values using the Adam optimizer55 with a decaying learning rate of $ {0.99}^{epoch}\times {10}^{-3} $, where $ epoch $ refers to the current epoch number. Training a typical QPI D2NN model takes ~72 h to complete with 200 epochs and $ n=20 $ diffusers per epoch.
Quantitative phase imaging (QPI) through random diffusers using a diffractive optical network
- Light: Advanced Manufacturing 4, Article number: (2023)
- Received: 19 January 2023
- Revised: 17 June 2023
- Accepted: 20 June 2023 Published online: 22 July 2023
doi: https://doi.org/10.37188/lam.2023.017
Abstract: Quantitative phase imaging (QPI) is a label-free computational imaging technique used in various fields, including biology and medical research. Modern QPI systems typically rely on digital processing using iterative algorithms for phase retrieval and image reconstruction. Here, we report a diffractive optical network trained to convert the phase information of input objects positioned behind random diffusers into intensity variations at the output plane, all-optically performing phase recovery and quantitative imaging of phase objects completely hidden by unknown, random phase diffusers. This QPI diffractive network is composed of successive diffractive layers, axially spanning in total ~70 , where is the illumination wavelength; unlike existing digital image reconstruction and phase retrieval methods, it forms an all-optical processor that does not require external power beyond the illumination beam to complete its QPI reconstruction at the speed of light propagation. This all-optical diffractive processor can provide a low-power, high frame rate and compact alternative for quantitative imaging of phase objects through random, unknown diffusers and can operate at different parts of the electromagnetic spectrum for various applications in biomedical imaging and sensing. The presented QPI diffractive designs can be integrated onto the active area of standard CCD/CMOS-based image sensors to convert an existing optical microscope into a diffractive QPI microscope, performing phase recovery and image reconstruction on a chip through light diffraction within passive structured layers.
Research Summary
All-optical quantitative phase imaging through random diffusers using diffractive networks
Quantitative phase imaging (QPI) is a label-free computational technique frequently used for imaging cells and tissue samples. Modern QPI systems heavily rely on digital processing and face challenges when diffusive media obstruct the optical path. A team led by Aydogan Ozcan at the University of California, Los Angeles (UCLA), reported a new approach to perform QPI through random unknown phase diffusers using a diffractive optical network. This diffractive network, optimized through deep learning, consists of a set of spatially-engineered diffractive surfaces designed to transform the phase information of input samples positioned behind random diffusers into intensity variations that quantitatively represent the object’s phase information at the output. The team anticipates the potential integration of QPI diffractive networks onto the active area of image sensor-arrays, converting an existing optical microscope into a diffractive QPI microscope that performs all-optical phase recovery and image reconstruction on a chip.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article′s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article′s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.