ArticlePDF Available

Deep learning optimized single-pixel LiDAR

Authors:

Abstract and Figures

Interest in autonomous transport has led to a demand for 3D imaging technologies capable of resolving fine details at long range. Light detection and ranging (LiDAR) systems have become a key technology in this area, with depth information typically gained through time-of-flight photon-counting measurements of a scanned laser spot. Single-pixel imaging methods offer an alternative approach to spot-scanning, which allows a choice of sampling basis. In this work, we present a prototype LiDAR system, which compressively samples the scene using a deep learning optimized sampling basis and reconstruction algorithms. We demonstrate that this approach improves scene reconstruction quality compared to an orthogonal sampling method, with reflectivity and depth accuracy improvements of 57% and 16%, respectively, for one frame per second acquisition rates. This method may pave the way for improved scan-free LiDAR systems for driverless cars and for fully optimized sampling to decision-making pipelines.
Content may be subject to copyright.
Deep learning optimized single-pixel LiDAR
Cite as: Appl. Phys. Lett. 115, 231101 (2019); doi: 10.1063/1.5128621
Submitted: 20 September 2019 .Accepted: 29 October 2019 .
Published Online: 2 December 2019
Neal Radwell,
1
Steven D. Johnson,
1,a)
Matthew P. Edgar,
1
Catherine F. Higham,
2
Roderick Murray-Smith,
2
and Miles J. Padgett
1,b)
AFFILIATIONS
1
SUPA, School of Physics and Astronomy, University of Glasgow, Glasgow G12 8QQ, United Kingdom
2
School of Computing Science, University of Glasgow, Glasgow G12 8QQ, United Kingdom
a)
Electronic mail: steven.johnson@glasgow.ac.uk
b)
Electronic mail: miles.padgett@glasgow.ac.uk
ABSTRACT
Interest in autonomous transport has led to a demand for 3D imaging technologies capable of resolving fine details at long range. Light
detection and ranging (LiDAR) systems have become a key technology in this area, with depth information typically gained through time-of-
flight photon-counting measurements of a scanned laser spot. Single-pixel imaging methods offer an alternative approach to spot-scanning,
which allows a choice of sampling basis. In this work, we present a prototype LiDAR system, which compressively samples the scene using a
deep learning optimized sampling basis and reconstruction algorithms. We demonstrate that this approach improves scene reconstruction
quality compared to an orthogonal sampling method, with reflectivity and depth accuracy improvements of 57% and 16%, respectively, for
one frame per second acquisition rates. This method may pave the way for improved scan-free LiDAR systems for driverless cars and for
fully optimized sampling to decision-making pipelines.
V
C2019 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://
creativecommons.org/licenses/by/4.0/).https://doi.org/10.1063/1.5128621
Imaging our surroundings in 3D has become a key challenge
across a range of sectors, including gaming, robotics, health-care, secu-
rity, manufacturing, and automotive industries. Mature technologies
such as radar and ultrasound sensing are effective at long and short
ranges, respectively, but are unable to provide the spatially detailed
maps necessary for many applications within the 10–100 m range. An
attractive solution for these intermediate ranges is light detection and
ranging (LiDAR), capable of providing millimetric depth precision at
ranges of around 100 m with a good spatial resolution.
1
LiDAR is capable of recovering depth information using direct
time-of-flight measurement,
2
frequency modulation continuous wave
(FMCW),
3
or the amplitude modulation continuous wave (AMCW)
4
approaches. The FMCW method ramps the modulation frequency
and measures the difference between the outgoing and return frequen-
cies, and AMCW correlates the outgoing and return intensities to
determine the distance toa target.Both these methods require a signif-
icant return signal, and therefore, high laser powers are required and
the techniques cannot be used at longrange. By contrast, time-of-flight
methods measure the return delay of a short pulse and can be sensitive
at the single photon level, with a temporal resolution in the tens of
picoseconds. Time-of-flight LiDAR methods have therefore become
the state-of-the-art for long-range, high-precision depth mapping.
To obtain millimetric precision for LiDAR, subnanosecond tem-
poral resolutions are required, ruling out the use of conventional cam-
eras. In recent years, there has been significant development of single
photon avalanche detector (SPAD) array cameras, which combine
multiple SPADs and their timing electronics on-chip.
5
SPAD array
technology allows scan-free LiDAR with longer integration times and
consequently improved signal-to-noise (SNR) ratio. However, at pre-
sent, the SPAD array technology suffers from high dark counts and
readout noise, while also demanding stringent requirements on read-
out clock rates and data transfer rates, both of which can limit their
performance and application readiness. Typical LiDAR systems are
currently therefore more usually based on spot-scanning techniques,
requiring mechanical scanning systems with a reconstruction resolu-
tion and acquisition rates limited by scan speed.
6
Single-pixel imaging (SPI) techniques are an alternative imaging
modality for recovering spatial information.
7,8
SPI uses a single
bucket-detector combined with a spatial modulator, most commonly a
digital micromirror device (DMD), to measure the spatial overlap
between the scene and the spatial patterns of an imaging basis. Simple
orthogonal imaging bases such as the Hadamard basis
9
are able to
reconstruct an Npixel image from Nsequential measurements, with
the acquisition time traditionally limited by modulator speed.
Appl. Phys. Lett. 115, 231101 (2019); doi: 10.1063/1.5128621 115, 231101-1
V
CAuthor(s) 2019
Applied Physics Letters ARTICLE scitation.org/journal/apl
However, we predict that in the case of single-photon counting
(Geiger-mode) LiDAR measurements, the maximum count rate in
photons per second (due to technical constraints of current single-
photon detectors) becomes the limiting factor.
To explore whether the photon measurement rate is the limiting
factor in the signal-to-noise (SNR), we make an order-of-magnitude
estimate of the number of photons required to produce a recognizable
image. If we consider a scene with an equal fraction of light and dark,
and as each mask in our sampling patterns have 50% of the pixels
“on,” the average overlap between the reflected signal and our pattern
will be a quarter of the maximum possible signal for a fully reflecting
scene. There will be an average measured signal for all patterns with
fluctuations around this average level due to the varying back-
scattered signals as the patterns of the basis are cycled. The informa-
tion in SPI is contained within these fluctuations. If we say that the
average single level is N=4, then the degree of overlap between the pat-
tern and the object gives fluctuations of order ffiffiffiffiffiffiffiffi
N=4
p. The ratio of
these fluctuations to the mean signal must be greater than SNR for the
measurement. If each of these measurements is based upon Pphotons,
the shot noise is ffiffiffi
P
p, and therefore, we require that the shot noise is
not greater than the information carrying fluctuations. We can state
that the minimum number of photons required per pattern measure-
ment is ffiffiffi
P
p>ffiffiffiffiffiffiffiffi
N=4
p. This implies that for an arbitrary scene, we will
need P!N. For a fully sampled reconstruction of the image, we
require Npatterns and therefore a total of !N2photons per recon-
struction. In a conventional scanning LiDAR or in a focal plane array,
Nphotons are needed.
With our system, a 64 "64 image will require N¼4096 photons
per pattern, which will take 400 lstoacquireat10
7
photons/s (a typical
single-photon measurement rate of our commercially available detec-
tor), corresponding to a pattern rate of only 2.5 kHz and hence an image
acquisition time of 1.6 s. Therefore, in this low photon number regime,
the pattern modulation rate is no longer the limiting factor (which is
typically tens of kilohertz); rather, maximizing the number of photons
per pattern becomes essential. By reducing the number of patterns
needed, the speed of acquisition can be increased while still producing
an acceptable SNR.
Unlike scanning systems, the freedom to choose the sampling
basis in SPI provides the opportunity to use compressed sensing tech-
niques, where a high quality Npixel image can be reconstructed with
fewer than Npattern projections and measurements.
10
The most com-
mon compressed sensing approaches use sampling bases, which are
maximally incoherent to the object basis, and use convex optimiza-
tion techniques with a strong prior, such as total curvature mini-
mization to recover the missing information. While effective, these
techniques are computationally expensive and require highly
sparse scenes, making them problematic for real-time (less than
one frame per second) and therefore low-resolution applications.
Several other approaches have been proposed for compressed sens-
ing, which retain fast reconstruction speeds, including evolution-
ary compressed sensing,
11
Russian-dolls ordering,
12
terahertz
imaging,
13
and cake-cutting ordering
14
and typically achieving
compression ratios around 25%. More recently, a method of com-
pressed sensing has been demonstrated using an optimized imag-
ing basis and reconstruction algorithm derived from a trained
convolutional neural network.
15
This deep learning (DL) approach
achieves a 4% compression ratio.
LiDAR systems have previously incorporated SPI techniques
16,17
and photon counting.
18,19
Here, we present a LiDAR system that uses
SPI techniques with DL compressed sensing together with single-
photon sensitive detection. Our single-pixel 3D imaging system is
shown in Fig. 1. The illumination laser has a pulse length of 120 fs,
a repetition rate of 80 MHz, and a center wavelength of 780 nm
(frequency doubled Toptica FemtoFErb 1560). The laser illumination
is expanded uniformly with a light integration tube, the output from
which is reimaged onto a DMD. The high-speed DMD (Vialux model
V-7001) provides time-varying structured illumination and consists of
an array of 1064 "768 micromirrors each approximately 10 lmin
size, capable of pattern switching rates of over 20 kHz. Each micromir-
ror is actuated to direct light either to the scene or to an internal beam
stop, allowing binary amplitude modulation of the projected patterns.
The DMD mirrors are grouped into larger “superpixels” to convert
the 1064 "768 micromirrors into the more typical 32"32, 64 "64,
or 128 "128 resolutions for SPI. The DMD plane is imaged onto the
scene using a standard high-quality camera lens, chosen to provide the
appropriate field of view for the chosen range. A fast response Geiger-
mode photomultiplier tube (Horiba PPD-900) detects back-scattered
photons from the scene, which are spectrally filtered (Semrock LL01-
780-12.5 and Thorlabs FB780-10 in series) to greatly reduce spurious
photon events from background thermal light. The collection optics
comprise a second camera lens to match the field-of-view of the detec-
tor to the structured illumination with a variable aperture. A time cor-
related single-photon counting (TCSPC) system is an efficient
triggering device, which can resolve single-photon detection times
with a jitter on the few tens of picoseconds range. Our TCSPC elec-
tronics (a customized Horiba DeltaHub) record the delay time for
each detected photon “event” relative to the synchronization signal
provided by the illuminating laser.
For each illumination pattern, the time-of-flight of many photon
events is collected into a single histogram before being transferred to a
laptop computer for image reconstruction. The TCSPC electronics
enable continuous streaming of up to 20 000 histograms per second,
each with 512 time bins with 25 ps widths, enabling real-time image
FIG. 1. LiDAR prototype. (a) A pulsed laser uniformly illuminates a digital-micromir-
ror-device, used to provide structured illumination onto a scene, and the back-
scattered light is collected onto a photomultiplier tube. The TCSPC electronics
record photon arrival events and collect the delay times into histograms. The DMD
triggers the histograms to be sent to the computer, where they are reconstructed
into 3D point clouds. (b) Photograph of the enclosed LiDAR demonstrator unit.
Applied Physics Letters ARTICLE scitation.org/journal/apl
Appl. Phys. Lett. 115, 231101 (2019); doi: 10.1063/1.5128621 115, 231101-2
V
CAuthor(s) 2019
construction. In practice, however, using all 512 time bins can result in
reconstruction lagging behind acquisition, as each time bin has to be
reconstructed individually. Therefore, in order to maintain real-time
reconstruction, we group 8 time bins together, such that we have 64
distinct time bins, allowing 3D reconstruction within an acquisition
time of 0.5 s. This pipeline bottleneck could be alleviated by computing
the frames simultaneously with a graphics processing unit.
SPI sequentially illuminates the scene with patterned light, with
each pattern forming part of a sampling basis, with many possible SPI
sampling bases.
9,20
In recent work,
15
deep learning, specifically, a mul-
tilayer neural network called an autoencoder, was used to produce
both an optimized highly compressed measurement basis and an algo-
rithm capable of recovering real-time high-resolution images. In order
for the measurement basis to be suitable for a general scene, a large
image recognition dataset, the STL-10 dataset (100 000 images com-
prising 10 classes: airplane, bird, car, cat, deer, dog, horse, monkey,
ship, and truck with natural backgrounds),
21
was used for training.
The multilayered autoencoder used a series of filters known as layers
to encode the training images and decode the image from encoded
measurements. This deep learning method was used to derive the
optimal binary basis for this training dataset from the encoding layers
of the neural network. A multilayered mapping algorithm to recon-
struct an image from a series of spatially encoded measurements was
developed by the decoding layers of the neural network. In this 3D
reconstruction work, we use this general case measurement basis but
adapt the reconstruction algorithm to recover 3D scene information
rather than 2D images.
The basis set consists of 666 patterns with 128 "128 pixels
(i.e., 4% compression ratio), which were calculated using deep learn-
ing, specifically a deep convolutional neural network. We generalize
the method from 2D scene reconstruction to 3D depth mapping by
performing a reconstruction of every depth plane (time bin), and
though this is somewhat outside of the scope of the 2D image DL
training, we will show that the system is robust enough to provide
impressive 3D results.
We test our DL basis against conventional fully sampled orthogo-
nal imaging bases, namely, the Hadamard basis. To avoid patterns
producing abnormally high, or low, return signals, we randomly per-
mute the columns of Hbefore reshaping. This “democratization”
makes optimum use of the dynamic range of the detection. Both the
Hadamard basis and the DL basis consist of $1 and 1 values, whereas
our modulator produces 0 (no light) and 1 (light). By sequentially
counting photons from a pattern and its contrast-reversed inverse pat-
tern and taking the difference, the signal (photon number) corre-
sponding to the $1 and 1 patterns can be deduced. For every new
pattern displayed on the DMD, a trigger signal is sent to the DeltaHub
histogrammer, transferring the current histogram to the computer and
beginning acquisition of a new histogram.
Once a “set” of differential histograms (666 for DL sampling or
Nfor Hadamard sampling) have been collected, the reconstruction is
performed for each time bin of the histogram using our DL recon-
struction or fast Hadamard transform. For Hadamard sampling, since
the patterns are democratized, this returns an image in which the pix-
els are scrambled, and we unscramble this by inverting the democrati-
zation process. Reconstruction of an image at every time bin gives a
3D point-cloud of intensity. For typical scenes, each transverse posi-
tion has a single depth (i.e., we are imaging nontransparent surfaces),
and so we choose to reduce noise by applying a 3 "3"3 Gaussian
3D smoothing kernel.
22
This sparsity also lends itself to a much more
compact representation by conversion of the 3D intensity map to a
depth map where each transverse position (pixel) has a reflectivity and
a depth determined as the time bin with the maximum; an example of
this conversion can be seen in Fig. 2.
Quantitative analysis of the performance of the deep-learned and
Hadamard imaging methods is gained by comparison to a ground
truth reflectivity and depth map. This ground truth is estimated by
fully sampling the same static scene using 16 384, 128"128 pixel
Hadamard patterns and acquiring and averaging data for over 2 h
(these are the data in Fig. 2). This provides low noise, high resolution
reflectivity, and depth maps for comparing the short acquisition time
data against, which will give a relative measure of performance though
may of course still have absolute errors with respect to the physical
reality. To estimate the difference between our test methods and this
“ground truth,” we use a normalized root mean square error, which is
defined as
NRMSE ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1
NX
N
n¼1ðRn$GTnÞ2
s
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1
NX
N
n¼1ðGTnÞ2
s;(1)
where R
n
is the nth pixel of the test reconstruction, GT
n
is the nth pixel
of the ground truth data, and Nis the number of pixels (one depth
tagged pixel for each lateral position in the image). We then define our
depth and reflectivity “accuracy” as 1 $NRMSE. For images with a
lower resolution, we first upscale the test image to 128"128 by pixel
repetition. We produce two separate accuracy figures for the reflectiv-
ity and depth maps. To improve the relevance of our depth accuracy
estimates, we do not include depth accuracy for pixels with apparent
reflectivity less than 5% of the maximum, as these pixels are likely
noise and will contain no meaningful depth information. For reflectiv-
ity comparison, we first normalize the data, as the Hadamard and DL
reconstructions return images with large differences in absolute reflec-
tivity. We normalize the reflectivity of both our test image and ground
truth by subtracting the mean value and dividing by the range. The
FIG. 2. Composite image breakdown. (a) shows a photograph of the target. (b)
shows the intensity measured at the maximum signal for each pixel. (c) shows the
depth measurement for each pixel. (d) shows the combined data in a single image.
(e) shows the point cloud of the combined data showing the depth of the object.
Applied Physics Letters ARTICLE scitation.org/journal/apl
Appl. Phys. Lett. 115, 231101 (2019); doi: 10.1063/1.5128621 115, 231101-3
V
CAuthor(s) 2019
depth accuracy is calculated with no normalization as the depth should
be absolute.
We compare the DL reconstruction method against three recon-
structions using a Hadamard imaging basis,
9
with reconstruction reso-
lutions of 32 "32, 64 "64, and 128 "128. The results while imaging
a scene at a distance of 2 m are shown in Fig. 3 for acquisition times
between 0.5 and 100 s. The Hadamard basis methods follow the pre-
dicted behavior that for the same acquisition time (photon number),
higher resolutions suffer greatly increased noise. Our DL method,
however, performs well even at short acquisition times, offering similar
noise levels to the 32 "32 Hadamard reconstructions but with a detail
level much closer to the 64 "64 reconstructions.
We quantify the quality of reconstructions using the accuracy
metric defined previously, and the results for the data in Fig. 3 are
shown in Fig. 4. The data were acquired as 200 individual 0.5 s acquisi-
tions, and the longer acquisitions are formed by summing. The accu-
racy is given as the mean of the values from the whole 200 acquisition
dataset, and the errors are given as the standard deviation. The reflec-
tivity accuracy results show the expected trend of increasing accuracy
with the acquisition time and reveal the DL reconstruction to
outperform even the 32 "32 Hadamard basis for acquisition times
above 0.5 s. The smaller error can be attributed to the finer details pre-
sent in the (in principle) 128 "128 DL reconstructions, which are
smoothed over in the lower resolution 32 "32 Hadamard reconstruc-
tion. The detail level of the DL reconstruction is qualitatively similar to
the 64 "64 reconstructions, and comparing the DL results with
64 "64 Hadamards shows a 50% improvement for short acquisition
times and 33% on average across all acquisition times. The depth accu-
racy data shown in Fig. 4(b) again show an advantage for the DL
method. Absolute depth accuracies can be acquired directly from the
error calculations, which for an acquisition time of 1 s are 40mm,
49 mm, and 167 mm for DL, 32 "32 Hadamards and 64 "64
Hadamards methods, respectively, with an average DL advantage of
11 mm (75 mm) over 32 "32 (64 "64) across all acquisition times.
In addition to the lab image acquisitions presented, we have
tested our DL LiDAR system at longer range in a corridor with ambi-
ent lighting. Measurements at 5 m and 28 m are shown in Fig. 5;for
these measurements, the data have been taken with 32"32
Hadamard basis and the DL basis. Both long and short range data
show the advantage of the DL imaging method, which again shows
similar SNR levels to the 32 "32 Hadamard reconstruction, but with
increased image sharpness.
FIG. 3. Composite image reconstructions. Columns are reconstructions for varying acquisition times. Rows show different reconstruction methods. 128 "128 Hadamards lack
0.5 s and 1 s acquisitions as the modulator speed is not sufficient to fully sample within those time limits. The color scale is the same as in Fig. 2.
FIG. 4. Reconstruction results. (a) Reflectivity accuracy and (b) depth accuracy of
the 4 reconstruction methods for varying acquisition times. DL: deep learned recon-
struction method. 32, 64, and 128 are Hadamard reconstructions of 32 "32,
64 "64, and 128 "128 resolution.
FIG. 5. Short and long range measurements. (a) Wide field of view measurement
of a mannequin at 5 m. (b) Narrow field of view measurement of the mannequin
against a flat background at a range of 30 m.
Applied Physics Letters ARTICLE scitation.org/journal/apl
Appl. Phys. Lett. 115, 231101 (2019); doi: 10.1063/1.5128621 115, 231101-4
V
CAuthor(s) 2019
We have demonstrated a single-pixel LiDAR system using a deep
learning optimized imaging basis and reconstruction algorithm, which
was derived from a trained convolutional neural network. This appli-
cation of deep learning takes the decoding scheme developed for 2D
images and demonstrates this methodology to a 3D LiDAR system.
We show that this DL LiDAR system outperforms traditional
Hadamard basis reconstruction methods in both reflectivity and depth
accuracy. Comparing the reflectivity accuracy of our DL method with
a similarly detailed 64 "64 Hadamard reconstruction shows a 57%
improvement. Depth accuracy is also improved for the DL method,
providing an advantage for all acquisition times with a 16% improve-
ment over 64 "64 at an acquisition time of 1 s. We discussed the limi-
tations of photon counting LiDAR with the photon acquisition rate,
with a reduced number of patterns, and using a DL reconstruction,
more precise images are produced when compared to conventional
pattern sets. We hope that this can improve the performance of SPI
LiDAR techniques and provide an alternative to 3D imaging methods
based on scanning, thereby miniaturizing systems for applications
such as autonomous vehicles. This can also pave the way for the devel-
opment of compressed sensing technologies, which are tuned to meet
the needs of down-stream decision making systems. Full 3D informa-
tion may not be necessary to make effective decisions, and compressed
sensing can greatly reduce the amounts of acquired and analyzed data
in complex systems.
We wish to acknowledge the financial support from the
Engineering and Physical Sciences Research Council (EPSRC)
QuantIC (No. EP/M01326X/1) and the H2020 European Research
Council (ERC) (TWISTS, No. 340507) (PhotUntangle, No. 804626).
REFERENCES
1
A. McCarthy, X. Ren, A. D. Frera, N. R. Gemmell, N. J. Krichel, C. Scarcella, A.
Ruggeri, A. Tosi, and G. S. Buller, “Kilometer-range depth imaging at 1550 nm
wavelength using an InGaAs/InP single-photon avalanche diode detector,”
Opt. Express 21, 22098–22113 (2013).
2
J. S. Massa, G. S. Buller, A. C. Walker, S. Cova, M. Umasuthan, and A. M.
Wallace, “Time-of-flight optical ranging system based on time-correlated sin-
gle-photon counting,” Appl. Opt. 37, 7298–7304 (1998).
3
D. Pierrottet, F. Amzajerdian, L. Petway, B. Barnes, G. Lockard, and M. Rubio,
“Linear FMCW laser radar for precision range and vector velocity meas-
urements,” MRS Proc. 1076, 1076-K04-06 (2008).
4
B. Behroozpour, P. A. M. Sandborn, M. C. Wu, and B. E. Boser, “Lidar system
architectures and circuits,” IEEE Commun. Mag. 55, 135–142 (2017).
5
X. Ren, P. W. R. Connolly, A. Halimi, Y. Altmann, S. McLaughlin, I. Gyongy,
R. K. Henderson, and G. S. Buller, “High-resolution depth profiling using a
range-gated CMOS SPAD quanta image sensor,” Opt. Express 26, 5541–5557
(2018).
6
S. Xiang, S. Chen, X. Wu, D. Xiao, and X. Zheng, “Study on fast linear scan-
ning for a new laser scanner,” Opt. Laser Technol. 42, 42–46 (2010).
7
M. P. Edgar, G. M. Gibson, and M. J. Padgett, “Principles and prospects for
single-pixel imaging,” Nat. Photonics 13, 13–20 (2019).
8
M. J. Padgett and R. W. Boyd, “An introduction to ghost imaging: Quantum
and classical,” Philos. Trans. R. Soc. A 375, 20160233 (2017).
9
W. K. Pratt, J. Kane, and H. C. Andrews, “Hadamard transform image coding,”
Proc. IEEE 57, 58–68 (1969).
10
M. F. Duarte, M. A. Davenport, D. Takhar, J. N. Laska, T. Sun, K. F. Kelly, and
R. G. Baraniuk, “Single-pixel imaging via compressive sampling,” IEEE Signal
Process. Mag. 25, 83–91 (2008).
11
N. Radwell, K. J. Mitchell, G. M. Gibson, M. P. Edgar, R. Bowman, and M. J.
Padgett, “Single-pixel infrared and visible microscope,” Optica 1, 285–289
(2014).
12
M.-J. Sun, L.-T. Meng, M. P. Edgar, M. J. Padgett, and N. Radwell, “A Russian
Dolls ordering of the Hadamard basis for compressive single-pixel imaging,”
Sci. Rep. 7, 3464 (2017).
13
W. L. Chan, K. Charan, D. Takhar, K. F. Kelly, R. G. Baraniuk, and D. M.
Mittleman, “A single-pixel terahertz imaging system based on compressed
sensing,” Appl. Phys. Lett. 93, 121105 (2008).
14
W.-K. Yu, “Super sub-Nyquist single-pixel imaging by means of cake-cutting
Hadamard basis sort,” Sensors 19, 4122 (2019).
15
C. F. Higham, R. Murray-Smith, M. J. Padgett, and M. P. Edgar, “Deep learning
for real-time single-pixel video,” Sci. Rep. 8, 2369 (2018).
16
M.-J. Sun, M. P. Edgar, G. M. Gibson, B. Sun, N. Radwell, R. Lamb, and M. J.
Padgett, “Single-pixel three-dimensional imaging with time-based depth reso-
lution,” Nat. Commun. 7, 12010 (2016).
17
M. P. Edgar, S. D. Johnson, D. B. Phillips, and M. J. Padgett, “Real-time compu-
tational photon-counting LiDAR,” Opt. Eng. 57, 031304 (2017).
18
G. A. Howland, P. B. Dixon, and J. C. Howell, “Photon-counting compressive
sensing laser radar for 3D imaging,” Appl. Opt. 50, 5917–5920 (2011).
19
G. A. Howland, D. J. Lum, M. R. Ware, and J. C. Howell, “Photon counting
compressive depth mapping,” Opt. Express 21, 23822–23837 (2013).
20
Z. Zhang, X. Wang, G. Zheng, and J. Zhong, “Hadamard single-pixel imaging
versus Fourier single-pixel imaging,” Opt. Express 25, 19619–19639 (2017).
21
A. Coates, H. Lee, and A. Y. Ng, “An analysis of single layer networks in unsu-
pervised feature learning,” in Proceedings of the Fourteenth International
Conference on Artificial Intelligence and Statistics (2011), Vol. 15.
22
S. D. Johnson, D. B. Phillips, Z. Ma, S. Ramachandran, and M. J. Padgett, “A
light-in-flight single-pixel camera for use in the visible and short-wave
infrared,” Opt. Express 27, 9829–9837 (2019).
Applied Physics Letters ARTICLE scitation.org/journal/apl
Appl. Phys. Lett. 115, 231101 (2019); doi: 10.1063/1.5128621 115, 231101-5
V
CAuthor(s) 2019
... For example, the field of deep microscopy aims to advance the resolution of classical microscopy images solely by computation. A series of experiments showed how neural networks could improve single-pixel cameras by autonomously learning the ideal illumination patterns of the objects [337]. A very active field of research deals with wavefront shaping and imaging using complex media. ...
... A) For data analysation. Here, the illumination patterns of a 3d single-pixel camera are constructed via deep neural networks and lead to significantly better results than classical methods.[337] B) For Design questions. ...
Article
Full-text available
Structured waves are ubiquitous for all areas of wave physics, both classical and quantum, where the wavefields are inhomogeneous and cannot be approximated by a single plane wave. Even the interference of two plane waves, or of a single inhomogeneous (evanescent) wave, provides a number of nontrivial phenomena and additional functionalities as compared to a single plane wave. Complex wavefields with inhomogeneities in the amplitude, phase, and polarization, including topological structures and singularities, underpin modern nanooptics and photonics, yet they are equally important, e.g., for quantum matter waves, acoustics, water waves, etc. Structured waves are crucial in optical and electron microscopy, wave propagation and scattering, imaging, communications, quantum optics, topological and non-Hermitian wave systems, quantum condensed-matter systems, optomechanics, plasmonics and metamaterials, optical and acoustic manipulation, and so forth. This Roadmap is written collectively by prominent researchers and aims to survey the role of structured waves in various areas of wave physics. Providing background, current research, and anticipating future developments, it will be of interest to a wide cross-disciplinary audience.
... The novel single-pixel imaging LiDAR [8,19] used in this study integrates Time-of-Flight (ToF) methods and single-pixel imaging theory to spatially encode the entire scene, achieving global detection with a single-pixel detector and long-range detection capabilities. This model offers a relatively simple architecture and low data flow, as well as promising prospects for practical applications [20][21][22]. ...
Article
Full-text available
The real-time tracking of moving objects has extensive applications in various domains. Existing tracking methods typically utilize video image processing, but their performance is limited due to the high information throughput and computational requirements associated with processing continuous images. Additionally, imaging in certain spectral bands can be costly. This paper proposes a non-imaging real-time three-dimensional tracking technique for distant moving targets using single-pixel LiDAR. This novel approach involves compressing scene information from three-dimensional to one-dimensional space using spatial encoding modulation and then obtaining this information through single-pixel detection. A LiDAR system is constructed based on this method, where the peak position of the detected full-path one-dimensional echo signal is used to obtain the target distance, while the peak intensity is used to obtain the azimuth and pitch information of the moving target. The entire process requires minimal data collection and a low computational load, making it feasible for the real-time three-dimensional tracking of single or multiple moving targets. Outdoor experiments confirmed the efficacy of the proposed technology, achieving a distance accuracy of 0.45 m and an azimuth and pitch angle accuracy of approximately 0.03° in localizing and tracking a flying target at a distance of 3 km.
... For instance, SPI has been adopted in deep tissue imaging (18)(19)(20)(21), fluorescence lifetime imaging (22), ultrathin 3D fiber endoscopy (21), and optical diffraction tomography (23) for in vivo applications where light scattering and absorption are commonly involved. Additionally, the 3D precise ranging and sensing ability of SPI has been exploited for various LiDAR systems (14,15,(24)(25)(26)(27) and non-line-of-sight 3D imaging (28)(29)(30) with promising applications in 3D situation awareness for autonomous vehicles as well as real-time visualization of hazardous gas leaks (31). ...
Article
Full-text available
Three-dimensional single-pixel imaging (3D SPI) has become an attractive imaging modality for both biomedical research and optical sensing. 3D-SPI techniques generally depend on time-of-flight or stereovision principle to extract depth information from backscattered light. However, existing implementations for these two optical schemes are limited to surface mapping of 3D objects at depth resolutions, at best, at the millimeter level. Here, we report 3D light-field illumination single-pixel microscopy (3D-LFI-SPM) that enables volumetric imaging of microscopic objects with a near-diffraction-limit 3D optical resolution. Aimed at 3D space reconstruction, 3D-LFI-SPM optically samples the 3D Fourier spectrum by combining 3D structured light-field illumination with single-element intensity detection. We build a 3D-LFI-SPM prototype that provides an imaging volume of ∼390 × 390 × 3,800 μm3 and achieves 2.7-μm lateral resolution and better than 37-μm axial resolution. Its capability of 3D visualization of label-free optical absorption contrast is demonstrated by imaging single algal cells in vivo. Our approach opens broad perspectives for 3D SPI with potential applications in various fields, such as biomedical functional imaging.
Article
Full-text available
The fast expansion of photon detection technology has fertilized the rapid growth of single-photon sensing and imaging techniques. While promising significant advantages over their classical counterparts, they suffer from ambient and quantum noises whose effects become more pronounced at low light levels, limiting the quality of the acquired signal. Here, we study how photon-counting noises degrade a single-pixel optical classifier via compressive sensing, and how its performance can be restored by using quantum parametric mode sorting. Using modified National Institute of Standards and Technology (MNIST) handwritten digits as an example, we examine the effects of detector dark counts and in-band background noises and demonstrate the effectiveness of mode filtering and upconversion detection in addressing those issues. We achieve 94% classification accuracy in the presence of 500 times stronger in-band noise than the signal received. Our results suggest a robust and efficient approach to single photon sensing in a practical environment, where sunlight, ambient, and multiscattering noises can easily dominate the weak signal.
Article
A deep learning (DL) approach for improving detection accuracy and efficiency is conducted in this paper based on a mass-position sensing scheme. In the scheme, masses and positions of multiple spheres can be determined using a length-adjustable cantilever with lower modes. Four DL networks, including two simple MLP (multi-layer perceptron), an inverted triangle MLP and a residual network, are constructed to process the data sets obtained by experimentally verified physics-based model. Comparing to iteration with the non-negative linear least squares, the detection accuracy is increased by 80%, and the calculation efficiency is improved by more than 4000 times. The conducted DL approach does not rely on the modal shape functions of the cantilever which is essential for the iteration method. The size of the data set has almost no impact on the predicted accuracy while more input dimensions can make significant improvement. If the principle of a physical sensor can be verified by simulation and experiment simultaneously, a data set can be established with simulation and then the DL neural network can be trained to learn the relationship between input and output of the sensors. This is especially useful when it is difficult to reverse the input from the output of the sensor by traditional mathematical means. So our approach that training DL networks with the data set obtained by experimentally verified physics-based model is expected to be applicable to physical sensors besides resonant one.
Article
Full-text available
Single-pixel imaging via compressed sensing can reconstruct high-quality images from a few linear random measurements of an object known a priori to be sparse or compressive, by using a point/bucket detector without spatial resolution. Nevertheless, random measurements still have blindness, limiting the sampling ratios and leading to a harsh trade-off between the acquisition time and the spatial resolution. Here, we present a new compressive imaging approach by using a strategy we call cake-cutting, which can optimally reorder the deterministic Hadamard basis. The proposed method is capable of recovering images of large pixel-size with dramatically reduced sampling ratios, realizing super sub-Nyquist sampling and significantly decreasing the acquisition time. Furthermore, such kind of sorting strategy can be easily combined with the structured characteristic of the Hadamard matrix to accelerate the computational process and to simultaneously reduce the memory consumption of the matrix storage. With the help of differential modulation/measurement technology, we demonstrate this method with a single-photon single-pixel camera under the ulta-weak light condition and retrieve clear images through partially obscuring scenes. Thus, this method complements the present single-pixel imaging approaches and can be applied to many fields.
Article
Full-text available
Single-pixel cameras reconstruct images from a stream of spatial projection measurements recorded with a single-element detector, which itself has no spatial resolution. This enables the creation of imaging systems that can take advantage of the ultra-fast response times of single-element detectors. Here we present a single-pixel camera with a temporal resolution of 200 ps in the visible and short-wave infrared wavelengths, used here to study the transit time of distinct spatial modes transmitted through few-mode and orbital angular momentum mode conserving optical fiber. Our technique represents a way to study the spatial and temporal characteristics of light propagation in multimode optical fibers, which may find use in optical fiber design and communications.
Article
Full-text available
Modern digital cameras employ silicon focal plane array (FPA) image sensors featuring millions of pixels. However, it is possible to make a camera that only needs one pixel. In these cameras a spatial light modulator, placed before or after the object to be imaged, applies a time-varying pattern and synchronized intensity measurements are made with a single-pixel detector. The principle of compressed sensing then allows an image to be generated. As the approach suits a wide a variety of detector technologies, images can be collected at wavelengths outside the reach of FPA technology or at high frame rates or in three dimensions. Promising applications include the visualization of hazardous gas leaks and 3D situation awareness for autonomous vehicles.
Article
Full-text available
A CMOS single-photon avalanche diode (SPAD) quanta image sensor is used to reconstruct depth and intensity profiles when operating in a range-gated mode used in conjunction with pulsed laser illumination. By designing the CMOS SPAD array to acquire photons within a pre-determined temporal gate, the need for timing circuitry was avoided and it was therefore possible to have an enhanced fill factor (61% in this case) and a frame rate (100,000 frames per second) that is more difficult to achieve in a SPAD array which uses time-correlated single-photon counting. When coupled with appropriate image reconstruction algorithms, millimeter resolution depth profiles were achieved by iterating through a sequence of temporal delay steps in synchronization with laser illumination pulses. For photon data with high signal-to-noise ratios, depth images with millimeter scale depth uncertainty can be estimated using a standard cross-correlation approach. To enhance the estimation of depth and intensity images in the sparse photon regime, we used a bespoke clustering-based image restoration strategy, taking into account the binomial statistics of the photon data and non-local spatial correlations within the scene. For sparse photon data with total exposure times of 75 ms or less, the bespoke algorithm can reconstruct depth images with millimeter scale depth uncertainty at a stand-off distance of approximately 2 meters. We demonstrate a new approach to single-photon depth and intensity profiling using different target scenes, taking full advantage of the high fill-factor, high frame rate and large array format of this range-gated CMOS SPAD array.
Article
Full-text available
Single-pixel cameras capture images without the requirement for a multi-pixel sensor, enabling the use of state-of-the-art detector technologies and providing a potentially low-cost solution for sensing beyond the visible spectrum. One limitation of single-pixel cameras is the inherent trade-off between image resolution and frame rate, with current compressive (compressed) sensing techniques being unable to support real-time video. In this work we demonstrate the application of deep learning with convolutional auto-encoder networks to recover real-time 128 × 128 pixel video at 30 frames-per-second from a single-pixel camera sampling at a compression ratio of 2%. In addition, by training the network on a large database of images we are able to optimise the first layer of the convolutional network, equivalent to optimising the basis used for scanning the image intensities. This work develops and implements a novel approach to solving the inverse problem for single-pixel cameras efficiently and represents a significant step towards real-time operation of computational imagers. By learning from examples in a particular context, our approach opens up the possibility of high resolution for task-specific adaptation, with importance for applications in gas sensing, 3D imaging and metrology.
Article
Full-text available
The availability of compact, low-cost, and high-speed MEMS-based spatial light modulators has generated widespread interest in alternative sampling strategies for imaging systems utilizing single-pixel detectors. The development of compressed sensing schemes for real-time computational imaging may have promising commercial applications for high-performance detectors, where the availability of focal plane arrays is expensive or otherwise limited. We discuss the research and development of a prototype light detection and ranging (LiDAR) system via direct time of flight, which utilizes a single high-sensitivity photon-counting detector and fast-timing electronics to recover millimeter accuracy three-dimensional images in real time. The development of low-cost real time computational LiDAR systems could have importance for applications in security, defense, and autonomous vehicles. © 2017 Society of Photo-Optical Instrumentation Engineers (SPIE).
Article
Full-text available
Ghost imaging has been a subject of interest to the quantum optics community for the past 20 years. Initially seen as manifestation of quantum spookiness, it is now recognized as being implementable in both single- and many-photon number regimes. Beyond its scientific curiosity, it is now feeding novel imaging modalities potentially offering performance attributes that traditional approaches cannot match. This article is part of the themed issue ‘Quantum technology for the 21st century’.
Article
Full-text available
Single-pixel imaging which employs active illumination to acquire spatial information is an innovative imaging scheme and has received increasing attentions in recent years. It is applicable to imaging at non-visible wavelengths and imaging under low light conditions. However, single-pixel imaging has once encountered problems of low reconstruction quality and long data-acquisition time. Hadamard single-pixel imaging (HSI) and Fourier single-pixel imaging (FSI) are two representative deterministic model based techniques. Both techniques are able to achieve high-quality and efficient imaging, remarkably improving the applicability of single-pixel imaging scheme. In this paper, we compare the performances of HSI and FSI with theoretical analysis and experiments. The results show that FSI is more efficient than HSI while HSI is more noise-robust than FSI. Our work may provide a guideline for researchers to choose suitable single-pixel imaging technique for their applications.
Article
3D imaging technologies are applied in numerous areas, including self-driving cars, drones, and robots, and in advanced industrial, medical, scientific, and consumer applications. 3D imaging is usually accomplished by finding the distance to multiple points on an object or in a scene, and then creating a point cloud of those range measurements. Different methods can be used for the ranging. Some of these methods, such as stereovision, rely on processing 2D images. Other techniques estimate the distance more directly by measuring the round-trip delay of an ultrasonic or electromagnetic wave to the object. Ultrasonic waves suffer large losses in air and cannot reach distances beyond a few meters. Radars and lidars use electromagnetic waves in radio and optical spectra, respectively. The shorter wavelengths of the optical waves compared to the radio frequency waves translates into better resolution, and a more favorable choice for 3D imaging. The integration of lidars on electronic and photonic chips can lower their cost, size, and power consumption, making them affordable and accessible to all the abovementioned applications. This review article explains different lidar aspects and design choices, such as optical modulation and detection techniques, and point cloud generation by means of beam-steering or flashing an entire scene. Popular lidar architectures and circuits are presented, and the superiority of the FMCW lidar is discussed in terms of range resolution, receiver sensitivity, and compatibility with emerging technologies. At the end, an electronic-photonic integrated circuit for a micro-imaging FMCW lidar is presented as an example.
Conference Paper
A great deal of research has focused on algorithms for learning features from un- labeled data. Indeed, much progress has been made on benchmark datasets like NORB and CIFAR by employing increasingly complex unsupervised learning al- gorithms and deep models. In this paper, however, we show that several very sim- ple factors, such as the number of hidden nodes in the model, may be as important to achieving high performance as the choice of learning algorithm or the depth of the model. Specifically, we will apply several off-the-shelf feature learning al- gorithms (sparse auto-encoders, sparse RBMs and K-means clustering, Gaussian mixtures) to NORB and CIFAR datasets using only single-layer networks. We then present a detailed analysis of the effect of changes in the model setup: the receptive field size, number of hidden nodes (features), the step-size (stride) be- tween extracted features, and the effect of whitening. Our results show that large numbers of hidden nodes and dense feature extraction are as critical to achieving high performance as the choice of algorithm itselfso critical, in fact, that when these parameters are pushed to their limits, we are able to achieve state-of-the- art performance on both CIFAR and NORB using only a single layer of features. More surprisingly, our best performance is based on K-means clustering, which is extremely fast, has no hyper-parameters to tune beyond the model structure it- self, and is very easy implement. Despite the simplicity of our system, we achieve performance beyond all previously published results on the CIFAR-10 and NORB datasets (79.6% and 97.0% accuracy respectively).