PreprintPDF Available

Neural Network Based 3D Tracking with a Graphene Transparent Focal Stack Imaging System

Authors:

Abstract and Figures

Recent years have seen the rapid growth of new approaches to optical imaging, with an emphasis on extracting three-dimensional (3D) information from what is normally a two-dimensional (2D) image capture. Perhaps most importantly, the rise of computational imaging, defined as the synergistic design of optical systems in conjunction with image reconstruction algorithms, enables both new physical layouts of optical components and new algorithms to be implemented. This paper concerns the convergence of two advances: the development of transparent photodetectors with high responsivity, and the rapid expansion of the capabilities of machine learning including the development of powerful neural networks. In particular, we demonstrate that the use of transparent photodetector arrays stacked vertically along the optical axis of an imaging system, called a focal stack, together with a feedforward neural network, provides a powerful new approach to real-time 3D optical imaging including object tracking. The focal stack imaging system is realized through the development of graphene transparent photodetector arrays. As a proof-of concept, 3D tracking of point-like objects was successfully demonstrated with multilayer feedforward neural networks, which was then extended for tracking of multi-point objects in position. Our computer model further demonstrates how this optical system can track extended objects in 3D, highlighting the promise of combining nanophotonic devices, new optical system designs, and machine learning for new frontiers in 3D imaging.
Content may be subject to copyright.
Page 1/14
Neural Network Based 3D Tracking with a Graphene
Transparent Focal Stack Imaging System
Dehui Zhang ( dehui@umich.edu )
University of Michigan
Zhen Xu
University of Michigan
Zhengyu Huang
University of Michigan
Audrey Rose Gutierrez
University of Michigan
Cameron Blocker
University of Michigan
Che-Hung Liu
University of Michigan
Miao-Bin Lien
University of Michigan
Gong Cheng
University of Michigan https://orcid.org/0000-0002-1206-1762
Zhe Liu
University of Michigan
Il Yong Chun
University of Hawai’i at Manoa
Jeffrey Fessler
University of Michigan https://orcid.org/0000-0001-9998-3315
Zhaohui Zhong
University of Michigan https://orcid.org/0000-0001-5050-7182
Theodore Norris
University of Michigan https://orcid.org/0000-0003-1387-7074
Article
Keywords:
DOI: https://doi.org/10.21203/rs.3.rs-54444/v1
Page 2/14
License: This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License
Page 3/14
Abstract
Recent years have seen the rapid growth of new approaches to optical imaging, with an emphasis on
extracting three-dimensional (3D) information from what is normally a two-dimensional (2D) image
capture. Perhaps most importantly, the rise of computational imaging, dened as the synergistic design
of optical systems in conjunction with image reconstruction algorithms, enables both new physical
layouts of optical components and new algorithms to be implemented. This paper concerns the
convergence of two advances: the development of transparent photodetectors with high responsivity, and
the rapid expansion of the capabilities of machine learning including the development of powerful neural
networks. In particular, we demonstrate that the use of transparent photodetector arrays stacked vertically
along the optical axis of an imaging system, called a focal stack, together with a feedforward neural
network, provides a powerful new approach to real-time 3D optical imaging including object tracking. The
focal stack imaging system is realized through the development of graphene transparent photodetector
arrays. As a proof-of concept, 3D tracking of point-like objects was successfully demonstrated with
multilayer feedforward neural networks, which was then extended for tracking of multi-point objects in
position. Our computer model further demonstrates how this optical system can track extended objects in
3D, highlighting the promise of combining nanophotonic devices, new optical system designs, and
machine learning for new frontiers in 3D imaging.
Introduction:
Emerging technologies such as autonomous vehicles demand imaging technologies that can capture not
only a 2D image but also the 3D spatial position and orientation of objects. Multiple solutions have been
proposed, including LiDAR systems1,2,3 and light-eld cameras4–7, though existing approaches suffer
from signicant limitations. For example, LiDAR is constrained by size and cost, and most importantly
requires active illumination of the scene using a laser, which poses challenges of its own, including
safety. Light-eld cameras of various congurations have also been proposed and tested. A common
approach uses a microlens array in front of the sensor array of a camera;4,5 light emitted from the same
point with different angles is then mapped to different pixels to create angular information. However, the
mapping to a lower dimension carries a tradeoff between spatial and angular resolution. Alternatively,
one can use optical masks6 and camera arrays7 for light eld acquisition. However, the former method
sacrices the signal-to-noise ratio and might need a longer exposure time in compensation; the latter
device size could become a limiting factor in developing compact cameras. Recently, a light-eld imaging
system using stacks of graphene-based transparent photodetector arrays was proposed as a new
approach to 3D imaging, but only a simple 1D ranging was demonstrated experimentally due to the limit
of single-pixel devices. In this work, we present the development of 2D transparent detector arrays,
specically 4 × 4 (16-pixel) all-graphene arrays, and demonstrate simultaneous imaging at multiple focal
planes with data acquired using only two sensor planes and used as input to a feedforward neural
network.
Page 4/14
To have its highest possible sensitivity to light, a photodetector would ideally absorb all the light incident
upon it in the active region of the device. It is possible, however, to design a detector with a
photoresponse suciently large for a given application, that nevertheless does not absorb all the incident
light11–16. Indeed, we have shown that a photodetector in which the active region consists of two
graphene layers can operate with quite high responsivities, while absorbing only about 5% of the incident
light17. By fabricating the detector on a transparent substrate, it is possible to obtain responsivities of
several A/W while transmitting 80–90% of the incident light, allowing multiple sensor planes to be
stacked along the axis of an optical system. We have previously demonstrated a simple 1D ranging
application using single pixel of such detectors8. We also showed how focal stack imaging is possible in
a single exposure if transparent detector arrays can be realized, and developed models showing how
light-eld imaging and 3D reconstruction could be accomplished.
While the emphasis in ref [8] was on 4D light eld imaging and reconstruction from a focal stack, some
optical applications, e.g., ranging and tracking, do not require computationally expensive 4D light eld
reconstruction18,19. The question naturally arises as to whether the focal stack geometry will allow optical
sensor data to provide the necessary information for a given application, without reconstructing a 4D
light eld or estimating a 3D scene structure via depth map. The simple intuition behind the focal stack
geometry is that each sensor array will image sharply a specic region of the object space, corresponding
to the depth of eld for each sensor plane. A stack of sensors thus greatly expands the total system
depth of eld. The use of sophisticated algorithms, however, may provide useful information even for
regions of the object space that are not in precise focus. To this end, we demonstrate how combinations
of focal stacks obtained by transparent sensor arrays and machine learning algorithms enable 3D object
tracking, without the need for light-eld reconstruction. Experimental results illustrate that the
implemented neural networks using focal stack data can achieve accurate 3D object tracking eciently
(millisecond inference time using a conventional GPU computing power; see details in SI B-VI).
The concept of a focal-stack imaging system based on simultaneous imaging at multiple focal planes is
shown in Fig.1(a). In the typical imaging process, the camera lens projects an arbitrary object (in this
case a ball-and-stick model) onto a set of transparent imaging arrays stacked at different focal planes.
With the sensor arrays having a typical transparency on the order of 90%, sucient light propagates to all
planes for sensitive detection of the projected light eld. (Of course the nal sensor in the stack need not
be transparent, and could be a conventional opaque sensor array). Each of the images in the stack
records the light distribution at a specic depth, so that depth information is encoded in the image stack.
We can then use neural networks to process the 3D focal stack data and estimate the 3D position and
conguration of the object.
As a proof of concept, we rst show that the system can estimate a point object’s position in 3D space.
To accurately estimate the 3D position of the point objects from two-plane focal stack data, we train and
apply feedforward neural networks that have the form of a multilayer perceptron (MLP)20. A point object
is scanned through a set of points in the object space, and the focal stack data is used to train MLP
Page 5/14
neural networks; using trained systems, we track an object throughout the full object space. Using the
same data, we also designed another neural network to track multi-point objects in position. We rst
demonstrate the concept with a stack of two 4 × 4 (16-pixel) graphene sensors. Then, to examine the
capability of future higher-resolution sensor arrays for tracking, we performed an emulation by also
acquiring focal stack data sets using a conventional CMOS camera with separate exposures for each
focal plane. Our experimental results show that the graphene-based transparent photodetector array is a
scalable solution for 3D information acquisition, and that a combination of transparent photodetector
arrays and machine learning algorithms can lead to a compact camera design capable of capturing real-
time 3D information with high resolution. This type of optical system is potentially useful for many
emerging technologies such as face recognition, autonomous vehicles and unmanned aero vehicle
navigation, and biological video-rate 3D microscopy, without the need for an integrated illumination
source. An additional benet of the graphene-based transparent photodetectors is its ability to detect light
with a broad bandwidth from visible to mid-infrared, enabling 3D infrared imaging for even more
applications.
All-graphene Transparent Photodetector Arrays:
Photodetector arrays with high responsivity and high transparency are central to realizing a focal stack
imaging system. To this end, we fabricated all-graphene transparent photodetector arrays as individual
sensor planes. Briey, CVD-grown graphene on copper foil was wet transferred21 onto a glass substrate
and patterned into oating gates of phototransistors using photolithography. We then sputtered 6nm of
undoped silicon on top as a tunneling barrier, followed by another layer of graphene transferred on top
and patterned into the interconnects and device channels. (Fig.1b bottom inset; Details in SI) In
particular, using an atomically thin graphene sheet for the interconnects dramatically reduces light
scattering when compared to using ITO or other conductive thin lms, which is crucial for recording
photocurrent signal across all focal stacks. As a proof-of-concept, we fabricated 4 × 4 (16-pixel)
transparent graphene photodetector arrays, as shown in Fig.1(b). The active region of each device, the
interconnects, and the transparent substrate are clearly differentiated in the optical image due to their
differing numbers of graphene layers. The device has an overall raw transparency > 80%; further
simulation shows that the transparency can be improved to 96% by refractive index compensation (see
SI). The devices are wired out separately and connected to metal pads, which are then wire-bonded to a
customized signal readout circuit. During normal operation, a bias voltage is applied across the graphene
channel and the current owing across the channel is measured; light illumination induces a change in
the current, producing photocurrent as the readout (details in SI). The photodetection mechanism of our
device is attributed to the photogating effect17,22,23 in the graphene transistor.
The yield and uniformity of devices were rst characterized by measuring the channel conductance.
Remarkably, the use of graphene interconnects can still lead to high device yield; 99% of the 192 devices
tested show good conductivities (see SI, Fig.1S). The DC photoresponsivity of an individual pixel within
the array can reach ~ 3A/W at a bias voltage of 0.5V, which is consistent with the response of single-
Page 6/14
pixel devices reported previously8. We also notice the large device-to-device variation that is intrinsic to
most nanoelectronics. Normalization within the array, however, can compensate for this uniformity issue,
a common practice even in a commercial CCD array.
To reduce the noise and minimize device hysteresis, the AC photocurrent of each pixel is recorded for 3D
tracking and imaging. This measurement scheme sacrices responsivity but makes the measurement
faster and more reliable. As shown in Fig.2(a), a chopper modulates the light and a lock-in amplier
records the AC current at the chopper frequency. The power dependence of the AC photocurrent is also
examined (see SI A-II). The responsivity remains constant in the power range that we use to perform our
test. Hence only a single exposure is required to calibrate the nonuniformity between the pixels.
Focal Stack Imaging With Transparent Sensors:
The concept of focal stack imaging was demonstrated using two vertically stacked transparent graphene
arrays. As shown in Fig.2(a), two 4 × 4 sensor arrays were mounted vertically along the optical axis,
separated at a controlled distance, to form a stack of imaging planes. This double-focal-plane stack
essentially serves as the camera of the imaging system. A convex lens focuses a 532nm laser beam,
with the beam focus serving as a point object. The focusing lens was mounted on a 3D-motorized stage
to vary the position of the point object in 3D. The AC photocurrent is recorded for individual pixels on both
front and back detector arrays while the point object is moving along the optical axis.
Figure2(b) shows a representative set of images captured experimentally by the two detector arrays
when a point object is scanned at different positions along the optical axis (12mm, 18mm, 22mm)
respectively, corresponding to focus shifting from the back plane toward the front plane (Fig.2(c)). The
grayscale images show the normalized photoresponse, with white (black) color representing high (low)
intensity. As the focus point shifts from the back plane toward the front plane, the image captured by the
front plane shrinks and sharpens, while the image captured by the back plane expands and blurs. Even
though the low pixel density limits the image resolution, these results nevertheless verify the validity of
simultaneously capturing images at multiple focal planes.
3d Tracking Of Point Objects:
While a single image measures the lateral position of objects as in conventional cameras, differences
between images captured in different sensor planes contain the depth information of the point object.
Hence focal stack data can be used to reconstruct the 3D position of the point object. Here we consider
three different types of point objects: a single-point object, a three-point object, and a two-point object
that is rotated and translated in three dimensions.
First, we consider single-point tracking. In this experiment, we scanned the point source (dotted circle in
Fig.2(a)) in a 3D spatial grid of size 0.6 mm × 0.6mm (x, y axes) × 20mm (z axis, i.e., the longitudinal
direction). The grid spacing was 0.06mm along the x, y axes, and 2mm along the z axis, leading to 1,331
Page 7/14
grid points in total. For each measurement, two images were recorded from the graphene sensor planes.
We randomly split the data into two subsets, training data with 1131 samples (85% of total samples) and
testing data with 200 samples (15% of total samples); all experiments used this data splitting procedure.
To estimate three spatial coordinates of the point object from the focal stack data, we trained three
separate MLP15 neural networks (one for each spatial dimension) with mean-square error (MSE) loss.
The results (Fig.3(a)(b)) show that even with the limited resolution provided by 4 × 4 arrays, and only two
sensor planes, the point object positions can be determined very accurately. We used the root-mean-
square error (RMSE) to quantify the estimation accuracy on the testing dataset; we obtained RMSE
values of 0.012mm, 0.014mm, and 1.196mm along the x, y, and z directions, respectively.
Given the good tracking performance with the small-scale (i.e., 4 × 4 arrays) graphene transistor focal
stack, we studied how the tracking performance scales with array size. We determined the performance
advantages of larger arrays by using conventional CMOS sensors to acquire the focal stack data. For
each point source position, we obtained multi-focal plane image stacks by multiple exposures with
varying CMOS sensor depth (note that focal stack data collected by CMOS sensors with multiple
exposures would be comparable to that obtained by the proposed transparent array with a single
exposure, as long as the scene being imaged is static), and down-sampled the resolution of high
resolution (1280 × 1024) images captured by CMOS sensor to 4 × 4, 9 × 9, and 32 × 32. We observed that
tracking performance improves as the array size increases; results are presented in SI B-V.
We next considered the possibility of tracking multi-point objects. Here, the object consisted of three point
objects, and these three points can have three possible relative positions to each other. We synthesized
1,880 3-point objects images as the sum of single-point objects images from either the graphene
detectors or the CMOS detectors (see details of focal stack synthesis in SI B-II). This synthesis approach
is reasonable given that the detector response is suciently linear and it avoids the complexity of
precisely positioning multiple point objects in the optical setup. To estimate the spatial coordinates of the
3-point synthetic objects, we trained a MLP neural network with MSE loss that considers the ordering
ambiguity of the network outputs (see Eq.(1) in SI). We used 3-point objects data synthesized from the
CMOS-sensor readout in the single-point tracking experiment (with each CMOS image smoothed by
spatial averaging and then down-sampled to 9 × 9). We found that the trained MLP neural network can
estimate a multi-point object’s position with remarkable accuracy; see Fig.3(c-d). The RMSE values
calculated from the entire test set are 0.017mm, 0.016mm, 0.59mm, along x-, y-, z-directions,
respectively. Similar to the single-point object tracking experiment, the multi-point object tracking
performance improves with increasing sensor resolution (see SI B-V).
Finally, we considered tracking of a two-point object that is rotated and translated in three dimensions.
This task aims to demonstrate 3D tracking of a continuously moving object, such as a rotating solid rod.
Similar to the 3-point object tracking experiment, we synthesized a 2-point object focal stack from single-
point object focal stacks captured using the graphene transparent transistor array. The two points are
located at the same x-y plane and are separated by a xed distance, as if tied by a solid rod. The rod is
allowed to rotate in the x-y plane and translate along the z-axis, forming helical trajectories, as shown in
Page 8/14
Fig.3(e). We trained a MLP neural network with 242 training trajectories using MSE loss to estimate the
objects spatial coordinates and tested its performance on 38 test rotating trajectories. Figure3(e) shows
the results of one test trajectory. The neural network estimated the orientation (x- and y-coordinates) and
depth (z-coordinate) of test objects with good accuracy: RMSE along x-, y-, and z-directions for the entire
test set are 0.016mm, 0.024mm, 0.65mm, respectively.
SI B-IV gives further details on the MLP neural network architectures and training.
3d Extended Object Tracking:
The aforementioned objects consisted of a few point sources. For non-point-like (extended) objects, the
graphene 4 × 4 pixel array fails to accurately estimate the conguration, given the limited information
available from such a small array. To illustrate the possibilities of 3D tracking of a complex object and
estimating its orientation, we used a ladybug as an extended object and moved it in a 3D spatial grid of
size 8.5mm × 8.5mm × 45mm. The grid spacing was 0.85mm along both x- and y-directions, and 3mm
along z-direction. At each grid point, the object took 8 possible orientations in the x-z plane, with 45°
angular separation between neighboring orientations (see experiment details in SI B-III). We acquired
15,488 high-resolution focal stack images using the CMOS sensor (at two different planes) and trained
two convolutional neural networks (CNNs), one to estimate the ladybug’s position and the other for
estimating its orientation, with MSE loss and the cross-entropy loss, respectively. Figure4 shows the
results for ve test samples. The CNNs correctly classied the orientation of all ve samples and
estimated their 3D position accurately. For the entire test set, the RMSE along x-, y-, and z-directions is
0.11mm, 0.13mm, and 0.65mm, respectively, and the orientation is classied with 99.35% accuracy. We
note that at least two imaging planes are needed to achieve good estimation accuracy along depth (z)-
direction: when the sensor at the front position is solely used, the RMSE value along z-direction is
2.14mm, and when the sensor at the back position is solely used, the RMSE value along z-direction is
1.60mm.
SI B-IV describes the CNN architectures and training details.
Discussion And Conclusion:
In conclusion, we designed and demonstrated a focal stack imaging system enabled by graphene
transparent photodetector arrays and the use of feedforward neural networks. Even with limited pixel
density, we successfully demonstrated simultaneous imaging at multiple focal planes, which can be used
for 3D tracking of point objects with high speed and high accuracy. Our computer model further proves
that such an imaging system has the potential to track an extended object and estimate its orientation at
the same time. Future advancements in graphene detector technology, such as higher density arrays and
smaller hysteresis enabled by higher quality tunnel barriers, will be necessary to move beyond the current
proof-of-concept demonstration. We also want to emphasize that the proposed focal stacking imaging
concept is not limited to graphene detectors alone. Transparent (or semi-transparent) detectors made
Page 9/14
from other 2D semiconductors and ultra-thin semiconductor lms can also be implemented as the
transparent sensor planes within the focal stacks. The resulting ultra-compact, high-resolution, and fast
3D object detection technology can be advantageous over existing technologies such as LiDAR and light-
eld cameras. Our work also showcases that the combination of nanophotonic devices, which is
intrinsically high-performance but nondeterministic, with machine learning algorithms can complement
and open new frontiers in computational imaging.
Declarations:
Acknowledgements
The authors gratefully acknowledge nancial support from the W. M. Keck Foundation and National
Science Foundation grants IIS 1838179. Devices were fabricated in the Lurie Nanofabrication Facility at
University of Michigan, a member of the National Nanotechnology Infrastructure Network funded by the
National Science Foundation.
Author Contributions
D.Z., Z.X., Z.H., J.A.F., Z.Z. and T.B.N. conceived the experiments. D.Z. and Z.L. fabricated the devices. Z.X.,
D.Z., C.L., M.L. and G.C. built the optical setup. D.Z., Z.X. and A.R.G. performed the nanodevice
optoelectrical measurements. Z.H. performed the CMOS camera data collection. Z.H., I.Y.C. and C.J.B.
worked on neural network based 3D reconstructions. All authors discussed the results and co-wrote the
manuscript.
Competing Interest Declaration:
The authors declare no competing interests.
Additional Information:
Supplementary Information is available for this paper. Correspondence and requests for materials should
be addressed to T.B.N and Z.Z. Reprints and permissions information is available at
www.nature.com/reprints.
References:
1. Schwarz, Brent. "LIDAR: Mapping the world in 3D." Nature Photonics 4.7 (2010): 429.
2. Oggier, Thierry, Scott T. Smith, and Andrew Herrington. "Line scan depth sensor." U.S. Patent
Application No. 15/700,231.
3. Niclass, Cristiano L., et al. "Light detection and ranging sensor." U.S. Patent Application No.
15/372,411.
Page 10/14
4. Ng, Ren, et al. "Light eld photography with a hand-held plenoptic camera."
Computer Science
Technical Report CSTR
2.11 (2005): 1–11.
5. Navarro, H., et al. "High-resolution far-eld integral-imaging camera by double snapshot."
Optics
Express
20.2 (2012): 890–895.
6. Xu, Zhimin, Jun Ke, and Edmund Y. Lam. "High-resolution lighteld photography using two masks."
Optics Express
20.10 (2012): 10971–10983.
7. Venkataraman, Kartik, et al. "PiCam: An ultra-thin high performance monolithic camera array."
ACM
Transactions on Graphics (TOG)
32.6 (2013): 166.
8. Lien M B, Liu C H, Chun I Y, et al. Ranging and light eld imaging with transparent photodetectors[J].
Nature Photonics, 2020, 14(3): 143–148.
9. Rumelhart D E, Hinton G E, Williams R J. Learning internal representations by error propagation[R].
California Univ San Diego La Jolla Inst for Cognitive Science, 1985.
10. Krizhevsky A, Sutskever I, Hinton G E. Imagenet classication with deep convolutional neural
networks[C]//Advances in neural information processing systems. 2012: 1097–1105.
11. Liu, Nan, et al. "Large-area, transparent, and exible infrared photodetector fabricated using PN
junctions formed by N-doping chemical vapor deposition grown graphene." Nano Letters 14.7 (2014):
3702–3708.
12. Zheng, Zhaoqiang, et al. "Flexible, transparent and ultra-broadband photodetector based on large-
area WSe2 lm for wearable devices." Nanotechnology 27.22 (2016): 225501.
13. Tsai, Shu-Yi, Min-Hsiung Hon, and Yang-Ming Lu. "Fabrication of transparent p-NiO/n-ZnO
heterojunction devices for ultraviolet photodetectors." Solid-State Electronics 63.1 (2011): 37–41.
14. Tanaka, Hideyuki, et al. "Transparent image sensors using an organic multilayer photodiode."
Advanced Materials 18.17 (2006): 2230–2233.
15. Stiebig, Helmut, et al. "Standing wave detection by thin transparent n–i–p diodes of amorphous
silicon." Thin Solid Films 427.1-2 (2003): 152–156.
16. Jovanov, Vladislav, et al. "Transparent Fourier transform spectrometer." Optics Letters 36.2 (2011):
274–276.
17. Liu C H, Chang Y C, Norris T B, et al. Graphene photodetectors with ultra-broadband and high
responsivity at room temperature[J]. Nature Nanotechnology, 2014, 9(4): 273–278.
18. Cameron J. Blocker, Il Yong Chun, and Jeffrey A. Fessler, “Low-rank plus sparse tensor mod- els for
light-eld reconstruction from focal stack data,” in Proc. IEEE Image, Video, and Multidim. Signal
Process. (IVMSP) Workshop, pp.1–5, Zagori, Greece, Apr. 2018.
19. Il Yong Chun, Zhengyu Huang, Hongki Lim, and Jeffrey A. Fessler, “Momentum-Net: Fast and
convergent iterative neural network for inverse problems, submitted, Jul. 2019. [Online] Available:
http://arxiv.org/abs/1907.11818
20. Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization
in the brain. Psychological Review, 65(6), 386.
Page 11/14
21. Lee, Seunghyun, et al. "Homogeneous bilayer graphene lm based exible transparent conductor."
Nanoscale 4.2 (2012): 639–644.
22. Konstantatos, Gerasimos, et al. "Hybrid graphene–quantum dot phototransistors with ultrahigh gain."
Nature Nanotechnology 7.6 (2012): 363.
23. Sun, Zhenhua, et al. "Infrared photodetectors based on CVD-grown graphene and PbS quantum dots
with ultrahigh responsivity." Advanced Materials 24.43 (2012): 5878–5883.
Figures
Figure 1
Concept of focal stack imaging system enabled by focal stacks of transparent all-graphene
photodetector arrays. (a) Schematic showing simultaneous capture of multiple images of a 3D object
(ball-and-stick model) on different focal planes. Transparent detector arrays (transparent blue sheets) are
placed after the lens (green oval) to form the camera system. The depth information is encoded in the
image stacks. Articial neural networks process the image data and extract important 3D conguration
information of the object. Inset: photograph of imaging system used in experiments with two transparent
focal planes. (b) Upper panel: Optical image of a 4×4 transparent graphene photodetector array, scale
bar: 500 μm. Upper-left corner is with false color and enhanced contrast in order to highlight the patterns.
Lower panel: Schematic of the all-graphene phototransistor design. It includes a top graphene layer as
transistor channel and a bottom graphene patch as oating gate, separated by a 6-nm silicon tunneling
barrier (purple). The device is fabricated on transparent glass substrate (blue), and the active detector
region is wired out with wider graphene stripes as interconnects.
Page 12/14
Figure 2
Experimental demonstration of focal stack imaging using double stacks of graphene detector arrays. (a)
A schematic of measurement setup. A point object (dotted circle) is generated by focusing a green laser
beam (532 nm) with the lens. Its position is controlled by a 3D motorized stage. Two detector arrays (blue
sheets) are placed behind the lens. An objective and CCD camera are placed behind the detector array for
sample alignment. A chopper modulates the light at 500 Hz and a lock-in amplier records the AC current
Page 13/14
at the chopper frequency. (b) Images captured by the front and back photodetector planes with objects at
three different positions along the optical axis (12 mm, 18 mm, 22 mm respectively). The grayscale
images are generated using responsivities for individual pixels within the array, normalized by the
maximum value for better contrast. The point source is slightly off-axis in the image presented, leading to
the shift of spot center. (c) The illustrations of the beam proles corresponding to the imaging planes in
2(b). The focus is shifting from the back plane (top panel) toward the front plane (bottom panel).
Figure 3
3D point object tracking using focal stack data for three different types of point objects. (a-b) Tracking
results for single point object (only 10 test samples are shown). Results are based on images captured
Page 14/14
with the graphene photodetector arrays. (c-d) Tracking results for three-points objects (only 4 test
samples are shown). Results are based on data synthesized from multi-focal-plane CMOS images
(downsampled to 9 × 9) of single point source. (e): Tracking results for rotating two-point objects on one
testing trajectory. The object is rotating counter-clockwise (viewed from left) while moving from z=-10mm
to z=10mm. Results are based on data synthesized from single point source images captured with
graphene photodetector arrays.
Figure 4
3D extended-object tracking and its orientation estimation using focal stack data collected by a CMOS
camera, in (a) the x-y-plane perspective and (b) in the x-z-plane perspective. The estimated (true)
ladybug’s position and orientation are indicated by green (orange) dots and green (orange) overlaid
ladybug images. Note that the ladybug images are not a part of the neural network output and are shown
for illustration only.
Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.
supp0616reformat.docx
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The core of any optical imaging system is a photodetector. Whether it is film or a semiconductor chip in a camera, or indeed the retina in an eye, conventional photodetectors are designed to absorb most of the incident light and record a projected two-dimensional (2D) distribution of light from a scene. The intensity distribution of light from 3D objects, however, can be described by a 4D light field, so optical imaging systems that can acquire higher dimensions of optical information are highly desirable1–3. Here, we report a proof-of-concept light field imaging scheme using transparent graphene photodetector stacks. On a transparent substrate we fabricate a photodetector using graphene as the light-sensing layer, the conducting channel layer, the gate layer and interconnects, enabling sensitive light detection and high transparency at the same time. This technology opens up the possibility of developing sensor arrays that can be stacked along the light path, enabling entirely new configurations of optical imaging devices. We experimentally demonstrate depth ranging using a double stack of transparent detectors and develop a method for computational reconstruction of a 4D light field from a single exposure that can be applied following the successful fabrication of dense 2D transparent sensor arrays. A highly transparent photodetector using graphene as the light-sensing layer, conducting channel layer, gate layer and interconnects enables new approaches for light field photodetection and imaging involving simultaneous detection across multiple focal planes.
Article
Full-text available
Although two-dimensional (2D) materials have attracted considerable research interest for use in the development of innovative wearable optoelectronic systems, the integrated optoelectronic performance of 2D materials photodetectors, including flexibility, transparency, broadband response and stability in air, remains quite low to date. Here, we demonstrate a flexible, transparent, high-stability and ultra-broadband photodetector made using large-area and highly-crystalline WSe2 films that were prepared by pulsed-laser deposition (PLD). Benefiting from the 2D physics of WSe2 films, this device exhibits excellent average transparency of 72% in the visible range and superior photoresponse characteristics, including an ultra-broadband detection spectral range from 370 to 1064 nm, reversible photoresponsivity approaching 0.92 A W(-1), external quantum efficiency of up to 180% and a relatively fast response time of 0.9 s. The fabricated photodetector also demonstrates outstanding mechanical flexibility and durability in air. Also, because of the wide compatibility of the PLD-grown WSe2 film, we can fabricate various photodetectors on multiple flexible or rigid substrates, and all these devices will exhibit distinctive switching behavior and superior responsivity. These indicate a possible new strategy for the design and integration of flexible, transparent and broadband photodetectors based on large-area WSe2 films, with great potential for practical applications in the wearable optoelectronic devices.
Article
Full-text available
This paper presents a camera that samples the 4D light field on its sensor in a single photographic exposure. This is achieved by in- serting a microlens array between the sensor and main lens, creat- ing a plenoptic camera. Each microlens measures not just the total amount of light deposited at that location, but how much light ar- rives along each ray. By re-sorting the measured rays of light to where they would have terminated in slightly different, synthetic cameras, we can compute sharp photographs focused at different depths. We show that a linear increase in the resolution of images under each microlens results in a linear increase in the sharpness of the refocused photographs. This property allows us to extend the depth of field of the camera without reducing the aperture, en- abling shorter exposures and lower image noise. Especially in the macrophotography regime, we demonstrate that we can also com- pute synthetic photographs from a range of different viewpoints. These capabilities argue for a different strategy in designing photo- graphic imaging systems. To the photographer, the plenoptic camera operates exactly like an ordinary hand-held camera. We have used our prototype to take hundreds of light field photographs, and we present examples of portraits, high-speed action and macro close-ups.
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Article
Graphene is a highly promising material for high speed, broadband and multicolor photodetection. Due to its lack of bandgap, individually gated P- and N-regions are needed to fabricate photodetectors. Here we report a technique for making large-area photodetector on the basis of controllable fabrication of graphene P-N junctions. Our selectively-doped chemical vapor deposition (CVD) graphene photodetector showed a ~5% modulation of conductance under global IR irradiation. By comparing devices of various geometries, we identify that both the homogeneous and the p-n junction regions contribute competitively to the photoresponse. Furthermore, we demonstrate that our two-terminal graphene photodetector can be fabricated on both transparent and flexible substrates, without the need for complex fabrication processes used in electrically gated three-terminal devices. This represents the first demonstration of a fully transparent and flexible graphene-based IR photodetector that exhibits both good photoresponsivity and high bending capability. This simple approach should facilitate the development of next generation high performance IR photodetectors.
Article
The ability to detect light over a broad spectral range is central to several technological applications in imaging, sensing, spectroscopy and communication. Graphene is a promising candidate material for ultra-broadband photodetectors, as its absorption spectrum covers the entire ultraviolet to far-infrared range. However, the responsivity of graphene-based photodetectors has so far been limited to tens of mA W(-1) (refs 5, , , , , 10) due to the small optical absorption of a monolayer of carbon atoms. Integration of colloidal quantum dots in the light absorption layer can improve the responsivity of graphene photodetectors to ∼1 × 10(7) A W(-1) (ref. 11), but the spectral range of photodetection is reduced because light absorption occurs in the quantum dots. Here, we report an ultra-broadband photodetector design based on a graphene double-layer heterostructure. The detector is a phototransistor consisting of a pair of stacked graphene monolayers (top layer, gate; bottom layer, channel) separated by a thin tunnel barrier. Under optical illumination, photoexcited hot carriers generated in the top layer tunnel into the bottom layer, leading to a charge build-up on the gate and a strong photogating effect on the channel conductance. The devices demonstrated room-temperature photodetection from the visible to the mid-infrared range, with mid-infrared responsivity higher than 1 A W(-1), as required by most applications. These results address key challenges for broadband infrared detectors, and are promising for the development of graphene-based hot-carrier optoelectronic applications.
Article
An optically transparent p–n heterojunction device consisting of p-NiO and n-ZnO thin films was fabricated by r.f. sputtering method. The structural and optical properties of the p-NiO/n-ZnO heterojunction were characterized by X-ray diffraction (XRD), UV–visible spectroscopy, Hall effect measurement, and J–V photocurrent measurements. The XRD shows that ZnO films are highly crystalline in nature with preferred orientation along the (0002) direction. The p-NiO/n-ZnO heterojunction device has an average transmittance of over 80% in the visible region. The current–voltage curve of the heterojunction demonstrates obvious rectifying diode behavior in a dark environment. The lowest leakage current is 6.64×10−8A/cm2 for the p-NiO/n-ZnO heterojunction device.
Article
With the advent of bidimensional array detectors the throughput advantage of a Fourier Transform Spectrometer (FTS) can be used to create a new type of 3-D spectrometer. The classical multiplex property in the spectral domain of a FTS is multiplied by the number of pixel of the array. The points of the entrance field are all observed in parallel. After discussing the properties of this instrument, the coupling of the FTS of the CFH Telescope to a camera equipped with a NICMOS 3 array is described. With this combination, spectro-imaging in any bandpass between 1 and 2.5 µ m is possible within a circular 24” field of view, with a scale of 0.33”/pixel, at seeing-limited spatial resolution. Any spectral resolution is choosable up to 30,000. Illustrations are given by a study of the dark side of Venus at 1.27 µ m and of planetary nebulae at 2 µ m. Many other objects can benefit from this observing mode in the near infrared. Further developments of this 3-D technique are discussed.