An example PNN, implemented experimentally using broadband optical SHG a, Input data are encoded into the spectrum of a laser pulse (Methods, Supplementary Section 2). To control transformations implemented by the broadband SHG process, a portion of the pulse’s spectrum is used as trainable parameters (orange). The physical computation result is obtained from the spectrum of a blue (about 390 nm) pulse generated within a χ⁽²⁾ medium. b, To construct a deep PNN, the outputs of the SHG transformations are used as inputs to subsequent SHG transformations, with independent trainable parameters. c, d, After training the SHG-PNN (see main text, Fig. 3), it classifies test vowels with 93% accuracy. c, The confusion matrix for the PNN on the test set. d, Representative examples of final-layer output spectra, which show the SHG-PNN’s prediction.

Source publication

Introduction to PNNs
a, Artificial neural networks contain operational...

An example PNN, implemented experimentally using broadband optical...

Physics-aware training
a, PAT is a hybrid in situ–in silico algorithm...

Image classification with diverse physical systems
We trained PNNs...

Deep physical neural networks trained with backpropagation

Article

Full-text available

Jan 2022

Deep-learning models have become pervasive tools in science and engineering. However, their energy requirements now increasingly limit their scalability ¹ . Deep-learning accelerators 2–9 aim to perform deep learning energy-efficiently, usually targeting the inference phase and often by exploiting physical substrates beyond conventional electronics...

Deep learning methods for Hamiltonian parameter estimation and magnetic domain image generation in twisted van der Waals magnets

Article

Full-text available

Jun 2024

The application of twist engineering in van der Waals magnets has opened new frontiers in the field of two-dimensional magnetism, yielding distinctive magnetic domain structures. Despite the introduction of numerous theoretical methods, limitations persist in terms of accuracy or efficiency due to the complex nature of the magnetic Hamiltonians pertinent to these systems. In this study, we introduce a deep-learning approach to tackle these challenges. Utilizing customized, fully connected networks, we develop two deep-neural-network kernels that facilitate efficient and reliable analysis of twisted van der Waals magnets. Our regression model is adept at estimating the magnetic Hamiltonian parameters of twisted bilayer CrI3 from its magnetic domain images generated through atomistic spin simulations. The ‘generative model’ excels in producing precise magnetic domain images from the provided magnetic parameters. The trained networks for these models undergo thorough validation, including statistical error analysis and assessment of robustness against noisy injections. These advancements not only extend the applicability of deep-learning methods to twisted van der Waals magnets but also streamline future investigations into these captivating yet poorly understood systems.

A high-performance deep reservoir computer experimentally demonstrated with ion-gating reservoirs

Article

Full-text available

Jun 2024

While physical reservoir computing is a promising way to achieve low power consumption neuromorphic computing, its computational performance is still insufficient at a practical level. One promising approach to improving its performance is deep reservoir computing, in which the component reservoirs are multi-layered. However, all of the deep-reservoir schemes reported so far have been effective only for simulation reservoirs and limited physical reservoirs, and there have been no reports of nanodevice implementations. Here, as an ionics-based neuromorphic nanodevice implementation of deep-reservoir computing, we report a demonstration of deep physical reservoir computing with maximum of four layers using an ion gating reservoir, which is a small and high-performance physical reservoir. While the previously reported deep-reservoir scheme did not improve the performance of the ion gating reservoir, our deep-ion gating reservoir achieved a normalized mean squared error of 9.08 × 10 ⁻³ on a second-order nonlinear autoregressive moving average task, which is the best performance of any physical reservoir so far reported in this task. More importantly, the device outperformed full simulation reservoir computing. The dramatic performance improvement of the ion gating reservoir with our deep-reservoir computing architecture paves the way for high-performance, large-scale, physical neural network devices.

Large-scale photonic computing with nonlinear disordered media

Article

Full-text available

Jun 2024

Neural networks find widespread use in scientific and technological applications, yet their implementations in conventional computers have encountered bottlenecks due to ever-expanding computational needs. Photonic computing is a promising neuromorphic platform with potential advantages of massive parallelism, ultralow latency and reduced energy consumption but mostly for computing linear operations. Here we demonstrate a large-scale, high-performance nonlinear photonic neural system based on a disordered polycrystalline slab composed of lithium niobate nanocrystals. Mediated by random quasi-phase-matching and multiple scattering, linear and nonlinear optical speckle features are generated as the interplay between the simultaneous linear random scattering and the second-harmonic generation, defining a complex neural network in which the second-order nonlinearity acts as internal nonlinear activation functions. Benchmarked against linear random projection, such nonlinear mapping embedded with rich physical computational operations shows improved performance across a large collection of machine learning tasks in image classification, regression and graph classification. Demonstrating up to 27,648 input and 3,500 nonlinear output nodes, the combination of optical nonlinearity and random scattering serves as a scalable computing engine for diverse applications.

Quantum Equilibrium Propagation for efficient training of quantum systems based on Onsager reciprocity

Preprint

Jun 2024

The widespread adoption of machine learning and artificial intelligence in all branches of science and technology has created a need for energy-efficient, alternative hardware platforms. While such neuromorphic approaches have been proposed and realised for a wide range of platforms, physically extracting the gradients required for training remains challenging as generic approaches only exist in certain cases. Equilibrium propagation (EP) is such a procedure that has been introduced and applied to classical energy-based models which relax to an equilibrium. Here, we show a direct connection between EP and Onsager reciprocity and exploit this to derive a quantum version of EP. This can be used to optimize loss functions that depend on the expectation values of observables of an arbitrary quantum system. Specifically, we illustrate this new concept with supervised and unsupervised learning examples in which the input or the solvable task is of quantum mechanical nature, e.g., the recognition of quantum many-body ground states, quantum phase exploration, sensing and phase boundary exploration. We propose that in the future quantum EP may be used to solve tasks such as quantum phase discovery with a quantum simulator even for Hamiltonians which are numerically hard to simulate or even partially unknown. Our scheme is relevant for a variety of quantum simulation platforms such as ion chains, superconducting qubit arrays, neutral atom Rydberg tweezer arrays and strongly interacting atoms in optical lattices.

Training of Physical Neural Networks

Preprint

Jun 2024

Physical neural networks (PNNs) are a class of neural-like networks that leverage the properties of physical systems to perform computation. While PNNs are so far a niche research area with small-scale laboratory demonstrations, they are arguably one of the most underappreciated important opportunities in modern AI. Could we train AI models 1000x larger than current ones? Could we do this and also have them perform inference locally and privately on edge devices, such as smartphones or sensors? Research over the past few years has shown that the answer to all these questions is likely "yes, with enough research": PNNs could one day radically change what is possible and practical for AI systems. To do this will however require rethinking both how AI models work, and how they are trained - primarily by considering the problems through the constraints of the underlying hardware physics. To train PNNs at large scale, many methods including backpropagation-based and backpropagation-free approaches are now being explored. These methods have various trade-offs, and so far no method has been shown to scale to the same scale and performance as the backpropagation algorithm widely used in deep learning today. However, this is rapidly changing, and a diverse ecosystem of training techniques provides clues for how PNNs may one day be utilized to create both more efficient realizations of current-scale AI models, and to enable unprecedented-scale models.

Non-Invasive Self-Adaptive Information States’ Acquisition inside Dynamic Scattering Spaces

Article

Full-text available

May 2024

Pushing the information states’ acquisition efficiency has been a long-held goal to reach the measurement precision limit inside scattering spaces. Recent studies have indicated that maximal information states can be attained through engineered modes; however, partial intrusion is generally required. While non-invasive designs have been substantially explored across diverse physical scenarios, the non-invasive acquisition of information states inside dynamic scattering spaces remains challenging due to the intractable non-unique mapping problem, particularly in the context of multi-target scenarios. Here, we establish the feasibility of non-invasive information states’ acquisition experimentally for the first time by introducing a tandem-generated adversarial network framework inside dynamic scattering spaces. To illustrate the framework’s efficacy, we demonstrate that efficient information states’ acquisition for multi-target scenarios can achieve the Fisher information limit solely through the utilization of the external scattering matrix of the system. Our work provides insightful perspectives for precise measurements inside dynamic complex systems.

A Pontryagin Perspective on Reinforcement Learning

Preprint

Full-text available

May 2024

Reinforcement learning has traditionally focused on learning state-dependent policies to solve optimal control problems in a closed-loop fashion. In this work, we introduce the paradigm of open-loop reinforcement learning where a fixed action sequence is learned instead. We present three new algorithms: one robust model-based method and two sample-efficient model-free methods. Rather than basing our algorithms on Bellman's equation from dynamic programming, our work builds on Pontryagin's principle from the theory of open-loop optimal control. We provide convergence guarantees and evaluate all methods empirically on a pendulum swing-up task, as well as on two high-dimensional MuJoCo tasks, demonstrating remarkable performance compared to existing baselines.

Deep Photonic Reservoir Computer for Speech Recognition

Article

Full-text available

May 2024

Speech recognition is a critical task in the field of artificial intelligence (AI) and has witnessed remarkable advancements thanks to large and complex neural networks, whose training process typically requires massive amounts of labeled data and computationally intensive operations. An alternative paradigm, reservoir computing (RC), is energy efficient and is well adapted to implementation in physical substrates, but exhibits limitations in performance when compared with more resource-intensive machine learning algorithms. In this work, we address this challenge by investigating different architectures of interconnected reservoirs, all falling under the umbrella of deep RC (DRC). We propose a photonic-based deep reservoir computer and evaluate its effectiveness on different speech recognition tasks. We show specific design choices that aim to simplify the practical implementation of a reservoir computer while simultaneously achieving high-speed processing of high-dimensional audio signals. Overall, with the present work, we hope to help the advancement of low-power and high-performance neuromorphic hardware.

Unified deep learning model for multitask representation and transfer learning: image classification, object detection, and image captioning

Article

Full-text available

May 2024

The application of deep learning has demonstrated impressive performance in computer vision tasks such as object detection, image classification, and image captioning. Though most models excel at performing single vision or language tasks, designing a single architecture that balances task specialization, performance, and adaptability across diverse tasks is challenging. To effectively address vision and language integration challenges, a combination of text embeddings and visual representation is necessary to understand dependencies of each subarea for multiple tasks. This paper proposes a single architecture that can handle various tasks in computer vision with fine-tuning capabilities for other specific vision and language tasks. The proposed model employs a modified DenseNet201 as a feature extractor (network backbone), an encoder-decoder architecture, and a task-specific head for inference. To tackle overfitting and improve precision, enhanced data augmentation and normalization techniques are employed. The model’s robustness is evaluated on over five datasets for different tasks: image classification, object detection, image captioning, and adversarial attack and defense. The experimental results demonstrate competitive performance compared to other works on CIFAR-10, CIFAR-100, Flickr8, Flickr30, Caltech10, and other task-specific datasets such as OCT, BreakHis, and so on. The proposed model is flexible and easy to adapt to new tasks, as it can also be extended to other vision and language tasks through fine-tuning with task-specific input indices.

Mechanical intelligence via fully reconfigurable elastic neuromorphic metasurfaces

Article

Full-text available

May 2024

The ability of mechanical systems to perform basic computations has gained traction over recent years, providing an unconventional alternative to digital computing in off grid, low power, and severe environments, which render the majority of electronic components inoperable. However, much of the work in mechanical computing has focused on logic operations via quasi-static prescribed displacements in origami, bistable, and soft deformable matter. Here, we present a first attempt to describe the fundamental framework of an elastic neuromorphic metasurface that performs distinct classification tasks, providing a new set of challenges, given the complex nature of elastic waves with respect to scattering and manipulation. Multiple layers of reconfigurable waveguides are phase-trained via constant weights and trainable activation functions in a manner that enables the resultant wave scattering at the readout location to focus on the correct class within the detection plane. We further demonstrate the neuromorphic system’s reconfigurability in performing two distinct tasks, eliminating the need for costly remanufacturing.

Citations