Figure - available from: New Journal of Physics
This content is subject to copyright. Terms and conditions apply.
A typical PQC with two qubits and two layers. The entangling layer of ZZ gate is trainable with parameter ν e . The architecture is composed of alternating layers of encoding unitaries Uenc(s,λ) and variational unitaries Uvar(ν) . The entangling layer can also be CZ gates without training parameters. The number of training parameters for L layers (ZZ circular structure) and n⩾3 qubits can be calculated by (6L−3)n . In case for CZ entangling gates, the number of parameters scales for (5L−2)n .

A typical PQC with two qubits and two layers. The entangling layer of ZZ gate is trainable with parameter ν e . The architecture is composed of alternating layers of encoding unitaries Uenc(s,λ) and variational unitaries Uvar(ν) . The entangling layer can also be CZ gates without training parameters. The number of training parameters for L layers (ZZ circular structure) and n⩾3 qubits can be calculated by (6L−3)n . In case for CZ entangling gates, the number of parameters scales for (5L−2)n .

Source publication
Article
Full-text available
Investigating quantum advantage in the NISQ era is a challenging problem whereas quantum machine learning becomes the most promising application that can be resorted to. However, no proposal has been investigated for arguably challenging inverse reinforcement learning to demonstrate the potential advantage. In this work, we propose a hybrid quantum...

Citations

... Moreover, Meyer et al. [23] investigated the impact of post-processing methods for variational quantum circuits (VQC) on the performance of quantum policy gradients. Several improved quantum policy gradient algorithms have been proposed, including actor-critic [24], soft actor-critic (SAC) [9], [25], deep deterministic policy gradient (DDPG) [26], quantum asynchronous advantage actor-critic (A3C) [27], and generative adversarial RL [28], aiming to enhance the efficiency and effectiveness of QRL methods. QRL has found applications in quantum control [29] and has been extended to multi-agent settings with promising results [30]- [32]. ...
... Wu et al. [39] introduced a quantum version of deep deterministic policy gradient (DDPG), and Chen et al. [7] presented a quantum asynchronous advantage actor-critic (A3C) algorithm. Xiao et al. [40] introduced a generative adversarial RL in the quantum regime. These modifications aim to further enhance the efficiency and effectiveness of QRL methods. ...
Preprint
Full-text available
This paper introduces the QDQN-DPER framework to enhance the efficiency of quantum reinforcement learning (QRL) in solving sequential decision tasks. The framework incorporates prioritized experience replay and asynchronous training into the training algorithm to reduce the high sampling complexities. Numerical simulations demonstrate that QDQN-DPER outperforms the baseline distributed quantum Q learning with the same model architecture. The proposed framework holds potential for more complex tasks while maintaining training efficiency.
Article
An important practical problem in the field of quantum metrology and sensors is to find the optimal sequences of controls for the quantum probe that realize optimal adaptive estimation. In Belliardo et al. [arXiv:2312.16985], we solved this problem in general by introducing a procedure capable of optimizing a wide range of tasks in quantum metrology and estimation by combining model-aware reinforcement learning with Bayesian inference. We take a model-based approach to the optimization where the physics describing the system is explicitly taken into account in the training through automatic differentiation. In this follow-up paper we present some applications of the framework. The first family of examples concerns the estimation of magnetic fields, hyperfine interactions, and decoherence times for electronic spins in diamond. For these examples, we perform multiple Ramsey measurements on the spin. The second family of applications concerns the estimation of phases and coherent states on photonic circuits, without squeezing elements, where the bosonic lines are measured by photon counters. This exposition showcases the broad applicability of the method, which has been implemented in the qsensoropt library released on PyPI, which can be installed with pip.