Jun Qi

Jun Qi
Fudan University

Doctor of Philosophy

About

48
Publications
10,234
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,206
Citations
Introduction
I received my Ph.D. from the School of Electrical and Computer Engineering at Georgia Institute of Technology, advised by Prof. Chin-Hui Lee and Prof. Xiaoli Ma. I will be joining the department of Electronic Engineering at Fudan University, Shanghai as an Assistant Professor.

Publications

Publications (48)
Conference Paper
Full-text available
Distributed deep neural networks are commonly employed for building automatic speech recognition (ASR) systems. In this work, we employ the robust submodular partitioning approach, which aims to split the training data into small disjoint data subsets and use each of these subsets to train a particular deep neural network. Two efficient algorithms...
Conference Paper
Full-text available
This work is originated from the MLSP 2014 Classification Challenge which tries to automatically detect subjects with schizophrenia and schizo-affective disorder by analyzing multi-modal features derived from magnetic resonance imaging (MRI) data. We employ Deep Neural Network (DNN)-based multi-view representation learning for combining multi-modal...
Conference Paper
Full-text available
The bottleneck (BN) feature, particularly based on deep structures, has gained significant success in automatic speech recognition (ASR). However, applying the BN feature to small/medium-scale tasks is nontrivial. An obvious reason is that the limited training data prevent from training a compli-cated deep network; another reason, which is more sub...
Conference Paper
Full-text available
Recent work demonstrates impressive success of the bottle-neck (BN) feature in speech recognition, particularly with deep networks plus appropriate pre-training. A widely admitted ad-vantage associated with the BN feature is that the network struc-ture can learn multiple environmental conditions with abundant training data. For tasks with limited t...
Conference Paper
Full-text available
A major challenge for automatic speech recognition (ASR) relates to significant performance reduction in noisy environments. Recent research has shown that auditory fea-tures based on Gammatone filters are promising to improve robustness of ASR systems against noise, though the research is far from extensive and generalizability of the new features...
Article
Full-text available
This work focuses on investigating an end-to-end learning approach for quantum neural networks (QNN) on noisy intermediate-scale quantum devices. The proposed model combines a quantum tensor network (QTN) with a variational quantum circuit (VQC), resulting in a QTN-VQC architecture. This architecture integrates a QTN with a horizontal or vertical s...
Preprint
Full-text available
Variational quantum circuit (VQC) is a promising approach for implementing quantum neural networks on noisy intermediate-scale quantum (NISQ) devices. Recent studies have shown that a tensor-train network (TTN) for VQC, namely TTN-VQC, can improve the representation and generalization powers of VQC. However, the Barren Plateau problem leads to the...
Article
Full-text available
The noisy intermediate-scale quantum devices enable the implementation of the variational quantum circuit (VQC) for quantum neural networks (QNN). Although the VQC-based QNN has succeeded in many machine learning tasks, the representation and generalization powers of VQC still require further investigation, particularly when the dimensionality of c...
Article
Full-text available
This work focuses on designing low-complexity hybrid tensor networks by considering trade-offs between the model complexity and practical performance. Firstly, we exploit a low-rank tensor-train deep neural network (TT-DNN) to build an end-to-end deep learning pipeline, namely LR-TT-DNN. Secondly, a hybrid model combining LR-TT-DNN with a convoluti...
Preprint
Full-text available
We propose an ensemble learning framework with Poisson sub-sampling to effectively train a collection of teacher models to issue some differential privacy (DP) guarantee for training data. Through boosting under DP, a student model derived from the training data suffers little model degradation from the models trained with no privacy protection. Ou...
Preprint
Full-text available
The noisy intermediate-scale quantum (NISQ) devices enable the implementation of the variational quantum circuit (VQC) for quantum neural networks (QNN). Although the VQC-based QNN has succeeded in many machine learning tasks, the representation and generalization powers of VQC still require further investigation, particularly when the dimensionali...
Preprint
Full-text available
The noisy intermediate-scale quantum (NISQ) devices enable the implementation of the variational quantum circuit (VQC) for quantum neural networks (QNN). Although the VQC-based QNN has succeeded in many machine learning tasks, the representation and generalization powers of VQC still require further investigation, particularly when the dimensionali...
Preprint
Full-text available
This work focuses on designing low complexity hybrid tensor networks by considering trade-offs between the model complexity and practical performance. Firstly, we exploit a low-rank tensor-train deep neural network (TT-DNN) to build an end-to-end deep learning pipeline, namely LR-TT-DNN. Secondly, a hybrid model combining LR-TT-DNN with a convoluti...
Preprint
Full-text available
The rapid development of quantum computing has demonstrated many unique characteristics of quantum advantages, such as richer feature representation and more secured protection on model parameters. This work proposes a vertical federated learning architecture based on variational quantum circuits to demonstrate the competitive performance of a quan...
Preprint
Full-text available
This work aims to design a low complexity spoken command recognition (SCR) system by considering different trade-offs between the number of model parameters and classification accuracy. More specifically, we exploit a deep hybrid architecture of a tensor-train (TT) network to build an end-to-end SRC pipeline. Our command recognition system, namely...
Article
Multicarrier transmissions, such as orthogonal frequency/chirp division multiplexing (OF/CDM), offer high spectral efficiency and low complexity equalization in multipath fading channels at the cost of high peak-to-average power ratio (PAPR). High peak powers can occur randomly and may drive the power amplifier (PA) into saturation, resulting in no...
Preprint
Full-text available
This work investigates an extension of transfer learning applied in machine learning algorithms to the emerging hybrid end-to-end quantum neural network (QNN) for spoken command recognition (SCR). Our QNN-based SCR system is composed of classical and quantum components: (1) the classical part mainly relies on a 1D convolutional neural network (CNN)...
Preprint
Full-text available
The advent of noisy intermediate-scale quantum (NISQ) computers raises a crucial challenge to design quantum neural networks for fully quantum learning tasks. To bridge the gap, this work proposes an end-to-end learning framework named QTN-VQC, by introducing a trainable quantum tensor network (QTN) for quantum embedding on a variational quantum ci...
Article
Full-text available
This paper proposes a novel tensor-train deep neural network (TT-DNN) based channel estimator to tackle challenges of time-varying channel estimation in multiple-input multiple-output (MIMO) systems. A centralized DNN channel estimator can be realized by a distributed TT-DNN with parallel paths. The TT-DNN provides a compact representation by decom...
Preprint
Full-text available
We propose a novel decentralized feature extraction approach in federated learning to address privacy-preservation issues for speech recognition. It is built upon a quantum convolutional neural network (QCNN) composed of a quantum circuit encoder for feature extraction, and a recurrent neural network (RNN) based end-to-end acoustic model (AM). To e...
Preprint
This paper proposes to generalize the variational recurrent neural network (RNN) with variational inference (VI)-based dropout regularization employed for the long short-term memory (LSTM) cells to more advanced RNN architectures like gated recurrent unit (GRU) and bi-directional LSTM/GRU. The new variational RNNs are employed for slot filling, whi...
Article
Full-text available
In this paper, we exploit the properties of mean absolute error (MAE) as a loss function for the deep neural network (DNN) based vector-to-vector regression. The goal of this work is twofold: (i) presenting performance bounds of MAE, and (ii) demonstrating new properties of MAE that make it more appropriate than mean squared error (MSE) as a loss f...
Preprint
In this paper, we exploit the properties of mean absolute error (MAE) as a loss function for the deep neural network (DNN) based vector-to-vector regression. The goal of this work is two-fold: (i) presenting performance bounds of MAE, and (ii) demonstrating new properties of MAE that make it more appropriate than mean squared error (MSE) as a loss...
Preprint
In this paper, we show that, in vector-to-vector regression utilizing deep neural networks (DNNs), a generalized loss of mean absolute error (MAE) between the predicted and expected feature vectors is upper bounded by the sum of an approximation error, an estimation error, and an optimization error. Leveraging upon error decomposition techniques in...
Conference Paper
Full-text available
This paper investigates different trade-offs between the number of model parameters and enhanced speech qualities by employing several deep tensor-to-vector regression models for speech enhancement. We find that a hybrid architecture, namely CNN-TT, is capable of maintaining a good quality performance with a reduced model parameter size. CNN-TT is...
Preprint
Full-text available
This paper investigates different trade-offs between the number of model parameters and enhanced speech qualities by employing several deep tensor-to-vector regression models for speech enhancement. We find that a hybrid architecture, namely CNN-TT, is capable of maintaining a good quality performance with a reduced model parameter size. CNN-TT is...
Article
Full-text available
The state-of-the-art machine learning approaches are based on classical von Neumann computing architectures and have been widely used in many industrial and academic domains. With the recent development of quantum computing, researchers and tech-giants have attempted new quantum circuits for machine learning tasks. However, the existing quantum com...
Conference Paper
Full-text available
We propose a tensor-to-vector regression approach to multi-channel speech enhancement in order to address the issue of input size explosion and hidden-layer size expansion. The key idea is to cast the conventional deep neural network (DNN) based vector-to-vector regression formulation under a tensor-train network (TTN) framework. TTN is a recently...
Article
Full-text available
In this paper, we show that, in vector-to-vector regression utilizing deep neural networks (DNNs), a generalized loss of mean absolute error (MAE) between the predicted and expected feature vectors is upper bounded by the sum of an approximation error, an estimation error, and an optimization error. Leveraging upon error decomposition techniques in...
Preprint
Full-text available
Recent studies have highlighted adversarial examples as ubiquitous threats to the deep neural network (DNN) based speech recognition systems. In this work, we present a U-Net based attention model, U-Net$_{At}$, to enhance adversarial speech signals. Specifically, we evaluate the model performance by interpretable speech recognition metrics and dis...
Preprint
Full-text available
Recent deep neural networks based techniques, especially those equipped with the ability of self-adaptation in the system level such as deep reinforcement learning (DRL), are shown to possess many advantages of optimizing robot learning systems (e.g., autonomous navigation and continuous robot arm control.) However, the learning-based systems and t...
Preprint
Full-text available
We propose a tensor-to-vector regression approach to multi-channel speech enhancement in order to address the issue of input size explosion and hidden-layer size expansion. The key idea is to cast the conventional deep neural network (DNN) based vector-to-vector regression formulation under a tensor-train network (TTN) framework. TTN is a recently...
Conference Paper
Full-text available
Distributed automatic speech recognition (ASR) requires to aggregate outputs of distributed deep neural network (DNN)-based models. This work studies the use of submodular functions to design a rank aggregation on score-based permutations, which can be used for distributed ASR systems in both supervised and unsupervised modes. Specifically, we comp...
Preprint
Distributed automatic speech recognition (ASR) requires to aggregate outputs of distributed deep neural network (DNN)-based models. This work studies the use of submodular functions to design a rank aggregation on score-based permutations, which can be used for distributed ASR systems in both supervised and unsupervised modes. Specifically, we comp...
Article
Full-text available
This paper focuses on a theoretical analysis of deep neural network (DNN) based functional approximation. Lever-aging upon two classical theorems on universal approximation, an artificial neural network (ANN) with a single hidden layer of neurons is used. With modified ReLU and Sigmoid activation functions, we first generalize the related concepts...
Conference Paper
Full-text available
Huge training datasets for automatic speech recognition (ASR) typically contain redundant information so that a subset of data is generally enough to obtain similar ASR performance to that obtained when the entire dataset is employed for training. Although the centralized submodular-based data selection methods have been successfully applied to obt...
Article
Full-text available
Unsupervised rank aggregation on score-based permutations, which is widely used in many applications, has not been deeply explored yet. This work studies the use of submodular optimization for rank aggregation on score-based permutations in an unsupervised way. Specifically, we propose an unsupervised approach based on the Lovasz Bregman divergence...

Network

Cited By