Content uploaded by Manuel Eugenio Morocho-Cayamcela
Author content
All content in this area was uploaded by Manuel Eugenio Morocho-Cayamcela on Mar 25, 2020
Content may be subject to copyright.
Learning to Communicate with Autoencoders:
Rethinking Wireless Systems with Deep Learning
Manuel Eugenio Morocho-Cayamcela, Judith Nkechinyere Njoku
Dept. of Electronic Engineering
Kumoh National Institute of Technology
Gumi, South Korea
{eugeniomorocho, judithnjoku24}@kumoh.ac.kr
Jeonghun Park, Wansu Lim
Dept. of IT Convergence Engineering
Kumoh National Institute of Technology
Gumi, South Korea
{wlsemrl8, wansu.lim}@kumoh.ac.kr
Abstract—The design and implementation of conventional
communication systems are based on strong probabilistic models
and assumptions. These fixed and conventional communication
theories exhibit limitations in the utilization of the limited
spectrum resources and the complexity of optimization for
emerging wireless applications. Currently, new generations of
wireless systems supported by artificial intelligence can learn
from the wireless spectrum data, and optimize their utilization to
enhance their performance. In this paper, we describe how deep
learning can be used to design an end-to-end communication
system using an encoder to replace the transmitter tasks such as
modulation and coding, and a decoder for the receiver tasks such
as demodulation and decoding. This flexible design can capture
channel impairments effectively and optimize the operations of
the transmitter and receiver altogether. We evaluate the case of a
single-antenna system, incorporating impairments in the channel
layer of the autoencoder and evaluating the response of different
neural network optimization algorithms.
Index Terms—Deep learning, autoencoders, wireless systems,
physical layer, channel estimation.
I. INTRODUCTION
The design and implementation of conventional commu-
nication systems are built upon strong probabilistic models
and assumptions [1]. Furthermore, they are limited in ex-
plaining theory to practice when handling the complexity of
optimization for new wireless applications with high degrees
of freedom. Deep learning has shown a high potential to
address this challenge via data-driven solutions, improving
the utilization of limited wireless spectrum resources [2]–
[5]. Instead of following a rigid design, new generations of
wireless systems empowered by cognitive radio can learn
from spectrum data, and optimize their spectrum utilization
to enhance their performance. These smart communication
systems rely on various detection, classification, and prediction
tasks, such as signal detection and signal type identification
for spectrum sensing [6]. To address these tasks, deep learning
provides powerful automated means for communication sys-
tems to learn from spectrum data and adapt to its dynamics
[7]. Wireless communications data come in large volumes
and at high rates and is subject to interference and security
threats due to the shared nature of the medium [8], [9]. Tra-
ditional modeling often fall short when capturing the delicate
relationship between highly complex spectrum data, whereas
deep learning has a robust capacity to meet the requirements
(i.e., data rate, mobility, latency, connection density, energy
efficiency, traffic capacity, etc.) of the next-generation mobile
and wireless communication systems [10].
In this paper, we rethink the approach to a conventional
communication system and describe how deep learning can
be used to design an end-to-end communication system using
an encoder to replace the transmitter functionalities such as
modulation and coding, and a decoder for the receiver func-
tionalities such as demodulation and decoding. We consider
an autoencoder to improve the performance of conventional
systems by jointly optimizing the communication between
the transmitter and the receiver, instead of optimizing their
individual modules. An autoencoder is a deep neural network
that consists of an encoder that learns a (latent) representation
of the given data, and a decoder that reconstructs the input data
from the encoded data. In this setting, joint modulation and
coding at the transmitter correspond to the encoder, and joint
decoding and demodulation at the receiver corresponds to the
decoder. The proposed convolutional encoder-decoder design
captures channel impairments and optimizes the transmitter
and receiver operations jointly for a single-antenna system.
Additionally, we compare different optimizers for the param-
eter update of our design, as well as for the constellations
generated by the autoencoder. Results demonstrate the power
of deep learning optimization in providing novel means to
design wireless communications.
II. ANEND-TO-END COMMUNICATION SEQUENCE WITH
DEEP LEARNING
A communication system is comprised of a transmitter, a
receiver, and a channel that carries the information between the
transmitter and the receiver. Claude E. Shannon in its original
paper on communication theory [11], stated that the funda-
mental problem of communication systems is: "reproducing at
one point either exactly or approximately a message selected
at another point". That statement is equivalent to the concept
of a modern autoencoder, where its job is to reconstruct a
given input at its output. In this section, we revisit the physical
layer of a conventional communication system design and
reformulate it as an end-to-end reconstruction task that aims to
optimize the transmitter and receiver components in a single
operation.
308
978-1-7281-4985-1/20/$31.00 ©2020 IEEE
ICAIIC 2020
Informa tion
Source
Source
Encoder
Channel
Encoder
Modulator
Transmitter
Reconstructed
Info rmation
Source
Decoder
Channel
Decoder
Demodulator
Receiver
Channel
Estimation
DetectionChannel
{1010} {1000}
Fig. 1. Block diagram of a conventional communication system.
A. The Limitation of Conventional Communication Systems
Conventional communication systems are divided into mul-
tiple independent blocks for the transmitter and receiver. These
independent pieces are optimized individually for different
tasks [12] (Fig. 1). Each block at the transmitter prepares
the signal to the effects of the communication channel and
noise at the receiver. The source encoder compresses the
input data and removes redundancy. The channel encoder adds
redundancy on the output of the source encoder in a controlled
way. The modulator changes the characteristics of the signal
based on the required data rate. The transmitted signal is
then distorted and attenuated by the channel. On top of that,
the impairments of the receiver’s hardware introduce extra
noise to the signal. The transmitter processes are reversed at
the receiver to recover the information. The optimization of
these individual processing blocks is known to be suboptimal,
given that it does not optimize the overall system collec-
tively [13]. In this conventional communication system, the
transmitter communicates one from the Mavailable messages
s∈M={1,2,..., M}to the receptor, making nuses of the
channel. The transmitter applies the modulation f:M → Rn
to the message s, and generates the signal x=f(s)∈Rn
to be transmitted. Digital modulation maps the input symbols
from a discrete alphabet to complex numbers that represent
the points on the constellation diagram. The process of digital
modulation in conventional communication systems has fixed
and pre-established constellation diagrams. The desired data
rate determines the constellation scheme and the grouping of
the input bits for symbol construction. Linear decision regions
make it simple to decode the information at the receiver.
B. An End-to-End Optimization Process with Autoencoders
As opposed to the independent block optimization of con-
ventional communication systems, deep learning is capable
to jointly optimize multiple communications blocks at the
ℝ → ℂ
Normaliza tion
Layer
Transmitter
argmax
Softmax
Acti vatio n
Receiver
AWGN, n
Channel
p(y|x)
x
y
Dense Layers
Dense Layers
{0...010...0}
f
(s)
g (y)
sŝ
1s
Fig. 2. An autoencoder-based end-to-end communication system.
transmitter and receiver by training them as deep neural
networks (DNNs). In an autoencoder system for a single
antenna, the output constellation diagrams are not pre-defined
but learned, based on the desired performance metric to be
minimized at the receiver (i.e., the symbol error rate, coherence
time, distance, propagation loss, etc.). The hardware of the
transmitter imposes the following constraints [14]:
(a) an energy constraint x2
2≤n,
(b) an amplitude constraint |xi|≤1∀i,
(c) an average power constraint E|xi|2≤1∀ion x.
The data rate of this system is computed as R=k/n
[bit/channel use], where k=log2(M)represents the number
of input bits and nincludes both the input bits and redundant
bits to reduce channel effects. The notation (n,k)implies that
a communication system sends one from the M=2kmessages
(i.e., kbits) over nchannel uses. Figure 2 illustrates a block
diagram of the channel autoencoder. The learning process
exploits the distribution of the communication channel data
under impairments. The communication channel is explained
by the density of the conditional probability p(y|x), where
y∈Rndenotes the signal at the receiver. The transmitted
message sis detected as yat the receiver, where the operation
g:Rn→Mis applied to estimate the value of ˆs. The channel
autoencoder is optimized to map xto yto enable sto be
recovered by minimizing probability of error. The autoencoder
components are summarized as follows:
1) Input: The input symbol sis encoded as a one-hot
vector, that is, scan only take legal combinations of values
with a single high ’1’ bit and all the others low ’0’. This
encoding allows a state machine to run at a faster clock rate
than any other encoding of that state machine. Determining
the state of a one-hot vector has a low and constant cost of
accessing one flip-flop.
2) Transmitter: The transmitter is composed of a feed-
forward neural network (FNN) with multiple dense layers. The
last dense layer output is reshaped to represent two complex
numbers with real (in-phase, I) and imaginary (quadrature,
309
Q) parts for each modulated input symbol. The normalization
layer ensures that physical constraints on xare met.
3) Channel: The channel layer is not trainable, and is rep-
resented by an additive white Gaussian noise (AWGN) layer
with a variance β=(2REb/N0)−1, where Eb/N0constitutes
the energy per bit (Eb)to noise power spectral density (N0)
ratio. The noise varies for every training example, and it is
used for the forward pass to distort the transmitted signal, but
neglected in the backward pass.
4) Receiver: Similar to the transmitter, the receiver is im-
plemented as an FNN. The softmax activation of its last layer
outputs the probability vector p∈(0,1)Mover all possible
messages. The element of pwith the highest probability value
is selected as ˆs.
5) Training: The autoencoder is trained using different
optimizers to update the weights of the FNN and compare their
behavior. The optimizers used during training are: stochastic
gradient descent (SGD), root mean square propagation (RM-
Sprop), adaptive gradient (Adagrad), adaptive learning rate
(Adadelta), adaptive moment (Adam), Adam-based infinity
norm (Adamax), and Nesterov Adam (Nadam). The training
batch is the set of all possible messages s∈M. The gradient is
derived from a categorical cross-entropy loss function between
1sand p.
III. SIMULATION RESULTS AND PERFORMANCE
EVALUATION
The autoencoder uses the data generated for transmission
and the same data at the reception point. The autoencoder
is considered as an unsupervised learning system since the
data used is not labeled externally. This concept allows the
autoencoder to learn without any prior knowledge. According
to [15], an autoencoder can achieve equivalent performance as
the Hamming (7, 4) code with maximum likelihood decoding
(MLD). The autoencoder achieves the same BLER as uncoded
BPSK for a (2, 2) system, and outperforms uncoded BPSK
for an (8, 8) system. We have reproduced the latter results
and discovered that the autoencoder learns the coding and
modulation scheme by jointly optimizing the cost function
for the entire end-to-end model. Optimizing the encoder and
decoder together is how we force the autoencoder to extract
only the features that are necessary and characterize the input
data to store it in the bottleneck layer (i.e., where the smaller
and dense representations are). After the training stage, the
autoencoder learns a heavily tailored compression scheme for
the specific communication system. Figure 3 presents a block
diagram of the simulated autoencoder architecture used to
compare the different optimizers. Figure 4 shows how the
loss of the cost function reduces until converging to almost
zero after around 100 epoch. The plot of SNR vs. BLER of
our autoencoder (1,2) with different optimizers can be seen
in Fig. 5. We identified that our autoencoder trained with a
categorical cross-entropy and optimized with Adadelta [16],
gives the best performance in terms of SNR range against
block error rate (BLER). The constellations received by our
autoencoder with diverse optimizers are illustrated in Fig. 6.
inpu t_1: Inp utLayer
dense_1: Dense
dense_2: Dense
batch_normalization_1: BatchNormalization
lambda_1: Lambda
gaussian_noise_1: GaussianNoise
dense_3: Dense
dense_4: Dense
Fig. 3. Block diagram of the autoencoder used to compare the optimizers.
Fig. 4. The number of epochs against the loss of Adadelta optimization.
We notice that for some optimizers e.g., SGD, the constellation
points are deviated from their ideal positions. This deviation
increases the modulation error at the receiver, which agrees
with Fig. 5 where SGD tends to diverge when we increase the
SNR range.
IV. CONCLUSIONS AND FUTURE WORK
We have review how deep learning architectures can help
in the optimization of communication systems. First, we
310
Fig. 5. Signal to noise ratio vs. block error rate for different autoencoder
(AE) optimizers.
Fig. 6. Constellations generated by our autoencoder under different parameter
optimizers.
discussed how to formulate a transmitter and receiver as an
autoencoder for the physical layer. We have used an end-
to-end optimization for the reconstruction loss, instead of
optimizing the individual blocks of a conventional communi-
cation system (i.e., synchronization, symbol estimation, error
correction, channel coding, modulation, etc.). We showed that
this formulation enables to capture channel impairments of
single antenna systems, and can match modulation baselines
by just applying off-the-shelf DNNs. Future works in the
field might include channel generalization by scaling from a
simple AWGN model to more complex real-world channels.
This channel generalization might be studied by combining
generative RF models with discriminative RF models, in an
adversarial way to improve both. Additionally, researchers
may leverage theory we know about propagation and physics
to propose better impairment models. From the autoencoder
side, several learning strategies can be studied, such as dif-
ferent weights initialization, hyperparameter selection, and
emerging autoencoder architectures. Moreover, additional au-
toencoders may be employed to extend this approach to
multi-user systems and multiple-antenna systems. It would be
interesting to see the new solutions of using autoencoders as
we scale systems. Finally, this work may be transferred to
specific domains, such as satellite communications, backhaul
radios, dense urban wireless, 5G MIMO, etc. There is still a
wide engineering knowledge that researchers might include to
take advantage of autoencoders in wireless communications
to finally enable a full deep learning-based communication
system.
ACKNOWLEDGMENT
This work was supported by Kumoh National Institute of
Technology (2019-104-155), and by the Technology Develop-
ment Program (S2508336) funded by the Ministry of SMEs
and Startups (MSS, Korea).
REFERENCES
[1] T. L. Marzetta, E. G. Larsson, H. Yang, and H. Q. Ngo, Fundamentals
of Massive MIMO, 1st ed. Cambridge, United Kingdom: Cambridge
University Press, 2016.
[2] T. O’Shea and J. Hoydis, “An Introduction to Deep Learning for the
Physical Layer,” IEEE Transactions on Cognitive Communications and
Networking, vol. 3, no. 4, pp. 563–575, 12 2017.
[3] M. E. Morocho-Cayamcela and W. Lim, “Finding the optimal path for
V2V multi-hop connectivity with Q-learning and Convolutional Neural
Networks,” in 2019 Korean Institute of Communications and
Information Sciences Conference (KICS), Jeju, South Korea, 6 2019.
[4] L. Wang and D. T. Delaney, “QoE Oriented Cognitive Network
Based on Machine Learning and SDN,” in 2019 IEEE 11th
International Conference on Communication Software and
Networks (ICCSN). IEEE, 6 2019, pp. 678–681.
[5] M. E. Morocho-Cayamcela and W. Lim, “Proposed cost function using
wireless propagation for self-organizing networks,” in 2019
Korean Institute of Communications and Information Sciences
Conference (KICS), Seoul, South Korea, 11 2019, pp. 172–174.
[6] M. E. Morocho-Cayamcela and W. Lim, “Artificial Intelligence in 5G
Technology: A Survey,” in 2018 International Conference on Informa-
tion and Communication Technology Convergence (ICTC), 2018, pp.
860–865.
[7] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, 1st ed.,
T. Dietterich, Ed. The MIT Press, 2016.
[8] A. Osseiran, J. F. Monserrat, and P. Marsch, 5G Mobile and Wireless
Communications Technology, 1st ed. Cambridge University Press, 2017.
[9] M. E. Morocho-Cayamcela, S. R. Angsanto, W. Lim, and A. Caliwag,
“An artificially structured step-index metasurface for 10GHz leaky
waveguides and antennas,” in 2018 IEEE 4th World Forum on Internet of
Things (WF-IoT). IEEE, 2 2018, pp. 568–573.
[10] M. E. Morocho-Cayamcela, H. Lee, and W. Lim, “Machine Learning for
5G/B5G Mobile and Wireless Communications: Potential, Limitations,
and Future Directions,” IEEE Access, vol. 7, pp. 137 184–137 206,
9 2019.
[11] C. E. Shannon, “A Mathematical Theory of Communication,” Bell
System Technical Journal, vol. 27, no. 3, pp. 379–423, 1948.
[12] E. Björnson, J. Hoydis, and L. Sanguinetti, Massive MIMO Networks,
1st ed. Pisa, Italy: now Publishers Inc., 2019, vol. 1.
[13] S. Dorner, S. Cammerer, J. Hoydis, and S. t. Brink, “Deep Learning
Based Communication Over the Air,” IEEE Journal of Selected Topics
in Signal Processing, vol. 12, no. 1, pp. 132–143, 2 2018.
[14] T. Erpek, T. J. OShea, Y. E. Sagduyu, Y. Shi, and T. C. Clancy,
“Deep Learning for Wireless Communications,” in Development and
Analysis of Deep Learning Architectures. Springer, Cham, 2020, pp.
223–266.
[15] T. J. O’Shea, T. Erpek, and T. C. Clancy, “Physical layer deep learning of
encodings for the MIMO fading channel,” in 2017 55th Annual
Allerton Conference on Communication, Control, and
Computing (Allerton). IEEE, 10 2017, pp. 76–80.
[16] M. D. Zeiler, “ADADELTA: An Adaptive Learning Rate Method,”
arXiv:1212.5701v1, 12 2012.
311