ArticlePDF Available

End-to-End Learning-Based Full-Duplex Amplify-and-Forward Relay Networks

Authors:

Abstract

Full duplex (FD) relaying can provide double spectral efficiency. Despite advanced self-interference cancellation techniques, residual self-interference (RSI) limits the performance significantly. We present an autoencoder (AE)-based block coded modulation (BCM) and differential BCM (d-BCM) for an FD amplify-and-forward (FD-AF) relay network that can tackle the deteriorating impacts of RSI. Existing works treat AE frameworks as black-box with minimal/no focus on training convergence, limiting AE’s practical deployment. Focussing on training convergence, firstly, we show that training of AE converges above minimum signal-to-noise-ratio (SNR) and below maximum RSI level, and CSI helps in faster convergence. Secondly, we establish a relationship between training hyper-parameters and AE-based BCM/d-BCM design, by showing that, for any given hyper-parameters, training of the AE has converged to its maximum potential of decoding if AEs encoder has designed 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> k </sup> codewords, with an emphasis on the minimum required training samples. To open the black-box AE, we reveal five observations in the AE-based designed codewords concerning Euclidean distance, packing density, hamming distance, and kurtosis, that resemble the desired observations of theoretical random coded modulations. By extensive simulations, we show that the proposed AE outperforms the conventional methods considerably for varying SNR, RSI, transmission rate, channel estimation errors, and small/practical block lengths.
1
End-to-End Learning-based Full-Duplex
Amplify-and-Forward Relay Networks
Ankit Gupta, Mathini Sellathurai and Tharmalingam Ratnarajah
Abstract—Full duplex (FD) relaying can provide double spec-
tral efficiency. Despite advanced self-interference cancellation
techniques, residual self-interference (RSI) limits the perfor-
mance significantly. We present an autoencoder (AE)-based block
coded modulation (BCM) and differential BCM (d-BCM) for
an FD amplify-and-forward (FD-AF) relay network that can
tackle the deteriorating impacts of RSI. Existing works treat
AE frameworks as black-box with minimal/no focus on training
convergence, limiting AE’s practical deployment. Focussing on
training convergence, firstly, we show that training of AE
converges above minimum signal-to-noise-ratio (SNR) and below
maximum RSI level, and CSI helps in faster convergence.
Secondly, we establish a relationship between training hyper-
parameters and AE-based BCM/d-BCM design, by showing that,
for any given hyper-parameters, training of the AE has converged
to its maximum potential of decoding if AEs encoder has designed
2kcodewords, with an emphasis on the minimum required
training samples. To open the black-box AE, we reveal five
observations in the AE-based designed codewords concerning
Euclidean distance, packing density, hamming distance, and
kurtosis, that resemble the desired observations of theoretical
random coded modulations. By extensive simulations, we show
that the proposed AE outperforms the conventional methods
considerably for varying SNR, RSI, transmission rate, channel
estimation errors, and small/practical block lengths.
Index Terms—Amplify-and-forward, autoencoder, block coded
modulation, deep learning, differential block coded modulation,
full-duplex, neural networks, relay networks and residual self
interference.
I. INTRODUCTION
End-to-end learning-based autoencoder (AE) framework has
appeared as a promising solution for performing block coded
modulation (BCM) design with the channel state information
(CSI) knowledge and differential BCM (d-BCM) without
the CSI knowledge, that achievies significant bit-error-rate
(BER) performance gains for rate R=k/n [bits/channel-
reuse] [1]–[12]. The AE frameworks have been investigated
extensively for the amplify-and-forward (AF) [13], [15] and
decode-and-forward (DF) [14], [16] relaying protocols uti-
lizing the half-duplex (HD) mode in [4]–[8] and [9]–[11],
respectively. Recently, the full-duplex (FD) relay is recognized
as an enabling technology to realize the expected gains in
A. Gupta and M. Sellathurai are with the Engineering and physical science
(EPS) department at Heriot-Watt University, Edinburgh EH14 4AS, U.K. (e-
mail: {ag104, m.sellathurai}@hw.ac.uk). T. Ratnarajah is with the Institute
for Digital Communications, the University of Edinburgh, Edinburgh EH9
3FG, U.K. (e-mail: t.ratnarajah@ed.ac.uk).
This work is supported in part by the U.K. Engineering and Physical
Sciences Research Council under Grant EP/P009670/1, the COG-MHEAR:
Towards cognitively-inspired 5G IoT enabled, multi-modal Hearing Aids
under Grant EP/T021063/1, and the Signal Procssing in the Information Age
under Grant EP/S000631/1.
the future networks, as it can double the spectral efficiency
by establishing concurrent transmission and reception on the
same temporal and spectral resources [15]–[25]. However,
none of the AE frameworks in [1]–[12] have considered FD
relaying network, where the self-interference (SI) leaking from
the signal transmitted by the FD relay node interferes with
the signal received at the FD relay node, thereby limiting
the spectral efficiency gains. Recent works [15]–[21] attest
to the facilitation of FD by superior multiple self-interference
cancellation (SIC) methods, such as antenna isolation, analog-
domain suppression, and digital-domain suppression [22].
Even with multiple SIC techniques, a residual self-interference
(RSI) is always present in the system. Several studies have
focused on conventional ways of analysis and optimization
of FD-AF relaying in the presence of RSI [23]–[25], wherein,
the relay requires channel gains of the source to relay link and
the self-interference channel for determining the amplification
factor. Besides, the destination node would need the channel
state information (CSI) knowledge of the overall channel of
source-relay-destination, and sometimes the channel gains the
self-interference channel. Further, estimating CSI knowledge
increases the feedback overhead, which will increase expo-
nentially in the future internet-of-things networks. However,
differential FD-AF relaying networks have never been studied,
to the best of authors’ knowledge. Further, none of the prior
AE or conventional works [1]–[25] have performed BCM or
d-BCM for the FD-AF relay networks.
In the seminal work of 1991 [26], Oliveira investigated
random short BCM design in 2-dimensional space using
the minimum squared Euclidean distance metric and showed
the existence of the optimum random BCM design, without
providing the method to obtain the same. The AE frameworks
provide us the method to perform the BCM/d-BCM design
without mathematical formulations, in 2n-dimensional space.
However, the AE frameworks [1]–[12] are considered “black
box” models, where insights into the obtained solutions almost
remain non-existent. The t-stochastic neighbor embedding (t-
SNE) [27] has been utilized for insights into the AE-based
BCM/d-BCM designed constellations in higher-dimensional
space [1], [6]. However, the t-SNE only provides the infor-
mation about the number of constellation points (codewords)
designed by the AE framework, apart from that, we can not
obtain any relevant information about the designed codewords.
Thus, there is an urgent need to open the black box of
AE frameworks for the fully understanding and practical
realization of the AE frameworks in future networks.
The process of training the AE frameworks include deter-
mining hyper-parameter settings, such as weight initialization,
2
activation functions, learning-rate, batch size, etc [27]. Al-
though the training process of neural networks (NNs) have
seen advancements, yet no universal technique exists to op-
timize these hyper-parameter settings, leading to suboptimal
choices, forcing the AE to get stuck in a local minima
while minimizing the cross entropy (CE) loss (c.f. [28],
[29]). Thus, in existing literature [1]–[12], hyper-parameter
settings are obtained sub-optimally by hit-and-trial method.
Which includes training an AE framework with various hyper-
parameter settings, monitoring the validation CE loss and
picking the hyper-parameter settings that give the minimum
validation CE loss during the training. However from coded
modulation perspective, a major problem with determining
the AE’s convergence by simply monitoring the validation
CE loss is that the validation CE loss for most of the non-
optimal hyper-parameter settings also reduces with training
epochs and thus we can not surely determine if the AE-
based designing of the BCM and d-BCM is converged by only
monitoring the validation CE loss. Thus, for any given hyper-
parameter settings, we need to determine the relationship
between the BCM and d-BCM designs performed by the AE
and any given hyper-parameter settings, which can indicate
if the trained AE has converged to its maximum potential.
Furthermore, for greater insights into the training convergence
of AE frameworks in practical settings, we also need to
determine impact of varying signal-to-noise-ratio (SNR) and
RSI levels, and presence/absence of the CSI knowledge on the
training convergence.
The major contributions of this work are summarized as:
We propose a bit-wise AE-based FD-AF relay networks,
where we consider NN-based encoder-decoder at the
source and destination nodes, and the conventional FD-
AF relay node operating in the presence of RSI. Depend-
ing on the availability of the CSI knowledge, we consider
three scenarios. Firstly, we propose AE-based BCM de-
sign in the presence of perfect CSI knowledge. Secondly,
we also analyze the proposed AE-based BCM design
with imperfect CSI knowledge. Thirdly, we completely
remove the necessity of CSI knowledge by proposing
differential FD-AF relay networks (i) we design the
amplification factor for conventional FD-AF relay node
by including the second order channel statistics of the
RSI, and (ii) we propose AE-based d-BCM design. In
contrast to the existing literature [1], [5]–[10] that have
analyzed AE frameworks with the rate R= 4/7,8/7. We
design a single NN architecture for the AE framework
that can handle varying high rate transmissions such
as R={4/7,8/7,12/7,16/7}for short block length
(n= 7). Furthermore, we train the AE to remain highly
generalizable of the testing SNR or RSI levels.
Focusing on the training convergence of the proposed AE
framework for FD-AF relay network, we show that:
For any given hyper-parameter settings, the two nec-
essary conditions for the training convergence are:
C1: The validation CE loss of the AE has converged
with respect to the training epochs and number of
training samples.
C2: The NN encoder of the AE designs 2kcodewords.
The training of AE, of sufficiently large block length
(n), converge to a locally optimum minima above a
minimum required SNR and below a maximum RSI
level.
The CSI knowledge helps in faster convergence of the
AE frameworks.
With the aim to open the black-box of the AE-based BCM
and d-BCM designs, we utilize the minimum Euclidean
distance, packing density, average Hamming distance, and
Kurtosis to reveal the ve distinct observations of 2k
codewords designed in 2n-dimensional space at the NN
encoder of the source node. These observations resemble
the desired theoretical observations of optimal random
BCM design discussed in [26].
By performing extensive simulations, we show that the
proposed AE-based BCM and d-BCM designs outper-
form the traditional FD-AF relay networks using (7,4)
Hamming code-based error correction as a baseline for
varying transmission rates (R), SNR and RSI levels by
considerable margins. Further, for longer block lengths,
we consider 5G-NR low density parity check (LDPC)
codes as outer codes for the AE-based BCM and d-BCM
designs and show significant BER performance gains.
Lastly, we also show that the proposed AE framework
remains highly reproducible even with different training
samples and weight initializations.
The rest of the work is organized as follows. In Sec. II we
propose the FD-AF relay network, model the RSI, propose a
differential scenario, and signal transmission-reception model.
In Sec. III we propose the AE-based FD-AF relay network, and
detail the hyper-parameter settings. In Sec. IV, we study the
convergence of AE with its necessary conditions. In Sec. V,
we analyze the observations of BCM and d-BCM designs. We
perform extensive performance evaluation in Section VI and
conclude this work in Section VII.
II. SY ST EM MO DE L
We consider a FD-AF relay networks as shown in Fig. 1,
consisting of a source node (S) that wants to transmit its signal
to the destination node (D), with the aid of an FD-AF relay
node (R). Each of the source and destination nodes has a single
antenna for transmission and reception, respectively. The relay
node has two antennas, one for the reception and the other
for transmission. We assume that the direct link between the
source and destination node is strongly attenuated because of
severe path-loss and shadowing effects.
A. Modelling the Residual Self Interference (RSI)
The RSI at the FD-AF relay node (R) can be modeled in two
ways (1) the complex Gaussian random model, where the
RSI is modeled as the independent and identically distributed
(i.i.d.) complex Gaussian random variables, having a similar
effect as the noise and aims at emphasizing the effect of RSI on
the performance [30], and (2) the general fading effect model,
where the RSI is modeled as a statistical fading distribution,
3
Fig. 1: System model for FD-AF relay networks.
such as i.i.d. Rician/Rayleigh fading, to model the RSI channel
effectively [31]. In this work, we utilize the general fading
effect model for RSI to characterize the RSI channel at the
relay node Reffectively. In particular, RSI is modelled by i.i.d.
Rayleigh block-fading (RBF) channel hrr CN 0, σ2
rr [23],
[24], such that it remains constant for ntransmissions [25].
B. Signal Transmission Model and MLD Decoding
The source node (S) intends to transmit x {0,1}kbits,
thus it first perform channel encoding ¯
xs=uc(x)to obtain
{0,1}jbits that are modulated to ncomplex baseband symbols
xs=um(¯
xs)Cn, such that ||xs||2
2=n, where ucand
umdenote the channel-coding and modulation functions. Then
source node performs symbol-by-symbol transmission, and the
signal received by the relay node after the SIC or under the
presence of the RSI, at time-instant κ, is given by
yr[κ] = pPs[κ]hsr[κ]xs[κ] + hr r[κ]xr[κ]
|{z }
RSI
+nr[κ](1)
where Psdenotes the transmission power of source node, hsr
is the i.i.d. RBF channel with hsr CN 0, σ2
hsr = 1,nris
the AWGN at the relay node with nr CN 0, σ2
r, and xr
is the amplified signal transmitted by the FD-AF relay node
at the same time-instant κ, given by
xr[κ] = pPr[κ]α[κ1]yr[κ1] (2)
where Prdenotes relay’s transmission power and the amplifi-
cation factor αis represented as
α[κ] = Ps[κ]|hsr[κ]|2+Pr[κ]|hr r[κ]|2+σ2
r1/2(3)
Now, the signal received by the destination node is given as
yd[κ] = hrd[κ]xr[κ] + nd[κ]
=hrd[κ]pPr[κ]α[κ1]yr[κ1] + nd[κ](4)
=pPs[κ1]Pr[κ]α[κ1]hsr[κ1]hr d[κ]xs[κ1]
| {z }
Desired Signal
+pPr[κ]α[κ1]hrr [κ1]hrd[κ]xr[κ1]
| {z }
RSI Signal
+pPr[κ]α[κ1]hrd[κ]nr[κ1] + nd[κ]
| {z }
Noise
(5)
where hrd CN (0,1) is the i.i.d. RBF channel in second-
hop and ndis AWGN at the destination node with nd
CN (0, σ 2
d). Thus, using (5) the signal-to-interference-and-
noise-ratio (SINR) at the destination node, denoted by Γ[κ],
can be given as (6) (shown on next page). For clarity, we have
detailed the impact of transmit SNR and RSI on the SINR
in Appendix A. The destination node performs optimal MLD
decoding, as follows
ˆxd= arg min
x∈C
yd[κ]pPs[κ1]Pr[κ]α[κ1]
hsr[κ1]hr d[κ]x||2(7)
where Cdenotes all the possible alphabets. The decoder
performs block-by-block channel-decoding using udfunction,
to obtain ˆ
xs=ud(ˆ
xd), where ˆ
xd {0,1}jand ˆ
xs {0,1}k.
C. Differential FD-AF Relay Networks - Without CSI
In the absence of the CSI knowledge, we propose to
utilize traditional differential modulation and demodulation
techniques at the source and destination nodes. For such
scenarios, we propose to design the amplification factor for
the FD-AF relay node by utilizing the variances of the first-
hop channel between the source and relay node, and the RSI
channel as
α[κ] = σ2
hsr +σ2
rr +σ2
r1/2(8)
where the variances {σ2
hsr , σ2
rr , σ2
r}can be obtained via long-
term average of the received signals. Similar approximations
have been employed for the HD-AF relays network in [32],
[33]. To include the impact of RSI in FD scenarios we
introduce the variance of the RSI channel in (8).
III. PROP OS ED AE-BASED FD-AF RELAY NET WORKS
The fundamental distinction between the BCM and d-BCM
design by the AE and the conventional networks is that the
AE aims to design the block codes using a learning-based
approach by updating the NN weights, while the conventional
network uses conventional channel codes (such as Hamming
code). Similar to our work [6] for HD-AF relay networks,
we propose bit-wise AE for BCM and d-BCM design for the
FD-AF relay network, as shown in Fig. 2.
Specifically, we consider a NN-based source node that
performs block-by-block encoding that transforms the kinput
bits x {0,1}kto ncomplex baseband symbols xsCn. We
now perform symbol-by-symbol transmission and at any time-
instant κthe symbol received by the FD-AF relay node, in
the presence of RSI can be given as (1). Traditionally, the AF
relaying scheme is designed to have minimal implementation
complexity by receiving, amplifying and re-transmitting the
signal. Similar to our work in [6]–[8] for HD-AF relay
networks, we propose to utilize a conventional FD-AF relay
node without using a NN-based relay. This is because NN-
based processing at the FD-AF relay node will make it similar
to the decode-and-forward relay decoding and re-encoding
the signal. Further, this will simplify the relay structure with
only analogue-domain signal reception amplification and re-
transmission, with the necessary SIC for FD transmission.
Thus, the signal transmitted by the FD-AF relay node be-
comes (2), where we utilize the amplification factor given in
(3) and (8) for the BCM and d-BCM designs, respectively.
The signal received by the destination node is given as (5).
Also, we consider a NN-based destination node that performs
4
Γ[κ] = Ps[κ1]Pr[κ]α[κ1]2|hsr[κ1]hr d[κ]|2
Pr[κ]α[κ1]2|hrr [κ1]hrd[κ]|2+Pr[κ]α[κ1]2|hrd[κ]|2σ2
r+σ2
d
(6)
Fig. 2: Block diagram of proposed AE-based FD-AF relay network.
block-by-block decoding that transforms the ninput complex
baseband symbols ydCnto ksoft probabilistic outputs
˜pdθd(xu|yd)[0,1], where u={1, ..., k}, that corresponds
to the log-likelihood ratios (LLRs), as
LLR(u) = log 1˜pdθd(xu= 0|yd)
˜pdθd(xu= 0|yd),u= 1, ..., k (9)
Thus, the designed AE framework solves the bit-decoding
problem as a multi-label binary classification problem. Specif-
ically, each of the bit is considered as a separate label, thus
there are klabels, and each of these labels can take binary
values 0/1. Further, the proposed AE’s NN decoder generates
soft-probabilistic outputs that corresponds to the LLRs, which
can also be modified for hard-decision decoding by placing a
threshold or used directly for more powerful outer decoders
such as LDPC and Turbo codes (as detailed in Sec. VI.E).
We train the AE with the aim to maximize the chances
of reconstruction of the intended signal xsby learning the
NN functions at the source and destination nodes, represented
by (θe,θd). As the input-output to AE is kbits, thus we
can formulate the proposed AE framework as a multi-label
binary classification problem, wherein, we utilize binary CE
loss to quantify the de-mapping error at the destination node.
We train the AE by performing mini-batch training, where
the weights and bias terms in the NN are updated using
the stochastic gradient descent (SGD) method employing the
back-propagation method [27]. In contrast to training dataset
creation methods for AE frameworks in [1]–[12] where HD
transmission is considered with no training dataset creation
methodology for the presence of RSI in FD networks. We
design a training dataset such that the AE can generalize well
for varying testing RSI or SNR values, as
For Varying RSI For any given rate R, we create
a training dataset with STrain samples with fixed trans-
mit SNR Eb/N0= 30 dB and multiple RSI levels
σ2
rr ={−60,20,0,20}dB. Then we train a single AE
framework (performing BCM or d-BCM design) until the
convergence as detailed in Remark 3 later. Then, we test
for varying RSI levels σ2
rr = [60,20] dB.
For Varying SNR For any given rate R, we create
a training dataset with STrain samples with fixed RSI
σ2
rr = 0 dB and multiple transmit SNR Eb/N0=
{3,10,23,28,38}dB. Then we train a single AE frame-
work (performing BCM or d-BCM design) until the
convergence as detailed in Remark 3 later. Then, we test
for varying transmit SNRs Eb/N0= [0,30] dB.
In contrast to the bit-wise AE-based frameworks in [12]
where SNR information is required at the encoder-decoder, we
remove the SNR requirement. Further for generalizability, in
this work we propose the same NN architecture for both AE-
based BCM and d-BCM design as shown in Table I with only
difference in the Lambda layers LL(yd)at the NN decoder.
In general, the radio transformer network (RTN) networks are
used for scenarios without the CSI knowledge as a means to
estimate the channel [1], as also considered in AE-based HD-
AF relay networks in [6]–[8]. But, by experiments, we find
that due to the presence of RSI at the FD-AF relay node the
proposed RTN is improving the performance for BCM design,
instead of the d-BCM design. Thus, we propose an RTN and
include it for BCM design and do not include RTN in d-BCM
design (please see details in Appendix B).
For the AE-based BCM, we perform channel equalization
and employ a RTN in the Lambda layers. Firstly, we include
two Lambda layers L1
Land L2
Lto perform channel equaliza-
5
TABLE I: NN architecture.
Node Layer Neurons Remarks
No. (l) (δl)
Encoder
l= 0 kInput (x)
l= 1 256 σ1=Tanh
l= 2 128 σ2=Tanh
l= 3 64 σ3=Tanh
l= 5 2n σ4=Linear
l= 6 2nPN (PS
N)
l= 7 2nOutput (xs)
Decoder
l= 0 2nInput (Output of LL(yd))
l= 1 1024 σ1=Tanh
l= 2 512 σ2=Tanh
l= 3 256 σ3=Tanh
l= 4 64 σ4=Tanh
l= 5 k σ5=Sigmoid
l= 7 kOutput ˜pdθd(x|yd)
tion for the dual-hop channels (hsr)and (hr d). Secondly, we
include RTN made of three Lambda layers the L3
Lis a dense
layer with 16 neurons and Tanh activation, the L4
Lis a dense
layer with 2nneurons and Linear activation, and the L5
Ladds
the output of L4
Land received signal ydbefore providing it
to the NN decoder. However, for the AE-based d-BCM, we
consider no Lambda layers in NN decoder, i.e., yddirectly
becomes input to the NN decoder.
In contrast to [1]–[12] where a factor of 2kneurons are
employed in the NN encoder and decoder, thereby increasing
the AE’s complexity exponentially with k. Another advantage
of the proposed NN architecture in Table I is that for the first
time we design a single NN architecture for the AE framework
that can handle all the varying high rate transmissions R=
{4/7,8/7,12/7,16/7}, later analyzed in Section VI.
The AE is implemented in Keras [35] with TensorFlow [36]
as backend. For training we utilize Adam optimizer [37],
where the weights are initialized using Glorot initializer [38].
We keep training samples Strain = 3×105and learning rate τ=
0.001. By parameter searching, we note that smaller batch-size
(B)and fewer epochs (E)leads to better performance for
BCM design in comparison to d-BCM design. Thus, we keep
B= 128, E = 15 for BCM design and B= 6000, E = 60
for d-BCM design, respectively. This is because without CSI,
large batch size provides the AE with sufficient samples at low
probability region and more epochs helps the AE in estimating
and removing the channel impairments.
IV. UNDERSTANDING THE TRAINING CONVERGENCE OF
AE FRA ME WORKS
In this section, we focus on the training convergence of the
proposed AE framework and provide three useful conjectures
based on empirical observations, labelled as Remarks 1–3.
Please note we use the term BCM loosely in this section to
imply for both the BCM and d-BCM designs. For clarity, for
any (n, k)block or rate R=k/n, we define following
Definition 1 (Symbol): A complex baseband symbol is de-
fined as a complex number indicating the symbol transmitted
or received at various nodes in the network.
Definition 2 (Codeword): A codeword is a collection of n
complex baseband symbols together.
A. Impact of SNR, RSI and CSI on Training Convergence
In this subsection, we focus on the impact of varying
SNR, RSI, and presence/absence of CSI knowledge on the
training convergence of the proposed AE framework. Many
prior works have determined the relationship between the
binary CE loss and MI for P2P networks [12], [40]–[42].
In this work, we propose the FD-AF relaying network with
binary CE loss calculated between source and destination node
and a conventional FD-AF relay node. Thus, the relationship
of binary CE loss and MI remains same as P2P networks,
obtained as follows
J(θe,θd) =
k
X
u=1 Zyd
p(yd)peθe(xu|yd) log ˜pdθdxu|yd)dyd
=DKL peθe(x|yd)||˜pdθd(x|yd)+
H(X)Ieθe(X;Yd)(10)
where DKL peθe(x|yd)||˜pdθd(x|yd)denotes the Kullback-
Leibler (KL)-divergence loss between the true and learnt
distributions, H(X)denotes entropy of the input bits X, and
Ieθe(X;Yd)is the MI between the input bits Xand the
received signal Yd[34]. Now, using (10), we can define
the estimated MI (I)[9], [12], [40]–[42] as the differ-
ence between the MI Ieθe(X;Yd)and KL-divergence loss
DKL peθe(x|yd)||˜pdθd(x|yd), given as follows
I:= Ieθe(X;Yd)DKL peθe(x|yd)||˜pdθd(ˆ
x|yd)
=H(X) J (θe,θd)(11)
Since the first term on R.H.S, H(X)in (11), remains a
constant, thus the changes in the estimated MI (I)in (11)
only depends on the binary CE loss term J(θe,θd).
Lastly, by simulations, we analyze the convergence of the
training of proposed AE frameworks. In particular, we train
a separate AE for each SNR or RSI level, once the AE is
trained we note the validation CE loss (J(θe,θd)) at the
last epoch, and obtain the estimated MI (I)as described in
(11). Specifically, we train the proposed AE for fixed SNR
Eb/N0= 30 dB and varying RSI in Fig. 3a, and for fixed
RSI σ2
rr =20 dB and varying SNR (Eb/N0)in Fig. 3b. For
greater insights, we also vary the rate R=k/n [bits/channel
reuse] and keep block size n= 7. In Fig. 3, we can see
that as the RSI decreases or SNR increases the estimated MI
increases, until it reaches the upper bound of k. Directly from
(11), it suggests that KL-divergence loss approaches 0making
Ieθe(X;Yd) = H(X). It is important to note that we can’t
find the global minima of the NN parameters with respect
to the binary CE loss. But, surprisingly we don’t need to
find the global minima. Empirically, the authors in [43], [44]
found that despite the non-convexity, the local minimas are
rare and they are all very similar to each other and the global
minima. Interested readers, please refer to the theoretical
insights presented in [43], [44]. Thus, the training of AE-based
FD-AF relay networks converge to a locally optimum minima
above a minimum required SNR (in Fig. 3b) and below a
maximum RSI level (in Fig. 3a). Further, in the absence of the
6
-10 0 10 20 30
SINR [dB]
4
6
8
10
12
14
Estimated Mutual Information
(n,k) = (7,4) - BCM
(n,k) = (7,8) - BCM
(n,k) = (7,16) - BCM
(n,k) = (7,4) - d-BCM
(n,k) = (7,8) - d-BCM
(n,k) = (7,12) - d-BCM
(a) Varying RSI for fixed transmit SNR Eb/N0= 30 dB.
-2 0 2 4 6 8 10 12
SINR [dB]
4
6
8
10
12
14
Estimated Mutual Information
(n,k) = (7,4) - BCM
(n,k) = (7,8) - BCM
(n,k) = (7,16) - BCM
(n,k) = (7,4) - d-BCM
(n,k) = (7,8) - d-BCM
(n,k) = (7,12) - d-BCM
(b) Varying transmit SNR for fixed RSI σ2
rr =20 dB.
Fig. 3: Estimated mutual information for proposed AE framework.
CSI the estimated MI converges to the upper bound at a higher
transmit SNR and lower RSI levels. Thus, convergence of the
training of AE performing BCM design with CSI knowledge is
faster than d-BCM design without CSI knowledge. (in Fig. 3).
Thus, based on above empirical observations, we give the
following Remarks.
Remark 1: The training of AE framework, of sufficiently
large block length (n), converge to a locally optimum minima
above a minimum required SNR and below a maximum RSI
level.
Remark 2: The CSI knowledge helps in faster convergence
of the AE frameworks.
From Remarks 1, 2, the proposed AE converges above a
minimum SNR and below a maximum RSI. For this, we need
to train a separate AE framework for each SNR and/or RSI
level, which is impractical in nature. For practical purposes,
we propose to train a single AE framework (performing BCM
or d-BCM design) on varying SNR or RSI levels, such that
the proposed AE can generalize well for the varying SNR and
RSI levels. As a result, although the AE’s estimated MI never
reaches the upper bound because of training on low SNR or
high RSI, but it enables the AE to generalize well in nature.
B. Necessary Conditions for Training Convergence
In this subsection, we focus on tackling the problem of
finding the training convergence of the AE-based BCM/d-
BCM designs for any heuristically chosen hyper-parameter
settings.
Firstly, we need to undertand the training of AE with respect
to epochs, for a given training samples. This focusses on
stopping overfitting of the training AE framework. There are
well-known techniques like early stopping to stop the training
of an AE once the validation CE loss starts to increase, as
any more training of the AE will reduce its generalizability.
This is because the early stopping on the gradient descent
creates generalizable NN frameworks, that also remains robust
to corrupted labels [44, Theorem 2.2] [39]. Secondly, we need
to undertand the training of AE with respect to number of
training samples. Since the proposed AE learns the BCM and
d-BCM design in the presence of channels and noise, thus the
AE must be trained with enough samples to be generalizable
in the future testing phase. Thus, the training of the proposed
AE is converged if increasing the training epochs and training
samples does not reduce the validation CE loss, respectively.
Furthermore, the proposed AE framework is modelled as
a multi-label binary classification problem. In particular, k
input-output bits represent klabels, with each label taking
binary 0/1values, thus there exist 2kpossible classes for
the proposed AE. Thus, the AE aims to design 2kpossible
codewords each representing a different class in a higher-
dimensional space. For any given hyper-parameter settings,
once these codewords are designed the AE converges because
we can not improve the performance any further.
We now empirically analyze the above discussion below.
For example, we train an AE for R= 16/7in Fig. 4 for
varying training data size STrain ={213, ..., 223}of fixed SNR
Eb/N0= 30 dB. Specifically, we divide the STrain training
samples into 4 : 1 ratio of training set STand validation set
SV. Then, we train the AE on STand determine the number
of codewords formed by the NN encoder and the binary CE
loss at the last epoch (15th epoch) on STand SV. Lastly, we
determine the BER using the testing samples STest.
In Fig. 4a, we can see that as the training dataset increases,
the number of codewords formed by the NN encoder of the
trained AE on the training and validation sets increase until it
becomes 216 codewords, each representing one of the possible
216 combination. The NN encoder forms these 216 codewords
on the 218 and 221 training samples using the training and
validation sets, respectively. Further, in Fig. 4a, we can see that
the binary CE loss, noted at the last (15th)epoch of training,
reduces as the training dataset increases and converges for
training and validation sets at 221 training samples.
In Fig. 4b we can see that as the training dataset increases
the performance of the proposed AE on the unseen testing
samples improves, whereas when the training dataset size
7
13 14 15 16 17 18 19 20 21 22 23
log2(Size of training data STrain)
10
11
12
13
14
15
16
17
log2(No. of codewords formed)
0.1
0.12
0.14
0.16
0.18
0.2
0.22
0.24
Binary cross entropy loss at last epoch
Using training set
Using validation set
On training set
On validation set
(a) Codewords formed by the NN Encoder and binary CE
loss on training and validation sets.
-10 0 10 20 30
SINR [dB]
10 -2
10 -1
BER
STrain = 2 13
STrain = 2 14
STrain = 2 15
STrain = 2 16
STrain = 2 17
STrain = 2 18
STrain = 2 19
(b) BER analysis on test set for varying number of training
samples STrain.
Fig. 4: Training convergence of the proposed AE framework.
starts becoming greater than 218 then the performance im-
provement of the AE starts converging because 216 codewords
are created by AE’s NN encoder on the training set.
Thus, for any given hyper-parameter settings, we at least
need training samples in the range of [2k+2,2k+5 ]to ensure
the AE creates 2kcodewords, the validation CE loss has con-
verged, and the AE’s performance converges to its maximum
potential of decoding the 2kpossible classes.
Thus based on above empirical observations, we can provide
the following Remark.
Remark 3: For any given hyper-parameter settings and suffi-
ciently large block length (n), the two necessary conditions for
the convergence of training of the AE frameworks performing
BCM and d-BCM designs are detailed as follows:
C1: The validation CE loss of the AE has converged with
respect to the training epochs and number of training
samples.
C2: The NN encoder of the AE designs 2kcodewords.
From Remark 3, the number of training samples increases
exponentially with input-output bits (k), with at least 2k+2
samples required, for the convergence of the AE frameworks.
However, these training samples are required only in offline
training phase. Once trained, the AE can be deployed online.
V. OP EN IN G TH E BLAC K-B OX OF BCM AND D-BCM
DESIGNED BY AE FRA ME WORKS
In this section, we reveal distinct observations made of
the AE-based designed BCM. For brevity we only consider
BCM because both the BCM and d-BCM exhibit similar
trends. Throughout this section, we train the proposed AE for
various rates R=k/n, where n N ={1,3,5,7,10}and
k K ={1,4,8,12,16}until convergence using Remark 3.
Once trained the NN encoder becomes deterministic. Thus,
if we input any kbits to the NN encoder of the trained AE
we obtain same ncomplex baseband symbols as output every
time, representing a codeword for the kinput bits. Now, we
Fig. 5: Minimum Euclidean distance dEmin for vaying (n, k).
can obtain all the possible codeword from the NN encoder
using all the possible combinations of kinput bits.
Observation 1:AE framework designs 2kcodewords in
2n-dimensional space.
In Sec. IV-B and Remark 3, we have already shown that the
training of the AE converges after designing of 2kcodewords.
Directly, as the NN encoder outputs 2nreal values for each
of the 2kcodewords, i.e. each of the 2kcodewords are
represented by unique ncomplex baseband symbols, thus 2k
codewords are being designed in 2n-dimensional space.
Observation 2a:As the block length increases the
minimum Euclidean distance between any of the possible
codewords increases.
Observation 2b:When the number of codewords becomes
extremely large, the minimum Euclidean distance between any
two codewords follows a Gaussian distribution for sufficiently
large block length (n).
Observation 2c:As the block length increases, the
8
0 5 10
0
1
2
Ncw
n = 1, k = 1
0.3 0.4 0.5
Minimum Euclidean distance d E between each fth codeword to its closest g th codeword
0
5
10
Number of codewords formed by the NN encoder of the trained AE
n = 1, k = 4
024
10-3
0
100
200 n = 1, k = 8
0 0.5 1
10-3
0
2000
4000 n = 1, k = 12
024
10-3
0
5
104n = 1, k = 16
0 5 10
0
1
2
Ncw
n = 3, k = 1
1.4 1.6 1.8
0
2
4n = 3, k = 4
0.4 0.6 0.8
0
20
40 n = 3, k = 8
0 0.02 0.04 0.06
0
100
200
n = 3, k = 12
0 0.02 0.04
0
5
104n = 3, k = 16
0 5 10
0
1
2
Ncw
n = 5, k = 1
1.9 1.95 2
0
2
4n = 5, k = 4
1 1.2 1.4
0
50 n = 5, k = 8
0.6 0.8 1
0
200
n = 5, k = 12
0.4 0.6 0.8
0
5000
n = 5, k = 16
0 5 10
0
1
2
Ncw
n = 7, k = 1
2.2 2.4 2.6
0
5n = 7, k = 4
1.4 1.6 1.8
0
50 n = 7, k = 8
1 1.2 1.4 1.6
0
200
400 n = 7, k = 12
0.4 0.6 0.8 1
0
1
2104n = 7, k = 16
0 5 10
dE
0
1
2
Ncw
n = 10, k = 1
2.8 3 3.2
dE
0
5n = 10, k = 4
1.8 1.9 2
dE
0
50
n = 10, k = 8
1.4 1.5 1.6 1.7
dE
0
500
n = 10, k = 12
1 1.2 1.4 1.6
dE
0
1
2
104n = 10, k = 16
Fig. 6: Minimum Euclidean distance dEbetween each fth codeword and its closest gth codeword.
Euclidean distance between the codewords concentrate to the
average Euclidean distance.
We can determine the minimum Euclidean distance [45]
between each fth ={1, ..., 2k}codeword and its closest gth
codeword, as follows
df
E= min
g∈{1,...,2k}and g6=f||xfxg||2,f(12)
where 2kdenotes the number of possible codewords and x(·)
denotes a vector comprising ncomplex values representing
each of the 2kpossible codewords. The minimum Euclidean
distance between any of the possible codewords is given as
dEmin = min
fdf
E,f {1, ..., 2k}(13)
For analyzing this observation, we train the proposed AE for
(n N , k K)and using (13) we determine the minimum
Euclidean distance between all the 2kdesigned codewords
(dEmin )in Fig. 5. We can see that as the block length (n)
increases the dEmin increases and as the number of input
bits (k)increases the dEmin decreases, this is because the 2k
codewords are being designed in 2n-dimensional space.
We also plot a histogram of the minimum Euclidean dis-
tance between each fth codeword and its closest gth code-
word, by calculating the dE={d1
E, ...., d2k
E}using (12)
for varying (n N , k K)in Fig. 6. Interestingly, when
k8and n3, i.e., scenarios where number of codewords
formed are very large and block length is very small, the
dEmin approaches zeros (Fig. 5) and minimum Euclidean
distance between each codeword to its closest codeword is
also zero (marked in red in Fig. 6). In such scenarios, the
AE learns to cheat by placing the 2kcodeword on top of
each other because of small space to place the large amount
of codewords. Moreover, in Fig. 6, when the block length
(n)increases, the mean of histogram also increases because
dEmin increases, indicating that as the block length increases
the spacing between any two closest codewords also increases.
Moreover, in Fig. 6, when number of codewords becomes
extremely high (for k8) and the block length is sufficiently
large (for n5), we can see that (i) although the
overall minimum Euclidean distance dEmin (obtained using
(13)) is small (Fig. 5), but the minimum Euclidean distance
dEbetween each fth codeword and its closest gth codeword
is competitively large for almost all the codewords and follows
a Gaussian distribution (marked in green in Fig. 6), as a
consequence of the central limit theorem; and (ii) as the block
length increases the standard deviation (spread) of the Gaus-
sian distribution decreases, indicating that the Euclidean dis-
tance of the codewords concentrates to the average Euclidean
distance. These observations resemble the desired theoretical
observations of BCM design discussed in [26]. Therefore, we
claim that the proposed AE framework can design random
BCM design to achieve the best possible distance observations
for sufficiently large block length (n)for any input bits (k).
Observation 3:The packing density improves as the rate
Rdecreases.
We can define the normalized second-order moment [45] as
the average squared Euclidean distance between a point in the
packing and the origin of the coordinate system, normalized
by the square of the minimum Euclidean distance dEmin , as
En=1
2kd2
Emin
2k
X
f=1
||xf||2
2(14)
9
1 3 5 7 10
Block Length (n)
0
5
10
15
20
25
30
35
Normalized second-order
Moment (En)
k = 1
k = 4
k = 8
k =12
k = 16
Fig. 7: Packing density.
This metric remains indifferent to scaling thus pivotal to
differentiate the packing densities. The lower the Enthe better
is the designed BCM. In Fig. 7 we analyze the packing
density Enwith varying rate R=k/n. We can see that
the packing density improves for the AE-based BCM as the
block-length (n)increases or the input bits (k)decreases, for
all (n N , k K), because the 2kcodewords are being
designed in the 2n-dimensional space.
Observation 4:The codes designed by the AE framework
are spherical codes.
The normalized fourth-order moment or Kurtosis [45] mea-
sures the variation of the squared Euclidean norm among the
constellation points, defined as
χ=1
E2
n2kd4
Emin
2k
X
f=1
||xf||4
2(15)
By simulations, we find that the proposed AE creates ‘Spheri-
cal codes’ with χ= 1, i.e. equal norm for all the 2kcodewords
for all the varying (n N , k K)scenarios.
Observation 5:As the block-length increases the average
Hamming distance between codewords increases.
Using the Observation - 2, we know that the 2kcodewords
are being designed in the 2n-dimensional space that have the
minimum Euclidean distance between the fth codeword and
its closest gth codeword as Gaussian distributed. Thus the
distance between any two codewords is different and does not
follow a grid-like structure, hence we cannot directly utilize
the minimum Euclidean distance to determine the average
Hamming distance between two closest codewords. Hence
in this work, we firstly determine the minimum Euclidean
distance dEmin of the 2kcodewords using (13). Then, for
each fth codeword, we determine all the codewords within the
sphere with radius given by the minimum Euclidean distance
dfdEmin +ξ, such that ξ0and represent these code-
words by a set Sf. We then determine the average Hamming
distance for each fth codeword and all the codewords in its
corresponding set Sf[46], [47], as
davg,f
H=X
g∈Sf
dH(f, g)
|Sf|(16)
dEmin
1.5dEmin
2dEmin
2.5dEmin
3dEmin
Radius of the sphere to determine the codeword set
0
0.5
1
1.5
2
2.5
3
3.5
4
Average Hamming Distance
n = 1
n = 3
n = 5
n = 7
n = 10
Fig. 8: Average Hamming distance.
where dH(f, g)denotes the Hamming distance between code-
word fand g, and
Sf
is the cardinality of the set Sf. Now,
we can determine the average Hamming distance for all the
f {1, ..., 2k}codewords as
davg
H=1
|davg,f
H>0|
2k
X
f=1
davg,f
H(17)
where |davg,f
H>0|is the number of non-zero elements in
davg
H. For fixed input bits k= 8, we determine the average
Hamming distance in (17) for varying block lengths n N
and varying ξ={0,0.5,1,1.5,2}in Fig. 8. As expected, as
the radius of the sphere (dEmin +ξ)to determine the codeword
set Sfincreases the average Hamming distance davg
Hincreases.
Interestingly, as the block-length (n)increases the average
Hamming distance davg
Halso increases, because in Fig. 5 we see
that as the block-length (n)increases the minimum Euclidean
distance between the codewords dEmin is also increasing.
VI. PERFORMANCE EVALUATIO N
In this section, we evaluate the performance of the proposed
AE, conventional HD-AF and FD-AF relay networks for
fixed RSI and varying transmit SNR and vice-versa. Please
note we show the plots with respect to SINR in (6) (for
details please refer Appendix A). As this is first time NN-
based AE framework is proposed in context of FD networks,
for fair comparison, we consider the conventional FD-AF
relay networks as benchmark, wherein we utilize traditional
modulation techniques and (7,4) Hamming code as a baseline
error correction code, with the MLD decoding detailed in (7).
Also, we utilize RBF channels such that it remains constant
for n= 7 transmissions only.
A. AE-based d-BCM Design - Without CSI Knowledge
In Fig. 9, we analyze the BER of the proposed AE-
based d-BCM design. In Fig. 9a–9c, we fixed transmit SNR
Eb/N0= 30 dB and vary the RSI, for varying input bits (k)1.
1For fixed n= 7, we keep kas 4,8, and 12, that corresponds to d-
BPSK, d-QPSK and d-8PSK modulations designs in conventional networks
with (7,4) Hamming coding.
10
-10 0 10 20 30
SINR [dB]
10 -2
10 -1
10 0
BER
d-BPSK + (7,4) HC - FD
d-BPSK + (7,4) HC - HD
AE-based d-BCM
(a) Fixed transmit SNR Eb/N0= 30 dB
R= 4/7.
-10 0 10 20 30
SINR [dB]
10 -2
10 -1
10 0
BER
d-QPSK + (7,4) HC - FD
d-QPSK + (7,4) HC - HD
AE-based d-BCM
(b) Fixed transmit SNR Eb/N0= 30 dB
R= 8/7.
-10 0 10 20 30
SINR [dB]
10 -2
10 -1
10 0
BER
d-PSK-8 + (7,4) HC - FD
d-PSK-8 + (7,4) HC - HD
AE-based d-BCM
(c) Fixed transmit SNR Eb/N0= 30 dB
R= 12/7.
(d) Fixed RSI σ2
rr = 0 dB R= 4/7.
-4 -2 0 2
SINR [dB]
10 -2
10 -1
10 0
BER
d-QPSK + (7,4) HC - FD
d-QPSK + (7,4) HC - HD
AE-based d-BCM
(e) Fixed RSI σ2
rr = 0 dB R= 8/7.
-4 -2 0 2
SINR [dB]
10 -1
10 0
BER
d-PSK-8 + (7,4) HC - FD
d-PSK-8 + (7,4) HC - HD
AE-based d-BCM
(f) Fixed RSI σ2
rr = 0 dB R= 12/7.
Fig. 9: Performance evaluation for d-BCM for FD-AF relay networks. Please note we vary RSI in Fig. (a)-(c) and vary transmit
SNR in Fig. (d)-(f).
We can see that for small RSI (σ2
rr 30 dB) the BER
performance of (i) conventional FD-AF and HD-AF relay
networks’ becomes same and (ii) proposed AE converges,
because the RSI becomes negligible to impact the signal at the
FD-AF relay node. Furthermore, the proposed AE outperforms
the conventional FD-AF relay networks for all varying rates
(R)and RSI levels, even outperforming conventional HD-AF
relay networks for small RSI levels. In Fig. 9d–9f, we fixed
the RSI at σ2
rr = 0 dB and vary the transmit SNR, for varying
(k). The conventional FD-AF relay networks is not able to
decode the signals even for Eb/N0= 30 dB as the RSI is high
(σ2
rr = 0 dB), but the proposed AE can decode the signals and
the BER reduces with SNR.
We explain the reasons for the gains achieved by the AE
as follows the proposed AE is able to design 2kcodewords
in 2n-dimensional space with automatic bit-labelling, by max-
imizing the bit-wise MI (as detailed in Sec. IV-A). The AE
aims to learn these d-BCM design to remove the deteriorating
impacts of RSI, RBF channels and AWGN at the nodes,
by the proposed end-to-end training until convergence using
Remark 3. This leads to the maximization of the minimum
Euclidean distance and minimum average Hamming distance
as detailed in Sec. V for the designed codewords and thus
achieve improvement in the BER performance.
In Fig. 9a–9c, the proposed AE is able to design the d-BCM
for 2kcodewords in 2n-dimensional space with the observa-
tions detailed in Sec. V, leading to the BER performance for
proposed AE almost similar for any rate R12/7. Thus,
as the modulation order or rate increases the proposed AE
can even outperform the conventional HD-AF relay networks
even for higher RSI, i.e. at 10 dB (for k= 8) and
5dB (for k= 12) for n= 7. Due to similar reasons, in
Fig. 9d–9f, the AE’s BER performance becomes closer to the
conventional HD-AF relay networks as the modulation order
or rate increases, indicating that impact of RSI is removed by
AE even in the absence of CSI and very high RSI levels.
B. AE-based BCM Design - With Perfect CSI Knowledge
In Fig. 10, we analyze the BER performance of the proposed
AE-based BCM design. For varying (k)2, in Fig. 10a–10c we
fix transmit SNR Eb/N0= 30 dB and vary the RSI, and
in Fig. 10d–10f we fix σ2
rr = 0 dB and vary the SNR. We
see similar BER performance trends for BCM as d-BCM in
Sec. VI-A. Unlike d-BCM design, the BER performance of
BCM designs deteriorate with increasing modulation order
or rate because in the presence of perfect CSI, the BER
performance of the conventional FD-AF and HD-AF relay
networks is already very good and the advantage the AE had
in tackling the RBF channels effectively than conventional
differential schemes is not present for BCM design.
C. AE-based BCM Design - With Imperfect CSI Knowledge
Now, we analyze the AE’s performance in the presence of
the channel estimation error. We utilize the linear minimum
mean squared error (LMMSE) based channel estimation [48]
denoted by hω
(·) CN 0, σ2
hωwhere the error in channel esti-
mation is e(·) CN 0, σ2
efor both the hops (·) = {sr, rd}.
From the orthogonality principle of the LMMSE we know
that the errors in the channel estimation remains mutually
independent of the estimated channel, thus, we have
hω
(·)=h(·)+e(·),(·) = {sr, rd}(18)
2For fixed n= 7, we keep kas 4,8, and 16, that corresponds to BPSK,
QPSK and QAM-16 modulations designs in conventional networks with (7,4)
Hamming coding.
11
-10 0 10 20 30
SINR [dB]
10 -4
10 -2
10 0
BER
BPSK + (7,4) HC - FD
BPSK + (7,4) HC - HD
AE-based BCM
(a) Fixed transmit SNR Eb/N0= 30 dB
R= 4/7.
-10 0 10 20 30
SINR [dB]
10 -3
10 -2
10 0
BER
QPSK + (7,4) HC - FD
QPSK + (7,4) HC - HD
AE-based BCM
(b) Fixed transmit SNR Eb/N0= 30 dB
R= 8/7.
-10 0 10 20 30
SINR [dB]
10 -3
10 -2
10 -1
10 0
BER
QAM-16 + (7,4) HC - FD
QAM-16 + (7,4) HC - HD
AE-based BCM
(c) Fixed transmit SNR Eb/N0= 30 dB
R= 16/7.
-4 -2 0 2
SINR [dB]
10 -4
10 -2
10 0
BER
BPSK + (7,4) HC - FD
BPSK + (7,4) HC - HD
AE-based BCM
(d) Fixed RSI σ2
rr = 0 dB R= 4/7.
(e) Fixed RSI σ2
rr = 0 dB R= 8/7.
(f) Fixed RSI σ2
rr = 0 dB R= 16/7.
Fig. 10: Performance evaluation for BCM for FD-AF relay networks. Please note we vary RSI in Fig. (a)-(c) and vary transmit
SNR in Fig. (d)-(f).
0.1 0.25 0.5 0.75 1
Channel Estimation Quality ( )
10 -3
10 -2
10 -1
10 0
BER
Conventional - SINR = 7 dB
Conventional - SINR = 21 dB
Conventional - SINR = 28 dB
Proposed AE - SINR = 7 dB
Proposed AE - SINR = 21 dB
Proposed AE - SINR = 28 dB
Fig. 11: Impact of the CEQ (ς)on FD-AF relay networks for
rate R= 8/7for fixed transmit SNR and varying RSI.
We denote the channel estimation quality (CEQ) by ςand
assume that the error variance depends on the SNR denoted
by γ, such that σ2
e=σ2
h
1+ςγ σ2
h
1
1+ςγ and σ2
hω=ςγ σ2
h
1+ςγ σ2
h
ςγ
1+ςγ [48].
In Fig. 11 we analyze the impact of CEQ (ς)on the
proposed AE and conventional QPSK + (7,4) Hamming code
for fixed rate R= 8/7and transmit SNR Eb/N0= 30 dB.
Please note ς= 0 indicates completely erroneous channel
whereas ς=denotes perfect channel estimation. To
create an AE that remains unaffected of the varying channel
estimation errors, we train a single AE framework consisting
of Strain samples from varying ς={0.1,0.5,1,∞} until
the convergence as detailed in Remark 3 and test on unseen
Stest samples of varying CEQs (ς). Clearly, the proposed AE
outperform the conventional FD-AF relay networks for all the
CEQs due to the similar reasons as Sec. VI-A. In fact, the BER
performance of proposed AE framework with almost fully er-
roneous channel estimation ς= 0.1is better than conventional
FD-AF relay networks with perfect channel estimation ς=
and as the RSI increases the BER performance improvement
by the proposed AE increases, this is because the proposed
AE-based BCM is designing 2kcodewords in 2n-dimensional
space with observations in Sec. V such that it can handle the
impacts of RSI and channel estimation errors effectively.
D. Reproducibility of Proposed AE Framework
Definition 3 (Reproducibility of AE): An AE is defined to
be reproducible, for a given hyper-parameter setting PS,if
and only if we can reproduce any trained AE model M(θ)
with a very high probability, such that it does not lead to large
variations in BER for different training weight initializations
and training-validation samples of the AE.
We analyze the reproducibility by varying training-
validation data and weight initializations for training the AE
25 times and reporting the standard deviation and mean of
BER in testing data in Fig. 12. In particular, we evaluate
reproducibility of the proposed AE framework for different
RSI levels, while we fix rate R= 16/7in BCM design and
fix rate R= 12/7in d-BCM design, with transmit SNR
Eb/N0= 30 dB. We can see that the proposed AE frame-
work is highly reproducible because its standard deviation of
25 BER obtained from 25 different runs lies in the range
102104. This is due to the fact that we train the AE until
the convergence using Remark 3. Also, as the RSI increases
the variations in BER increases by a factor of two, showing
that higher RSI levels negatively impacts the reproducibility
of the trained AE framework in a FD-AF relay network.
12
-10 0 10 20 30
SINR [dB]
10 -4
10 -3
10 -2
Std. BER for 25 iterations
10 -3
10 -2
10 -1
10 0
Mean BER for 25 iterations
Std. BER for AE-based BCM for 25 iterations
Std. BER for AE-based BCM for 25 iterations
Mean BER for AE-based BCM for 25 iterations
Mean BER for AE-based BCM for 25 iterations
Fig. 12: Reproducibility of AE (R= 16/7) and AE (R=
12/7) frameworks.
E. AE-based BCM and d-BCM design with Outer 5G-NR
LDPC Codes
Until now, we consider the AE-based BCM and d-BCM
design for short block length (n= 7). Recently, 5G-NR stan-
dards propose to utilize the outer LDPC codes for facilitating
parallel execution to meet the low-latency and high throughput
requirements in the 5G’s URLLC networks [49]. Thus, we
use the 5G-NR LDPC codes with base graph 1 and rate 1/3
as outer codes [49], [50]. Specifically3, we employ 5G-NR
LDPC codes with the rate 1/3as outer code in Fig. 9c, 9f,
10c, 10f. We consider a block length of n= 11,616 for
designing LDPC codes. Thus, in Fig. 13, we compare the
BER performance with LDPC as outer codes for varying RSI
(Fig. 13a) and SNR (Fig. 13b). In Fig. 13a, at 104BER, we
see that proposed AE-based BCM and d-BCM outperform the
conventional MLD scenario by 11 dB and 17 dB, respectively.
In Fig. 13b, we see that even with outer powerful LDPC codes,
the conventional MLD scenario can not decode the signal
because of high RSI (σ2
rr = 0 dB), but the proposed AE-
based BCM and d-BCM design can decode the signal and
waterfall in BER appears at 12 dB and 16 dB. In summary,
BER performance gains attained by the proposed AE-based
BCM and d-BCM design over the MLDs for short block
length in Fig. 9c, 9f, 10c, 10f are translated and enhanced by
utilizing the outer LDPC codes. This demonstrates the greatly
improved decoding abilities of the proposed AE-based BCM
and d-BCM, even when operated for the long block lengths
with the help of an outer code.
VII. CONCLUSION
In this work, we propose end-to-end learning-based FD-AF
relay networks in the presence of the RSI using AEs for high
transmission rates R=k/n. We propose (n, k)AE-based
3For the conventional scenario, we employ 5G-NR LDPC codes with the
rate 1/3as outer codes and Hamming code with the rate 7/4as inner code,
also we utilize 16-QAM and d-PSK-8 modulation for scenarios with and
without CSI, respectively. For proposed AE, we employ 5G-NR LDPC codes
with a rate of 1/3as outer codes for the AE-based BCM and d-BCM design
with the rate of 16/7and 12/7, respectively.
-10 0 10
SINR [dB]
10 -5
10 -4
10 -3
10 -2
10 -1
10 0
BER
(i) With CSI knowledge
Conven. MLD
Proposed AE
20 dB
Improvement
-20 0 20
SINR [dB]
10 -5
10 -4
10 -3
10 -2
10 -1
10 0
BER
(ii) Without CSI knowledge
Conven. MLD
Proposed AE
32 dB
Improvement
(a) Varying RSI and fixed
SNR Eb/N0= 30 dB.
-4 -2 0 2
SINR [dB]
10 -5
10 -4
10 -3
10 -2
10 -1
10 0
BER
(i) With CSI knowledge
Conven. MLD
Proposed AE
-4 -2 0 2
SINR [dB]
10 -5
10 -4
10 -3
10 -2
10 -1
10 0
BER
(ii) Without CSI knowledge
Conven. MLD
Proposed AE
(b) Varying transmit SNR and
fixed RSI σ2
rr = 0 dB.
Fig. 13: Performance evaluation for AE-based BCM and d-
BCM design with outer 5G-NR LDPC codes.
BCM and d-BCM designs depending upon the availability of
the CSI knowledge. Further counter-intuitively in the presence
of the RSI in the FD-AF relay networks, we propose to
utilize a radio transformer network for the AE framework with
CSI knowledge to improve the NN-based decoding and BER
performance. We design a single AE framework (performing
BCM or d-BCM design) that can generalize well on varying
testing SNR or RSI levels, outperforming the conventional FD-
AF relay networks with remarkable gains, but also the half
duplex AF relay networks for d-BCM designs. We analyze the
AE’s performance in the presence of channel estimation error
and note that for moderate RSI, the proposed AE framework
with almost fully erroneous channel estimation still outper-
forms the conventional FD-AF relay networks with perfect
13
0 10 20 30 40 50
Transmit SNR, E b/N 0 [dB]
-10
-5
0
5
10
15
20
25
SINR [dB]
RSI, rr
2 = 0 dB
RSI, rr
2 = -40 dB
RSI, rr
2 = 10 dB
(a) Transmit SNR Eb/N0versus SINR.
-60 -50 -40 -30 -20 -10 0 10 20
rr
2 [dB]
-20
-15
-10
-5
0
5
10
15
20
25
30
SINR [dB]
Transmit SNR, E b/N0 = 0 dB
Transmit SNR, E b/N0 = 15 dB
Transmit SNR, E b/N0 = 30 dB
(b) RSI σ2
rr versus SINR.
Fig. 14: Impact of transmit SNR and RSI on the received SINR.
-10 0 10 20 30
SINR [dB]
10 -4
10 -3
10 -2
10 -1
BER
AE-based BCM + RTN
AE-based BCM (no RTN)
AE-based d-BCM + RTN
AE-based d-BCM (no RTN)
1.8 dB
5 dB
Fig. 15: Impact of including an RTN in the AE frameworks.
channel estimation. Moreover, we show that proposed AE is
highly reproducible for varying training weight initializations
and sample sets as the BER for different trainings varies by
a standard deviation of 102104depending on RSI levels
in the FD-AF relay node. Lastly, we consider 5G-NR LDPC
codes as outer codes for the AE-based BCM and d-BCM
designs, we can see extraordinary BER performance gains of
up to 17 dB.
With a focus on training convergence, we show that the
AE converges above a minimum required SNR and below a
maximum RSI depending on the transmission rate and CSI
availability. Furthermore, we provide the necessary conditions
for AE’s convergence by showing that once the binary cross-
entropy validation loss has converged and the NN encoder
of AE designs 2kcodewords during the training phase, the
AE has converged to its maximum potential of decoding the
signal. Lastly, by analyzing the AE-based BCM design, we
determine distinct observations of the designed codewords
in 2n-dimensional space (i)AE forms 2kcodewords in
2n-dimensional space, (ii)as the block length increases the
minimum Euclidean distance between any of the possible
codewords increases, and for sufficiently large block length
(n)when the number of codewords becomes extremely large,
the minimum Euclidean distance between any two codewords
follows a Gaussian distribution and the Euclidean distance
between the codewords concentrate to the average Euclidean
distance, (iii)the packing density improves as the rate R
decreases, (iv)the codewords designed by the AE framework
are spherical codes, and (v)as the block-length increases
the average Hamming distance between codewords increases.
We aim to consider transmitter/receiver distortion model as
discussed in [17] for the future works. Re-training of the AE
frameworks in an online settings using transfer learning [51]
is an interesting topic but we leave it for the future works.
APPENDIX A
IMPAC T OF TRANSMIT SNR A ND RSI ON THE RECEIVED
SINR AT THE DE ST INATION NODE
The received SINR at the destination node can be given as
(6) that depends on two factors (1) transmit SNR (Eb/N0)
of the source and relay node, and (2) the RSI (σ2
rr )at the relay
node. For sake of simplicity, we consider equal transmit SNR
at the source and relay nodes. In Fig. 14, we show the impact
of transmit SNR and RSI on the received SINR. Clearly, as
the transmit SNR increases or the RSI decreases the received
SINR increases. Please note that for evaluations in Sec. VI, we
fixed RSI σ2
rr = 0 dB and vary the transmit SNR [0,30] dB,
thus SINR varies from [5,3] dB (as shown in Fig. 14a) and
we fixed transmit SNR Eb/N0= 30 dB and vary the RSI
[60,20] dB, thus SINR varies from [12,30] dB (as shown
in Fig. 14b).
APPENDIX B
IMPAC T OF INCLUDING RTN IN AE FRAMEWORKS
In AE works for HD-AF relay network [6] a RTN is
included in d-BCM and excluded in BCM. In Fig. 15, we ana-
lyze the impact of including an RTN in the NN decoder of the
14
proposed AE frameworks for (n, k) = (7,4),Eb/N0= 30 dB
and varying RSI. We can see that including an RTN in AE-
based BCM design helps to improve the BER performance
by at least 5dB for lower RSI (σ2
rr 20 dB), whereas in-
cluding an RTN in AE-based d-BCM design worsens the BER
performance by at least 1.8dB for higher RSI (σ2
rr 0dB).
Thus in this work, in contrast to [6], we have include an RTN
in BCM design and do not include an RTN in d-BCM design.
REFERENCES
[1] T. O’Shea and J. Hoydis, An Introduction to Deep Learning for the
Physical Layer, in IEEE Transactions on Cognitive Communications and
Networking, vol. 3, no. 4, pp. 563–575, Dec. 2017.
[2] S. D¨
orner, S. Cammerer, J. Hoydis and S. t. Brink, “Deep Learning Based
Communication Over the Air, in IEEE Journal of Selected Topics in
Signal Processing, vol. 12, no. 1, pp. 132-143, Feb. 2018.
[3] E. Balevi and J. G. Andrews, Autoencoder-Based Error Correction Cod-
ing for One-Bit Quantization,” in IEEE Transactions on Communications,
vol. 68, no. 6, pp. 3440–3451, June 2020.
[4] T. Matsumine, T. Koike-Akino and Y. Wang, “Deep Learning-Based
Constellation Optimization for Physical Network Coding in Two-Way
Relay Networks,” ICC 2019 - 2019 IEEE Intern. Conf. on Commun.
(ICC), China, 2019, pp. 1–6.
[5] A. Gupta and M. Sellathurai, “End-to-End Learning-based Amplify-and-
Forward Relay Networks using Autoencoders, ICC 2020 - 2020 IEEE
International Conference on Communications (ICC), Dublin, Ireland,
2020, pp. 1–6.
[6] A. Gupta and M. Sellathurai, ”End-to-End Learning-Based Framework
for Amplify-and-Forward Relay Networks, in IEEE Access, vol. 9, pp.
81660-81677, 2021.
[7] A. Gupta and M. Sellathurai, “End-to-End Learning-based Two-Way AF
Relay Networks with I/Q Imbalance,” 2021 IEEE 22nd International
Workshop on Signal Processing Advances in Wireless Communications
(SPAWC), 2021, pp. 111-115.
[8] A. Gupta and M. Sellathurai, “A Novel Average Autoencoder-based
Amplify-and-Forward Relay Networks with Hardware Impairments,
IEEE Transactions on Cognitive Communications and Networking, Ac-
cepted for Publication, 2021.
[9] Y. Lu, P. Cheng, Z. Chen, Y. Li, W. H. Mow and B. Vucetic, “Deep
Autoencoder Learning for Relay-Assisted Cooperative Communication
Systems,” in IEEE Transactions on Communications, vol. 68, no. 9, pp.
5471–5488, Sept. 2020.
[10] A. Gupta and M. Sellathurai, “A Stacked-Autoencoder Based End-to-
End Learning Framework for Decode-and-Forward Relay Networks,
ICASSP 2020 - IEEE Intern. Conf. Acoustics, Speech, Signal Process.
(ICASSP), 2020, pp. 5245–5249.
[11] A. Gupta and M. Sellathurai, “A Stacked Autoencoder-based Decode-
and-Forward Relay Networks with I/Q Imbalance, AI-6G Workshop,
IEEE World Congress on Computational intelligence (WCCI Workshop),
Accepted, 2022.
[12] S. Cammerer, F. A. Aoudia, S. Drner, M. Stark, J. Hoydis and S. ten
Brink, “Trainable Communication Systems: Concepts and Prototype, in
IEEE Transactions on Communications, vol. 68, no. 9, pp. 5489–5503,
Sept. 2020.
[13] A. Gupta, K. Singh and M. Sellathurai, “Time-Switching EH-Based
Joint Relay Selection and Resource Allocation Algorithms for Multi-
User Multi-Carrier AF Relay Networks,” in IEEE Trans. Green Commun.
Netw., vol. 3, no. 2, pp. 505-522, June 2019.
[14] K. Singh, A. Gupta, T. Ratnarajah and M. Ku, A General Approach
Toward Green Resource Allocation in Relay-Assisted Multiuser Commu-
nication Networks,” in IEEE Trans. Wireless Commun., vol. 17, no. 2,
pp. 848-862, Feb. 2018.
[15] A. Gupta, S. Biswas, K. Singh, T. Ratnarajah and M. Sellathurai, An
Energy-Efficient Approach Towards Power Allocation in Non-Orthogonal
Multiple Access Full-Duplex AF Relay Systems,” 2018 IEEE 19th Intern.
Work. Signal Process. Advances in Wireless Commun, (SPAWC), 2018, pp.
1-5.
[16] K. Singh, A. Gupta and T. Ratnarajah, “Efficient joint subcarrier and
power allocation for achieving green multiuser full-duplex decode-and-
forward relay networks,” 2017 IEEE Intern. Conf. on Commun. (ICC),
2017, pp. 1-6.
[17] J. Xue, S. Biswas, A. C. Cirik, H. Du, Y. Yang, T. Ratnarajah, and M.
Sellathurai, “Transceiver Design of Optimum Wirelessly Powered Full-
Duplex MIMO IoT Devices, in IEEE Trans. Commun., vol. 66, no. 5,
pp. 1955-1969, May 2018.
[18] C. Zhong, M. Matthaiou, G. K. Karagiannidis and T. Ratnarajah,
“Generic Ergodic Capacity Bounds for Fixed-Gain AF Dual-Hop Re-
laying Systems,” in IEEE Trans. Vehicular Technol., vol. 60, no. 8, pp.
3814-3824, Oct. 2011.
[19] Z. Ding, T. Ratnarajah and K. K. Leung, “On the study of network
coded AF transmission protocol for wireless multiple access channels,”
in IEEE Trans. Wireless Commun., vol. 8, no. 1, pp. 118-123, Jan. 2009.
[20] A. Bishnu, M. Holm and T. Ratnarajah, “Performance Evaluation of
Full-Duplex IAB Multi-Cell and Multi-User Network for FR2 Band, in
IEEE Access, vol. 9, pp. 72269-72283, 2021.
[21] H. He, S. Biswas, P. Aquilina, T. Ratnarajah and J. Yang, “Performance
Analysis of Multi-Cell Full-Duplex Cellular Networks, in IEEE Access,
vol. 8, pp. 206914-206930, 2020.
[22] J. R. Krier and I. F. Akyildiz, Active self-interference cancellation
of passband signals using gradient descent,” 2013 IEEE 24th Ann.
Intern. Symp. on Personal, Indoor, and Mobile Radio Commun. (PIMRC),
London, UK, 2013, pp. 1212–1216.
[23] T. P. Do and T. V. T. Le, “Power Allocation and Performance Compar-
ison of Full Duplex Dual Hop Relaying Protocols,” in IEEE Communi-
cations Letters, vol. 19, no. 5, pp. 791–794, May 2015.
[24] K. -G. Wu, F. -T. Chien, Y. -F. Lin and M. -K. Chang, “SINR and Delay
Analyses in Two-Way Full-Duplex SWIPT-Enabled Relaying Systems,
in IEEE Transactions on Communications,Early Access, 2021.
[25] K. Yang, H. Cui, L. Song and Y. Li, “Efficient Full-Duplex Relaying
With Joint Antenna-Relay Selection and Self-Interference Suppression,
in IEEE Transactions on Wireless Communications, vol. 14, no. 7, pp.
3991-4005, July 2015.
[26] H.M.D. Oliveira, G. Battail, “The random coded modulation: perfor-
mance and Euclidean distance spectrum evaluation, Ann. T ´
el´
ecommun.,
vol. 47, pp. 107-124, 1992.
[27] Ian Goodfellow et. al.. Deep learning. MIT Press, 2016.
[28] D. Masters and C. Luschi, “Revisiting Small Batch Training for Deep
Neural Networks,” arXiv, 2018.
[29] Y. Huang et. al., “GPipe: Efficient Training of Giant Neural Networks
using Pipeline Parallelism,” in Proc. Thirty-third Conference on Neural
Information Processing Systems (NIPS), 2019.
[30] L. Jim´
enez Rodr´
ıguez, N. H. Tran and T. Le-Ngoc, “Performance of
Full-Duplex AF Relaying in the Presence of Residual Self-Interference,”
in IEEE Journal on Selected Areas in Communications, vol. 32, no. 9,
pp. 1752-1764, Sept. 2014.
[31] T. K. Baranwal, D. S. Michalopoulos and R. Schober, “Outage
Analysis of Multihop Full Duplex Relaying,” in IEEE Communi-
cations Letters, vol. 17, no. 1, pp. 63–66, January 2013, doi:
10.1109/LCOMM.2012.112812.121826.
[32] M. R. Avendi and H. H. Nguyen, “Performance of Selection Combin-
ing for Differential Amplify-and-Forward Relaying Over Time-Varying
Channels,” in IEEE Trans. on Wireless Communications, vol. 13, no. 8,
pp. 4156–4166, Aug. 2014.
[33] Y. Lou, Y. Ma, Q. Yu, H. Zhao and W. Xiang, A Differential ML Com-
biner for Differential Amplify-and-Forward System in Time-Selective
Fading Channels,” in IEEE Trans. on Vehicular Technology, vol. 65, no.
12, pp. 10157–10163, Dec. 2016.
[34] T. M. Cover and J. A. Thomas, Elements of information theory., John
Wiley & Sons, Nov. 2012.
[35] N. Ketkar, “Introduction to keras, Deep Learning with Python, Springer,
pp. 97–111, 2017.
[36] Martn Abadi et. al., “TensorFlow: Large-scale machine learning on
heterogeneous systems,” Technical Report, Goggle Brain, arXiv, 2015.
[Online]:https://arxiv.org/abs/1605.08695.
[37] D. Kingma and J. Ba., “Adam: A method for stochastic optimization,
In: arXiv preprint arXiv:1412.6980 (2014). [Online].
[38] X. Glorot and Y. Bengio, “Understanding the difficulty of training deep
feed-forward neural networks,” in Proceedings International Conference
AI Statistics, vol. 9, pp. 249–256, May 2010.
[39] M. Li, M. Soltanolkotabi, S. Oymak, “Gradient Descent with Early
Stopping is Provably Robust to Label Noise for Overparameterized Neural
Networks,” ArXiv, 2019. [Online]:https://arxiv.org/abs/1903.11680.
[40] F. Alberge, “Deep Learning Constellation Design for the AWGN
Channel With Additive Radar Interference, in IEEE Transactions on
Communications, vol. 67, no. 2, pp. 1413–1423, Feb. 2019.
[41] M. Stark, F. Ait Aoudia and J. Hoydis, ”Joint Learning of Geometric and
Probabilistic Constellation Shaping,” 2019 IEEE Globecom Workshops
(GC Wkshps), 2019, pp. 1-6.
15
[42] F. A. Aoudia and J. Hoydis, ”Joint Learning of Probabilistic and
Geometric Shaping for Coded Modulation Systems,” GLOBECOM 2020
- 2020 IEEE Global Communications Conference, 2020, pp. 1-6
[43] A. Choromanska, M. Henaff, M. Mathieu, G. B. Arous, and Y. LeCun,
The loss surfaces of multilayer networks, in Proc. 18th Int. Conf. Artificial
Intelligence and Statistics (AISTATS), 2015, pp. 192204.
[44] Y. N. Dauphin et. al. “Identifying and attacking the saddle point problem
in high-dimensional non-convex optimization, in Proc. 27th Advances in
Neural Information Processing Systems, pp. 29332941. 2014.
[45] E. Agrell, “Database of sphere packings,” [Online]: http://codes.se/
packings, 2014, accessed Mar. 1, 2019.
[46] S. Park and Moo-Kwang Byeon, “Irregularly distributed triangular
quadrature amplitude modulation,” 2008 IEEE 19th International Sym-
posium on Personal, Indoor and Mobile Radio Communications, Cannes,
France, 2008, pp. 1–5.
[47] T. G. Markiewicz, “Construction and Labeling of Triangular QAM, in
IEEE Communications Letters, vol. 21, no. 8, pp. 1751–1754, Aug. 2017.
[48] A. R. Heidarpour, M. Ardakani and C. Tellambura, “Network Coded
Cooperation Based on Relay Selection with Imperfect CSI,” 2017 IEEE
86th Vehicular Technology Conference (VTC-Fall), 2017, pp. 1-5.
[49] 3GPP TS 38.212 v16.0.0, “NR; Multiplexing and channel coding.” 3rd
Generation Partnership Project (3GPP), Technical Specification Group
Radio Access Network, Jan. 2020.
[50] M. Shirvanimoghaddam et al., “Short Block-Length Codes for Ultra-
Reliable Low Latency Communications, in IEEE Communications Mag-
azine, vol. 57, no. 2, pp. 130–137, February 2019.
[51] C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, A survey
on deep transfer learning,” arXiv, 2018, arXiv:1808.01974. [Online].
Available: https://arxiv.org/abs/1808.01974.
... Edge devices, ranging from edge servers to IoT endpoints, possess increasing computational power and intelligence, allowing them to perform complex text analytics and natural language processing tasks at the edge of the network [14][15][16]. This paradigm shift has not only reduced the latency associated with transmitting data to distant cloud servers but has also paved the way for enhanced data privacy and security [17][18][19][20]. By processing sensitive text data locally, edge computing mitigates potential risks associated with data exposure during transmission, catering to the stringent privacy requirements of various industries. ...
Article
Full-text available
With the rapid advancement of artificial intelligence and wireless communication technologies, the abundance of textual information has grown significantly, accompanied by a plethora of multidimensional metrics such as innovation, application prospects, key technologies, and expected outcomes. Extracting valuable insights from these multifaceted indicators and establishing an effective composite evaluation weighting framework poses a pivotal challenge in text information processing. In response, we propose a novel approach in this paper to textual information processing, leveraging multi-dimensional indicator weights (MDIWs). Our method involves extracting semantic information from text and inputting it into an LSTM-based textual information processor (TIP) to generate MDIWs. These MDIWs are then processed to create a judgment matrix following by eigenvalue decomposition and normalization, capturing intricate semantic relationships. Our framework enhances the comprehension of multi-dimensional aspects within textual data, offering potential benefits in various applications such as sentiment analysis, information retrieval, and content summarization. Experimental results underscore the effectiveness of our approach in refining and utilizing MDIWs for improved understanding and decision-making. This work contributes to the enhancement of text information processing by offering a structured approach to address the complexity of multidimensional metric evaluation, thus enabling more accurate and informed decision-making in various domains.
... The evolution of Information Technology (IT) has been marked by remarkable advancements and transformative shifts [1][2][3][4]. Beginning with the emergence of early computing systems in the mid-20th century, IT has progressed through successive waves of innovation, including the proliferation of personal computers in the 1980s, the advent of the internet and the World Wide Web in the 1990s, and the subsequent rise of mobile computing and cloud technology [5]. These developments have not only revolutionized communication, but also reshaped industries, economies, and societal interactions. ...
Article
Full-text available
This study delves into Internet of Things (IoT) networks wherein a transmitting source communicates information to a designated recipient. The presence of signal attenuation challenges the direct transmission of information from the source to the recipient. To surmount this obstacle, we investigate IoT network communication facilitated by multi-hop relays, whereby multiple relays collaboratively enable the conveyance of data from the source to the recipient across intermediate stages. For the considered IoT networks augmented by multi-hop relays, we assess the performance of the system by analyzing the probability of transmission outage. This analysis entails the derivation of an analytical expression for evaluating the occurrence of IoT network outage. Additionally, we gauge the system's effectiveness by examining the attainable transmission rate, wherein an analytical expression is furnished to assess the IoT data rate. The empirical results, along with the analytical findings, are subsequently presented to validate the formulated expressions in the context of IoT networks empowered by multi-hop relays. Notably, the utilization of multi-hop relaying emerges as a efficacious strategy for substantially expanding the coverage scope of IoT networks.
... Considering time-consuming operations of finding optimal MIMO downlink beamformer, these featureextraction-based methods have valued online beamformer findings. For AF relay-assisted systems, the authors of [19] have proposed modulation-and-coding schemes using deeplearning autoencoders. Immediately applying these modulation-and-coding DL schemes to MIMO AF relaying systems is unsuitable because the statistical model of interference effects at a single antenna-equipped relay node is not identical at the multiple-antenna relay node. ...
Article
Full-text available
This article studies transceiver design methods for multiple-source and multiple-destination communication systems via an amplify-and-forward relay. Specifically, sum mean-square-error (MSE) minimizing source power allocation schemes, a relay beamforming matrix, and destination filter design methods are developed. After formulating the tractable sum-MSE minimization problem by introducing an auxiliary variable, a block-coordinate-descent-based algorithm is proposed to alternately optimize each transceiver coefficients of source, relay, and destination nodes. Subsequently, deep learning (DL)-assisted design methods are proposed to address the drawbacks of iterative algorithms. Exploiting the structure of the optimum relay beamformer, the proposed DL-based methods return only a single parameter to construct the relay beamforming matrix as well as the transceiver coefficients for the source and destination nodes, thereby, efficiently implementing the deep neural network of the proposed scheme. The effectiveness of the proposed methods was verified through numerical simulations. In particular, without iterative calculations, the DL-based schemes show almost identical performance to that of the optimum methods.
... Researchers are exploring new methods to improve the efficiency and scalability of topic modeling, such as distributed and parallel algorithms. Named entity recognition (NER) is another area of active research in text mining [12][13][14]. NER is the process of identifying and classifying named entities, such as people, organizations, and locations, in text. This is important for applications such as information extraction and text summarization. ...
Article
Full-text available
This paper employs deep learning technique to perform the research of text mining for power grid networks, focusing on fundamental elements such as loss and activation functions. Through some analysis and formulas, we explain how these functions contribute to deep learning. We also introduce major deep learning training models, including CNN and RNN, and provide visual aids to aid understanding. To demonstrate the impact of various factors on deep learning training, we employ control variable experiments to analyze the influence of factors such as learning rate, batch size, and data noise on model training trends. While the influence of hyperparameters and data noise are covered in this paper, other factors such as CPU and memory frequency, as well as GPU performance, also play a crucial role in deep learning training. Therefore, continuous adjustments to various factors are necessary to achieve optimal training results for deep learning models in power grid networks.
Article
Full-text available
Full-duplex (FD) communication is a potential game changer for future wireless networks. It allows for simultaneous transmit and receive operations over the same frequency band, a doubling of the spectral efficiency. FD can also be a catalyst for supercharging other existing/emerging wireless technologies , including cooperative and cognitive communications, cellular networks, multiple-input multiple-output (MIMO), massive MIMO, non-orthogonal multiple access (NOMA), millimeter-wave (mmWave) communications, unmanned aerial vehicle (UAV)-aided communication, backscatter communication (Back-Com), and reconfigurable intelligent surfaces (RISs). These integrated technologies can further improve spectral efficiency, enhance security, reduce latency, and boost the energy efficiency of future wireless networks. A comprehensive survey of such integration has thus far been lacking. This paper fills that need. Specifically, we first discuss the fundamentals, highlighting the FD transceiver structure and the self-interference (SI) cancellation techniques. Next, we discuss the coexistence of FD with the above-mentioned wireless technologies. We also provide case studies for some of the integration scenarios mentioned above and future research directions for each case. We further address the potential research directions, open challenges, and applications for future FD-assisted wireless, including cell-free massive MIMO, mmWave communications, UAV, BackCom, and RISs. Finally, potential applications and developments of other miscellaneous technologies, such as mixed radio-frequency/free-space optical, visible light communication, dual-functional radar-communication, underwater wireless communication, multiuser ultra-reliable low-latency communications, vehicle-to-everything communications, rate splitting multiple access, integrated sensing and communication, and age of information, are also highlighted.
Article
Full-text available
In this paper, we propose a novel Average au-toencoder (AE)-based amplify-and-forward (AF) relay networks impacted by the I/Q imbalance (IQI) and additional hardware impairments (AHI), where the source and destination nodes are equipped with neural network (NN)-based encoder and decoder, while a conventional AF relay node assists the transmission. The average AE employs multiple small NN-based decoders at the destination node, each decoding a soft probabilistic output that is averaged to obtain the final soft probabilistic output at the destination node. By considering multiple small NN decoders, we reduce the implementation complexity significantly while improving the performance compared to the AE with a single large but NN-based decoder. Within this Average AE framework, we propose a coded modulation design (CMD) with zero-forcing-based IQI compensation that considers the availability of the channel state information (CSI) and IQI knowledge. However, the IQI and CSI need to be estimated separately. Thus, we also propose a CMD with no IQI compensation that requires only the CSI knowledge. Finally, we propose a differential CMD that removes the necessity of both the CSI and IQI knowledge. Under low signal-to-interference-and-noise-ratio regimes, we show that the proposed Average AE outperforms the optimal maximum likelihood detector by considerable margin. Index Terms-AF relay networks, additional hardware impairments , average autoencoder, block coding, coded modulation design, differential coded modulation design, I/Q imbalance, and small neural networks.
Article
Full-text available
We study end-to-end learning-based frameworks for amplify-and-forward (AF) relay networks, with and without the channel state information (CSI) knowledge. The designed framework resembles an autoencoder (AE) where all the components of the neural network (NN)-based source and destination nodes are optimized together in an end-to-end manner, and the signal transmission takes place with an AF relay node. Unlike the literature that employs an NN-based relay node with full CSI knowledge, we consider a conventional relay node that only amplifies the received signal using CSI gains. Without the CSI knowledge, we employ power normalization-based amplification that normalizes the transmission power of each block of symbols. We propose and compare symbol-wise and bit-wise AE frameworks by minimizing categorical and binary cross-entropy loss that maximizes the symbol-wise and bit-wise mutual information (MI), respectively. We determine the estimated MI and examine the convergence of both AE frameworks with signal-to-noise ratio (SNR). For both these AE frameworks, we design coded modulation and differential coded modulation, depending upon the availability of CSI at the destination node, that obtains symbols in 2n-dimensions, where n is the block length. To explain the properties of the 2n-dimensional designs, we utilize various metrics like minimum Euclidean distance, normalized second-order and fourth-order moments, and constellation figures of merit. We show that both these AE frameworks obtain similar spherical coded-modulation designs in 2n-dimensions, and bit-wise AE that inherently obtains the optimal bit-labeling outperforms symbol-wise AE (with faster convergence under low SNR) and the conventional AF relay network with a considerable SNR margin.
Article
Full-text available
One of the approaches to support high data rates in beyond 5G cellular networks is the dense deployment of the small cell millimeter-wave (frequency range 2 (FR2)) (> 24.5 GHz) base stations. However, this dense deployment of base stations leads to high cost, because it requires fiber backhaul connection with the core network. One prominent solution is the integrated access and backhaul (IAB) network proposed in 3GPP Release 16, where some portion of wireless spectrum is used for backhaul to serve base stations instead of fiber, and the remaining portion of the spectrum is used for user equipment for communication. The partition of the spectrum into access and backhaul degrades the spectral efficiency of the IAB network. Thus, in this paper, we propose a full duplex (FD) enabled IAB network with large-scale array systems to enhance the spectral efficiency of the IAB network. In the FD IAB network, the IAB node is FD which transmits and receives at the same frequency/time resources. We consider a multi-cell multi-user IAB network with all the interferences due to FD transmission and evaluate the performance in terms of bit-error-rate and spectral efficiency. Moreover, an algorithm for choosing different users is proposed which is based on the cross-correlation of RF precoder weights. Further, identification of the optimal beam’s index is also proposed which is based on the index of synchronization signal block and half-frame bit of physical broadcast channel payload of 5G NR. Finally, successive interference cancellation is proposed for self-interference mitigation in the digital domain.
Article
Full-text available
We analyze the performance of a cellular network, where Poisson point process distributed half-duplex (HD) downlink (DL) and uplink (UL) users are served by multiple full-duplex (FD) base stations (BSs). To address the surge in interference in the network due to the simultaneous operation in time and frequency of the FD BSs, we (a) adopt a self-interference cancellation scheme at each BS, and (b) apply linear interference alignment in each cell to cancel the intra-cell interference. Further, to better capture the distribution of the FD BSs, we model the BSs as a Matérn hard-core point process, in which a minimum distance is imposed between points. The performance of both UL and DL users is analyzed by deriving general expressions and closed-form approximations for the outage probability and throughput. Next, simulations are carried out for both macro and micro cell environments under both FD and HD operations with respect to various network parameters. Our results reveal several fundamental characteristics and the necessary conditions required for the successful deployment of such networks.
Article
Full-text available
Emerging recently as a novel concept in communication system design, end-to-end learning introduces deep neural networks (NNs) to represent the transmitter and receiver functions. Consequently, the whole system can be interpreted as an autoencoder (AE), which can be optimized from a holistic approach through a data-driven training method. Until now, the AE technique is mainly developed for point-to-point communication scenarios. In this paper, we aim to develop a novel NN-based AE scheme for relay-assisted cooperative communication systems. Specifically, three NN components are constructed to learn the behavior of the transmitter, relay node, and receiver, respectively. As the conventional end-to-end training is inapplicable, a novel two-stage training approach is proposed to indirectly solve the end-to-end training problem. The implicit approximations involved are analytically expressed based on information theory, offering insights on the achievable performance with the proposed training method. The proposed AE model eliminates the need for channel state information and noise variance of any link, and is adaptive to the variation in the input block length. Simulation results verify its advantages over the conventional decode-and-forward (DF) and amplify-and-forward (AF) schemes in various scenarios.
Article
We first investigate the accumulated loopback self-interference (ALSI) under amplify-and-forward (AF) protocol in two-way full-duplex (TWFD) relaying systems with simultaneous wireless information and power transfer (SWIPT). We analyze how SWIPT affects ALSI and establish the condition for ALSI to converge when there is no energy buffer. Unlike ALSI in TWFD relaying without SWIPT (Chang et al. , 2019), ALSI in TWFD relaying with SWIPT always converges and does not grow unbounded. In addition, we study the delay performance in such a system under AF and decode-and-forward (DF) protocols. In AF protocol, the condition for successful signal transmissions is provided. When that condition is met, the received signals at the relay can always get through the channels but experience delays. Otherwise, the outage event occurs. In DF protocol, the best-effort scenario (having no energy buffer) and the QoS scenario (having energy buffer) are discussed. The probability of successful transmissions and the distribution of the delay in both scenarios are derived. Simulation results of DF protocol show that the best-effort scenario is suitable for the delay-tolerant service and the QoS scenario is suitable for the delay-sensitive service. Furthermore, AF protocol is more suitable for the service with less QoS demanding.
Article
We consider a trainable point-to-point communication system, where both transmitter and receiver are implemented as neural networks (NNs), and demonstrate that training on the bit-wise mutual information (BMI) allows seamless integration with practical bit-metric decoding (BMD) receivers, as well as joint optimization of constellation shaping and labeling. Moreover, we present a fully differentiable neural iterative demapping and decoding (IDD) structure which achieves significant gains on additive white Gaussian noise (AWGN) channels using a standard 802.11n low-density parity-check (LDPC) code. The strength of this approach is that it can be applied to arbitrary channels without any modifications. Going one step further, we show that careful code design can lead to further performance improvements. Lastly, we show the viability of the proposed system through implementation on software-defined radios (SDRs) and training of the end-to-end system on the actual wireless channel. Experimental results reveal that the proposed method enables significant gains compared to conventional techniques.