ArticlePDF Available

End-to-End Learning-Based Full-Duplex Amplify-and-Forward Relay Networks

January 2022
IEEE Transactions on Communications PP(99):1-1

January 2022
PP(99):1-1

DOI:10.1109/TCOMM.2022.3225460

Authors:

Ankit Gupta

Heriot-Watt University

Mathini Sellathurai

Heriot-Watt University

T. Ratnarajah

The University of Edinburgh

Full duplex (FD) relaying can provide double spectral efficiency. Despite advanced self-interference cancellation techniques, residual self-interference (RSI) limits the performance significantly. We present an autoencoder (AE)-based block coded modulation (BCM) and differential BCM (d-BCM) for an FD amplify-and-forward (FD-AF) relay network that can tackle the deteriorating impacts of RSI. Existing works treat AE frameworks as black-box with minimal/no focus on training convergence, limiting AE’s practical deployment. Focussing on training convergence, firstly, we show that training of AE converges above minimum signal-to-noise-ratio (SNR) and below maximum RSI level, and CSI helps in faster convergence. Secondly, we establish a relationship between training hyper-parameters and AE-based BCM/d-BCM design, by showing that, for any given hyper-parameters, training of the AE has converged to its maximum potential of decoding if AEs encoder has designed 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> k </sup> codewords, with an emphasis on the minimum required training samples. To open the black-box AE, we reveal five observations in the AE-based designed codewords concerning Euclidean distance, packing density, hamming distance, and kurtosis, that resemble the desired observations of theoretical random coded modulations. By extensive simulations, we show that the proposed AE outperforms the conventional methods considerably for varying SNR, RSI, transmission rate, channel estimation errors, and small/practical block lengths.

Content uploaded by Ankit Gupta

Content may be subject to copyright.

End-to-End Learning-based Full-Duplex

Amplify-and-Forward Relay Networks

Ankit Gupta, Mathini Sellathurai and Tharmalingam Ratnarajah

Abstract—Full duplex (FD) relaying can provide double spec-

tral efﬁciency. Despite advanced self-interference cancellation

techniques, residual self-interference (RSI) limits the perfor-

mance signiﬁcantly. We present an autoencoder (AE)-based block

coded modulation (BCM) and differential BCM (d-BCM) for

an FD amplify-and-forward (FD-AF) relay network that can

tackle the deteriorating impacts of RSI. Existing works treat

AE frameworks as black-box with minimal/no focus on training

convergence, limiting AE’s practical deployment. Focussing on

training convergence, ﬁrstly, we show that training of AE

converges above minimum signal-to-noise-ratio (SNR) and below

maximum RSI level, and CSI helps in faster convergence.

Secondly, we establish a relationship between training hyper-

parameters and AE-based BCM/d-BCM design, by showing that,

for any given hyper-parameters, training of the AE has converged

to its maximum potential of decoding if AEs encoder has designed

2kcodewords, with an emphasis on the minimum required

training samples. To open the black-box AE, we reveal ﬁve

observations in the AE-based designed codewords concerning

Euclidean distance, packing density, hamming distance, and

kurtosis, that resemble the desired observations of theoretical

random coded modulations. By extensive simulations, we show

that the proposed AE outperforms the conventional methods

considerably for varying SNR, RSI, transmission rate, channel

estimation errors, and small/practical block lengths.

Index Terms—Amplify-and-forward, autoencoder, block coded

modulation, deep learning, differential block coded modulation,

full-duplex, neural networks, relay networks and residual self

interference.

I. INTRODUCTION

End-to-end learning-based autoencoder (AE) framework has

appeared as a promising solution for performing block coded

modulation (BCM) design with the channel state information

(CSI) knowledge and differential BCM (d-BCM) without

the CSI knowledge, that achievies signiﬁcant bit-error-rate

(BER) performance gains for rate R=k/n [bits/channel-

reuse] [1]–[12]. The AE frameworks have been investigated

extensively for the amplify-and-forward (AF) [13], [15] and

decode-and-forward (DF) [14], [16] relaying protocols uti-

lizing the half-duplex (HD) mode in [4]–[8] and [9]–[11],

respectively. Recently, the full-duplex (FD) relay is recognized

as an enabling technology to realize the expected gains in

A. Gupta and M. Sellathurai are with the Engineering and physical science

(EPS) department at Heriot-Watt University, Edinburgh EH14 4AS, U.K. (e-

mail: {ag104, m.sellathurai}@hw.ac.uk). T. Ratnarajah is with the Institute

for Digital Communications, the University of Edinburgh, Edinburgh EH9

3FG, U.K. (e-mail: t.ratnarajah@ed.ac.uk).

This work is supported in part by the U.K. Engineering and Physical

Sciences Research Council under Grant EP/P009670/1, the COG-MHEAR:

Towards cognitively-inspired 5G IoT enabled, multi-modal Hearing Aids

under Grant EP/T021063/1, and the Signal Procssing in the Information Age

under Grant EP/S000631/1.

the future networks, as it can double the spectral efﬁciency

by establishing concurrent transmission and reception on the

same temporal and spectral resources [15]–[25]. However,

none of the AE frameworks in [1]–[12] have considered FD

relaying network, where the self-interference (SI) leaking from

the signal transmitted by the FD relay node interferes with

the signal received at the FD relay node, thereby limiting

the spectral efﬁciency gains. Recent works [15]–[21] attest

to the facilitation of FD by superior multiple self-interference

cancellation (SIC) methods, such as antenna isolation, analog-

domain suppression, and digital-domain suppression [22].

Even with multiple SIC techniques, a residual self-interference

(RSI) is always present in the system. Several studies have

focused on conventional ways of analysis and optimization

of FD-AF relaying in the presence of RSI [23]–[25], wherein,

the relay requires channel gains of the source to relay link and

the self-interference channel for determining the ampliﬁcation

factor. Besides, the destination node would need the channel

state information (CSI) knowledge of the overall channel of

source-relay-destination, and sometimes the channel gains the

self-interference channel. Further, estimating CSI knowledge

increases the feedback overhead, which will increase expo-

nentially in the future internet-of-things networks. However,

differential FD-AF relaying networks have never been studied,

to the best of authors’ knowledge. Further, none of the prior

AE or conventional works [1]–[25] have performed BCM or

d-BCM for the FD-AF relay networks.

In the seminal work of 1991 [26], Oliveira investigated

random short BCM design in 2-dimensional space using

the minimum squared Euclidean distance metric and showed

the existence of the optimum random BCM design, without

providing the method to obtain the same. The AE frameworks

provide us the method to perform the BCM/d-BCM design

without mathematical formulations, in 2n-dimensional space.

However, the AE frameworks [1]–[12] are considered “black

box” models, where insights into the obtained solutions almost

remain non-existent. The t-stochastic neighbor embedding (t-

SNE) [27] has been utilized for insights into the AE-based

BCM/d-BCM designed constellations in higher-dimensional

space [1], [6]. However, the t-SNE only provides the infor-

mation about the number of constellation points (codewords)

obtain any relevant information about the designed codewords.

Thus, there is an urgent need to open the black box of

AE frameworks for the fully understanding and practical

realization of the AE frameworks in future networks.

The process of training the AE frameworks include deter-

mining hyper-parameter settings, such as weight initialization,

activation functions, learning-rate, batch size, etc [27]. Al-

though the training process of neural networks (NNs) have

seen advancements, yet no universal technique exists to op-

timize these hyper-parameter settings, leading to suboptimal

choices, forcing the AE to get stuck in a local minima

while minimizing the cross entropy (CE) loss (c.f. [28],

[29]). Thus, in existing literature [1]–[12], hyper-parameter

settings are obtained sub-optimally by hit-and-trial method.

Which includes training an AE framework with various hyper-

parameter settings, monitoring the validation CE loss and

picking the hyper-parameter settings that give the minimum

validation CE loss during the training. However from coded

modulation perspective, a major problem with determining

the AE’s convergence by simply monitoring the validation

CE loss is that the validation CE loss for most of the non-

optimal hyper-parameter settings also reduces with training

epochs and thus we can not surely determine if the AE-

based designing of the BCM and d-BCM is converged by only

monitoring the validation CE loss. Thus, for any given hyper-

parameter settings, we need to determine the relationship

between the BCM and d-BCM designs performed by the AE

and any given hyper-parameter settings, which can indicate

if the trained AE has converged to its maximum potential.

Furthermore, for greater insights into the training convergence

of AE frameworks in practical settings, we also need to

determine impact of varying signal-to-noise-ratio (SNR) and

RSI levels, and presence/absence of the CSI knowledge on the

training convergence.

The major contributions of this work are summarized as:

•We propose a bit-wise AE-based FD-AF relay networks,

where we consider NN-based encoder-decoder at the

source and destination nodes, and the conventional FD-

AF relay node operating in the presence of RSI. Depend-

ing on the availability of the CSI knowledge, we consider

three scenarios. Firstly, we propose AE-based BCM de-

we also analyze the proposed AE-based BCM design

with imperfect CSI knowledge. Thirdly, we completely

remove the necessity of CSI knowledge by proposing

differential FD-AF relay networks – (i) we design the

ampliﬁcation factor for conventional FD-AF relay node

by including the second order channel statistics of the

RSI, and (ii) we propose AE-based d-BCM design. In

contrast to the existing literature [1], [5]–[10] that have

analyzed AE frameworks with the rate R= 4/7,8/7. We

design a single NN architecture for the AE framework

that can handle varying high rate transmissions such

as R={4/7,8/7,12/7,16/7}for short block length

(n= 7). Furthermore, we train the AE to remain highly

generalizable of the testing SNR or RSI levels.

•Focusing on the training convergence of the proposed AE

framework for FD-AF relay network, we show that:

–For any given hyper-parameter settings, the two nec-

essary conditions for the training convergence are:

C1: The validation CE loss of the AE has converged

with respect to the training epochs and number of

training samples.

C2: The NN encoder of the AE designs 2kcodewords.

–The training of AE, of sufﬁciently large block length

(n), converge to a locally optimum minima above a

minimum required SNR and below a maximum RSI

level.

–The CSI knowledge helps in faster convergence of the

AE frameworks.

•With the aim to open the black-box of the AE-based BCM

and d-BCM designs, we utilize the minimum Euclidean

distance, packing density, average Hamming distance, and

Kurtosis to reveal the ﬁve distinct observations of 2k

codewords designed in 2n-dimensional space at the NN

encoder of the source node. These observations resemble

the desired theoretical observations of optimal random

BCM design discussed in [26].

•By performing extensive simulations, we show that the

proposed AE-based BCM and d-BCM designs outper-

form the traditional FD-AF relay networks using (7,4)

Hamming code-based error correction as a baseline for

varying transmission rates (R), SNR and RSI levels by

considerable margins. Further, for longer block lengths,

we consider 5G-NR low density parity check (LDPC)

codes as outer codes for the AE-based BCM and d-BCM

designs and show signiﬁcant BER performance gains.

Lastly, we also show that the proposed AE framework

remains highly reproducible even with different training

samples and weight initializations.

The rest of the work is organized as follows. In Sec. II we

propose the FD-AF relay network, model the RSI, propose a

differential scenario, and signal transmission-reception model.

In Sec. III we propose the AE-based FD-AF relay network, and

detail the hyper-parameter settings. In Sec. IV, we study the

convergence of AE with its necessary conditions. In Sec. V,

we analyze the observations of BCM and d-BCM designs. We

perform extensive performance evaluation in Section VI and

conclude this work in Section VII.

II. SY ST EM MO DE L

We consider a FD-AF relay networks as shown in Fig. 1,

consisting of a source node (S) that wants to transmit its signal

to the destination node (D), with the aid of an FD-AF relay

node (R). Each of the source and destination nodes has a single

antenna for transmission and reception, respectively. The relay

node has two antennas, one for the reception and the other

for transmission. We assume that the direct link between the

source and destination node is strongly attenuated because of

severe path-loss and shadowing effects.

A. Modelling the Residual Self Interference (RSI)

The RSI at the FD-AF relay node (R) can be modeled in two

ways – (1) the complex Gaussian random model, where the

RSI is modeled as the independent and identically distributed

(i.i.d.) complex Gaussian random variables, having a similar

effect as the noise and aims at emphasizing the effect of RSI on

the performance [30], and (2) the general fading effect model,

where the RSI is modeled as a statistical fading distribution,

Fig. 1: System model for FD-AF relay networks.

such as i.i.d. Rician/Rayleigh fading, to model the RSI channel

effectively [31]. In this work, we utilize the general fading

effect model for RSI to characterize the RSI channel at the

relay node Reffectively. In particular, RSI is modelled by i.i.d.

Rayleigh block-fading (RBF) channel hrr ∼ CN 0, σ2

rr [23],

[24], such that it remains constant for ntransmissions [25].

B. Signal Transmission Model and MLD Decoding

The source node (S) intends to transmit x∈ {0,1}kbits,

thus it ﬁrst perform channel encoding ¯

xs=uc(x)to obtain

{0,1}jbits that are modulated to ncomplex baseband symbols

xs=um(¯

xs)∈Cn, such that ||xs||2

2=n, where ucand

umdenote the channel-coding and modulation functions. Then

source node performs symbol-by-symbol transmission, and the

signal received by the relay node after the SIC or under the

presence of the RSI, at time-instant κ, is given by

yr[κ] = pPs[κ]hsr[κ]xs[κ] + hr r[κ]xr[κ]

|{z }

RSI

+nr[κ](1)

where Psdenotes the transmission power of source node, hsr

is the i.i.d. RBF channel with hsr ∼ CN 0, σ2

hsr = 1,nris

the AWGN at the relay node with nr∼ CN 0, σ2

r, and xr

is the ampliﬁed signal transmitted by the FD-AF relay node

at the same time-instant κ, given by

xr[κ] = pPr[κ]α[κ−1]yr[κ−1] (2)

where Prdenotes relay’s transmission power and the ampliﬁ-

cation factor αis represented as

α[κ] = Ps[κ]|hsr[κ]|2+Pr[κ]|hr r[κ]|2+σ2

r−1/2(3)

Now, the signal received by the destination node is given as

yd[κ] = hrd[κ]xr[κ] + nd[κ]

=hrd[κ]pPr[κ]α[κ−1]yr[κ−1] + nd[κ](4)

=pPs[κ−1]Pr[κ]α[κ−1]hsr[κ−1]hr d[κ]xs[κ−1]

| {z }

Desired Signal

+pPr[κ]α[κ−1]hrr [κ−1]hrd[κ]xr[κ−1]

| {z }

RSI Signal

+pPr[κ]α[κ−1]hrd[κ]nr[κ−1] + nd[κ]

| {z }

Noise

(5)

where hrd ∼ CN (0,1) is the i.i.d. RBF channel in second-

hop and ndis AWGN at the destination node with nd∼

CN (0, σ 2

d). Thus, using (5) the signal-to-interference-and-

noise-ratio (SINR) at the destination node, denoted by Γ[κ],

can be given as (6) (shown on next page). For clarity, we have

detailed the impact of transmit SNR and RSI on the SINR

in Appendix A. The destination node performs optimal MLD

decoding, as follows

ˆxd= arg min

x∈C 







yd[κ]−pPs[κ−1]Pr[κ]α[κ−1]

hsr[κ−1]hr d[κ]x||2(7)

where Cdenotes all the possible alphabets. The decoder

performs block-by-block channel-decoding using udfunction,

to obtain ˆ

xs=ud(ˆ

xd), where ˆ

xd∈ {0,1}jand ˆ

xs∈ {0,1}k.

C. Differential FD-AF Relay Networks - Without CSI

In the absence of the CSI knowledge, we propose to

utilize traditional differential modulation and demodulation

techniques at the source and destination nodes. For such

scenarios, we propose to design the ampliﬁcation factor for

the FD-AF relay node by utilizing the variances of the ﬁrst-

hop channel between the source and relay node, and the RSI

channel as

α[κ] = σ2

hsr +σ2

rr +σ2

r−1/2(8)

where the variances {σ2

hsr , σ2

rr , σ2

r}can be obtained via long-

term average of the received signals. Similar approximations

have been employed for the HD-AF relays network in [32],

[33]. To include the impact of RSI in FD scenarios we

introduce the variance of the RSI channel in (8).

III. PROP OS ED AE-BASED FD-AF RELAY NET WORKS

The fundamental distinction between the BCM and d-BCM

design by the AE and the conventional networks is that the

AE aims to design the block codes using a learning-based

approach by updating the NN weights, while the conventional

network uses conventional channel codes (such as Hamming

code). Similar to our work [6] for HD-AF relay networks,

we propose bit-wise AE for BCM and d-BCM design for the

FD-AF relay network, as shown in Fig. 2.

Speciﬁcally, we consider a NN-based source node that

performs block-by-block encoding that transforms the kinput

bits x∈ {0,1}kto ncomplex baseband symbols xs∈Cn. We

now perform symbol-by-symbol transmission and at any time-

instant κthe symbol received by the FD-AF relay node, in

the presence of RSI can be given as (1). Traditionally, the AF

relaying scheme is designed to have minimal implementation

complexity by receiving, amplifying and re-transmitting the

signal. Similar to our work in [6]–[8] for HD-AF relay

networks, we propose to utilize a conventional FD-AF relay

node without using a NN-based relay. This is because NN-

based processing at the FD-AF relay node will make it similar

to the decode-and-forward relay decoding and re-encoding

the signal. Further, this will simplify the relay structure with

only analogue-domain signal reception ampliﬁcation and re-

transmission, with the necessary SIC for FD transmission.

Thus, the signal transmitted by the FD-AF relay node be-

comes (2), where we utilize the ampliﬁcation factor given in

(3) and (8) for the BCM and d-BCM designs, respectively.

The signal received by the destination node is given as (5).

Also, we consider a NN-based destination node that performs

Γ[κ] = Ps[κ−1]Pr[κ]α[κ−1]2|hsr[κ−1]hr d[κ]|2

Pr[κ]α[κ−1]2|hrr [κ−1]hrd[κ]|2+Pr[κ]α[κ−1]2|hrd[κ]|2σ2

r+σ2

(6)

Fig. 2: Block diagram of proposed AE-based FD-AF relay network.

block-by-block decoding that transforms the ninput complex

baseband symbols yd∈Cnto ksoft probabilistic outputs

˜pdθd(xu|yd)∈[0,1], where u={1, ..., k}, that corresponds

to the log-likelihood ratios (LLRs), as

LLR(u) = log 1−˜pdθd(xu= 0|yd)

˜pdθd(xu= 0|yd),∀u= 1, ..., k (9)

Thus, the designed AE framework solves the bit-decoding

problem as a multi-label binary classiﬁcation problem. Specif-

ically, each of the bit is considered as a separate label, thus

there are klabels, and each of these labels can take binary

values 0/1. Further, the proposed AE’s NN decoder generates

soft-probabilistic outputs that corresponds to the LLRs, which

can also be modiﬁed for hard-decision decoding by placing a

threshold or used directly for more powerful outer decoders

such as LDPC and Turbo codes (as detailed in Sec. VI.E).

We train the AE with the aim to maximize the chances

of reconstruction of the intended signal xsby learning the

NN functions at the source and destination nodes, represented

by (θe,θd). As the input-output to AE is kbits, thus we

can formulate the proposed AE framework as a multi-label

binary classiﬁcation problem, wherein, we utilize binary CE

loss to quantify the de-mapping error at the destination node.

We train the AE by performing mini-batch training, where

the weights and bias terms in the NN are updated using

the stochastic gradient descent (SGD) method employing the

back-propagation method [27]. In contrast to training dataset

creation methods for AE frameworks in [1]–[12] where HD

transmission is considered with no training dataset creation

methodology for the presence of RSI in FD networks. We

design a training dataset such that the AE can generalize well

for varying testing RSI or SNR values, as

•For Varying RSI – For any given rate R, we create

a training dataset with STrain samples with ﬁxed trans-

mit SNR Eb/N0= 30 dB and multiple RSI levels

σ2

rr ={−60,−20,0,20}dB. Then we train a single AE

framework (performing BCM or d-BCM design) until the

convergence as detailed in Remark 3 later. Then, we test

for varying RSI levels σ2

rr = [−60,20] dB.

•For Varying SNR – For any given rate R, we create

a training dataset with STrain samples with ﬁxed RSI

σ2

rr = 0 dB and multiple transmit SNR Eb/N0=

{3,10,23,28,38}dB. Then we train a single AE frame-

work (performing BCM or d-BCM design) until the

convergence as detailed in Remark 3 later. Then, we test

for varying transmit SNRs Eb/N0= [0,30] dB.

In contrast to the bit-wise AE-based frameworks in [12]

where SNR information is required at the encoder-decoder, we

remove the SNR requirement. Further for generalizability, in

this work we propose the same NN architecture for both AE-

based BCM and d-BCM design as shown in Table I with only

difference in the Lambda layers LL(yd)at the NN decoder.

In general, the radio transformer network (RTN) networks are

used for scenarios without the CSI knowledge as a means to

estimate the channel [1], as also considered in AE-based HD-

AF relay networks in [6]–[8]. But, by experiments, we ﬁnd

that due to the presence of RSI at the FD-AF relay node the

proposed RTN is improving the performance for BCM design,

instead of the d-BCM design. Thus, we propose an RTN and

include it for BCM design and do not include RTN in d-BCM

design (please see details in Appendix B).

For the AE-based BCM, we perform channel equalization

and employ a RTN in the Lambda layers. Firstly, we include

two Lambda layers L1

Land L2

Lto perform channel equaliza-

TABLE I: NN architecture.

Node Layer Neurons Remarks

No. (l) (δl)

Encoder

l= 0 kInput (x)

l= 1 256 σ1=Tanh

l= 2 128 σ2=Tanh

l= 3 64 σ3=Tanh

l= 5 2n σ4=Linear

l= 6 2nPN (PS

l= 7 2nOutput (xs)

Decoder

l= 0 2nInput (Output of LL(yd))

l= 1 1024 σ1=Tanh

l= 2 512 σ2=Tanh

l= 3 256 σ3=Tanh

l= 4 64 σ4=Tanh

l= 5 k σ5=Sigmoid

l= 7 kOutput ˜pdθd(x|yd)

tion for the dual-hop channels (hsr)and (hr d). Secondly, we

include RTN made of three Lambda layers – the L3

Lis a dense

layer with 16 neurons and Tanh activation, the L4

Lis a dense

layer with 2nneurons and Linear activation, and the L5

Ladds

the output of L4

Land received signal ydbefore providing it

to the NN decoder. However, for the AE-based d-BCM, we

consider no Lambda layers in NN decoder, i.e., yddirectly

becomes input to the NN decoder.

In contrast to [1]–[12] where a factor of 2kneurons are

employed in the NN encoder and decoder, thereby increasing

the AE’s complexity exponentially with k. Another advantage

of the proposed NN architecture in Table I is that for the ﬁrst

time we design a single NN architecture for the AE framework

that can handle all the varying high rate transmissions R=

{4/7,8/7,12/7,16/7}, later analyzed in Section VI.

The AE is implemented in Keras [35] with TensorFlow [36]

as backend. For training we utilize Adam optimizer [37],

where the weights are initialized using Glorot initializer [38].

We keep training samples Strain = 3×105and learning rate τ=

0.001. By parameter searching, we note that smaller batch-size

(B)and fewer epochs (E)leads to better performance for

BCM design in comparison to d-BCM design. Thus, we keep

B= 128, E = 15 for BCM design and B= 6000, E = 60

for d-BCM design, respectively. This is because without CSI,

large batch size provides the AE with sufﬁcient samples at low

probability region and more epochs helps the AE in estimating

and removing the channel impairments.

IV. UNDERSTANDING THE TRAINING CONVERGENCE OF

AE FRA ME WORKS

In this section, we focus on the training convergence of the

proposed AE framework and provide three useful conjectures

based on empirical observations, labelled as Remarks 1–3.

Please note we use the term BCM loosely in this section to

imply for both the BCM and d-BCM designs. For clarity, for

any (n, k)block or rate R=k/n, we deﬁne following

Deﬁnition 1 (Symbol): A complex baseband symbol is de-

ﬁned as a complex number indicating the symbol transmitted

or received at various nodes in the network.

Deﬁnition 2 (Codeword): A codeword is a collection of n

complex baseband symbols together.

A. Impact of SNR, RSI and CSI on Training Convergence

In this subsection, we focus on the impact of varying

SNR, RSI, and presence/absence of CSI knowledge on the

training convergence of the proposed AE framework. Many

prior works have determined the relationship between the

binary CE loss and MI for P2P networks [12], [40]–[42].

In this work, we propose the FD-AF relaying network with

binary CE loss calculated between source and destination node

and a conventional FD-AF relay node. Thus, the relationship

of binary CE loss and MI remains same as P2P networks,

obtained as follows

J(θe,θd) = −

u=1 Zyd

p(yd)peθe(xu|yd) log ˜pdθd(ˆxu|yd)dyd

=DKL peθe(x|yd)||˜pdθd(x|yd)+

H(X)−Ieθe(X;Yd)(10)

where DKL peθe(x|yd)||˜pdθd(x|yd)denotes the Kullback-

Leibler (KL)-divergence loss between the true and learnt

distributions, H(X)denotes entropy of the input bits X, and

Ieθe(X;Yd)is the MI between the input bits Xand the

received signal Yd[34]. Now, using (10), we can deﬁne

the estimated MI (I)[9], [12], [40]–[42] as the differ-

ence between the MI Ieθe(X;Yd)and KL-divergence loss

DKL peθe(x|yd)||˜pdθd(x|yd), given as follows

I:= Ieθe(X;Yd)−DKL peθe(x|yd)||˜pdθd(ˆ

x|yd)

=H(X)− J (θe,θd)(11)

Since the ﬁrst term on R.H.S, H(X)in (11), remains a

constant, thus the changes in the estimated MI (I)in (11)

only depends on the binary CE loss term J(θe,θd).

Lastly, by simulations, we analyze the convergence of the

training of proposed AE frameworks. In particular, we train

a separate AE for each SNR or RSI level, once the AE is

trained we note the validation CE loss (J(θe,θd)) at the

last epoch, and obtain the estimated MI (I)as described in

(11). Speciﬁcally, we train the proposed AE for ﬁxed SNR

Eb/N0= 30 dB and varying RSI in Fig. 3a, and for ﬁxed

RSI σ2

rr =−20 dB and varying SNR (Eb/N0)in Fig. 3b. For

greater insights, we also vary the rate R=k/n [bits/channel

reuse] and keep block size n= 7. In Fig. 3, we can see

that as the RSI decreases or SNR increases the estimated MI

increases, until it reaches the upper bound of k. Directly from

(11), it suggests that KL-divergence loss approaches 0making

Ieθe(X;Yd) = H(X). It is important to note that we can’t

ﬁnd the global minima of the NN parameters with respect

to the binary CE loss. But, surprisingly we don’t need to

ﬁnd the global minima. Empirically, the authors in [43], [44]

found that despite the non-convexity, the local minimas are

rare and they are all very similar to each other and the global

minima. Interested readers, please refer to the theoretical

insights presented in [43], [44]. Thus, the training of AE-based

FD-AF relay networks converge to a locally optimum minima

above a minimum required SNR (in Fig. 3b) and below a

maximum RSI level (in Fig. 3a). Further, in the absence of the

-10 0 10 20 30

SINR [dB]

Estimated Mutual Information

(n,k) = (7,4) - BCM

(n,k) = (7,8) - BCM

(n,k) = (7,16) - BCM

(n,k) = (7,4) - d-BCM

(n,k) = (7,8) - d-BCM

(n,k) = (7,12) - d-BCM

(a) Varying RSI for ﬁxed transmit SNR Eb/N0= 30 dB.

-2 0 2 4 6 8 10 12

SINR [dB]

Estimated Mutual Information

(n,k) = (7,4) - BCM

(n,k) = (7,8) - BCM

(n,k) = (7,16) - BCM

(n,k) = (7,4) - d-BCM

(n,k) = (7,8) - d-BCM

(n,k) = (7,12) - d-BCM

(b) Varying transmit SNR for ﬁxed RSI σ2

rr =−20 dB.

Fig. 3: Estimated mutual information for proposed AE framework.

CSI the estimated MI converges to the upper bound at a higher

transmit SNR and lower RSI levels. Thus, convergence of the

training of AE performing BCM design with CSI knowledge is

faster than d-BCM design without CSI knowledge. (in Fig. 3).

Thus, based on above empirical observations, we give the

following Remarks.

Remark 1: The training of AE framework, of sufﬁciently

large block length (n), converge to a locally optimum minima

above a minimum required SNR and below a maximum RSI

level.

Remark 2: The CSI knowledge helps in faster convergence

of the AE frameworks.

From Remarks 1, 2, the proposed AE converges above a

minimum SNR and below a maximum RSI. For this, we need

to train a separate AE framework for each SNR and/or RSI

level, which is impractical in nature. For practical purposes,

we propose to train a single AE framework (performing BCM

or d-BCM design) on varying SNR or RSI levels, such that

the proposed AE can generalize well for the varying SNR and

RSI levels. As a result, although the AE’s estimated MI never

reaches the upper bound because of training on low SNR or

high RSI, but it enables the AE to generalize well in nature.

B. Necessary Conditions for Training Convergence

In this subsection, we focus on tackling the problem of

ﬁnding the training convergence of the AE-based BCM/d-

BCM designs for any heuristically chosen hyper-parameter

settings.

Firstly, we need to undertand the training of AE with respect

to epochs, for a given training samples. This focusses on

stopping overﬁtting of the training AE framework. There are

well-known techniques like early stopping to stop the training

of an AE once the validation CE loss starts to increase, as

any more training of the AE will reduce its generalizability.

This is because the early stopping on the gradient descent

creates generalizable NN frameworks, that also remains robust

to corrupted labels [44, Theorem 2.2] [39]. Secondly, we need

to undertand the training of AE with respect to number of

training samples. Since the proposed AE learns the BCM and

d-BCM design in the presence of channels and noise, thus the

AE must be trained with enough samples to be generalizable

in the future testing phase. Thus, the training of the proposed

AE is converged if increasing the training epochs and training

samples does not reduce the validation CE loss, respectively.

Furthermore, the proposed AE framework is modelled as

a multi-label binary classiﬁcation problem. In particular, k

input-output bits represent klabels, with each label taking

binary 0/1values, thus there exist 2kpossible classes for

the proposed AE. Thus, the AE aims to design 2kpossible

codewords each representing a different class in a higher-

dimensional space. For any given hyper-parameter settings,

once these codewords are designed the AE converges because

we can not improve the performance any further.

We now empirically analyze the above discussion below.

For example, we train an AE for R= 16/7in Fig. 4 for

varying training data size STrain ={213, ..., 223}of ﬁxed SNR

Eb/N0= 30 dB. Speciﬁcally, we divide the STrain training

samples into 4 : 1 ratio of training set STand validation set

SV. Then, we train the AE on STand determine the number

of codewords formed by the NN encoder and the binary CE

loss at the last epoch (15th epoch) on STand SV. Lastly, we

determine the BER using the testing samples STest.

In Fig. 4a, we can see that as the training dataset increases,

the number of codewords formed by the NN encoder of the

trained AE on the training and validation sets increase until it

becomes 216 codewords, each representing one of the possible

216 combination. The NN encoder forms these 216 codewords

on the 218 and 221 training samples using the training and

validation sets, respectively. Further, in Fig. 4a, we can see that

the binary CE loss, noted at the last (15th)epoch of training,

reduces as the training dataset increases and converges for

training and validation sets at 221 training samples.

In Fig. 4b we can see that as the training dataset increases

the performance of the proposed AE on the unseen testing

samples improves, whereas when the training dataset size

13 14 15 16 17 18 19 20 21 22 23

log2(Size of training data STrain)

log2(No. of codewords formed)

0.1

0.12

0.14

0.16

0.18

0.2

0.22

0.24

Binary cross entropy loss at last epoch

Using training set

Using validation set

On training set

On validation set

(a) Codewords formed by the NN Encoder and binary CE

loss on training and validation sets.

-10 0 10 20 30

SINR [dB]

10 -2

10 -1

BER

STrain = 2 13

STrain = 2 14

STrain = 2 15

STrain = 2 16

STrain = 2 17

STrain = 2 18

STrain = 2 19

(b) BER analysis on test set for varying number of training

samples STrain.

Fig. 4: Training convergence of the proposed AE framework.

starts becoming greater than 218 then the performance im-

provement of the AE starts converging because 216 codewords

are created by AE’s NN encoder on the training set.

Thus, for any given hyper-parameter settings, we at least

need training samples in the range of [2k+2,2k+5 ]to ensure

the AE creates 2kcodewords, the validation CE loss has con-

verged, and the AE’s performance converges to its maximum

potential of decoding the 2kpossible classes.

Thus based on above empirical observations, we can provide

the following Remark.

Remark 3: For any given hyper-parameter settings and sufﬁ-

ciently large block length (n), the two necessary conditions for

the convergence of training of the AE frameworks performing

BCM and d-BCM designs are detailed as follows:

C1: The validation CE loss of the AE has converged with

respect to the training epochs and number of training

samples.

C2: The NN encoder of the AE designs 2kcodewords.

From Remark 3, the number of training samples increases

exponentially with input-output bits (k), with at least 2k+2

samples required, for the convergence of the AE frameworks.

However, these training samples are required only in ofﬂine

training phase. Once trained, the AE can be deployed online.

V. OP EN IN G TH E BLAC K-B OX OF BCM AND D-BCM

In this section, we reveal distinct observations made of

the AE-based designed BCM. For brevity we only consider

BCM because both the BCM and d-BCM exhibit similar

trends. Throughout this section, we train the proposed AE for

various rates R=k/n, where n∈ N ={1,3,5,7,10}and

k∈ K ={1,4,8,12,16}until convergence using Remark 3.

Once trained the NN encoder becomes deterministic. Thus,

if we input any kbits to the NN encoder of the trained AE

we obtain same ncomplex baseband symbols as output every

time, representing a codeword for the kinput bits. Now, we

Fig. 5: Minimum Euclidean distance dEmin for vaying (n, k).

can obtain all the possible codeword from the NN encoder

using all the possible combinations of kinput bits.

Observation – 1:AE framework designs 2kcodewords in

2n-dimensional space.

In Sec. IV-B and Remark 3, we have already shown that the

training of the AE converges after designing of 2kcodewords.

Directly, as the NN encoder outputs 2nreal values for each

of the 2kcodewords, i.e. each of the 2kcodewords are

represented by unique ncomplex baseband symbols, thus 2k

codewords are being designed in 2n-dimensional space.

Observation – 2a:As the block length increases the

minimum Euclidean distance between any of the possible

codewords increases.

Observation – 2b:When the number of codewords becomes

extremely large, the minimum Euclidean distance between any

two codewords follows a Gaussian distribution for sufﬁciently

large block length (n).

Observation – 2c:As the block length increases, the

0 5 10

Ncw

n = 1, k = 1

0.3 0.4 0.5

Minimum Euclidean distance d E between each fth codeword to its closest g th codeword

Number of codewords formed by the NN encoder of the trained AE

n = 1, k = 4

024

10-3

100

200 n = 1, k = 8

0 0.5 1

10-3

2000

4000 n = 1, k = 12

024

10-3

104n = 1, k = 16

0 5 10

Ncw

n = 3, k = 1

1.4 1.6 1.8

4n = 3, k = 4

0.4 0.6 0.8

40 n = 3, k = 8

0 0.02 0.04 0.06

100

200

n = 3, k = 12

0 0.02 0.04

104n = 3, k = 16

0 5 10

Ncw

n = 5, k = 1

1.9 1.95 2

4n = 5, k = 4

1 1.2 1.4

50 n = 5, k = 8

0.6 0.8 1

200

n = 5, k = 12

0.4 0.6 0.8

5000

n = 5, k = 16

0 5 10

Ncw

n = 7, k = 1

2.2 2.4 2.6

5n = 7, k = 4

1.4 1.6 1.8

50 n = 7, k = 8

1 1.2 1.4 1.6

200

400 n = 7, k = 12

0.4 0.6 0.8 1

2104n = 7, k = 16

0 5 10

Ncw

n = 10, k = 1

2.8 3 3.2

5n = 10, k = 4

1.8 1.9 2

n = 10, k = 8

1.4 1.5 1.6 1.7

500

n = 10, k = 12

1 1.2 1.4 1.6

104n = 10, k = 16

Fig. 6: Minimum Euclidean distance dEbetween each fth codeword and its closest gth codeword.

Euclidean distance between the codewords concentrate to the

average Euclidean distance.

We can determine the minimum Euclidean distance [45]

between each fth ={1, ..., 2k}codeword and its closest gth

codeword, as follows

E= min

g∈{1,...,2k}and g6=f||xf−xg||2,∀f(12)

where 2kdenotes the number of possible codewords and x(·)

denotes a vector comprising ncomplex values representing

each of the 2kpossible codewords. The minimum Euclidean

distance between any of the possible codewords is given as

dEmin = min

fdf

E,∀f∈ {1, ..., 2k}(13)

For analyzing this observation, we train the proposed AE for

(n∈ N , k ∈ K)and using (13) we determine the minimum

Euclidean distance between all the 2kdesigned codewords

(dEmin )in Fig. 5. We can see that as the block length (n)

increases the dEmin increases and as the number of input

bits (k)increases the dEmin decreases, this is because the 2k

codewords are being designed in 2n-dimensional space.

We also plot a histogram of the minimum Euclidean dis-

tance between each fth codeword and its closest gth code-

word, by calculating the dE={d1

E, ...., d2k

E}using (12)

for varying (n∈ N , k ∈ K)in Fig. 6. Interestingly, when

k≥8and n≤3, i.e., scenarios where number of codewords

formed are very large and block length is very small, the

dEmin approaches zeros (Fig. 5) and minimum Euclidean

distance between each codeword to its closest codeword is

also zero (marked in red in Fig. 6). In such scenarios, the

AE learns to cheat by placing the 2kcodeword on top of

each other because of small space to place the large amount

of codewords. Moreover, in Fig. 6, when the block length

(n)increases, the mean of histogram also increases because

dEmin increases, indicating that as the block length increases

the spacing between any two closest codewords also increases.

Moreover, in Fig. 6, when number of codewords becomes

extremely high (for k≥8) and the block length is sufﬁciently

large (for n≥5), we can see that – (i) although the

overall minimum Euclidean distance dEmin (obtained using

(13)) is small (Fig. 5), but the minimum Euclidean distance

dEbetween each fth codeword and its closest gth codeword

is competitively large for almost all the codewords and follows

a Gaussian distribution (marked in green in Fig. 6), as a

consequence of the central limit theorem; and (ii) as the block

length increases the standard deviation (spread) of the Gaus-

sian distribution decreases, indicating that the Euclidean dis-

tance of the codewords concentrates to the average Euclidean

distance. These observations resemble the desired theoretical

observations of BCM design discussed in [26]. Therefore, we

claim that the proposed AE framework can design random

BCM design to achieve the best possible distance observations

for sufﬁciently large block length (n)for any input bits (k).

Observation – 3:The packing density improves as the rate

Rdecreases.

We can deﬁne the normalized second-order moment [45] as

the average squared Euclidean distance between a point in the

packing and the origin of the coordinate system, normalized

by the square of the minimum Euclidean distance dEmin , as

En=1

2kd2

Emin

f=1

||xf||2

2(14)

1 3 5 7 10

Block Length (n)

Normalized second-order

Moment (En)

k = 1

k = 4

k = 8

k =12

k = 16

Fig. 7: Packing density.

This metric remains indifferent to scaling thus pivotal to

differentiate the packing densities. The lower the Enthe better

is the designed BCM. In Fig. 7 we analyze the packing

density Enwith varying rate R=k/n. We can see that

the packing density improves for the AE-based BCM as the

block-length (n)increases or the input bits (k)decreases, for

all (n∈ N , k ∈ K), because the 2kcodewords are being

designed in the 2n-dimensional space.

Observation – 4:The codes designed by the AE framework

are spherical codes.

The normalized fourth-order moment or Kurtosis [45] mea-

sures the variation of the squared Euclidean norm among the

constellation points, deﬁned as

χ=1

n2kd4

Emin

f=1

||xf||4

2(15)

By simulations, we ﬁnd that the proposed AE creates ‘Spheri-

cal codes’ with χ= 1, i.e. equal norm for all the 2kcodewords

for all the varying (n∈ N , k ∈ K)scenarios.

Observation – 5:As the block-length increases the average

Hamming distance between codewords increases.

Using the Observation - 2, we know that the 2kcodewords

are being designed in the 2n-dimensional space that have the

minimum Euclidean distance between the fth codeword and

its closest gth codeword as Gaussian distributed. Thus the

distance between any two codewords is different and does not

follow a grid-like structure, hence we cannot directly utilize

the minimum Euclidean distance to determine the average

Hamming distance between two closest codewords. Hence

in this work, we ﬁrstly determine the minimum Euclidean

distance dEmin of the 2kcodewords using (13). Then, for

each fth codeword, we determine all the codewords within the

sphere with radius given by the minimum Euclidean distance

df≤dEmin +ξ, such that ξ≥0and represent these code-

words by a set Sf. We then determine the average Hamming

distance for each fth codeword and all the codewords in its

corresponding set Sf[46], [47], as

davg,f

H=X

g∈Sf

dH(f, g)

|Sf|(16)

dEmin

1.5dEmin

2dEmin

2.5dEmin

3dEmin

Radius of the sphere to determine the codeword set

0.5

1.5

2.5

3.5

Average Hamming Distance

n = 1

n = 3

n = 5

n = 7

n = 10

Fig. 8: Average Hamming distance.

where dH(f, g)denotes the Hamming distance between code-

word fand g, and 

Sf

is the cardinality of the set Sf. Now,

we can determine the average Hamming distance for all the

f∈ {1, ..., 2k}codewords as

davg

H=1

|davg,f

H>0|

f=1

davg,f

H(17)

where |davg,f

H>0|is the number of non-zero elements in

davg

H. For ﬁxed input bits k= 8, we determine the average

Hamming distance in (17) for varying block lengths n∈ N

and varying ξ={0,0.5,1,1.5,2}in Fig. 8. As expected, as

the radius of the sphere (dEmin +ξ)to determine the codeword

set Sfincreases the average Hamming distance davg

Hincreases.

Interestingly, as the block-length (n)increases the average

Hamming distance davg

Halso increases, because in Fig. 5 we see

that as the block-length (n)increases the minimum Euclidean

distance between the codewords dEmin is also increasing.

VI. PERFORMANCE EVALUATIO N

In this section, we evaluate the performance of the proposed

AE, conventional HD-AF and FD-AF relay networks for

ﬁxed RSI and varying transmit SNR and vice-versa. Please

note we show the plots with respect to SINR in (6) (for

details please refer Appendix A). As this is ﬁrst time NN-

based AE framework is proposed in context of FD networks,

for fair comparison, we consider the conventional FD-AF

relay networks as benchmark, wherein we utilize traditional

modulation techniques and (7,4) Hamming code as a baseline

error correction code, with the MLD decoding detailed in (7).

Also, we utilize RBF channels such that it remains constant

for n= 7 transmissions only.

A. AE-based d-BCM Design - Without CSI Knowledge

In Fig. 9, we analyze the BER of the proposed AE-

based d-BCM design. In Fig. 9a–9c, we ﬁxed transmit SNR

Eb/N0= 30 dB and vary the RSI, for varying input bits (k)1.

1For ﬁxed n= 7, we keep kas 4,8, and 12, that corresponds to d-

BPSK, d-QPSK and d-8PSK modulations designs in conventional networks

with (7,4) Hamming coding.

-10 0 10 20 30

SINR [dB]

10 -2

10 -1

10 0

BER

d-BPSK + (7,4) HC - FD

d-BPSK + (7,4) HC - HD

AE-based d-BCM

(a) Fixed transmit SNR Eb/N0= 30 dB

–R= 4/7.

-10 0 10 20 30

SINR [dB]

10 -2

10 -1

10 0

BER

d-QPSK + (7,4) HC - FD

d-QPSK + (7,4) HC - HD

AE-based d-BCM

(b) Fixed transmit SNR Eb/N0= 30 dB

–R= 8/7.

-10 0 10 20 30

SINR [dB]

10 -2

10 -1

10 0

BER

d-PSK-8 + (7,4) HC - FD

d-PSK-8 + (7,4) HC - HD

AE-based d-BCM

–R= 12/7.

-4 -2 0 2

SINR [dB]

10 -2

10 -1

10 0

BER

d-BPSK + (7,4) HC - FD

d-BPSK + (7,4) HC - HD

AE-based d-BCM

(d) Fixed RSI σ2

rr = 0 dB – R= 4/7.

-4 -2 0 2

SINR [dB]

10 -2

10 -1

10 0

BER

d-QPSK + (7,4) HC - FD

d-QPSK + (7,4) HC - HD

AE-based d-BCM

(e) Fixed RSI σ2

rr = 0 dB – R= 8/7.

-4 -2 0 2

SINR [dB]

10 -1

10 0

BER

d-PSK-8 + (7,4) HC - FD

d-PSK-8 + (7,4) HC - HD

AE-based d-BCM

(f) Fixed RSI σ2

rr = 0 dB – R= 12/7.

Fig. 9: Performance evaluation for d-BCM for FD-AF relay networks. Please note we vary RSI in Fig. (a)-(c) and vary transmit

SNR in Fig. (d)-(f).

We can see that for small RSI (σ2

rr ≤ −30 dB) the BER

performance of – (i) conventional FD-AF and HD-AF relay

networks’ becomes same and (ii) proposed AE converges,

because the RSI becomes negligible to impact the signal at the

FD-AF relay node. Furthermore, the proposed AE outperforms

the conventional FD-AF relay networks for all varying rates

(R)and RSI levels, even outperforming conventional HD-AF

relay networks for small RSI levels. In Fig. 9d–9f, we ﬁxed

the RSI at σ2

rr = 0 dB and vary the transmit SNR, for varying

(k). The conventional FD-AF relay networks is not able to

decode the signals even for Eb/N0= 30 dB as the RSI is high

(σ2

rr = 0 dB), but the proposed AE can decode the signals and

the BER reduces with SNR.

We explain the reasons for the gains achieved by the AE

as follows – the proposed AE is able to design 2kcodewords

in 2n-dimensional space with automatic bit-labelling, by max-

imizing the bit-wise MI (as detailed in Sec. IV-A). The AE

aims to learn these d-BCM design to remove the deteriorating

impacts of RSI, RBF channels and AWGN at the nodes,

by the proposed end-to-end training until convergence using

Remark 3. This leads to the maximization of the minimum

Euclidean distance and minimum average Hamming distance

as detailed in Sec. V for the designed codewords and thus

achieve improvement in the BER performance.

In Fig. 9a–9c, the proposed AE is able to design the d-BCM

for 2kcodewords in 2n-dimensional space with the observa-

tions detailed in Sec. V, leading to the BER performance for

proposed AE almost similar for any rate R≤12/7. Thus,

as the modulation order or rate increases the proposed AE

can even outperform the conventional HD-AF relay networks

even for higher RSI, i.e. at −10 dB (for k= 8) and

−5dB (for k= 12) for n= 7. Due to similar reasons, in

Fig. 9d–9f, the AE’s BER performance becomes closer to the

conventional HD-AF relay networks as the modulation order

or rate increases, indicating that impact of RSI is removed by

AE even in the absence of CSI and very high RSI levels.

B. AE-based BCM Design - With Perfect CSI Knowledge

In Fig. 10, we analyze the BER performance of the proposed

AE-based BCM design. For varying (k)2, in Fig. 10a–10c we

ﬁx transmit SNR Eb/N0= 30 dB and vary the RSI, and

in Fig. 10d–10f we ﬁx σ2

rr = 0 dB and vary the SNR. We

see similar BER performance trends for BCM as d-BCM in

Sec. VI-A. Unlike d-BCM design, the BER performance of

BCM designs deteriorate with increasing modulation order

or rate because in the presence of perfect CSI, the BER

performance of the conventional FD-AF and HD-AF relay

networks is already very good and the advantage the AE had

in tackling the RBF channels effectively than conventional

differential schemes is not present for BCM design.

C. AE-based BCM Design - With Imperfect CSI Knowledge

Now, we analyze the AE’s performance in the presence of

the channel estimation error. We utilize the linear minimum

mean squared error (LMMSE) based channel estimation [48]

denoted by hω

(·)∈ CN 0, σ2

hωwhere the error in channel esti-

mation is e(·)∈ CN 0, σ2

efor both the hops (·) = {sr, rd}.

From the orthogonality principle of the LMMSE we know

that the errors in the channel estimation remains mutually

independent of the estimated channel, thus, we have

hω

(·)=h(·)+e(·),∀(·) = {sr, rd}(18)

2For ﬁxed n= 7, we keep kas 4,8, and 16, that corresponds to BPSK,

QPSK and QAM-16 modulations designs in conventional networks with (7,4)

Hamming coding.

-10 0 10 20 30

SINR [dB]

10 -4

10 -2

10 0

BER

BPSK + (7,4) HC - FD

BPSK + (7,4) HC - HD

AE-based BCM

(a) Fixed transmit SNR Eb/N0= 30 dB

–R= 4/7.

-10 0 10 20 30

SINR [dB]

10 -3

10 -2

10 0

BER

QPSK + (7,4) HC - FD

QPSK + (7,4) HC - HD

AE-based BCM

(b) Fixed transmit SNR Eb/N0= 30 dB

–R= 8/7.

-10 0 10 20 30

SINR [dB]

10 -3

10 -2

10 -1

10 0

BER

QAM-16 + (7,4) HC - FD

QAM-16 + (7,4) HC - HD

AE-based BCM

–R= 16/7.

-4 -2 0 2

SINR [dB]

10 -4

10 -2

10 0

BER

BPSK + (7,4) HC - FD

BPSK + (7,4) HC - HD

AE-based BCM

(d) Fixed RSI σ2

rr = 0 dB – R= 4/7.

-4 -2 0 2

SINR [dB]

10 -3

10 -2

10 -1

10 0

BER

QPSK + (7,4) HC - FD

QPSK + (7,4) HC - HD

AE-based BCM

(e) Fixed RSI σ2

rr = 0 dB – R= 8/7.

-2 0 2 4

SINR [dB]

10 -4

10 -2

10 0

BER

QAM-16 + (7,4) HC - FD

QAM-16 + (7,4) HC - HD

AE-based BCM

(f) Fixed RSI σ2

rr = 0 dB – R= 16/7.

Fig. 10: Performance evaluation for BCM for FD-AF relay networks. Please note we vary RSI in Fig. (a)-(c) and vary transmit

SNR in Fig. (d)-(f).

0.1 0.25 0.5 0.75 1

Channel Estimation Quality ( )

10 -3

10 -2

10 -1

10 0

BER

Conventional - SINR = 7 dB

Conventional - SINR = 21 dB

Conventional - SINR = 28 dB

Proposed AE - SINR = 7 dB

Proposed AE - SINR = 21 dB

Proposed AE - SINR = 28 dB

Fig. 11: Impact of the CEQ (ς)on FD-AF relay networks for

rate R= 8/7for ﬁxed transmit SNR and varying RSI.

We denote the channel estimation quality (CEQ) by ςand

assume that the error variance depends on the SNR denoted

by γ, such that σ2

e=σ2

1+ςγ σ2

≈1

1+ςγ and σ2

hω=ςγ σ2

1+ςγ σ2

≈

ςγ

1+ςγ [48].

In Fig. 11 we analyze the impact of CEQ (ς)on the

proposed AE and conventional QPSK + (7,4) Hamming code

for ﬁxed rate R= 8/7and transmit SNR Eb/N0= 30 dB.

Please note ς= 0 indicates completely erroneous channel

whereas ς=∞denotes perfect channel estimation. To

create an AE that remains unaffected of the varying channel

estimation errors, we train a single AE framework consisting

of Strain samples from varying ς={0.1,0.5,1,∞} until

the convergence as detailed in Remark 3 and test on unseen

Stest samples of varying CEQs (ς). Clearly, the proposed AE

outperform the conventional FD-AF relay networks for all the

CEQs due to the similar reasons as Sec. VI-A. In fact, the BER

performance of proposed AE framework with almost fully er-

roneous channel estimation ς= 0.1is better than conventional

FD-AF relay networks with perfect channel estimation ς=∞

and as the RSI increases the BER performance improvement

by the proposed AE increases, this is because the proposed

AE-based BCM is designing 2kcodewords in 2n-dimensional

space with observations in Sec. V such that it can handle the

impacts of RSI and channel estimation errors effectively.

D. Reproducibility of Proposed AE Framework

Deﬁnition 3 (Reproducibility of AE): An AE is deﬁned to

be reproducible, for a given hyper-parameter setting PS,if

and only if we can reproduce any trained AE model M(θ)

with a very high probability, such that it does not lead to large

variations in BER for different training weight initializations

and training-validation samples of the AE.

We analyze the reproducibility by varying training-

validation data and weight initializations for training the AE

25 times and reporting the standard deviation and mean of

BER in testing data in Fig. 12. In particular, we evaluate

reproducibility of the proposed AE framework for different

RSI levels, while we ﬁx rate R= 16/7in BCM design and

ﬁx rate R= 12/7in d-BCM design, with transmit SNR

Eb/N0= 30 dB. We can see that the proposed AE frame-

work is highly reproducible because its standard deviation of

25 BER obtained from 25 different runs lies in the range

10−2−10−4. This is due to the fact that we train the AE until

the convergence using Remark 3. Also, as the RSI increases

the variations in BER increases by a factor of two, showing

that higher RSI levels negatively impacts the reproducibility

of the trained AE framework in a FD-AF relay network.

-10 0 10 20 30

SINR [dB]

10 -4

10 -3

10 -2

Std. BER for 25 iterations

10 -3

10 -2

10 -1

10 0

Mean BER for 25 iterations

Std. BER for AE-based BCM for 25 iterations

Mean BER for AE-based BCM for 25 iterations

Fig. 12: Reproducibility of AE (R= 16/7) and AE (R=

12/7) frameworks.

E. AE-based BCM and d-BCM design with Outer 5G-NR

LDPC Codes

Until now, we consider the AE-based BCM and d-BCM

design for short block length (n= 7). Recently, 5G-NR stan-

dards propose to utilize the outer LDPC codes for facilitating

parallel execution to meet the low-latency and high throughput

requirements in the 5G’s URLLC networks [49]. Thus, we

use the 5G-NR LDPC codes with base graph 1 and rate 1/3

as outer codes [49], [50]. Speciﬁcally3, we employ 5G-NR

LDPC codes with the rate 1/3as outer code in Fig. 9c, 9f,

10c, 10f. We consider a block length of n= 11,616 for

designing LDPC codes. Thus, in Fig. 13, we compare the

BER performance with LDPC as outer codes for varying RSI

(Fig. 13a) and SNR (Fig. 13b). In Fig. 13a, at 10−4BER, we

see that proposed AE-based BCM and d-BCM outperform the

conventional MLD scenario by 11 dB and 17 dB, respectively.

In Fig. 13b, we see that even with outer powerful LDPC codes,

the conventional MLD scenario can not decode the signal

because of high RSI (σ2

rr = 0 dB), but the proposed AE-

based BCM and d-BCM design can decode the signal and

waterfall in BER appears at 12 dB and 16 dB. In summary,

BER performance gains attained by the proposed AE-based

BCM and d-BCM design over the MLDs for short block

length in Fig. 9c, 9f, 10c, 10f are translated and enhanced by

utilizing the outer LDPC codes. This demonstrates the greatly

improved decoding abilities of the proposed AE-based BCM

and d-BCM, even when operated for the long block lengths

with the help of an outer code.

VII. CONCLUSION

In this work, we propose end-to-end learning-based FD-AF

relay networks in the presence of the RSI using AEs for high

transmission rates R=k/n. We propose (n, k)AE-based

3For the conventional scenario, we employ 5G-NR LDPC codes with the

rate 1/3as outer codes and Hamming code with the rate 7/4as inner code,

also we utilize 16-QAM and d-PSK-8 modulation for scenarios with and

without CSI, respectively. For proposed AE, we employ 5G-NR LDPC codes

with a rate of 1/3as outer codes for the AE-based BCM and d-BCM design

with the rate of 16/7and 12/7, respectively.

-10 0 10

SINR [dB]

10 -5

10 -4

10 -3

10 -2

10 -1

10 0

BER

(i) With CSI knowledge

Conven. MLD

Proposed AE

20 dB

Improvement

-20 0 20

SINR [dB]

10 -5

10 -4

10 -3

10 -2

10 -1

10 0

BER

(ii) Without CSI knowledge

Conven. MLD

Proposed AE

32 dB

Improvement

(a) Varying RSI and ﬁxed

SNR Eb/N0= 30 dB.

-4 -2 0 2

SINR [dB]

10 -5

10 -4

10 -3

10 -2

10 -1

10 0

BER

(i) With CSI knowledge

Conven. MLD

Proposed AE

-4 -2 0 2

SINR [dB]

10 -5

10 -4

10 -3

10 -2

10 -1

10 0

BER

(ii) Without CSI knowledge

Conven. MLD

Proposed AE

(b) Varying transmit SNR and

ﬁxed RSI σ2

rr = 0 dB.

Fig. 13: Performance evaluation for AE-based BCM and d-

BCM design with outer 5G-NR LDPC codes.

BCM and d-BCM designs depending upon the availability of

the CSI knowledge. Further counter-intuitively in the presence

of the RSI in the FD-AF relay networks, we propose to

utilize a radio transformer network for the AE framework with

CSI knowledge to improve the NN-based decoding and BER

performance. We design a single AE framework (performing

BCM or d-BCM design) that can generalize well on varying

testing SNR or RSI levels, outperforming the conventional FD-

AF relay networks with remarkable gains, but also the half

duplex AF relay networks for d-BCM designs. We analyze the

AE’s performance in the presence of channel estimation error

and note that for moderate RSI, the proposed AE framework

with almost fully erroneous channel estimation still outper-

forms the conventional FD-AF relay networks with perfect

0 10 20 30 40 50

Transmit SNR, E b/N 0 [dB]

-10

-5

SINR [dB]

RSI, rr

2 = 0 dB

RSI, rr

2 = -40 dB

RSI, rr

2 = 10 dB

(a) Transmit SNR Eb/N0versus SINR.

-60 -50 -40 -30 -20 -10 0 10 20

2 [dB]

-20

-15

-10

-5

SINR [dB]

Transmit SNR, E b/N0 = 0 dB

Transmit SNR, E b/N0 = 15 dB

Transmit SNR, E b/N0 = 30 dB

(b) RSI σ2

rr versus SINR.

Fig. 14: Impact of transmit SNR and RSI on the received SINR.

-10 0 10 20 30

SINR [dB]

10 -4

10 -3

10 -2

10 -1

BER

AE-based BCM + RTN

AE-based BCM (no RTN)

AE-based d-BCM + RTN

AE-based d-BCM (no RTN)

1.8 dB

5 dB

Fig. 15: Impact of including an RTN in the AE frameworks.

channel estimation. Moreover, we show that proposed AE is

highly reproducible for varying training weight initializations

and sample sets as the BER for different trainings varies by

a standard deviation of 10−2−10−4depending on RSI levels

in the FD-AF relay node. Lastly, we consider 5G-NR LDPC

codes as outer codes for the AE-based BCM and d-BCM

designs, we can see extraordinary BER performance gains of

up to 17 dB.

With a focus on training convergence, we show that the

AE converges above a minimum required SNR and below a

maximum RSI depending on the transmission rate and CSI

availability. Furthermore, we provide the necessary conditions

for AE’s convergence by showing that once the binary cross-

entropy validation loss has converged and the NN encoder

of AE designs 2kcodewords during the training phase, the

AE has converged to its maximum potential of decoding the

signal. Lastly, by analyzing the AE-based BCM design, we

determine distinct observations of the designed codewords

in 2n-dimensional space – (i)AE forms 2kcodewords in

2n-dimensional space, (ii)as the block length increases the

minimum Euclidean distance between any of the possible

codewords increases, and for sufﬁciently large block length

(n)when the number of codewords becomes extremely large,

the minimum Euclidean distance between any two codewords

follows a Gaussian distribution and the Euclidean distance

between the codewords concentrate to the average Euclidean

distance, (iii)the packing density improves as the rate R

decreases, (iv)the codewords designed by the AE framework

are spherical codes, and (v)as the block-length increases

the average Hamming distance between codewords increases.

We aim to consider transmitter/receiver distortion model as

discussed in [17] for the future works. Re-training of the AE

frameworks in an online settings using transfer learning [51]

is an interesting topic but we leave it for the future works.

APPENDIX A

IMPAC T OF TRANSMIT SNR A ND RSI ON THE RECEIVED

SINR AT THE DE ST INATION NODE

The received SINR at the destination node can be given as

(6) that depends on two factors – (1) transmit SNR (Eb/N0)

of the source and relay node, and (2) the RSI (σ2

rr )at the relay

node. For sake of simplicity, we consider equal transmit SNR

at the source and relay nodes. In Fig. 14, we show the impact

of transmit SNR and RSI on the received SINR. Clearly, as

the transmit SNR increases or the RSI decreases the received

SINR increases. Please note that for evaluations in Sec. VI, we

ﬁxed RSI σ2

rr = 0 dB and vary the transmit SNR [0,30] dB,

thus SINR varies from [−5,3] dB (as shown in Fig. 14a) and

we ﬁxed transmit SNR Eb/N0= 30 dB and vary the RSI

[−60,20] dB, thus SINR varies from [−12,30] dB (as shown

in Fig. 14b).

APPENDIX B

IMPAC T OF INCLUDING RTN IN AE FRAMEWORKS

In AE works for HD-AF relay network [6] a RTN is

included in d-BCM and excluded in BCM. In Fig. 15, we ana-

lyze the impact of including an RTN in the NN decoder of the

proposed AE frameworks for (n, k) = (7,4),Eb/N0= 30 dB

and varying RSI. We can see that including an RTN in AE-

based BCM design helps to improve the BER performance

by at least 5dB for lower RSI (σ2

rr ≤ −20 dB), whereas in-

cluding an RTN in AE-based d-BCM design worsens the BER

performance by at least 1.8dB for higher RSI (σ2

rr ≥0dB).

Thus in this work, in contrast to [6], we have include an RTN

in BCM design and do not include an RTN in d-BCM design.

REFERENCES

[1] T. O’Shea and J. Hoydis, “An Introduction to Deep Learning for the

Physical Layer,” in IEEE Transactions on Cognitive Communications and

Networking, vol. 3, no. 4, pp. 563–575, Dec. 2017.

[2] S. D¨

orner, S. Cammerer, J. Hoydis and S. t. Brink, “Deep Learning Based

Communication Over the Air,” in IEEE Journal of Selected Topics in

Signal Processing, vol. 12, no. 1, pp. 132-143, Feb. 2018.

[3] E. Balevi and J. G. Andrews, “Autoencoder-Based Error Correction Cod-

ing for One-Bit Quantization,” in IEEE Transactions on Communications,

vol. 68, no. 6, pp. 3440–3451, June 2020.

[4] T. Matsumine, T. Koike-Akino and Y. Wang, “Deep Learning-Based

Constellation Optimization for Physical Network Coding in Two-Way

Relay Networks,” ICC 2019 - 2019 IEEE Intern. Conf. on Commun.

(ICC), China, 2019, pp. 1–6.

[5] A. Gupta and M. Sellathurai, “End-to-End Learning-based Amplify-and-

Forward Relay Networks using Autoencoders,” ICC 2020 - 2020 IEEE

International Conference on Communications (ICC), Dublin, Ireland,

2020, pp. 1–6.

[6] A. Gupta and M. Sellathurai, ”End-to-End Learning-Based Framework

for Amplify-and-Forward Relay Networks,” in IEEE Access, vol. 9, pp.

81660-81677, 2021.

[7] A. Gupta and M. Sellathurai, “End-to-End Learning-based Two-Way AF

Relay Networks with I/Q Imbalance,” 2021 IEEE 22nd International

Workshop on Signal Processing Advances in Wireless Communications

(SPAWC), 2021, pp. 111-115.

[8] A. Gupta and M. Sellathurai, “A Novel Average Autoencoder-based

Amplify-and-Forward Relay Networks with Hardware Impairments,”

IEEE Transactions on Cognitive Communications and Networking, Ac-

cepted for Publication, 2021.

[9] Y. Lu, P. Cheng, Z. Chen, Y. Li, W. H. Mow and B. Vucetic, “Deep

Autoencoder Learning for Relay-Assisted Cooperative Communication

Systems,” in IEEE Transactions on Communications, vol. 68, no. 9, pp.

5471–5488, Sept. 2020.

[10] A. Gupta and M. Sellathurai, “A Stacked-Autoencoder Based End-to-

End Learning Framework for Decode-and-Forward Relay Networks,”

ICASSP 2020 - IEEE Intern. Conf. Acoustics, Speech, Signal Process.

(ICASSP), 2020, pp. 5245–5249.

[11] A. Gupta and M. Sellathurai, “A Stacked Autoencoder-based Decode-

and-Forward Relay Networks with I/Q Imbalance,” AI-6G Workshop,

IEEE World Congress on Computational intelligence (WCCI Workshop),

Accepted, 2022.

[12] S. Cammerer, F. A. Aoudia, S. Drner, M. Stark, J. Hoydis and S. ten

Brink, “Trainable Communication Systems: Concepts and Prototype,” in

IEEE Transactions on Communications, vol. 68, no. 9, pp. 5489–5503,

Sept. 2020.

[13] A. Gupta, K. Singh and M. Sellathurai, “Time-Switching EH-Based

Joint Relay Selection and Resource Allocation Algorithms for Multi-

User Multi-Carrier AF Relay Networks,” in IEEE Trans. Green Commun.

Netw., vol. 3, no. 2, pp. 505-522, June 2019.

[14] K. Singh, A. Gupta, T. Ratnarajah and M. Ku, “A General Approach

Toward Green Resource Allocation in Relay-Assisted Multiuser Commu-

nication Networks,” in IEEE Trans. Wireless Commun., vol. 17, no. 2,

pp. 848-862, Feb. 2018.

[15] A. Gupta, S. Biswas, K. Singh, T. Ratnarajah and M. Sellathurai, “An

Energy-Efﬁcient Approach Towards Power Allocation in Non-Orthogonal

Multiple Access Full-Duplex AF Relay Systems,” 2018 IEEE 19th Intern.

Work. Signal Process. Advances in Wireless Commun, (SPAWC), 2018, pp.

1-5.

[16] K. Singh, A. Gupta and T. Ratnarajah, “Efﬁcient joint subcarrier and

power allocation for achieving green multiuser full-duplex decode-and-

forward relay networks,” 2017 IEEE Intern. Conf. on Commun. (ICC),

2017, pp. 1-6.

[17] J. Xue, S. Biswas, A. C. Cirik, H. Du, Y. Yang, T. Ratnarajah, and M.

Sellathurai, “Transceiver Design of Optimum Wirelessly Powered Full-

Duplex MIMO IoT Devices,” in IEEE Trans. Commun., vol. 66, no. 5,

pp. 1955-1969, May 2018.

[18] C. Zhong, M. Matthaiou, G. K. Karagiannidis and T. Ratnarajah,

“Generic Ergodic Capacity Bounds for Fixed-Gain AF Dual-Hop Re-

laying Systems,” in IEEE Trans. Vehicular Technol., vol. 60, no. 8, pp.

3814-3824, Oct. 2011.

[19] Z. Ding, T. Ratnarajah and K. K. Leung, “On the study of network

coded AF transmission protocol for wireless multiple access channels,”

in IEEE Trans. Wireless Commun., vol. 8, no. 1, pp. 118-123, Jan. 2009.

[20] A. Bishnu, M. Holm and T. Ratnarajah, “Performance Evaluation of

Full-Duplex IAB Multi-Cell and Multi-User Network for FR2 Band,” in

IEEE Access, vol. 9, pp. 72269-72283, 2021.

[21] H. He, S. Biswas, P. Aquilina, T. Ratnarajah and J. Yang, “Performance

Analysis of Multi-Cell Full-Duplex Cellular Networks,” in IEEE Access,

vol. 8, pp. 206914-206930, 2020.

[22] J. R. Krier and I. F. Akyildiz, “Active self-interference cancellation

of passband signals using gradient descent,” 2013 IEEE 24th Ann.

Intern. Symp. on Personal, Indoor, and Mobile Radio Commun. (PIMRC),

London, UK, 2013, pp. 1212–1216.

[23] T. P. Do and T. V. T. Le, “Power Allocation and Performance Compar-

ison of Full Duplex Dual Hop Relaying Protocols,” in IEEE Communi-

cations Letters, vol. 19, no. 5, pp. 791–794, May 2015.

[24] K. -G. Wu, F. -T. Chien, Y. -F. Lin and M. -K. Chang, “SINR and Delay

Analyses in Two-Way Full-Duplex SWIPT-Enabled Relaying Systems,”

in IEEE Transactions on Communications,Early Access, 2021.

[25] K. Yang, H. Cui, L. Song and Y. Li, “Efﬁcient Full-Duplex Relaying

With Joint Antenna-Relay Selection and Self-Interference Suppression,”

in IEEE Transactions on Wireless Communications, vol. 14, no. 7, pp.

3991-4005, July 2015.

[26] H.M.D. Oliveira, G. Battail, “The random coded modulation: perfor-

mance and Euclidean distance spectrum evaluation,” Ann. T ´

el´

ecommun.,

vol. 47, pp. 107-124, 1992.

[27] Ian Goodfellow et. al.. Deep learning. MIT Press, 2016.

[28] D. Masters and C. Luschi, “Revisiting Small Batch Training for Deep

Neural Networks,” arXiv, 2018.

[29] Y. Huang et. al., “GPipe: Efﬁcient Training of Giant Neural Networks

using Pipeline Parallelism,” in Proc. Thirty-third Conference on Neural

Information Processing Systems (NIPS), 2019.

[30] L. Jim´

enez Rodr´

ıguez, N. H. Tran and T. Le-Ngoc, “Performance of

Full-Duplex AF Relaying in the Presence of Residual Self-Interference,”

in IEEE Journal on Selected Areas in Communications, vol. 32, no. 9,

pp. 1752-1764, Sept. 2014.

[31] T. K. Baranwal, D. S. Michalopoulos and R. Schober, “Outage

Analysis of Multihop Full Duplex Relaying,” in IEEE Communi-

cations Letters, vol. 17, no. 1, pp. 63–66, January 2013, doi:

10.1109/LCOMM.2012.112812.121826.

[32] M. R. Avendi and H. H. Nguyen, “Performance of Selection Combin-

ing for Differential Amplify-and-Forward Relaying Over Time-Varying

Channels,” in IEEE Trans. on Wireless Communications, vol. 13, no. 8,

pp. 4156–4166, Aug. 2014.

[33] Y. Lou, Y. Ma, Q. Yu, H. Zhao and W. Xiang, “A Differential ML Com-

biner for Differential Amplify-and-Forward System in Time-Selective

Fading Channels,” in IEEE Trans. on Vehicular Technology, vol. 65, no.

12, pp. 10157–10163, Dec. 2016.

[34] T. M. Cover and J. A. Thomas, Elements of information theory., John

Wiley & Sons, Nov. 2012.

[35] N. Ketkar, “Introduction to keras,” Deep Learning with Python, Springer,

pp. 97–111, 2017.

[36] Martn Abadi et. al., “TensorFlow: Large-scale machine learning on

heterogeneous systems,” Technical Report, Goggle Brain, arXiv, 2015.

[Online]:https://arxiv.org/abs/1605.08695.

[37] D. Kingma and J. Ba., “Adam: A method for stochastic optimization,”

In: arXiv preprint arXiv:1412.6980 (2014). [Online].

[38] X. Glorot and Y. Bengio, “Understanding the difﬁculty of training deep

feed-forward neural networks,” in Proceedings International Conference

AI Statistics, vol. 9, pp. 249–256, May 2010.

[39] M. Li, M. Soltanolkotabi, S. Oymak, “Gradient Descent with Early

Stopping is Provably Robust to Label Noise for Overparameterized Neural

Networks,” ArXiv, 2019. [Online]:https://arxiv.org/abs/1903.11680.

[40] F. Alberge, “Deep Learning Constellation Design for the AWGN

Channel With Additive Radar Interference,” in IEEE Transactions on

Communications, vol. 67, no. 2, pp. 1413–1423, Feb. 2019.

[41] M. Stark, F. Ait Aoudia and J. Hoydis, ”Joint Learning of Geometric and

Probabilistic Constellation Shaping,” 2019 IEEE Globecom Workshops

(GC Wkshps), 2019, pp. 1-6.

[42] F. A. Aoudia and J. Hoydis, ”Joint Learning of Probabilistic and

Geometric Shaping for Coded Modulation Systems,” GLOBECOM 2020

- 2020 IEEE Global Communications Conference, 2020, pp. 1-6

[43] A. Choromanska, M. Henaff, M. Mathieu, G. B. Arous, and Y. LeCun,

The loss surfaces of multilayer networks, in Proc. 18th Int. Conf. Artiﬁcial

Intelligence and Statistics (AISTATS), 2015, pp. 192204.

[44] Y. N. Dauphin et. al. “Identifying and attacking the saddle point problem

in high-dimensional non-convex optimization,” in Proc. 27th Advances in

Neural Information Processing Systems, pp. 29332941. 2014.

[45] E. Agrell, “Database of sphere packings,” [Online]: http://codes.se/

packings, 2014, accessed Mar. 1, 2019.

[46] S. Park and Moo-Kwang Byeon, “Irregularly distributed triangular

quadrature amplitude modulation,” 2008 IEEE 19th International Sym-

posium on Personal, Indoor and Mobile Radio Communications, Cannes,

France, 2008, pp. 1–5.

[47] T. G. Markiewicz, “Construction and Labeling of Triangular QAM,” in

IEEE Communications Letters, vol. 21, no. 8, pp. 1751–1754, Aug. 2017.

[48] A. R. Heidarpour, M. Ardakani and C. Tellambura, “Network Coded

Cooperation Based on Relay Selection with Imperfect CSI,” 2017 IEEE

86th Vehicular Technology Conference (VTC-Fall), 2017, pp. 1-5.

[49] 3GPP TS 38.212 v16.0.0, “NR; Multiplexing and channel coding.” 3rd

Generation Partnership Project (3GPP), Technical Speciﬁcation Group

Radio Access Network, Jan. 2020.

[50] M. Shirvanimoghaddam et al., “Short Block-Length Codes for Ultra-

Reliable Low Latency Communications,” in IEEE Communications Mag-

azine, vol. 57, no. 2, pp. 130–137, February 2019.

[51] C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, “A survey

on deep transfer learning,” arXiv, 2018, arXiv:1808.01974. [Online].

Available: https://arxiv.org/abs/1808.01974.

Textual Information Processing Based on Multi-Dimensional Indicator Weights

Article

Full-text available

Dec 2023

With the rapid advancement of artificial intelligence and wireless communication technologies, the abundance of textual information has grown significantly, accompanied by a plethora of multidimensional metrics such as innovation, application prospects, key technologies, and expected outcomes. Extracting valuable insights from these multifaceted indicators and establishing an effective composite evaluation weighting framework poses a pivotal challenge in text information processing. In response, we propose a novel approach in this paper to textual information processing, leveraging multi-dimensional indicator weights (MDIWs). Our method involves extracting semantic information from text and inputting it into an LSTM-based textual information processor (TIP) to generate MDIWs. These MDIWs are then processed to create a judgment matrix following by eigenvalue decomposition and normalization, capturing intricate semantic relationships. Our framework enhances the comprehension of multi-dimensional aspects within textual data, offering potential benefits in various applications such as sentiment analysis, information retrieval, and content summarization. Experimental results underscore the effectiveness of our approach in refining and utilizing MDIWs for improved understanding and decision-making. This work contributes to the enhancement of text information processing by offering a structured approach to address the complexity of multidimensional metric evaluation, thus enabling more accurate and informed decision-making in various domains.

Outage Probability Analysis of Multi-hop Relay Aided IoT Networks: Multi-hop Relay Aided IoT Networks

Article

Full-text available

Nov 2023

This study delves into Internet of Things (IoT) networks wherein a transmitting source communicates information to a designated recipient. The presence of signal attenuation challenges the direct transmission of information from the source to the recipient. To surmount this obstacle, we investigate IoT network communication facilitated by multi-hop relays, whereby multiple relays collaboratively enable the conveyance of data from the source to the recipient across intermediate stages. For the considered IoT networks augmented by multi-hop relays, we assess the performance of the system by analyzing the probability of transmission outage. This analysis entails the derivation of an analytical expression for evaluating the occurrence of IoT network outage. Additionally, we gauge the system's effectiveness by examining the attainable transmission rate, wherein an analytical expression is furnished to assess the IoT data rate. The empirical results, along with the analytical findings, are subsequently presented to validate the formulated expressions in the context of IoT networks empowered by multi-hop relays. Notably, the utilization of multi-hop relaying emerges as a efficacious strategy for substantially expanding the coverage scope of IoT networks.

Deep Learning Assisted Transceiver Design Methods for Multisource and Multidestination AF Relay Systems

Article

Full-text available

Oct 2023
WIRELESS PERS COMMUN

Joonwoo Shin

This article studies transceiver design methods for multiple-source and multiple-destination communication systems via an amplify-and-forward relay. Specifically, sum mean-square-error (MSE) minimizing source power allocation schemes, a relay beamforming matrix, and destination filter design methods are developed. After formulating the tractable sum-MSE minimization problem by introducing an auxiliary variable, a block-coordinate-descent-based algorithm is proposed to alternately optimize each transceiver coefficients of source, relay, and destination nodes. Subsequently, deep learning (DL)-assisted design methods are proposed to address the drawbacks of iterative algorithms. Exploiting the structure of the optimum relay beamformer, the proposed DL-based methods return only a single parameter to construct the relay beamforming matrix as well as the transceiver coefficients for the source and destination nodes, thereby, efficiently implementing the deep neural network of the proposed scheme. The effectiveness of the proposed methods was verified through numerical simulations. In particular, without iterative calculations, the DL-based schemes show almost identical performance to that of the optimum methods.

Research on the Performance of Text Mining and Processing in Power Grid Networks

Article

Full-text available

Jun 2023

This paper employs deep learning technique to perform the research of text mining for power grid networks, focusing on fundamental elements such as loss and activation functions. Through some analysis and formulas, we explain how these functions contribute to deep learning. We also introduce major deep learning training models, including CNN and RNN, and provide visual aids to aid understanding. To demonstrate the impact of various factors on deep learning training, we employ control variable experiments to analyze the influence of factors such as learning rate, batch size, and data noise on model training trends. While the influence of hyperparameters and data noise are covered in this paper, other factors such as CPU and memory frequency, as well as GPU performance, also play a crucial role in deep learning training. Therefore, continuous adjustments to various factors are necessary to achieve optimal training results for deep learning models in power grid networks.

Wireless federated learning for PR identification and analysis based on generalized information

Article

Jun 2024

Normalized flow networks and generalized information aided PR dynamic analysis

Article

Jun 2024

Long-short term memory networks aided fault detection of power facilities

Article

May 2024

Knowledge graph learning algorithm based on deep convolutional networks

Article

May 2024

Enhancing Cooperative Communications via Reconfigurable Intelligent Surface-Assisted Strategies and the Integration of Low-Density Parity-Check Codes

Conference Paper

Nov 2023

A Comprehensive Survey on Full-Duplex Communication: Current Solutions, Future Trends, and Open Issues

Article

Full-text available

Sep 2023

Full-duplex (FD) communication is a potential game changer for future wireless networks. It allows for simultaneous transmit and receive operations over the same frequency band, a doubling of the spectral efficiency. FD can also be a catalyst for supercharging other existing/emerging wireless technologies , including cooperative and cognitive communications, cellular networks, multiple-input multiple-output (MIMO), massive MIMO, non-orthogonal multiple access (NOMA), millimeter-wave (mmWave) communications, unmanned aerial vehicle (UAV)-aided communication, backscatter communication (Back-Com), and reconfigurable intelligent surfaces (RISs). These integrated technologies can further improve spectral efficiency, enhance security, reduce latency, and boost the energy efficiency of future wireless networks. A comprehensive survey of such integration has thus far been lacking. This paper fills that need. Specifically, we first discuss the fundamentals, highlighting the FD transceiver structure and the self-interference (SI) cancellation techniques. Next, we discuss the coexistence of FD with the above-mentioned wireless technologies. We also provide case studies for some of the integration scenarios mentioned above and future research directions for each case. We further address the potential research directions, open challenges, and applications for future FD-assisted wireless, including cell-free massive MIMO, mmWave communications, UAV, BackCom, and RISs. Finally, potential applications and developments of other miscellaneous technologies, such as mixed radio-frequency/free-space optical, visible light communication, dual-functional radar-communication, underwater wireless communication, multiuser ultra-reliable low-latency communications, vehicle-to-everything communications, rate splitting multiple access, integrated sensing and communication, and age of information, are also highlighted.

A Novel Average Autoencoder-Based Amplify-and-Forward Relay Networks With Hardware Impairments

Article

Full-text available

Mar 2022

In this paper, we propose a novel Average au-toencoder (AE)-based amplify-and-forward (AF) relay networks impacted by the I/Q imbalance (IQI) and additional hardware impairments (AHI), where the source and destination nodes are equipped with neural network (NN)-based encoder and decoder, while a conventional AF relay node assists the transmission. The average AE employs multiple small NN-based decoders at the destination node, each decoding a soft probabilistic output that is averaged to obtain the final soft probabilistic output at the destination node. By considering multiple small NN decoders, we reduce the implementation complexity significantly while improving the performance compared to the AE with a single large but NN-based decoder. Within this Average AE framework, we propose a coded modulation design (CMD) with zero-forcing-based IQI compensation that considers the availability of the channel state information (CSI) and IQI knowledge. However, the IQI and CSI need to be estimated separately. Thus, we also propose a CMD with no IQI compensation that requires only the CSI knowledge. Finally, we propose a differential CMD that removes the necessity of both the CSI and IQI knowledge. Under low signal-to-interference-and-noise-ratio regimes, we show that the proposed Average AE outperforms the optimal maximum likelihood detector by considerable margin. Index Terms-AF relay networks, additional hardware impairments , average autoencoder, block coding, coded modulation design, differential coded modulation design, I/Q imbalance, and small neural networks.

End-to-End Learning-Based Framework for Amplify-and-Forward Relay Networks

Article

Full-text available

Jun 2021

We study end-to-end learning-based frameworks for amplify-and-forward (AF) relay networks, with and without the channel state information (CSI) knowledge. The designed framework resembles an autoencoder (AE) where all the components of the neural network (NN)-based source and destination nodes are optimized together in an end-to-end manner, and the signal transmission takes place with an AF relay node. Unlike the literature that employs an NN-based relay node with full CSI knowledge, we consider a conventional relay node that only amplifies the received signal using CSI gains. Without the CSI knowledge, we employ power normalization-based amplification that normalizes the transmission power of each block of symbols. We propose and compare symbol-wise and bit-wise AE frameworks by minimizing categorical and binary cross-entropy loss that maximizes the symbol-wise and bit-wise mutual information (MI), respectively. We determine the estimated MI and examine the convergence of both AE frameworks with signal-to-noise ratio (SNR). For both these AE frameworks, we design coded modulation and differential coded modulation, depending upon the availability of CSI at the destination node, that obtains symbols in 2n-dimensions, where n is the block length. To explain the properties of the 2n-dimensional designs, we utilize various metrics like minimum Euclidean distance, normalized second-order and fourth-order moments, and constellation figures of merit. We show that both these AE frameworks obtain similar spherical coded-modulation designs in 2n-dimensions, and bit-wise AE that inherently obtains the optimal bit-labeling outperforms symbol-wise AE (with faster convergence under low SNR) and the conventional AF relay network with a considerable SNR margin.

Performance Evaluation of Full-Duplex IAB Multi-Cell and Multi-User Network for FR2 Band

Article

Full-text available

May 2021

One of the approaches to support high data rates in beyond 5G cellular networks is the dense deployment of the small cell millimeter-wave (frequency range 2 (FR2)) (> 24.5 GHz) base stations. However, this dense deployment of base stations leads to high cost, because it requires fiber backhaul connection with the core network. One prominent solution is the integrated access and backhaul (IAB) network proposed in 3GPP Release 16, where some portion of wireless spectrum is used for backhaul to serve base stations instead of fiber, and the remaining portion of the spectrum is used for user equipment for communication. The partition of the spectrum into access and backhaul degrades the spectral efficiency of the IAB network. Thus, in this paper, we propose a full duplex (FD) enabled IAB network with large-scale array systems to enhance the spectral efficiency of the IAB network. In the FD IAB network, the IAB node is FD which transmits and receives at the same frequency/time resources. We consider a multi-cell multi-user IAB network with all the interferences due to FD transmission and evaluate the performance in terms of bit-error-rate and spectral efficiency. Moreover, an algorithm for choosing different users is proposed which is based on the cross-correlation of RF precoder weights. Further, identification of the optimal beam’s index is also proposed which is based on the index of synchronization signal block and half-frame bit of physical broadcast channel payload of 5G NR. Finally, successive interference cancellation is proposed for self-interference mitigation in the digital domain.

Performance Analysis of Multi-Cell Full-Duplex Cellular Networks

Article

Full-text available

Jan 2020

We analyze the performance of a cellular network, where Poisson point process distributed half-duplex (HD) downlink (DL) and uplink (UL) users are served by multiple full-duplex (FD) base stations (BSs). To address the surge in interference in the network due to the simultaneous operation in time and frequency of the FD BSs, we (a) adopt a self-interference cancellation scheme at each BS, and (b) apply linear interference alignment in each cell to cancel the intra-cell interference. Further, to better capture the distribution of the FD BSs, we model the BSs as a Matérn hard-core point process, in which a minimum distance is imposed between points. The performance of both UL and DL users is analyzed by deriving general expressions and closed-form approximations for the outage probability and throughput. Next, simulations are carried out for both macro and micro cell environments under both FD and HD operations with respect to various network parameters. Our results reveal several fundamental characteristics and the necessary conditions required for the successful deployment of such networks.

End-to-End Learning-based Amplify-and-Forward Relay Networks using Autoencoders

Conference Paper

Full-text available

Jun 2020

Deep Autoencoder Learning for Relay-Assisted Cooperative Communication Systems

Article

Full-text available

May 2020

Emerging recently as a novel concept in communication system design, end-to-end learning introduces deep neural networks (NNs) to represent the transmitter and receiver functions. Consequently, the whole system can be interpreted as an autoencoder (AE), which can be optimized from a holistic approach through a data-driven training method. Until now, the AE technique is mainly developed for point-to-point communication scenarios. In this paper, we aim to develop a novel NN-based AE scheme for relay-assisted cooperative communication systems. Specifically, three NN components are constructed to learn the behavior of the transmitter, relay node, and receiver, respectively. As the conventional end-to-end training is inapplicable, a novel two-stage training approach is proposed to indirectly solve the end-to-end training problem. The implicit approximations involved are analytically expressed based on information theory, offering insights on the achievable performance with the proposed training method. The proposed AE model eliminates the need for channel state information and noise variance of any link, and is adaptive to the variation in the input block length. Simulation results verify its advantages over the conventional decode-and-forward (DF) and amplify-and-forward (AF) schemes in various scenarios.

End-to-End Learning-based Two-Way AF Relay Networks with I/Q Imbalance

Conference Paper

Sep 2021

Joint Learning of Probabilistic and Geometric Shaping for Coded Modulation Systems

Conference Paper

Dec 2020

SINR and Delay Analyses in Two-Way Full-Duplex SWIPT-Enabled Relaying Systems

Article

Dec 2020

We first investigate the accumulated loopback self-interference (ALSI) under amplify-and-forward (AF) protocol in two-way full-duplex (TWFD) relaying systems with simultaneous wireless information and power transfer (SWIPT). We analyze how SWIPT affects ALSI and establish the condition for ALSI to converge when there is no energy buffer. Unlike ALSI in TWFD relaying without SWIPT (Chang et al. , 2019), ALSI in TWFD relaying with SWIPT always converges and does not grow unbounded. In addition, we study the delay performance in such a system under AF and decode-and-forward (DF) protocols. In AF protocol, the condition for successful signal transmissions is provided. When that condition is met, the received signals at the relay can always get through the channels but experience delays. Otherwise, the outage event occurs. In DF protocol, the best-effort scenario (having no energy buffer) and the QoS scenario (having energy buffer) are discussed. The probability of successful transmissions and the distribution of the delay in both scenarios are derived. Simulation results of DF protocol show that the best-effort scenario is suitable for the delay-tolerant service and the QoS scenario is suitable for the delay-sensitive service. Furthermore, AF protocol is more suitable for the service with less QoS demanding.

Trainable Communication Systems: Concepts and Prototype

Article

Jun 2020

We consider a trainable point-to-point communication system, where both transmitter and receiver are implemented as neural networks (NNs), and demonstrate that training on the bit-wise mutual information (BMI) allows seamless integration with practical bit-metric decoding (BMD) receivers, as well as joint optimization of constellation shaping and labeling. Moreover, we present a fully differentiable neural iterative demapping and decoding (IDD) structure which achieves significant gains on additive white Gaussian noise (AWGN) channels using a standard 802.11n low-density parity-check (LDPC) code. The strength of this approach is that it can be applied to arbitrary channels without any modifications. Going one step further, we show that careful code design can lead to further performance improvements. Lastly, we show the viability of the proposed system through implementation on software-defined radios (SDRs) and training of the end-to-end system on the actual wireless channel. Experimental results reveal that the proposed method enables significant gains compared to conventional techniques.

End-to-End Learning-Based Full-Duplex Amplify-and-Forward Relay Networks

Abstract

Recommended publications

End-to-End Learning-Based Framework for Amplify-and-Forward Relay Networks

A Novel Average Autoencoder-Based Amplify-and-Forward Relay Networks With Hardware Impairments

End-to-End Learning-based Two-Way AF Relay Networks with I/Q Imbalance

Novel Deep Learning-based Receiver Design for a Multi-user Uplink 5G-NR System