Conference PaperPDF Available

Joint Selection of Local Trainers and Resource Allocation for Federated Learning in Open RAN Intelligent Controllers

Authors:

Abstract and Figures

Recently, Federated Learning (FL) has been applied in various research domains specially because of its privacy preserving and decentralized approach of model training. However, very few FL applications have been developed for the Radio Access Network (RAN) due to the lack of efficient deployment models. Open RAN (O-RAN) promises a high standard of meeting 5G services through its disaggregated, hierarchical, and distributed network function processing framework. Moreover, it comes with built-in intelligent controllers to instill smart decision making ability into RAN. In this paper, we propose a framework named O-RANFed to deploy and optimize FL tasks in O-RAN to provide 5G slicing services. To improve the performance of FL we formulate a joint mathematical optimization model of local learners selection and resource allocation to perform model training in every iteration. We solve this non-convex problem using the decomposition method. First, we propose a slicing based and deadline aware client selection algorithm. Then, we solve the reduced resource allocation problem by using successive convex approximation (SCA) method. Our simulation results show the proposed model outperforms the state-of-the-art FL methods such as FedAvg and FedProx in terms of convergence, learning time, and resource costs.
Content may be subject to copyright.
Joint Selection of Local Trainers and Resource
Allocation for Federated Learning in Open RAN
Intelligent Controllers
Amardip Kumar Singh, Kim Khoa Nguyen
Synchromedia Lab , ´
Ecole de Technologie Sup´
erieure, Montreal, Canada
amardip-kumar.singh.1@ens.etsmtl.ca, kim-khoa.nguyen@etsmtl.ca
Abstract—Recently, Federated Learning (FL) has been applied
in various research domains specially because of its privacy pre-
serving and decentralized approach of model training. However,
very few FL applications have been developed for the Radio
Access Network (RAN) due to the lack of efficient deployment
models. Open RAN (O-RAN) promises a high standard of
meeting 5G services through its disaggregated, hierarchical, and
distributed network function processing framework. Moreover, it
comes with built-in intelligent controllers to instill smart decision
making ability into RAN. In this paper, we propose a framework
named O-RANFed to deploy and optimize FL tasks in O-RAN
to provide 5G slicing services. To improve the performance of
FL we formulate a joint mathematical optimization model of
local learners selection and resource allocation to perform model
training in every iteration. We solve this non-convex problem
using the decomposition method. First, we propose a slicing based
and deadline aware client selection algorithm. Then, we solve the
reduced resource allocation problem by using successive convex
approximation (SCA) method. Our simulation results show the
proposed model outperforms the state-of-the-art FL methods
such as FedAvg and FedProx in terms of convergence, learning
time, and resource costs.
Index Terms—Federated Learning, O-RAN, 5G, Resource
Allocation, RAN Intelligent Controller, Network Slicing, RIC
I. INTRODUCTION
Federated Learning (FL) is a new approach fusing the
concepts of distributed computing and Artificial Intelligence
[1]. Unlike the traditional machine learning in which models
are trained on a centralized data set, FL does not require the
data sets to be located at one central point. It collaborates with
the local computing nodes through a global aggregation point
by transferring only the model update vectors in a periodic
communication mode. Therefore, it respects the privacy of
the local data set, distributes the burden of computing, and
still train a centralized model [1]. These features enable
many applications of FL in edge computing where users’
devices serve as local trainers. However, it is yet to be well
investigated in carriers’ networks due to the lack of a standard
implementation model. The characteristics of FL pave way for
its uses in the domain of 5G RAN, particularly to enhance the
radio resource management policies [2].
5G network services promise to deliver ever fast and reliable
user experience that too on a massive scale. This comes with
a burden on improvements in how the access network is
managed. Using the operational and maintenance data that is
collected periodically, RAN performance can be improved by
incorporating AI capabilities. O-RAN is a newly introduced
radio access network architectural framework that supports
different use cases of 5G services through network slicing and
disaggregation. The key element of O-RAN is its intelligent
controllers (RICs) that monitor and support the guaranteed per-
formance for different slice user groups [3]. On the other hand,
O-RAN operates on multi-vendor and shared resource system.
Therefore, it has to function under minimal resource cost and
yet with guaranteed service delivery. These constraints impose
heavy challenges posed for the O-RAN RICs.
In this paper, we investigate the possibility of FL model
training in O-RAN to provide 5G slicing services. 5G services
are governed by slicing of the physical network into several
logical isolated self-adaptable networks hosted by general
processors located at cloud-based data centres [4]. 5G defines
three classes of services based on their quality of services
(QoS) metrics: ultra-Reliable Low Latency Communication
(uRLLC), extreme Mobile Broad Band (eMBB), and massive
Machine Type Communications (mMTC). Accordingly, the
physical RAN infrastructure is also divided into three logi-
cal slices of network elements that are assigned and main-
tained dynamically [1]. The two kinds of O-RAN Intelligent
Controllers namely, near-realtime RAN Intelligent Controller
(near-RT-RIC) and Non-RT-RIC coordinate with each other to
select the slice specific local training points and then train the
FL models.
In Fig. 1, we map the FL framework via RICs’ specifications
of O-RAN [5]. The slice operational data is collected and
saved into distributed databases through E2 interface. O1 inter-
face transfers data to near RT-RICs for local processing. The
interaction of FL parameters is enabled by the A1 interface
between Non-RT-RIC and near-RT-RICs. Since the FL tasks
are distributed over edge cloud sites in O-RAN RICs, and
shared by various operators, resources required to facilitate this
learning process need to be optimized. In addition, FL model is
trained iteratively, therefore learning time is also an issue. Due
to the stochastic nature of this distributed learning process,
minimizing the learning time while guaranteeing the accuracy
of the global model is challenging. Hence, the resource usage
cost reduction and FL time minimization should be dealt
together for the O-RAN intelligent controllers.
A plethora of recent works have focused on the adaptive
ML Model Application
Data Base
RAN Data Analytics & AI Platform
O-CU-UP
O-CU-CP
O-DU
O-RU
Near-
RT-RIC
ML Model
Host
Non-RT-RIC
Service & Management Orchestration Functions
A1
O1
O2
E2
O1
E1
F1
E2
Near-
RT-RIC
Near-
RT-RIC
near-
RT-RIC
Near-
RT-RIC
Near-
RT-RIC
near-
RT-RIC
Near-
RT-RIC
Near-
RT-RIC
near-
RT-RIC
O-CU-UP
O-CU-CP
O-DU
O-RU
O-CU-UP
O-CU-CP
O-DU
O-RU
Data Base
Data
Base
Data
Base Data Base
Data Base
Data
Base Data Base
Data Base
Data
Base
F1
F1
E1 E1
E2 E2
uRLLC
slice
eMBB
slice
mMTC
slice
Local model
update vector
uploading
Global model
aggregated vector
broadcast
Fig. 1. Federated Learning set-up for O-RAN Intelligent Controllers
optimization of FL under communication constrained edge
computing. Prior work in [6], [7] and [8] have investigated
different resource constrained federated learning approaches.
Their objective is to minimize the energy usage of edge
devices involved in FL training by allocating optimal trans-
mission power. The work in [8] is too general and can only
be implemented in a single edge (with a single base station)
and cannot be used for a carrier network of multiple edges.
Moreover, the problem of resource allocation from the per-
spective of the System Management and Orchestration (SMO),
which is important for O-RAN, has not been considered so
far. In this paper, we have taken into account new parameters
of O-RAN architecture to design an algorithm to select local
trainers, then allocate resources for the selected trainers, and
propose an aggregation method. Our contributions in this paper
can be summarized as below:
A mathematical formulation of the joint optimal resource
allocation and local trainers’ selection problem for the O-
RANFed learning tasks. Then, we propose a solution for
this non-convex optimization problem using the decom-
position method.
An O-RAN slicing based and deadline aware algorithm
to select representative instances of near-RT-RIC as local
model participants in each global iteration of FL.
A FL algorithm, so called O-RANFed, for O-RAN slicing
services where near RT-RIC hosts the local training
instances and Non-RT-RIC hosts the global aggregation
point of the ML model.
To the best of our knowledge, this is the first work proposed
to optimize ML training through federated settings in O-
RAN. The remainder of this paper is organized as follows. In
Section II, the system model and the problem formulation are
presented. In Section III, we describe our proposed solution
approach. In Section IV, we present the numerical results to
evaluate the performance of our proposed solution. Finally, we
conclude the paper and discuss our future work.
II. SY ST EM MO DE L AN D PROB LE M FOR MU LATI ON
Consider an O-RAN system with a single regional cloud and
a set Mof Mdistributed edges cooperatively performing an
FL algorithm. In this FL setup, each edge cloud uses its locally
collected training data to train a local FL model. The Non-
RT-RIC at the regional cloud integrates the local FL models
from participating edge clouds and generates an aggregated FL
model. This aggregated FL model is further used to improve
local FL models of each near-RT-RIC enabling the local
models to collaboratively perform a learning algorithm without
transferring its training data. We call this aggregated FL model
generated by using the local FL models as the global FL
model. As illustrated in Fig. 1, the uplink from near-RT-RICs
to the Non-RT-RIC is used to send the local FL model update
parameters and the downlink is used to broadcast the global
FL model in global rounds of the training.
A. The Learning Model
In this model, each near-RT-RIC collects a dataset Di=
[xi,1, ....., xi,Si]of input data where Siis the number of the
input samples collected by near-RT-RIC iand each element xis
is the FL model’s input vector. Let yis be the output of xis . For
simplicity, we consider an FL model with single output, which
can be readily generalized to a case with multiple outputs. The
output data vector for training the FL model of near-RT-RIC i
is yi= [yi,1, ....., yi,Si]. We assume that the data collected by
each near-RT-RIC is different from the other near-RT-RICs
i.e. (xi6=xj;i6=j, i, j ∈ M). So, each local trainer
will train the model using a different dataset. This is in line
with the real scenario as each local near-RT-RIC collects the
operational data from the corresponding slice specific users.
We define a vector gito capture the parameters related to the
local FL model that is trained by Siand yi.gidetermines
the local FL model of each near-RT-RIC i. For example, in a
linear regression prediction algorithm, ximT.yirepresents the
output, and gidetermines the prediction accuracy. The training
process of an FL model is done in a way to solve:
min
g1,......,gM
1
S
M
X
i=1
Si
X
s=1
f(gi, xis, yis )(1)
s.t. g1=g2=..... =gM=gi∈ M (1a)
where S=PM
i=1 Siis the total size of training data of
all near-RT-RICs. gis the global FL model generated by
the Non-RT-RIC and f(gi, xis, yis )is a loss function that
captures the FL prediction accuracy. Different FL algorithms
use different loss functions. Constraint (1a) ensures that, once
the FL model converges, all of the near-RT-RICs and the Non-
RT-RIC will transmit the parameters gof the global FL model
to its connected near-RT-RICs so that they train their local FL
models. Then the near-RT-RICs will transmit their local FL
models to the Non-RT-RIC to update the global FL model. The
update of each near-RT-RIC i’s local FL model gidepends on
all near-RT-RICs’ local FL models. The update of the local FL
model gidepends on the learning algorithm. For example, one
can use gradient descent or randomized coordinate descent to
update the local FL model. The update of the global model g
is given by:
g=
M
X
i=1
Si.gi
S.(2)
Since, we are considering wireless transmissions through A1
interface between near-RT-RICs and the Non-RT-RIC [5],
there is a resource constraint on the communication model
which in turn affects the performance of FL learning algo-
rithm. Therefore, we need to jointly consider these two aspects.
B. FL Resource Model
In each global interaction, the O-RAN system has to decide
which local training points i.e. which near-RT-RIC to partic-
ipate. This is because at each time interval only a limited
number of clients can participate due to delay constraints
originating from the control loops of O-RAN. Therefore,
the selected clients upload their local FL models updates
depending on the wireless media. We define a binary variable
at
m∈ {1,0}to decide whether or not the trainer mis
selected in round t, and at= (at
1, ......, at
M)collects the
overall trainers’ selection decisions. A selected near-RT-RIC
in round ti.e. (at
m= 1), consumes compute resources to
train locally with the collected data. At the same time, these
selected trainers also consume bandwidth resources to transmit
update vectors. We consider the orthogonal frequency division
multi access (OFDMA) for local model uploading with a total
bandwidth B. Let bt
m[0,1] be the bandwidth allocation
ratio for trainer min round t, hence its allocated bandwidth
is bt
mB. Let bt= (bt
1, ......., bt
M). Bandwidth allocation must
satisfy Pm∈M bt
m= 1 t. Clearly if at
m= 0, namely trainer
mis not selected in round t, then no bandwidth is allocated to
it i.e. bt
m= 0. On the other hand, if at
m= 1, then we require
at least a minimum bandwidth bmin is to be allocated to the
trainer mi.e. bt
mbmin. To make the problem feasible, we
assume bmin 1
M. Therefore, total resource cost for using
communication bandwidth is:
Rco =
M
X
m=1
Rco
m=
T
X
t=1
at
mbt
mBptr (3)
for Tglobal rounds where ptr is the unit cost of bandwidth
usage. For each near-RT-RIC m, let Rcp
mdenote its local
training compute resource cost in every round which depends
on its computing host and dataset. To process the local dataset
each near-RT-RIC uses the CPU cycle frequency of the host.
Let the CPU power of mth host be fmcycles/s and the per
unit time usage cost be pc. Then the total compute resource
cost is:
Rcp =
M
X
m=1
Rcp
m=
T
X
t=1
at
m
Dmcm
fm
pc(4)
where cmis the CPU cycles required for processing a bit of
data.
C. FL Accuracy Model
The target for each of these local models is to attain a
θ  (0,1) level of accuracy, defined as below:
||∇ft
m(gt
m)|| ≤ θ||∇ft
m(gt1
m)||,m  {1,2,3, .., M }(5)
A near-RT-RIC takes several iterations, called local iterations,
to attain this accuracy. In the global model placed at the non-
RT-RIC, the target is to attain the optimal model weights to
reach level of global accuracy, defined as below:
|F(gt)F(g)| ≤ tT(6)
Constraint (6) states that the gis the optimal model parameter
i.e. for every global round beyond T, the difference between
the loss function values falls within the defined accuracy level.
Here, F(.)denotes the global loss function, defined over all
the local loss functions as:
F(g) :=
M
X
i=1
(|Si|/S)fi(gi)(7)
In [9], it is proven that the number of global iterations required
to attain a level of global accuracy and local accuracy θcan
be upper bounded by:
K(, θ) = O(log(1/))
(1 θ)(8)
We use this relationship among the local accuracy level, global
model accuracy, and the upper limit on the number of required
global rounds to model the FL time. In order to ensure
the convergence of the gradient descent approximation, the
following assumptions are considered on the loss functions at
each near-RT-RIC training point:
(i) Fi(g)is convex.
(ii) Fi(g)is ρ-Lipschitz, i.e. ||Fi(g)Fi(g0)|| ≤ ρ||gg0||,
for any g, g0.
(iii) Fi(g)is β-smooth, i.e, ||Fi(g)Fi(g0)|| β||g
g0||, for any g, g0.
(iv) For any gand i, the difference between the global gradient
and local gradient can be bounded by
||Fi(g)F(g)|| δi, and δ:= PiSii
S.
These assumptions are in line with the recent works [10], [11],
[6], [12] on convergence analysis of FL.
D. Latency Model
We consider synchronous communication, in other words
all the near-RT-RICs send their local update vectors to the
Non-RT-RIC before the tth round of global aggregation starts.
Therefore, before entering this communication round, all
the near-RT-RICs must finish its local ML processing. In
each of the global round, the FL tasks are spanned over
three operations: (i) computation, (ii) communication of local
updates to the Non-RT-RIC using uplink, and (iii) broad-
cast communication to all the involved near-RT-RICs using
downlink. Let the computation time required for one local
round for mth near-RT-RIC be Tcp
m, and there be Kllocal
iterations in each interval of the global communication. Then,
the computation time in one global iteration round is KlTcp
m.
Let the communication time required in transferring the local
update vectors from mth near-RT-RIC to the Non-RT-RIC be
Tco
min the uplink phase. Let dmbe the datasize of the update
vector of mth trainer. Therefore, the learning time in one
global round of FEDL for the mth local FL model trainer
is:
Tm=Kl.T cp
m+Tco
m;m∈ M (9)
Where Tco
mis calculated as:
Tco
m=dm
bt
m.B ;m∈ M (10)
In the downlink phase, we do not consider the delay because
it is negligible as compared to the uplink delay as a result
of high speed downlink communication. Let Kbe the total
number of global rounds to attain the global accuracy as
established in (8). Therefore, the total learning time can be
modeled as:
Ttotal =K.Tmax =K.max{Tm;m∈ M} (11)
E. Problem Formulation
Our goal is to jointly minimize the resource cost and the
learning time under the constraints of model accuracy, and
available compute and bandwidth resources. This can be done
by optimizing the selection of local trainers i.e. near-RT-RICs,
bandwidth allocation, and number of local training rounds, as
formulated in the optimization model (12).
P: min
at,bt,θ,Kl
{(1 ρ)Rtotal +ρ.T total}(12)
subject to:
0< θ < 1,(12a)
M
X
m=1
at
m.bt
m.B B, (12b)
M
X
m=1
bt
m= 1,(12c)
bmin bt
m1 ; mM,(12d)
max
m{cm.Dm
fm
+Tco
m}=Tmax,(12e)
K=µ.log(1/)
(1 θ),(12f)
at
m∈ {1,0}.(12g)
The objective function (12) has two components balanced by
a trade-off parameter ρbecause the two goals are conflicting.
The total resource cost, Rtotal =Rcp +Rco and the FL
training time, Ttotal as given by (11). Minimizing the resource
cost naturally leads to higher learning time and vice-versa.
Constraint (12a) limits the local accuracy line. Constraint
(12b) bounds the total bandwidth allocated for the FL tasks.
Constraint (12c) presents the definition of bt
mi.e, the sum of
bandwidth fractions must be 1. (12d) denotes the boundary of
the bandwidth fractional allocation. Since we have assumed
the synchronous communication mode of update vectors in
each global round, (12e) imposes this criteria. The relationship
between local accuracy and the number of global rounds
is stated in (12f) where µis a multiplication factor. (12h)
represent the defining domain of the decision variable.
III. PROP OS ED SOLUTION
(12) is a non-convex optimization problem because of the
non-convex objective function and constraints (12e)-(12g). So,
we decompose the problem into two sub-problems and then
use iterative solution to reach the optimal solution. We first
solve the problem of trainers’ selection and then use this
solution to allocate resources optimally to these selected local
trainers. Fig. 2 shows the scheme of the proposed solution.
Due to the variation in traffic patterns for different kinds
Original problem (12):
Resource cost and FL
time minimization,
sub. to optimal client
selection and
bandwidth allocation
Slicing based
and deadline
aware client
selection (13)
Resource
Allocation for
selected near-
RT-RICs (14)
Solution Approach
Solving (13) using
Algorithm 1
Solving (14) using
SCA method
Using the
solutions to
implement
Algorithm 2
Fig. 2. Schematic Diagram of the Proposed Solution
of slicing services of O-RAN, the local FL model might en-
counter inconsistency problem. This may lead to a degradation
in accurate prediction. We take into account this differentiation
and propose a trainers’ selection algorithm that respects the
formation of slices in O-RAN while maintaining a deadline
awareness.
A. Local Trainers’ Selection
According to the specifications defined by O-RAN Alliance,
the collected RAN operational data can be separated based on
their slice-user groups. Each near-RT-RIC is then fed with
slice specific network data. The selection of a near-RT-RIC
corresponding to a slice must be incorporated in each iteration
of gradient descent training of the model. However, not all the
local models can be accommodated in each iteration because
of the deadline constraint (13a) and limited computational and
bandwidth resources to be assigned for this learning task. So,
we propose Algorithm 1 for this selection in alignment with
the O-RAN slice definition. In this algorithm, we categorize
the set of near-RT-RICs into three classes corresponding to
eMBB, uRLLC, and mMTC slicing services.
Our objective in this trainers’ selection algorithm is to
maximize the number of near-RT-RICs to participate in each
global round and allow the non-RT-RIC to aggregate all
received data. This is based on the idea that a larger fraction
of trainers in each round saves the total time required for a
global FL model to attain the desired accuracy performance
[13]. Let N(⊆ M)be the set of selected near-RT-RICs, tr ound
be the deadline for each global round, t1be the elapsed time to
perform Algorithm 1, and tagg be the time taken in aggregating
the update parameters at the Non-RT-RIC. Therefore, the
Algorithm 1 : Deadline aware and Slicing based Local
Trainers’ Selection
1: Input: M: Set of all near-RT-RICs
2: Initialize Nu,Ne,Nm= Φ
3: for ti
round defined for i {N u,Ne,Nm}do
4: while |N | >0do
5: xarg minn∈N 1
2.(tk1
n+α.tk
n(estimated))
6: tt1+tagg. +tk
n
7: N \ {x}
8: if t<ti
round then
9: tt+tk
n
10: end if
11: end while
12: end for
13: Output: N=Nu∪ N e N m
mathematical optimization problem for the trainer selection
becomes:
max
N{|N |} (13)
s.t. t1+tagg. +1
2(tk1
n+α.tk
n)tround.(13a)
(13) is a combinatorial optimization problem which makes it
non-trivial. So, we employ a greedy heuristic to solve this
problem as shown in Algorithm 1. We repeat the steps in
each global round until we get the desired accuracy. Here, the
constraint (13a) restricts the violation of the deadline for every
near-RT-RIC in each global round. The deadline is assigned
separately for each slice-user groups while the total deadline
in each round is varied experimentally to observe its impact
on overall learning time of FL model.
B. Resource Allocation
From the trainer selection phase, we obtain ati.e. a binary
valued vector of selected trainers in kth global round. The
next phase is to allocate the compute and bandwidth resources
to support the local training, parameters uploading, model
aggregation, and broadcast of updated model weights. For
this we solve the optimization problem (12) with known
variable at. Still, (12) is a non-convex optimization prob-
lem, exact solution of which is infeasible using traditional
methods. Therefore, we employ an approximation approach
with equivalent surrogate functions. The multiplication factor
µin (12f) is chosen such that the whole numerator part is 1.
(12d) is replaced by an inequality preserving the same lower
bound on Tmax value. With these changes and substituting the
defining expressions, the optimization problem (12) reduces
the following mathematical form:
P1: min
bt,θ,Kln(1 ρ)T
X
t=1
at
m.bt
m.B.ptr
+Kl.
T
X
t=1
at
m.Dm.cm
fm
.pc+ρ. 1
1θ.Kl.Tmaxo(14)
subject to: (12a), (12b), (12c), (12d), and (12e)
The number of local iterations (Kl) in each global round is
determined experimentally as required in attaining the local
accuracy value θ. We solve this problem using Successive
Convex Approximation (SCA) method.
C. Federated Training in O-RAN RICs(O-RANFed)
Using the solutions of trainer selection and resource al-
location in (14), we train the FL model as described in
Algorithm 2. In each global round, a subset of participating
local trainers are selected first followed by resource allocation,
and then interaction of local Fl models with the global Fl
model. This loop continues for Kiterations, which is the
maximum number of global rounds required to attain the
prefixed accuracy of the model.
Algorithm 2 :ORANFed
1: Initialize: Untrained local model at each near-RT-RIC i
M;
2: for kK(accuracy) do
3: Non-RT-RIC uses Alg. 1 for client selection;
4: compute and bandwidth resources are assigned to se-
lected near-RT-RICs (N);
5: Each near-RT-RIC trains using local data till it achieves
an accuracy θand obtains gi,k;
6: Model update parameters edge clouds sent to the Non-
RT-RIC;
7: Non-RT-RIC aggregates the local weights through (2);
8: Non-RT-RIC broadcasts the aggregated parameters;
9: Non-RT-RIC calculates the global accuracy attained (6);
10: end for
11: Finally trained model is sent to SMO for deployment
D. Complexity Analysis
O-RANFed consists of trainers’ selection in step 3, and
assignment of resources in step 4. Steps 5 to 9 trains the FL
model iteratively. So, its complexity can be analysed in two-
part. In the first part, (13) is solved using Algorithm 1 having
time complexity of O(L)where Lis the cardinality of the set
M. In the second part, (14) is solved using SCA approach
having complexity O(JSCA )[14], where JSC A is the total
number of iterations within the SCA algorithm.
IV. NUMERICAL RES ULT S
TABLE I
SIMULATION SETTINGS
Parameter Value Parameter Value
N50 B1MHz
cm15cycles/bit fmU(1,1.6)GHz
ptr 1pc1
DmU(5,10)MB d 1
bmin 0.1MHz ρ (0,1)
Federated Learning Task: We trained a prediction model
where each near-RT-RIC processes a time series data con-
taining the volume of traffic requested by its corresponding
slice in a period of one month. The Dataset represents hourly
operational data of lower level network traffic. Using this
dataset, the trained model predicts the requirement of amount
of traffic in the next hour. We used Long Short Term Memory
(LSTM) based neural network with 4 layers to train this
regression model. We run this training on Intel(R) Core(TM)
i5-8265U CPU. The model attains 96.3% (approx.) accuracy
in centralized ML model. Therefore, the global accuracy of
the FL model is taken as 0.96.
Wireless Network: In order to compare our proposed model
with state-of-the-art FL methods, we considered a compatible
wireless network setting as described in Table 1. For simplic-
ity, all near-RT-RICs have the same data processing rate cm.
We used a uniform distribution random generator for assigning
CPU frequency (fm) of the host. The maximum bandwidth
capacity (B) is 1 MHz whereas the minimum (bmin) is kept
at 0.1. To better present result and without loss of generality,
the communication and compute cost (ptr, pc) are set as unit
value. The local dataset size is distributed uniformly in the
range of (5,10) MB. For benchmark, we consider FedAvg
[13] algorithm with fixed number of clients (N= 50), which
is the maximum number of near-RT-RICs in each global round.
Another prominent FL method is FedProx [15], which takes a
probability distribution to select the number of clients in each
global round. These two methods are suitable for comparative
analysis as one sets the upper limit on the client selection,
the later follows variable selection policy that differs from
our proposed ORANFed. Fig. 3 presents a comparison of the
Fig. 3. Trainer Selection pattern Fig. 4. Accuracy convergence
Fig. 5. Resource Cost comparison Fig. 6. Learning Time Cost
three FL approaches in terms of the number of clients selected
in each global rounds with respect to the total learning time
elapsed in the training process. FedAvg serves as the baseline
keeping a constant value. O-RANFed gradually attains the
maximum number of clients as the time progresses which
shows its efficiency over FedProx. Then, we compare the
accuracy achieved after each global round by each of the FL
methods. In Fig. 4, O-RANFed takes significantly less number
of global rounds to achieve the same accuracy compared to
the two other methods. This helps O-RANFed in saving FL
time as well as resources.
In terms of the objectives costs (resource and time), O-
RANFed performs better than FedAvg and FedProx. Fig. 5 and
6 show learning time and resources consumed by each method.
In these figures the behaviour of FL methods is plotted against
the pareto co-efficient (ρ). We can see that the learning time
of O-RANFed is the lowest, and it is much lower for higher
value of ρ. Moreover, resource cost required by O-RANFed is
the lowest, and it is lower for smaller value of ρ.
V. CONCLUSION
In this paper, we proposed a federated learning method
designed for O-RAN slicing environment. Our model takes
into account the importance of slice specific local trainers as
well as the resource allocation for performing FL tasks. The
simulation results show FL can be implemented to predict
data traffic of different slices in O-RAN. Our proposed model
outperforms state-of-the-art FL methods in terms of learning
time and resource cost in the simulations. Therefore, it can
be deployed in the control loops of O-RAN to guarantee the
slice QoS. In future, we will investigate the location of the
distributed data collection points to improve O-RANFed in a
highly distributed environment.
ACKNOWLEDGMENT
The authors thank Mitacs, Ciena, and ENCQOR for funding
this research under the IT13947 grant.
REFERENCES
[1] S. Abdulrahman and et al., “A survey on federated learning: The journey
from centralized to distributed on-site learning and beyond,IEEE IoT
Journal, vol. 8, no. 7, pp. 5476–5497, 2021.
[2] Z. Zhao and et al., “Federated-learning-enabled intelligent fog radio
access networks: Fundamental theory, key techniques, and future trends,”
IEEE MCW, vol. 27, no. 2, pp. 22–28, 2020.
[3] S. K. Singh and et al., “The evolution of radio access network towards
open-ran: Challenges and opportunities,” in IEEE WCNCW, 2020, pp.
1–6.
[4] O-RAN Alliance, “O-RAN-WG1.OAM-Architecture-v02.00,” 2019.
[5] H. Lee and et al., “Hosting ai/ml workflows on o-ran ric platform,” in
2020 IEEE GCWkshps, 2020, pp. 1–6.
[6] H. H. Yang and et al., “Scheduling policies for federated learning in
wireless networks,” IEEE TCOMM, vol. 68, no. 1, pp. 317–333, 2020.
[7] W. Shi and et al., “Joint device scheduling and resource allocation for
latency constrained wireless federated learning,” IEEE TWC, vol. 20,
no. 1, pp. 453–467, 2021.
[8] C. T. Dinh and et al., “Federated learning over wireless networks: Con-
vergence analysis and resource allocation,IEEE/ACM TNET, vol. 29,
no. 1, pp. 398–409, 2021.
[9] J. Koneˇ
cn`
y and et al., “Federated optimization: Distributed machine
learning for on-device intelligence,preprint arXiv:1610.02527, 2016.
[10] Z. Yang and et al., “Energy efficient federated learning over wireless
communication networks,” ITWC, vol. 20, no. 3, pp. 1935–1949, 2021.
[11] S. Wang and et al., “Adaptive federated learning in resource constrained
edge computing systems,” IEEE JSAC, vol. 37(6), pp. 1205–1221, 2019.
[12] M. Chen and et al., “A joint learning and communications framework for
federated learning over wireless networks,IEEE TWC, vol. 20, no. 1,
pp. 269–283, 2021.
[13] B. McMahan and et al., “Communication-efficient learning of deep net-
works from decentralized data,” in Artificial intelligence and statistics.
PMLR, 2017, pp. 1273–1282.
[14] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge
university press, 2004.
[15] T. Li and et al., “Federated optimization in heterogeneous networks,
arXiv preprint arXiv:1812.06127, 2018.
... In this paper, we first propose an accelerated iteration method that utilizes a random sparsification compression operator to optimize the communication resources. Then, we derive an updated FL algorithm by incorporating slice specific and deadline aware selection of local trainers [10]. ...
... In our prior work, we proposed ORANFed [10] that uses a ORAN system based deadline aware and slice specific local trainers' selection and then trains the FL model. This method, although provides a novel FL algorithm to suit the requirements of ORAN, could not be extended for a large number of local trainers as it does not mitigate the communication latency. ...
... Among the three decision variables, a t has been derived through a deadline aware and slicing based local trainers' Using the solutions to implement Algorithm 2 Fig. 4. Schematic Diagram of the Proposed Solution selection algorithm as proposed in [10]. In this step, we solve the following sub-problem: ...
Article
Full-text available
The disaggregated and hierarchical architecture of Open Radio Access Network (ORAN) with openness paradigm promises to deliver the ever demanding 5G services. Meanwhile, it also faces new challenges for the efficient deployment of Machine Learning (ML) models. Although ORAN has been designed with built-in Radio Intelligent Controllers (RIC) providing the capability of training ML models, traditional centralized learning methods may be no longer appropriate for the RICs due to privacy issues, computational burden, and communication overhead. Recently, Federated Learning (FL), a powerful distributed ML training, has emerged as a new solution for training models in ORAN systems. 5G use cases such as meeting the network slice Service Level Agreement (SLA) and Key Performance Indicator (KPI) monitoring for the smart radio resource management can greatly benefits from the FL models. However, training FL models efficiently in ORAN system is a challenging issue due to the stringent deadline of ORAN control loops, expensive compute resources, and limited communication bandwidth. Moreover, to deliver Grade of Service (GoS), the trained ML models must converge with acceptable accuracy. In this paper, we propose a second order gradient descent based FL training method named MCORANFed that utilizes compression techniques to minimize the communication cost and yet converges at a faster rate than state-of-the-art FL variants. We formulate a joint optimization problem to minimize the overall resource cost and learning time, and then solve it by the decomposition method. Our experimental results prove that MCORANFed is communication efficient with respect to ORAN system, and outperforms FL methods like MFL, FedAvg, and ORANFed in terms of costs and convergence rate.
... O Non-RT RIC pode, então, estar em um servidor na nuvem central, enquanto os Near-RT RICs estão em nuvens regionais [Bonati et al., 2021a]. Cada Near-RT RIC pode ser responsável pelo controle da RAN de uma determinada localidade por meio da interação com O-CUs ou O-DUs pela interface E2 [Singh e Khoa Nguyen, 2022]. A camada de federação, após aplicar a estratégia de federação com os vetores recebidos, envia o modelo global para os Near-RT RICs via interface A1. ...
... Um exemplo de uso de FL é o trabalho de Singh e Nguyen, que visa treinar modelos para orquestrar recursos das fatias de rede [Singh e Khoa Nguyen, 2022]. No exemplo apresentado no trabalho, utiliza-se um modelo para predizer o tráfego nas fatias de rede, podendo ser usado para dimensionar os recursos para cada uma e definir políticas apropriadas. ...
... Entretanto, é necessário levar em conta o tráfego de controle gerado pelo treinamento distribuído. Dessa forma, como visto nos trabalhos abordados anteriormente [Singh e Khoa Nguyen, 2022, Rezazadeh et al., 2023, é necessário escolher, dentre os diversos Near-RT RICs, quais participarão de uma rodada de FL além de comprimir os dados enviados. Um Near-RT RIC também pode atuar como camada de federação para modelos de tempo real (RT), possuindo desafios semelhantes ao Non-RT, de escolha de nós para participar do treinamento e tráfego de controle gerado [Cao et al., 2022]. ...
... In [9], the effect of AI/ML usage on the control loop response time in RAN is examined for resource adaptation where a drift-based solution based on the retraining of the model or switching the best suitable trained model is proposed in order to avoid performance degradation in prediction accuracy. In addition, RL is widely applied to various problems such as intelligent user access control schemes with the utilization of deep Q-network [10] and the joint selection of better training model and resource allocation for 5G network slicing services [11], and the total cell throughput maximization by tuning parameters in the DU side [12] etc. ...
Conference Paper
Full-text available
The agile and rapid management of the operations that take place in 5G radio access networks (RAN) including the monitoring of RF fluctuations, throughput suffering, and handover issues due to mobility on the user equipment (UE) side is becoming critical. Therefore, it needs to be managed by a near real-time RAN intelligent controller (RIC) in the context of O-RAN concept where O-RAN aims at the democratization of 5G RAN components to provide flexibility and compatibility to the vendors in the 5G market. Accordingly, in this paper, we present a deep learning (DL)-based autoencoder design for detecting the RF anomalies at the UE side through the extended applications (xApps) running on 5G near real-time RIC, thus, providing better and seamless service continuity. Simulation results demonstrate that the proposed autoencoder is able to achieve better performance on RF anomaly detection compared to the existing models such as random forest and isolation forest. Compared to the isolation forest algorithm, the deep-autoencoder model gives 10% better results in terms of overall accuracy score.
... RICs bring smart decision making into the RAN using Artificial Intelligence (AI) and Machine Learning (ML) frameworks. The work in [102] investigates resource allocation in the RIC using federated learning, in a network sliced environment where traffic must be forecast on a per slice level. Furthermore, the work in [103] surveys deep learning-based work for 5G and the integration to AI-enabled O-RAN architecture. ...
Article
Full-text available
Mobile network traffic is increasing and so is the energy consumption. The Radio Access Network (RAN) part is responsible for the largest share of the mobile network energy consumption, and thus; an important consideration when expanding mobile networks to meet traffic demands. This work analyses how the energy consumption of future mobile networks can be minimised by using the right RAN architecture, share the network with other operators and implementing the most efficient energy minimising technologies in the RAN. It is explored how the different approaches can be realised in real life networks as well as the research state of the art is highlighted. Furthermore, this work provides an overview of future research directions for 6G energy saving potentials. Different energy saving contributions are evaluated by a common methodology for more realistic comparison, based on the potential energy saving of the overall mobile network consumption. Results show that implementing selected technologies and architectures, the mobile network overall energy consumption can be reduced by approximately 30%, corresponding to almost half of the RAN energy consumption. Following this, a set of guidelines towards an energy optimised mobile network is provided, proposing changes to be made initially and in the longer run for brownfield network operators as well as a target network for greenfield network operators.
... where C R (c m ), C E (c m ), C S (c m ), C D (c m ), and C Aa (c m ) represent the cost of running RIC Man, E2T, SDL/STSL, NIBs, and xApp a ∈ A, respectively, on a given CN c m ∈ V C . 13 The objective function of minimizing the total cost is finally defined as: ...
Preprint
Full-text available
The Radio Access Network (RAN) is the segment of cellular networks that provides wireless connectivity to end-users. O-RAN Alliance has been transforming the RAN industry by proposing open RAN specifications and the programmable Non-Real-Time and Near-Real-Time RAN Intelligent Controllers (Non-RT RIC and Near-RT RIC). Both RICs provide platforms for running applications called rApps and xApps, respectively, to optimize the behavior of the RAN. We investigate a disaggregation strategy of the Near-RT RIC so that its components meet stringent latency requirements while presenting a cost-effective solution. We propose the novel RIC Orchestrator (RIC-O) that optimizes the deployment of the Near-RT RIC components across the cloud-edge continuum. Edge computing nodes often present limited resources and are expensive compared to cloud computing. For example, in the O-RAN Signalling Storm Protection, Near-RT RIC is expected to support end-to-end control loop latencies as low as 10ms. Therefore, performance-critical components of Near-RT RIC and certain xApps should run at the edge while other components can run on the cloud. Furthermore, RIC-O employs an efficient strategy to react to sudden changes and re-deploy components dynamically. We evaluate our proposal through analytical modeling and real-world experiments in an extended Kubernetes deployment implementing RIC-O and disaggregated Near-RT RIC.
Article
Entering the 5G era, the mobile network operators (MNO) are facing greater challenges in providing services cost effectively than any other previous generations. The potential solutions to this are lying on the emerging trend of deep convergence of information technology (IT), communication technology (CT) and data technology (DT). In particular, the O-RAN technology, the representation of such ICDT convergence and proposed by the O-RAN ALLIANCE in 2018, is transforming Radio Access Networks towards a new paradigm featuring openness, cloudification and intelligence. O-RAN has gained huge attention from both industry and academia since its inception. In this paper, we presented the recent endeavors from China Mobile, including our deployment scenarios, various test results from open fronthaul, cloud platform to the intelligent controller. Our rich and comprehensive tests have demonstrated the viability and superiority of current O-RAN technologies. Furthermore, we also provide our deep thinking on the O-RAN future evolution in order to better serve the emerging applications such as Metaverse, cloud extended-reality (XR), extensive enterprise private 5G verticals and so on.
Article
Federated learning (FL) trains a global learning model by using a central server to collaborate with multiple decentralized clients. In a wireless network, the data transmission latency between a client and the FL server is substantially affected by signal quality dynamics and bandwidth allocation. FL clients require synchronized communication at each round to update their models simultaneously, which makes bandwidth allocation methods for conventional wireless tasks infeasible to use. Existing bandwidth allocation studies for FL mainly focused on allocating bandwidth of one bandwidth provider without cost. In this paper, we consider a more practical and challenging problem: how to assign the bandwidth to clients under multiple wireless providers to minimize the FL round length (i.e., the latency that FL finishes one round of model training and updating) with bandwidth capability and cost constraints? We propose a model that maps the problem into a new variant of the knapsack problem, called multi-dimensional max-min multiple knapsacks (MDM $^{\,3}$ KP). Based on MDM $^{\,3}$ KP, we create an iterative solution to find the client assignment and bandwidth allocation that minimizes the FL round length. Comprehensive simulation results show that the solution reduces the FL round length by up to 70.8% compared with other benchmarks.
Article
In this study, a cellular system with a large-scale distributed multi-user multi-input multi-output (MU-MIMO) is considered, in which a large number of distributed antennas are deployed spatially over each base station coverage area (cell) and user clusters are formed in each cell to perform cluster-wise distributed MU-MIMO in parallel. In such a cellular system, the intercell and intracell interferences coexist and limit the link capacity. In this study, a 2-layer interference coordination (IC) framework that can effectively mitigate the two types of interferences simultaneously is proposed. In the 1st layer, the intercell IC is performed in a centralized manner by the non-real-time (non-RT) radio access network intelligent controller (RIC), and then in the 2nd layer, under the condition of the results in the 1st layer, intracell IC is done by each near-RT RICs in a decentralized manner. Furthermore, a restricted conditional graph coloring algorithm (RCGCA) suitable for this 2-layer IC framework is proposed. The proposed RCGCA is designed to be applied on a partial pre-colored graph, such that when it is applied in the 2-layer IC framework, it satisfies the requirement that the 2nd layer coloring must be applied under the condition of the pre-coloring results of the 1st layer. In addition, by restricting the total number of colors, the RCGCA can tradeoff between improving the capacity due to interference mitigation and degrading the capacity due to bandwidth partition, thereby maximizing the link capacity. We compare the link capacity achievable by the proposed 2-layer IC framework based on RCGCA with that achievable by the well-known fractional frequency reuse (FFR) scheme, no interference coordination case, fully centralized framework, and fully decentralized framework. Computer simulations confirm that our proposed 2-layer IC framework based on RCGCA can significantly improve the link capacity.
Article
Full-text available
In this paper, the problem of energy efficient transmission and computation resource allocation for federated learning (FL) over wireless communication networks is investigated. In the considered model, each user exploits limited local computational resources to train a local FL model with its collected data and, then, sends the trained FL model to a base station (BS) which aggregates the local FL model and broadcasts it back to all of the users. Since FL involves an exchange of a learning model between users and the BS, both computation and communication latencies are determined by the learning accuracy level. Meanwhile, due to the limited energy budget of the wireless users, both local computation energy and transmission energy must be considered during the FL process. This joint learning and communication problem is formulated as an optimization problem whose goal is to minimize the total energy consumption of the system under a latency constraint. To solve this problem, an iterative algorithm is proposed where, at every step, closed-form solutions for time allocation, bandwidth allocation, power control, computation frequency, and learning accuracy are derived. Since the iterative algorithm requires an initial feasible solution, we construct the completion time minimization problem and a bisection-based algorithm is proposed to obtain the optimal solution, which is a feasible solution to the original energy minimization problem. Numerical results show that the proposed algorithms can reduce up to 59.5% energy consumption compared to the conventional FL method.
Article
Full-text available
In this paper, the problem of training federated learning (FL) algorithms over a realistic wireless network is studied. In the considered model, wireless users execute an FL algorithm while training their local FL models using their own data and transmitting the trained local FL models to a base station (BS) that generates a global FL model and sends the model back to the users. Since all training parameters are transmitted over wireless links, the quality of training is affected by wireless factors such as packet errors and the availability of wireless resources. Meanwhile, due to the limited wireless bandwidth, the BS needs to select an appropriate subset of users to execute the FL algorithm so as to build a global FL model accurately. This joint learning, wireless resource allocation, and user selection problem is formulated as an optimization problem whose goal is to minimize an FL loss function that captures the performance of the FL algorithm. To seek the solution, a closed-form expression for the expected convergence rate of the FL algorithm is first derived to quantify the impact of wireless factors on FL. Then, based on the expected convergence rate of the FL algorithm, the optimal transmit power for each user is derived, under a given user selection and uplink resource block (RB) allocation scheme. Finally, the user selection and uplink RB allocation is optimized so as to minimize the FL loss function. Simulation results show that the proposed joint federated learning and communication framework can improve the identification accuracy by up to 1:4%, 3:5% and 4:1%, respectively, compared to: 1) An optimal user selection algorithm with random resource allocation, 2) a standard FL algorithm with random user selection and resource allocation, and 3) a wireless optimization algorithm that minimizes the sum packet error rates of all users while being agnostic to the FL parameters.
Article
Full-text available
Driven by privacy concerns and the visions of Deep Learning, the last four years have witnessed a paradigm shift in the applicability mechanism of Machine Learning (ML). An emerging model, called Federated Learning (FL), is rising above both centralized systems and on-site analysis, to be a new fashioned design for ML implementation. It is a privacy preserving decentralized approach, which keeps raw data on devices and involves local ML training while eliminating data communication overhead. A federation of the learned and shared models is then performed on a central server to aggregate and share the built knowledge among participants. This paper starts by examining and comparing different ML-based deployment architectures, followed by in-depth and in-breadth investigation on FL. Compared to the existing reviews in the field, we provide in this survey a new classification of FL topics and research fields based on thorough analysis of the main technical challenges and current related work. In this context, we elaborate comprehensive taxonomies covering various challenging aspects, contributions and trends in the literature including core system models and designs, application areas, privacy and security and resource management. Further, we discuss important challenges and open research directions towards more robust FL systems.
Article
Full-text available
Emerging technologies and applications including Internet of Things (IoT), social networking, and crowd-sourcing generate large amounts of data at the network edge. Machine learning models are often built from the collected data, to enable the detection, classification, and prediction of future events. Due to bandwidth, storage, and privacy concerns, it is often impractical to send all the data to a centralized location. In this paper, we consider the problem of learning model parameters from data distributed across multiple edge nodes, without sending raw data to a centralized place. Our focus is on a generic class of machine learning models that are trained using gradientdescent based approaches. We analyze the convergence bound of distributed gradient descent from a theoretical point of view, based on which we propose a control algorithm that determines the best trade-off between local update and global parameter aggregation to minimize the loss function under a given resource budget. The performance of the proposed algorithm is evaluated via extensive experiments with real datasets, both on a networked prototype system and in a larger-scale simulated environment. The experimentation results show that our proposed approach performs near to the optimum with various machine learning models and different data distributions.
Article
There is an increasing interest in a fast-growing machine learning technique called Federated Learning (FL), in which the model training is distributed over mobile user equipment (UEs), exploiting UEs’ local computation and training data. Despite its advantages such as preserving data privacy, FL still has challenges of heterogeneity across UEs’ data and physical resources. To address these challenges, we first propose FEDL , a FL algorithm which can handle heterogeneous UE data without further assumptions except strongly convex and smooth loss functions. We provide a convergence rate characterizing the trade-off between local computation rounds of each UE to update its local model and global communication rounds to update the FL global model. We then employ FEDL in wireless networks as a resource allocation optimization problem that captures the trade-off between FEDL convergence wall clock time and energy consumption of UEs with heterogeneous computing and power resources. Even though the wireless resource allocation problem of FEDL is non-convex, we exploit this problem’s structure to decompose it into three sub-problems and analyze their closed-form solutions as well as insights into problem design. Finally, we empirically evaluate the convergence of FEDL with PyTorch experiments, and provide extensive numerical results for the wireless resource allocation sub-problems. Experimental results show that FEDL outperforms the vanilla FedAvg algorithm in terms of convergence rate and test accuracy in various settings.
Article
In federated learning (FL), devices contribute to the global training by uploading their local model updates via wireless channels. Due to limited computation and communication resources, device scheduling is crucial to the convergence rate of FL. In this paper, we propose a joint device scheduling and resource allocation policy to maximize the model accuracy within a given total training time budget for latency constrained wireless FL. A lower bound on the reciprocal of the training performance loss, in terms of the number of training rounds and the number of scheduled devices per round, is derived. Based on the bound, the accuracy maximization problem is solved by decoupling it into two sub-problems. First, given the scheduled devices, the optimal bandwidth allocation suggests allocating more bandwidth to the devices with worse channel conditions or weaker computation capabilities. Then, a greedy device scheduling algorithm is introduced, which selects the device consuming the least updating time obtained by the optimal bandwidth allocation in each step, until the lower bound begins to increase, meaning that scheduling more devices will degrade the model accuracy. Experiments show that the proposed policy outperforms state-of-the-art scheduling policies under extensive settings of data distributions and cell radius.
Article
The rise of big data and AI boosts the development of future wireless networks. However, due to the high cost of data offloading and model training, it is challenging to implement network intelligence based on the existing centralized learning strategies, especially at the edge of networks. To provide a feasible solution, a paradigm of federated learning- enabled intelligent F-RANs is proposed, which can take full advantage of fog computing and AI. The fundamental theory with respect to the accuracy loss correction and the model compression is studied, which can provide some insights into the design of federated learning in F-RANs. To support the implementation of federated learning, some key techniques are introduced to fully integrate the communication, computation, and storage capability of F-RANs. Moreover, future trends of federated learning-enabled intelligent F-RANs, such as potential applications and open issues, are discussed.
Article
Motivated by the increasing computational capacity of wireless user equipments (UEs), e.g., smart phones, tablets, or vehicles, as well as the increasing concerns about sharing private data, a new machine learning model has emerged, namely federated learning (FL), that allows a decoupling of data acquisition and computation at the central unit. Unlike centralized learning taking place in a data center, FL usually operates in a wireless edge network where the communication medium is resource-constrained and unreliable. Due to limited bandwidth, only a portion of UEs can be scheduled for updates at each iteration. Due to the shared nature of the wireless medium, transmissions are subjected to interference and are not guaranteed. The performance of FL system in such a setting is not well understood. In this paper, an analytical model is developed to characterize the performance of FL in wireless networks. Particularly, tractable expressions are derived for the convergence rate of FL in a wireless setting, accounting for effects from both scheduling schemes and inter-cell interference. Using the developed analysis, the effectiveness of three different scheduling policies, i.e., random scheduling (RS), round robin (RR), and proportional fair (PF), are compared in terms of FL convergence rate. It is shown that running FL with PF outperforms RS and RR if the network is operating under a high signal-to-interference-plus-noise ratio (SINR) threshold, while RR is more preferable when the SINR threshold is low. Moreover, the FL convergence rate decreases rapidly as the SINR threshold increases, thus confirming the importance of compression and quantization of the update parameters. The analysis also reveals a trade-off between the number of scheduled UEs and subchannel bandwidth under a fixed amount of available spectrum.