ArticlePDF Available

Abstract

Fueled by the availability of more data and computing power, recent breakthroughs in cloud-based machine learning (ML) have transformed every aspect of our lives from face recognition and medical diagnosis to natural language processing. However, classical ML exerts severe demands in terms of energy, memory and computing resources, limiting their adoption for resource constrained edge devices. The new breed of intelligent devices and high-stake applications (drones, augmented/virtual reality, autonomous systems, etc.), requires a novel paradigm change calling for distributed, low-latency and reliable ML at the wireless network edge (referred to as edge ML). In edge ML, training data is unevenly distributed over a large number of edge nodes, which have access to a tiny fraction of the data. Moreover training and inference are carried out collectively over wireless links, where edge devices communicate and exchange their learned models (not their private data). In a first of its kind, this article explores key building blocks of edge ML, different neural network architectural splits and their inherent tradeoffs, as well as theoretical and technical enablers stemming from a wide range of mathematical disciplines. Finally, several case studies pertaining to various high-stake applications are presented demonstrating the effectiveness of edge ML in unlocking the full potential of 5G and beyond.
A preview of the PDF is not available
... The latter is usually referred to as multi-agent decentralized learning (MADL) [4], [5]. ...
... For example, the maximum accuracy level is achieved just within 18 communication rounds. This is intuitively expected since the agent placement seems to be a very desirable setting (in terms of the information spread) in the sense that the relatively more informative agent [i.e., 1] is at the center of the star and the less informative agents [i.e., [2][3][4][5][6] are at the edge nodes. On the other hand, results show that even a slight change in the agent placement can significantly aggravate the convergence properties, see the plot for NegM in which the agents 1 and 2 both lie at the edge of the star graph. ...
... Since agents have access only to their own distributions, the proposed formulations allow agents to sample from P j instead of P X[13].4 A strongly connected graph is a directed graph in which paths exist in both directions, connecting any two distinct vertices within the graph. ...
Article
Full-text available
Multi-agent Decentralized Learning (MADL) is a scalable approach that enables agents to learn based on their local datasets. However, it presents significant challenges related to the impact of dataset heterogeneity and the communication graph structure on learning speed, as well as the lack of a robust method for quantifying prediction uncertainty. To address these challenges, we propose BayGO, a novel fully-decentralized multi-agent local Bay esian learning with local averaging, usually referred to as non-Bayesian social learning, together with g raph o ptimization framework. Within BayGO, agents locally learn a posterior distribution over the model parameters, updating it locally using their datasets and sharing this information with their neighbors. We derive an aggregation rule for combining received posterior distributions to achieve optimality and consensus. Moreover, we theoretically derive the convergence rate of agents’ posterior distributions. This convergence rate accounts for both network structure and information heterogeneity among agents. To expedite learning, agents employ the derived convergence rate as an objective, optimizing it with respect to the network structure alternately with their posterior distributions. As a consequence, agents can successfully fine-tune their network connections according to the information content of their neighbors. This leads to a sparse graph configuration, where each agent communicates exclusively with the neighbor that offers the highest information gain, enhancing communication efficiency. Our simulations corroborate that the BayGO framework accelerates learning compared to fully-connected and star topologies owing to its capacity for selecting neighbors based on information gain.
... Federated learning (FL) is an emerging paradigm of distributed machine learning that enables a large number of end users to collaboratively learn a global model without directly accessing their raw data [1]- [4]. Inspired by FL, there is a trend of upgrading the wireless network by integrating FL into the mobile system, achieving network intelligence. ...
Conference Paper
Full-text available
THIS PAPER IS ELIGIBLE FOR THE STUDENT PAPER AWARD. Federated Learning (FL) is envisioned as the cornerstone of the next-generation mobile system, whereby integrating FL into the network edge elements (i.e., user terminals and edge/cloud servers), it is expected to unleash the potential of network intelligence by learning from the massive amount of users' data while concurrently preserving privacy. In this paper, we develop an analytical framework that quantifies the interplay of user mobility, a fundamental property of mobile networks, and data heterogeneity, the salient feature of FL, on the model training efficiency. Specifically, we derive the convergence rate of a hierarchical FL system operated in a mobile network, showing how user mobility amplifies the divergence caused by data heterogeneity. The theoretical findings are corroborated by experimental simulations.
... Another benefit of FbFTL is the dataset balancing. If the overall dataset is imbalanced and hence the samples with certain types of output appears much more frequently than those of other outputs, it is hard for FL and FTL to distinguish this via gradient updates, and such imbalanced data distribution could significantly degrade FL performance [69], [70]. However, FbFTL with direct output information enables techniques, such as re-sampling specific classes or merging near-identical classes, to improve dataset imbalance. ...
Article
Full-text available
In this paper, we propose feature-based federated transfer learning as a novel approach to improve communication efficiency by reducing the uplink payload by multiple orders of magnitude compared to that of existing approaches in federated learning and federated transfer learning. Specifically, in the proposed feature-based federated learning, we design the extracted features and outputs to be uploaded instead of parameter updates. For this distributed learning model, we determine the required payload and provide comparisons with the existing schemes. Subsequently, we analyze the robustness of feature-based federated transfer learning against packet loss, data insufficiency, and quantization. Finally, we address privacy considerations by defining and analyzing label privacy leakage and feature privacy leakage, and investigating mitigating approaches. For all aforementioned analyses, we evaluate the performance of the proposed learning scheme via experiments on an image classification task and a natural language processing task to demonstrate its effectiveness.
... Another benefit of FbFTL is the dataset balancing. If the overall dataset is imbalanced and hence the samples with certain types of output appears much more frequently than those of other outputs, it is hard for FL and FTL to distinguish this via gradient updates, and such imbalanced data distribution could significantly degrade FL performance [69], [70]. However, FbFTL with direct output information enables techniques, such as re-sampling specific classes or merging near-identical classes, to improve dataset imbalance. ...
Preprint
In this paper, we propose feature-based federated transfer learning as a novel approach to improve communication efficiency by reducing the uplink payload by multiple orders of magnitude compared to that of existing approaches in federated learning and federated transfer learning. Specifically, in the proposed feature-based federated learning, we design the extracted features and outputs to be uploaded instead of parameter updates. For this distributed learning model, we determine the required payload and provide comparisons with the existing schemes. Subsequently, we analyze the robustness of feature-based federated transfer learning against packet loss, data insufficiency, and quantization. Finally, we address privacy considerations by defining and analyzing label privacy leakage and feature privacy leakage, and investigating mitigating approaches. For all aforementioned analyses, we evaluate the performance of the proposed learning scheme via experiments on an image classification task and a natural language processing task to demonstrate its effectiveness.
Article
Full-text available
With advancements in distributed autonomous systems (e.g., vehicles, sensors, and robots) in the 5G/6G era, sidelink communication technology has evolved as a distributed communication system in the third-generation partnership project (3GPP). However, the current sidelink communication design focusing on information dissemination or point-to-point communication with a low rate is not suitable for rapid development of such autonomous systems. Instead, based on sidelink, developing distributed wireless personal area networks (WPANs) with a drastically higher rate for transmitting user data is essential. The overarching goal of this study is to explore the possibility of sidelink communication evolution to 1) form a distributed and autonomous WPAN and 2) support millimeter wave (mmWave) bands. Our core idea is to merge several design concepts of the precedented mmWave WPAN standards, i.e., IEEE 802.15.3c/11ad, into the sidelink communications, thereby bridging the gap between the two separated systems. This paper presents the anatomy of the IEEE 802.15.3c/11ad system with a focus on the formation of mmWave WPANs among distributed nodes and their operation. In addition, the current status of sidelink communication system design is highlighted, along with the missing building blocks, which are required to develop 3GPP sidelink-based mmWave WPAN systems. Simulation results shed light on merging IEEE 802.15.3c/11ad concepts into 3GPP sidelink communication regarding a control data transmission scheme, which should be designed to enhance robustness and is a crucial step for subsequent high-rate user data transmission.
Chapter
The rapid resurgence of Artificial Intelligence (AI) has revolutionized nearly every field of science and technology. The ubiquity of smartphones and Internet of Things (IoT) platforms has led to an expectation that most smart applications will run on the edge of cellular networks. This has increased the focus on the introduction of the digital edge to support AI-friendly applications on specific cutting-edge platforms. Thus, a new research field named “edge computing” has been developed, covering two areas, and revolutionizing them: wireless networking and computing. The restricted computing capacity and minimal data for the unit are resolved by an essential topic of advanced learning. This is accomplished by the usage of the Mobile Edge Computing (MEC) interface and the utilization of the vast volume of data spread over a wide range of edge computers. Distributed knowledge processing and connectivity between the edge server and devices are two significant, connected elements of these networks, and their convergence presents a range of new study challenges. This article introduces a new set of ideas known as learning-driven wireless networking in advanced computing and provides examples of the feasibility of these concepts and specific research tools.
Article
Multi-access edge computing (MEC) has emerged as a promising computing paradigm to push computing resources and services to the network edge. It allows applications/services to be deployed on edge servers for provisioning low-latency services to nearby users. However, in the MEC environment, edge servers may suffer from failures while the app vendor has to guarantee continuously available services to its users, thereby securing its revenue for application instances deployed. In this paper, we focus on available service provisioning when cost-effectively deploying application instances on edge servers. We first formulate a novel A vailability-aware R evenue-effective A pplication D eployment (ARAD) problem in the MEC environment with the aim to maximize the overall revenue by considering both service availability benefit and deployment cost. We prove that the ARAD problem is $\mathcal {NP}$ -hard. Then, we propose an approximation algorithm named ARAD-A to find the ARAD solution efficiently with a constant approximation ratio of $\frac{1}{2}$ . We extensively evaluate the performance of ARAD-A against five representative approaches. Experimental results demonstrate that our ARAD-A can achieve the best performance in securing the app vendor's overall revenue.
Article
Full-text available
Rapid progress of wireless communication and networks plays a critical role in different applications related to carrying out heterogeneous ever-increasing traffic in multimedia communication, remote health, and commercial applications. The fastest developments and uninterrupted upgradations in recent telecommunication networks regarding the deployment of the advanced 5G, which is being rolled out, draw attention to various research challenges and demanding transition requirements from existing 4G to 5G, beyond (B5G) and toward the 6G. These network upgradations strongly stipulate the need for system flexibility along with intelligent decisions at run-time with the integration of Artificial Intelligence (AI) and Machine Learning (ML)-based techniques. In this paper, various research challenges regarding the rolling out of advanced 5G network, the need for artificial intelligence in the rapidly emerging next generation wireless network (NGWN) along with various state-of-the-art approaches of the Artificial Intelligence and Machine learning, target applications, challenges regarding the incorporation of intelligence in the network have been investigated in a unified way. The paper highlights the contemporary cutting-edge developments of the telecom industry, research trends of network edge intelligence including the integration of federated learning (FL), allied challenges and open research issues in promising 6G networks.
Article
Full-text available
Establishing and tracking beams in millimeter-wave (mmWave) vehicular communication is a challenging task. Large antenna arrays and narrow beams introduce significant system overhead configuring the beams using exhaustive beam search. In this paper, we propose to learn the optimal beam pair index by exploiting the locations and types of the receiver vehicle and its neighboring vehicles (situational awareness), leveraging machine learning classification and past beam training data. We formulate the mmWave beam selection as a multi-class classification problem based on hand-crafted features that capture the situational awareness in different coordinates. We then provide a comprehensive comparison of the different classification models and various levels of situational awareness. Furthermore, we examine several practical issues in the implementation: localization is susceptible to inaccuracy; situational awareness at the base station (BS) can be outdated due to vehicle mobility and limited location reporting frequencies; the situational awareness may be incomplete since vehicles could be invisible to the BS if they are not connected. To demonstrate the scalability of the proposed beam selection solution in the large antenna array regime, we propose two solutions to recommend multiple beams and exploit an extra phase of beam sweeping among the recommended beams. The numerical results show that situational awareness-assisted beam selection using machine learning is able to provide beam prediction, with accuracy that increases with more complete knowledge of the environment.
Article
Full-text available
When the data is distributed across multiple servers, lowering the communication cost between the servers (or workers) while solving the distributed learning problem is an important problem and is the focus of this paper. In particular, we propose a fast, and communication-efficient decentralized framework to solve the distributed machine learning (DML) problem. The proposed algorithm, Group Alternating Direction Method of Multipliers (GADMM) is based on the Alternating Direction Method of Multipliers (ADMM) framework. The key novelty in GADMM is that it solves the problem in a decentralized topology where at most half of the workers are competing for the limited communication resources at any given time. Moreover, each worker exchanges the locally trained model only with two neighboring workers, thereby training a global model with a lower amount of communication overhead in each exchange. We prove that GADMM converges to the optimal solution for convex loss functions, and numerically show that it converges faster and more communication-efficient than the state-of-the-art communication-efficient algorithms such as the Lazily Aggregated Gradient (LAG) and dual averaging, in linear and logistic regression tasks on synthetic and real datasets. Furthermore, we propose Dynamic GADMM (D-GADMM), a variant of GADMM, and prove its convergence under the time-varying network topology of the workers.
Article
Full-text available
Designing distributed, fast and reliable wireless consensus protocols is instrumental in enabling mission-critical decentralized systems, such as robotic networks in the industrial Internet of Things (IIoT), drone swarms in rescue missions, and so forth. However, chasing both low-latency and reliability of consensus protocols is a challenging task. The problem is aggravated under wireless connectivity that may be slower and less reliable, compared to wired connections. To tackle this issue, we investigate fundamental relationships between consensus latency and reliability through the lens of wireless connectivity, and co-design communication and consensus protocols for low-latency and reliable decentralized systems. Specifically, we propose a novel communication-efficient distributed consensus protocol, termed Random Representative Consensus (R2C), and show its effectiveness under gossip and broadcast communication protocols. To this end, we derive a closed-form end-to-end (E2E) latency expression of the R2C that guarantees a target reliability, and compare it with a baseline consensus protocol, referred to as Referendum Consensus (RC). The result show that the R2C is faster compared to the RC and more reliable compared when co-designed with the broadcast protocol compared to that with the gossip protocol.
Conference Paper
This paper presents a new class of gradient methods for distributed machine learning that adaptively skip the gradient calculations to learn with reduced communication and computation. Simple rules are designed to detect slowly-varying gradients and, therefore, trigger the reuse of outdated gradients. The resultant gradient-based algorithms are termed Lazily Aggregated Gradient — justifying our acronym LAG used henceforth. Theoretically, the merits of this contribution are: i) the convergence rate is the same as batch gradient descent in stronglyconvex, convex, and nonconvex cases; and, ii) if the distributed datasets are heterogeneous (quantified by certain measurable constants), the communication rounds needed to achieve a targeted accuracy are reduced thanks to the adaptive reuse of lagged gradients. Numerical experiments on both synthetic and real data corroborate a significant communication reduction compared to alternatives.
Article
Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in various domains. We develop a new formula for the gradient of the CVaR in the form of a conditional expectation. Based on this formula, we propose a novel sampling-based estimator for the gradient of the CVaR, in the spirit of the likelihood-ratio method. We analyze the bias of the estimator, and prove the convergence of a corresponding stochastic gradient descent algorithm to a local CVaR optimum. Our method allows to consider CVaR optimization in new domains. As an example, we consider a reinforcement learning application, and learn a risk-sensitive controller for the game of Tetris.
Conference Paper
This paper presents Tofu, a system that partitions very large DNN models across multiple GPU devices to reduce per-GPU memory footprint. Tofu is designed to partition a dataflow graph of fine-grained tensor operators used by platforms like MXNet and TensorFlow. In order to automatically partition each operator, we propose to describe the semantics of an operator in a simple language inspired by Halide. To optimally partition different operators in a dataflow graph, Tofu uses a recursive search algorithm that minimizes the total communication cost. Our experiments on an 8-GPU machine show that Tofu enables the training of very large CNN and RNN models. It also achieves 25% - 400% speedup over alternative approaches to train very large models.
Article
Mission-critical applications require Ultra-Reliable Low Latency (URLLC) wireless connections, where the packet error rate (PER) goes down to 10 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">-9</sup> . Fulfillment of the bold reliability figures becomes meaningful only if it can be related to a statistical model in which the URLLC system operates. However, this model is generally not known and needs to be learned by sampling the wireless environment. In this paper, we treat this fundamental problem in the simplest possible communication-theoretic setting: selecting a transmission rate over a dynamic wireless channel in order to guarantee high transmission reliability. We introduce a novel statistical framework for design and assessment of URLLC systems, consisting of three key components: (i) channel model selection; (ii) learning the model using training; and (iii) selecting the transmission rate to satisfy the required reliability. As it is insufficient to specify the URLLC requirements only through PER, two types of statistical constraints are introduced, Averaged Reliability (AR) and Probably Correct Reliability (PCR). The analysis and the evaluations show that adequate model selection and learning are indispensable for designing consistent physical layer that asymptotically behaves as if the channel was known perfectly, while maintaining the reliability requirements in URLLC systems.
Article
The traditional role of a communication engineer is to address the technical problem of transporting bits reliably over a noisy channel. With the emergence of 5G, and the availability of a variety of competing and coexisting wireless systems, wireless connectivity is becoming a commodity. This article argues that communication engineers in the post-5G era should extend the scope of their activity in terms of design objectives and constraints beyond connectivity to encompass the semantics of the transferred bits within the given applications and use cases. To provide a platform for semantic-aware connectivity solutions, this paper introduces the concept of a semantic-effectiveness (SE) plane as a core part of future communication architectures. The SE plane augments the protocol stack by providing standardized interfaces that enable information filtering and direct control of functionalities at all layers of the protocol stack. The advantages of the SE plane are described in the perspective of recent developments in 5G, and illustrated through a number of example applications. The introduction of a SE plane may help replacing the current “next-G paradigm” in wireless evolution with a framework based on continuous improvements and extensions of the systems and standards.
Article
Federated learning involves training statistical models over remote devices or siloed data centers, such as mobile phones or hospitals, while keeping data localized. Training in heterogeneous and potentially massive networks introduces novel challenges that require a fundamental departure from standard approaches for large-scale machine learning, distributed optimization, and privacy-preserving data analysis. In this article, we discuss the unique characteristics and challenges of federated learning, provide a broad overview of current approaches, and outline several directions of future work that are relevant to a wide range of research communities.