Conference PaperPDF Available

Data-Driven Analytics for Automated Cell Outage Detection in Self-Organizing Networks

March 2015

March 2015

DOI:10.1109/DRCN.2015.7149014

Conference: Design of Reliable Communication Networks (DRCN)
At: Kansas City, USA
Volume: 11th

Authors:

Ahmed Zoha

University of Glasgow

Arsalan Saeed

University of Surrey

Ali Imran

University of Oklahoma

Muhammad Ali Imran

University of Glasgow

Show all 5 authorsHide

In this paper, we address the challenge of autonomous cell outage detection (COD) in Self-Organizing Networks (SON). COD is a pre-requisite to trigger fully automated self-healing recovery actions following cell outages or network failures. A special case of cell outage, referred to as Sleeping Cell (SC) remains particularly challenging to detect in state-of-the-art SON, since it triggers no alarms for Operation and Maintenance (O&M) entity. Consequently, no SON compensation function can be launched unless site visits or drive tests are performed, or complaints are received by affected customers. To address this issue, we present and evaluates a COD framework, which is based on minimization of drive test (MDT) reports, a functionality recently specified in third generation partnership project (3GPP) Release 10, for LTE Networks. Our proposed framework aims to detect cell outages in an autonomous fashion by first pre-processing the MDT measurements using multidimensional scaling method and further employing it together with machine learning algorithms to detect and localize anomalous network behaviour. We validate and demonstrate the effectiveness of our proposed solution using the data obtained from simulating the network under various operational settings.

System Model for Cell Outage Detection Measurements Description Location longitude and latitude information Serving Cell info Cell Global Identity (CGI) RSRP Reference Signal Received Power in dBm RSRQ Reference Signal Received Quality in dB Neighboring Cell Information Three Strongest intra-LTE RSRP, RSRQ information

…

(a) OCSVMD learned network profile for Reference Scenario (b) Low-shadowing case (c) Distribution of RSRP values for all shadowing cases (d) Medium Traffic case (e) smaller ISD case (d) Distribution of RSRP values for all ISD cases

…

Network profiling using LOFD

…

Figures - uploaded by Arsalan Saeed

Content may be subject to copyright.

Content uploaded by Arsalan Saeed

Content may be subject to copyright.

A preview of the PDF is not available

Few-Shot Learning and Self-Training for eNodeB Log Analysis for Service-Level Assurance in LTE Networks

Article

Full-text available

Oct 2020

With the increasing network topology complexity and continuous evolution of the new wireless technology, it is challenging to address the network service outage with traditional methods. In the long-term evolution (LTE) networks, a large number of base stations called eNodeBs are deployed to cover the entire service areas spanning various kinds of geographical regions. Each eNodeB generates a large number of key performance indicators (KPIs). Hundreds of thousands of eNodeBs are typically deployed to cover a nation-wide service area. Operators need to handle hundreds of millions of KPIs to cover the areas. It is impractical to handle manually such a huge amount of KPI data, and automation of data processing is therefore desired. To improve network operation efficiency, a suitable machine learning technique is used to learn and classify individual eNodeBs into different states based on multiple performance metrics during a specific time window. However, an issue with supervised learning requires a large amount of labeled dataset, which takes costly human-labor and time to annotate data. To mitigate the cost and time issues, we propose a method based on few-shot learning that uses Prototypical Networks algorithm to complement the eNodeB states analysis. Using a dataset from a live LTE network consists of thousand of eNodeB, our experiment results show that the proposed technique provides high performance while using a low number of labeled data.

Cell Fault Management Using Machine Learning Techniques

Article

Full-text available

Aug 2019

This paper surveys the literature relating to the application of machine learning to fault management in cellular networks from an operational perspective. We summarise the main issues as 5G networks evolve, and their implications for fault management. We describe the relevant machine learning techniques through to deep learning, and survey the progress which has been made in their application, based on the building blocks of a typical fault management system. We review recent work to develop the abilities of deep learning systems to explain and justify their recommendations to network operators. We discuss forthcoming changes in network architecture which are likely to impact fault management and offer a vision of how fault management systems can exploit deep learning in the future. We identify a series of research topics for further study in order to achieve this.

Data-Driven Energy Efficiency Modeling in Large-Scale Networks: An Expert Knowledge and ML-Based Approach

Article

Full-text available

Jan 2024

The energy consumption of mobile networks poses a critical challenge. Mitigating this concern necessitates the deployment and optimization of network energy-saving solutions, such as carrier shutdown, to dynamically manage network resources. Traditional optimization approaches encounter complexity due to factors like the large number of cells, stochastic traffic, channel variations, and intricate trade-offs. This paper introduces the simulated reality of communication networks (SRCON) framework, a novel, data-driven modeling paradigm that harnesses live network data and employs a blend of machine learning (ML)- and expert-based models. These mix of models accurately characterizes the functioning of network components, and predicts network energy efficiency and user equipment (UE) quality of service for any energy carrier shutdown configuration in a specific network. Distinguishing itself from existing methods, SRCON eliminates the reliance on expensive expert knowledge, drive testing, or incomplete maps for predicting network performance. This paper details the pipeline employed by SRCON to decompose the large network energy efficiency modeling problem into ML- and expert-based submodels. It demonstrates how, by embracing stochasticity, and carefully crafting the relationship between such submodels, the overall computational complexity can be reduced and prediction accuracy enhanced. Results derived from real network data underscore the paradigm shift introduced by SRCON, showcasing significant gains over a state-of-the-art method used by a operator for network energy efficiency modeling. The reliability of this local, data-driven modeling of the network proves to be a key asset for network energy-saving optimization.

Towards Addressing Training Data Scarcity Challenge in Emerging Radio Access Networks: A Survey and Framework

Article

Full-text available

Jan 2023

The future of cellular networks is contingent on artificial intelligence (AI) based automation, particularly for radio access network (RAN) operation, optimization, and troubleshooting. To achieve such zero-touch automation, a myriad of AI-based solutions are being proposed in literature to leverage AI for modeling and optimizing network behavior to achieve the zero-touch automation goal. However, to work reliably, AI based automation, requires a deluge of training data. Consequently, the success of the proposed AI solutions is limited by a fundamental challenge faced by cellular network research community: scarcity of the training data. In this paper, we present an extensive review of classic and emerging techniques to address this challenge. We first identify the common data types in RAN and their known use-cases. We then present a taxonomized survey of techniques used in literature to address training data scarcity for various data types. This is followed by a framework to address the training data scarcity. The proposed framework builds on available information and combination of techniques including interpolation, domain-knowledge based, generative adversarial neural networks, transfer learning, autoencoders, fewshot learning, simulators and testbeds. Potential new techniques to enrich scarce data in cellular networks are also proposed, such as by matrix completion theory, and domain knowledge-based techniques leveraging different types of network geometries and network parameters. In addition, an overview of state-of-the art simulators and testbeds is also presented to make readers aware of current and emerging platforms to access real data in order to overcome the data scarcity challenge. The extensive survey of training data scarcity addressing techniques combined with proposed framework to select a suitable technique for given type of data, can assist researchers and network operators in choosing the appropriate methods to overcome the data scarcity challenge in leveraging AI to radio access network automation.

Towards Addressing Training Data Scarcity Challenge in Emerging Radio Access Networks: A Survey and Framework

Preprint

Full-text available

Apr 2023

The future of cellular networks is contingent on artificial intelligence (AI) based automation, particularly for radio access network (RAN) operation, optimization, and troubleshooting. To achieve such zero-touch automation, a myriad of AI-based solutions are being proposed in literature for modeling and optimizing network behavior to achieve the zero-touch automation goal. However, to work reliably, AI based automation, requires a deluge of training data. Consequently, the success of AI solutions is limited by a fundamental challenge faced by cellular network research community: scarcity of training data. We present an extensive review of classic and emerging techniques to address this challenge. We first identify the common data types in RAN and their known use-cases. We then present a taxonomized survey of techniques to address training data scarcity for various data types. This is followed by a framework to address the training data scarcity. The framework builds on available information and combination of techniques including interpolation, domain-knowledge based, generative adversarial neural networks, transfer learning, autoencoders, few-shot learning, simulators, and testbeds. Potential new techniques to enrich scarce data in cellular networks are also proposed, such as by matrix completion theory, and domain knowledge-based techniques leveraging different network parameters and geometries. An overview of state-of-the art simulators and testbeds is also presented to make readers aware of current and emerging platforms for real data access. The extensive survey of training data scarcity addressing techniques combined with proposed framework to select a suitable technique for given type of data, can assist researchers and network operators in choosing appropriate methods to overcome the data scarcity challenge in leveraging AI to radio access network automation.

Computational and Communication Infrastructure Challenges for Resilient Cloud Services

Article

Full-text available

Jul 2022

Fault tolerance and the availability of applications, computing infrastructure, and communications systems during unexpected events are critical in cloud environments. The microservices architecture, and the technologies that it uses, should be able to maintain acceptable service levels in the face of adverse circumstances. In this paper, we discuss the challenges faced by cloud infrastructure in relation to providing resilience to applications. Based on this analysis, we present our approach for a software platform based on a microservices architecture, as well as the resilience mechanisms to mitigate the impact of infrastructure failures on the availability of applications. We demonstrate the capacity of our platform to provide resilience to analytics applications, minimizing service interruptions and keeping acceptable response times.

Artificial Intelligence-powered Mobile Edge Computing-based Anomaly Detection in Cellular Networks

Article

Full-text available

Nov 2019

Escalating cell outages and congestion-treated as anomalies-cost a substantial revenue loss to the cellular operators and severely affect subscriber quality of experience. State-of-the-art literature applies feed-forward deep neural network at core network (CN) for the detection of above problems in a single cell; however, the solution is impractical as it will overload the CN that monitors thousands of cells at a time. Inspired from mobile edge computing and breakthroughs of deep convolutional neural networks (CNNs) in computer vision research, we split the network into several 100-cell regions each monitored by an edge server; and propose a framework that pre-processes raw call detail records having user activities to create an image-like volume, fed to a CNN model. The framework outputs a multi-labeled vector identifying anomalous cell(s). Our results suggest that our solution can detect anomalies with up to 96% accuracy, and is scalable and expandable for industrial Internet of things environment.

Big Data Analytics for 5G Networks: Utilities, Frameworks, Challenges, and Opportunities

Chapter

Jan 2021

In order to meet the challenges of ambitious capacity, user experience, and resource efficiency gains, the next‐generation cellular networks need to leverage end‐to‐end user and network behavior intelligence. This intelligence can be gathered from the mobile network big data which includes the massive telemetric data about network health and status as well as data about user whereabouts, preferences, context, and mobility patterns. As a result, exploitation of big data on wireless cellular network is emerging as an indispensable approach for harnessing intelligence in future wireless communication networks. In this article, we first identify and classify the big data that can be gathered from different layers and ends of a wireless cellular network. We then discuss several new utilities of the big data that can bridge the existing gaps to meet 5G requirements. After that we summarize the existing literature on data analytics for cellular network performance. We present different platforms and two different frameworks to implement big data analytic‐based solutions in 5G and beyond and compare their pros and cons. We then discuss how key performance indicators (KPIs)‐based data collection may not suffice in 5G. Through an exemplary study, we show how to unleash the full potential hidden within the big data, granularity of low‐level performance indicators, and how context is essential. Finally, we highlight the opportunities that can be availed from big data in cellular network and the challenges therein.

Improved Neural Network Transparency for Cell Degradation Detection Using Explanatory Model

Conference Paper

Jun 2020

Mobile Edge Computing-Based Data Driven Deep Learning Framework for Anomaly Detection

Article

Full-text available

Sep 2019

5G is anticipated to embed an artificial intelligence (AI)-empowerment to adroitly plan, optimize and manage the highly complex network by leveraging data generated at different positions of the network architecture. Outages and situation leading to congestion in a cell pose severe hazard for the network. High false alarms and inadequate accuracy are the major limitations of modern approaches for the anomaly—outage and sudden hype in traffic activity that may result in congestion—detection in mobile cellular networks. This indicates wasting limited resources that ultimately leads to an elevated operational expenditure (OPEX) and also interrupting quality of service (QoS) and quality of experience (QoE). Motivated by the outstanding success of deep learning (DL) technology, our study applies it for detection of the above-mentioned anomalies and also supports mobile edge computing (MEC) paradigm in which core network (CN)’s computations are divided across the cellular infrastructure among different MEC servers (co-located with base stations), to relief the CN. Each server monitors user activities of multiple cells and utilizes L-layer feedforward deep neural network (DNN) fueled by real call detail record (CDR) dataset for anomaly detection. Our framework achieved 98.8% accuracy with 0.44% false positive rate (FPR)—notable improvements that surmount the deficiencies of the old studies. The numerical results explicate the usefulness and dominance of our proposed detector.

A Survey of Self Organisation in Future Cellular Networks

Article

Full-text available

Jan 2013

This article surveys the literature over the period of the last decade on the emerging field of self organisation as applied to wireless cellular communication networks. Self organisation has been extensively studied and applied in adhoc networks, wireless sensor networks and autonomic computer networks; however in the context of wireless cellular networks, this is the first attempt to put in perspective the various efforts in form of a tutorial/survey. We provide a comprehensive survey of the existing literature, projects and standards in self organising cellular networks. Additionally, we also aim to present a clear understanding of this active research area, identifying a clear taxonomy and guidelines for design of self organising mechanisms. We compare strength and weakness of existing solutions and highlight the key research areas for further development. This paper serves as a guide and a starting point for anyone willing to delve into research on self organisation in wireless cellular communication networks.

Distribution-based anomaly detection via generalized likelihood ratio test: A general Maximum Entropy approach

Article

Full-text available

Jul 2013
COMPUT NETW

We address the problem of detecting “anomalies” in the network traffic produced by a large population of end-users following a distribution-based change detection approach. In the considered scenario, different traffic variables are monitored at different levels of temporal aggregation (timescales), resulting in a grid of variable/timescale nodes. For every node, a set of per-user traffic counters is maintained and then summarized into histograms for every time bin, obtaining a timeseries of empirical (discrete) distributions for every variable/timescale node. Within this framework, we tackle the problem of designing a formal Distribution-based Change Detector (DCD) able to identify statistically-significant deviations from the past behavior of each individual timeseries. For the detection task we propose a novel methodology based on a Maximum Entropy (ME) modeling approach. Each empirical distribution (sample observation) is mapped to a set of ME model parameters, called “characteristic vector”, via closed-form Maximum Likelihood (ML) estimation. This allows to derive a detection rule based on a formal hypothesis test (Generalized Likelihood Ratio Test, GLRT) to measure the coherence of the current observation, i.e., its characteristic vector, to the given reference. The latter is dynamically identified taking into account the typical non-stationarity displayed by real network traffic. Numerical results on synthetic data demonstrates the robustness of our detector, while the evaluation on a labeled dataset from an operational 3G cellular network confirms the capability of the proposed method to identify real traffic anomalies.

LOF: Identifying Density-Based Local Outliers.

Conference Paper

Full-text available

Jun 2000
SIGMOD REC

For many KDD applications, such as detecting criminal activities in E-commerce, finding the rare instances or the outliers, can be more interesting than finding the common patterns. Existing work in outlier detection regards being an outlier as a binary property. In this paper, we contend that for many scenarios, it is more meaningful to assign to each object a degree of being an outlier. This degree is called the local outlier factor (LOF) of an object. It is local in that the degree depends on how isolated the object is with respect to the surrounding neighborhood. We give a detailed formal analysis showing that LOF enjoys many desirable properties. Using real-world datasets, we demonstrate that LOF can be used to find outliers which appear to be meaningful, but can otherwise not be identified with existing approaches. Finally, a careful performance evaluation of our algorithm confirms we show that our approach of finding local outliers can be practical.

LTE Self-Organising Networks (SON): Network Management Automation for Operational Efficiency

Book

Jan 2012

Covering the key functional areas of LTE Self-Organising Networks (SON), this book introduces the topic at an advanced level before examining the state-of-the-art concepts. The required background on LTE network scenarios, technologies and general SON concepts is first given to allow readers with basic knowledge of mobile networks to understand the detailed discussion of key SON functional areas (self-configuration, -optimisation, -healing). Later, the book provides details and references for advanced readers familiar with LTE and SON, including the latest status of 3GPP standardisation. Based on the defined next generation mobile networks (NGMN) and 3GPP SON use cases, the book elaborates to give the full picture of a SON-enabled system including its enabling technologies, architecture and operation. "Heterogeneous networks" including different cell hierarchy levels and multiple radio access technologies as a new driver for SON are also discussed. Introduces the functional areas of LTE SON (self-optimisation, -configuration and -healing) and its standardisation, also giving NGMN and 3GPP use cases. Explains the drivers, requirements, challenges, enabling technologies and architectures for a SON-enabled system. Covers multi-technology (2G/3G) aspects as well as core network and end-to-end operational aspects. Written by experts who have been contributing to the development and standardisation of the LTE self-organising networks concept since its inception. Examines the impact of new network architectures ("Heterogeneous Networks") to network operation, for example multiple cell layers and radio access technologies.

LOF

Article

Jun 2000
SIGMOD REC

LTE Self-Organising Networks (SON): Network Management Automation for Operational Efficiency

Chapter

Dec 2011

A dynamic affinity propagation clustering algorithm for cell outage detection in self-healing networks

Conference Paper

Apr 2013

With the rapid development of the mobile wireless system, the operator is experiencing unprecedented challenges on service maintenance and operational expenditure, which drives the demand for realizing automation in current networks. The cell outage detection is considered as an effective way to automatically detect network fault. Our work presents an automated cell outage detection mechanism in which a clustering technique called Dynamic Affinity Propagation (DAP) clustering algorithm is introduced. Performance metrics are collected from the network during its regular operation and then fed into the algorithm to produce optimal clusters for further anomaly detection. The proposed mechanism has been implemented in the LTE-Advanced simulation environment, through which we have successfully detected the configured cell outages and located their specific outage areas.

Multidimensional scaling on a sphere

Article

Jan 1991

Nonmetric multidimensional scaling (MDS) is adapted to give configurations of points that lie on the surface of a sphere.There are data sets where it can be argued that spherical MDS is more relevant than the usual planar MDS.The theory behind the adaption of planar MDS to spherical MDS is outlined and then its use is illustrated on three data sets.

Multidimensional Scaling

Chapter

Jan 2008

Suppose dissimilarity data have been collected on a set of n objects or individuals, where there is a value of dissimilarity measured for each pair.The dissimilarity measure used might be a subjective judgement made by a judge, where for example a teacher subjectively scores the strength of friendship between pairs of pupils in her class, or, as an alternative, more objective, measure, she might count the number of contacts made in a day between each pair of pupils. In other situations the dissimilarity measure might be based on a data matrix. The general aim of multidimensional scaling is to find a configuration of points in a space, usually Euclidean, where each point represents one of the objects or individuals, and the distances between pairs of points in the configuration match as well as possible the original dissimilarities between the pairs of objects or individuals. Such configurations can be found using metric and non-metric scaling, which are covered in Sects. 2 and 3. A number of other techniques are covered by the umbrella title of multidimensional scaling (MDS), and here the techniques of Procrustes analysis, unidimensional scaling, individual differences scaling, correspondence analysis and reciprocal averaging are briefly introduced and illustrated with pertinent data sets.

A Cell Outage Detection Algorithm Using Neighbor Cell List Reports

Conference Paper

Dec 2008

Base stations experiencing hardware or software failures have negative impact on network performance and customer satisfaction. The timely detection of such so-called outage or sleeping cells can be a difficult and costly task, depending on the type of the error. As a first step towards self-healing capabilities of mobile communication networks, operators have formulated a need for an automated cell outage detection. This paper presents and evaluates a novel cell outage detection algorithm, which is based on the neighbor cell list reporting of mobile terminals. Using statistical classification techniques as well as a manually designed heuristic, the algorithm is able to detect most of the outage situations in our simulations.

Data-Driven Analytics for Automated Cell Outage Detection in Self-Organizing Networks

Abstract and Figures

Recommended publications

Scotland’s most sustainable university

Adam Smith 300 Year Anniversary – Global Reading Group Events

Adam Smith 300 Year Anniversary – Global Reading Group Events

The future with quantum

A SON Solution for Sleeping Cell Detection using Low-Dimensional Embedding of MDT Measurements

A learning-based approach for autonomous outage detection and coverage optimization

A Machine Learning Framework for Detection of Sleeping Cells in LTE Network

Outage Detection Framework for Energy Efficient Communication Network