ChapterPDF Available

Computer Vision Technique to Detect Accidents

November 2022

November 2022

DOI:10.1007/978-981-19-4162-7_38

In book: Intelligent Computing and Applications (pp.407-418)

Authors:

Jafflet Trinishia

VIT University

Asha Seshasayanam

National Institute of Technology Rourkela

Detecting accident in smart cities is hypothetical task in day-to-day life. It is hard to control by traffic police; the police cannot be available for all the 24 * 7 (all the month). Due to this, many accidents are passing by in everyone life. Many humans are losing their life due to lack of first aid support from the hospital. It takes at least 5 min to pass the accident information to the hospital; hence to overcome this problem, we have used computer vision technique to identify the accident in specific location, and the messages will be passed automatically to the nearby hospital. When the accident is detected the local hospital, and patrol is intimated by the Gmail else by SMS through SMTP to take necessary action. Using deep learning techniques, we are able to achieve a promising solution to this problem.KeywordsDeep learningCCTV surveillance footageFaster R-CNNAlarm systemZFVGG-16

Content uploaded by Asha Seshasayanam

Content may be subject to copyright.

123

Smart Innovation, Systems and Technologies 315

B.NarendraKumarRao

R.Balasubramanian

Shiuh-JengWang

RichiNayakEditors

Intelligent

Computing and

Applications

Proceedings ofICDIC 2020

Smart Innovation, Systems and Technologies

Volume 315

Series Editors

Robert J. Howlett, Bournemouth University and KES International,

Shoreham-by-Sea, UK

Lakhmi C. Jain, KES International, Shoreham-by-Sea, UK

The Smart Innovation, Systems and Technologies book series encompasses the topics

of knowledge, intelligence, innovation and sustainability. The aim of the series is to

make available a platform for the publication of books on all aspects of single and

multi-disciplinary research on these themes in order to make the latest results avail-

able in a readily-accessible form. Volumes on interdisciplinary research combining

two or more of these areas is particularly sought.

The series covers systems and paradigms that employ knowledge and intelligence

in a broad sense. Its scope is systems having embedded knowledge and intelligence,

which may be applied to the solution of world problems in industry, the environment

and the community. It also focusses on the knowledge-transfer methodologies and

innovation strategies employed to make this happen effectively. The combination

of intelligent systems tools and a broad range of applications introduces a need

for a synergy of disciplines from science, technology, business and the humanities.

The series will include conference proceedings, edited collections, monographs,

handbooks, reference books, and other relevant types of book in areas of science and

technology where smart systems and technologies can offer innovative solutions.

High quality content is an essential feature for all book proposals accepted for the

series. It is expected that editors of all accepted volumes will ensure that contributions

are subjected to an appropriate level of reviewing process and adhere to KES quality

principles.

Indexed by SCOPUS, EI Compendex, INSPEC, WTI Frankfurt eG, zbMATH,

Japanese Science and Technology Agency (JST), SCImago, DBLP.

All books published in the series are submitted for consideration in Web of Science.

B. Narendra Kumar Rao ·R. Balasubramanian ·

Shiuh-Jeng Wang ·Richi Nayak

Editors

Intelligent Computing

and Applications

Proceedings of ICDIC 2020

Editors

B. Narendra Kumar Rao

Department of Computer Science

and Engineering

Sree Vidyanikethan Engineering College

Tirupati, India

Shiuh-Jeng Wang

Department of Information Management

Central Police University

Taoyuan, Taiwan

R. Balasubramanian

Department of Computer Science

and Engineering

Indian Institute of Technology Roorkee

Roorkee, India

Richi Nayak

School of Electrical Engineering

and Computer Science

Queensland University of Technology

Brisbane, QLD, Australia

ISSN 2190-3018 ISSN 2190-3026 (electronic)

Smart Innovation, Systems and Technologies

ISBN 978-981-19-4161-0 ISBN 978-981-19-4162-7 (eBook)

https://doi.org/10.1007/978-981-19-4162-7

Singapore Pte Ltd. 2023

This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether

the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse

of illustrations, recitation, broadcasting, reproduction on microﬁlms or in any other physical way, and

transmission or information storage and retrieval, electronic adaptation, computer software, or by similar

or dissimilar methodology now known or hereafter developed.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication

does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant

protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book

are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or

the editors give a warranty, expressed or implied, with respect to the material contained herein or for any

errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional

claims in published maps and institutional afﬁliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.

The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,

Singapore

Conference Committee

International Conference on Data Analytics, Intelligent Computing, and Cyber

Security (ICDIC 2020)

Department of Computer Science and Engineering

Sree Vidyanikethan Engineering College

(Afﬁliated to JNTUA, Anantapuramu)

Sree Sainath Nagar, Tirupathi-517102, India

Chief Patrons

Dr. M. Mohan Babu, Chairman, SVET

Mr. Vishnu Manchu, CEO, SVET

Patrons

Dr. L. Venu Gopal Reddy, Advisor and Director, SVET

Dr. P. Giridhara Reddy, Director, Academics and Research, SVEC

Dr. B. M. Satish, Principal, SVEC

Advisors

Dr. T. Nageswara Prasad, Vice Principal, SVEC

Dr. P. V. Ramana, Dean Academics, SVEC

Dr. B. Raveendra Babu, Dean, CSE, SVEC

vi Conference Committee

International Technical Committee

Dr. Mathew Dailey, Asian Institute of Technology, Thailand

Dr. Lakshmi C. Jain, University of Technology, Sydney, Australia

Dr. Farook Hussain, University of Technology, Sydney, Australia

Dr. Basim Alhadidi, Al-Balqa’ Applied University, Amman, Jordan

Dr. Habibollah Haron, Qaiwan International University, Iraq

Dr. Sam Goundar, Victoria University of Wellington, New Zealand

Dr. Marjan Kuchaki Rafsanjan, Shahid Bahonar University of Kerman, Kerman, Iran

Dr. M. S. Mekala, Yeungnam University, Gyeongsan, Korea

Dr. T. V. Ramana, TVET, Addis Ababa 1000, Ethiopia

Dr. Srinivas Nowduri, USA

Dr. Siva Ram Rajeyyagari, Shaqra University, Shaqra

National Technical Committee

Dr. R. Balasubramanian, IIT Roorkee, India (Chairman)

Dr. R. B. V. Subramanyam, NIT Warangal

Dr. V. K. Govindham, NIT Calicut

Dr. P. Viswanadh, IIIT Sricity

Dr. Devender Gurjat, NIT Silchar

Dr. K. Jairam Naik, NIT Raipur

Dr. G. C. Nandi, IIIT, Allahabad, Jhalwa, Uttar Pradesh

Dr. Deepak Garg, Bennett University, Greater Noida, Uttar Pradesh

Dr. R. Jagadish Kannan, VIT, Chennai

Dr. A. Rama Mohan Reddy, SV University, Tirupati

Dr. A. Vinaya Babu, JNTUH, Hyderabad

Dr. S. Viswanadha Raju, JNTUHCEJ, Jagityal, Karimnagar

Dr. B. Eswar Reddy, JNTUA College of Engineering, Kalikiri

Dr. P. Chenna Reddy, JNTUA, Ananthapuramu

Dr. C. Shoba Bindhu, JNTUA, Ananthapuramu

Dr. V. Valli Kumari, Andhra University, Vishakhapatnam

Dr.B.K.Tripathy,VIT,Vellore

Dr. S. Jyothi, SPMVV, Tirupati

Dr. Praveen Shukla, BBD University, Lucknow

Dr. K. Srujan Raju, CMR Technical Campus, Hyderabad

Dr. B. Balamurugan, Galgotias University, Greater Noida, Uttar Pradesh

Dr. K. Thirunavukkarasu, Karnavati University, Uvarsad, Gandhinagar, Gujarat

Dr. M. Rajasekhar Babu, VIT University, Vellore

Dr. G. Vara Prasad, BMS College of Engineering, Bengaluru

Conference Committee vii

Dr. K. Suneetha, Jain-deemed-to-be University, Bengaluru

Dr. Praveen Shukla, BBD University, Lucknow

Dr. P. Nagendra, Vishnu Institute of Technology, Bheemavaram

Dr. E. S. Madhan, SRM University, Chennai

Honorary Chair

Dr. Suresh Chandra Satapathy, KIIT, Bhubaneswar, India

Conference Chair

Dr. B. Narendra Kumar Rao, SVEC

Convener

Dr. K. Reddy Madhavi, SVEC

Organizing Committee

Dr. A. V. Sri Harsha, SVEC

Dr. G. Sunitha, SVEC

Dr.J.Avanija,SVEC

Dr. K. Suresh, SVEC

Dr. S. Sreenivasa Chakravarthi, SVEC

Dr. D. Ganesh, SVEC

Preface

The maiden International Conference on Data Analytics, Intelligent Computing, and

Cyber Security (ICDIC 2020), organized by the Department of Computer Science and

Engineering, Sree Vidyanikethan Engineering College, Tirupati, Andhra Pradesh,

India, is held during December 29–30, 2021. Sree Vidyanikethan Engineering

College (Autonomous) was established in 1996 by Sree Vidyanikethan Educational

Trust under the stewardship of Dr. M. Mohan Babu, renowned Film Artiste and

Former Member of Parliament (Rajya Sabha). The College was established in the

backward region of Rayalaseema to serve the cause of technical education with

an initial intake of 180. The intake has been increased exponentially to 2382 from

the year 2021 to 2022. The College now offers 15 B.Tech. programs; four M.Tech.

programs; MCA programs; and three Doctoral Programs. AICTE has also accorded

permission for second shift polytechnic from the academic year 2009 to 2010 and

presently ﬁve diploma courses are being offered. Today, Sree Vidyanikethan Engi-

neering College is one of the largest, most admired, and sought-after institutions in

Andhra Pradesh. The College is approved by AICTE and afﬁliated with JNTUA,

Ananthapuramu. The College has been accorded Autonomous Status by the UGC,

New Delhi, in 2010–2011. The College is known for its quality initiatives which are

amply reﬂected in accreditations by the National Board of Accreditation (NBA) for

UG and PG programs and Accredited by the National Assessment and Accreditation

Council (NAAC) with an ‘A’ Grade. The College has been accorded ‘UGC-Colleges

with Potential for Excellence’ status under CPE Scheme by UGC, New Delhi. It

also has been accorded ‘PLATINUM’ category by CII-AICTE Survey and was

conferred with ‘A-Grade’ by the Department of Higher Education, Andhra Pradesh.

The College participated in National Institution Ranking Frame Work (NIRF), 2020,

and was awarded the rank of 184. SIEMENS and APSSDC have established six

state-of-the-art laboratories in the institution.

ICDIC 2020 aims at providing a platform for scientists, scholars, engineers,

and students to present theoretical research and practical advancements at national

and international levels in the ﬁelds of data analytics, intelligent computing, cyber-

security, and its allied areas. Experts from different parts of the globe are involved in

the interaction on respective ﬁelds of the conference theme. Out of 176 submissions

xPreface

from all over the world, only 48 papers were selected after thorough reviewing, for

publication in the Springer Book Series on Smart Innovation, Systems and Technolo-

gies (SIST). International Conference on Data Analytics, Intelligent Computing and

Cyber Security, ICDIC-20, comprises the comprehensive state-of-the-art technical

contributions in the areas of data analytics, intelligent computing, cyber-security, and

emerging technologies. Selected papers were divided into ﬁve tracks, well balanced in

the content, and created enough discussion space for trending concepts. The purpose

of the conference has been served satisfactorily through international and national

speakers, 48 oral presentations by delegates, exchanging and sharing research knowl-

edge among peers. This conference created an ample opportunity for discussions,

debate, and exchange of ideas and information among participants. We are very

much grateful for the international and national advisory committee, session chairs,

peer reviewers who provided critical reviews in selecting quality papers, organizing

committee members, student volunteers, and faculty of the Department of Computer

Science and Engineering, who contributed to the success of this conference. We are

also thankful for all the authors who submitted quality papers and communicated

with their peers through the presentation of the work, which led to the grand success

of the conference.

We are very much thankful to the management of Sree Vidyanikethan Engineering

College Management, for their support in every step of the journey toward the success

of this conference, which inspired organizers and motivated many others.

Tirupati, India

Roorkee, India

Taoyuan, Taiwan

Brisbane, QLD, Australia

B. Narendra Kumar Rao

R. Balasubramanian

Shiuh-Jeng Wang

Richi Nayak

Contents

1 Prediction of Depression-Related Posts in Instagram Social

Media Platform ............................................... 1

M. Harini and B. Sivakumar

2 Classiﬁcation of Credit Card Frauds Using Autoencoded

Features ...................................................... 9

Kerenalli Sudarshana, C. MylaraReddy, and Zameer Ahmed Adhoni

3 BIVFN: Blockchain-Enabled Intelligent Vehicular Fog

Networks ..................................................... 19

Priyanka Gaba and Ram Shringar Raw

4 Deep Learning Approach for Pedestrian Detection, Tracking,

and Suspicious Activity Recognition in Academic Environment .... 29

Kamal Hajari, Ujwalla Gawande, and Yogesh Golhar

5 Data-Driven Approach to Deﬂate Consumption in Delay

Tolerant Networks ............................................ 39

C. Venkata Subbaiah and K. Govinda

6 Code-Level Self-adaptive Approach for Building Reusable

Software Components ......................................... 49

Sampath Korra, V. Biksham, Kotte Vinaykumar, and T. Bhaskar

7 Design of a Deep Network Model for Weed Classiﬁcation ......... 59

M. Vaidhehi and C. Malathy

8 E-Voting System Using U-Net Architecture with Blockchain

Technology ................................................... 69

Nuthalapati Sudha and A. Brahmananda Reddy

9 Multi-layered Architecture to Monitor and Control

the Energy Management in Smart Cities ........................ 81

A. K. Damodaram, S. Sreenivasa Chakravarthi,

L. Venkateswara Reddy, and K. Reddy Madhavi

xii Contents

10 Bio-Inspired Fireﬂy Algorithm for Polygonal Approximation

on Various Shapes ............................................. 95

L. Venkateswara Reddy, Ganesh Davanam, T. Pavan Kumar,

M. Sunil Kumar, and Mekala Narendar

11 An Efﬁcient IoT Security Solution Using Deep Learning

Mechanisms .................................................. 109

Maganti Venkatesh, Marni Srinu, Vijaya Kumar Gudivada,

Bibhuti Bhusan Dash, and Rabinarayan Satpathy

12 Intelligent Disease Analysis Using Machine Learning ............. 119

Nagendra Panini Challa, J. S. Shyam Mohan,

M. Naga Badra Kali, and P. Venkata Rama Raju

13 Automated Detection of Skin Lesions Using Back Propagation

Neural Network ............................................... 127

Nagendra Panini Challa, A. Mohan, Narendra Kumar Rao,

Bhaskar Kumar Rao, J. S. Shyam Mohan, and B. Balaji Bhanu

14 Detection of COVID-19 Using CNN and ML Algorithms .......... 135

M. Raghav Srinivaas, Khanjan Shah, B. Abhishek,

R. Jagadeesh Kannan, and A. Balasundaram

15 Prioritization of Watersheds Using GIS and Fuzzy Analytical

Hierarchy (FAHP) Method ..................................... 149

K. Anil, S. Sivaprakasam, and P. Sridhar

16 A Narrative Framework with Ensemble Learning for Face

Emotion Recognition .......................................... 159

S. Naveen Kumar Polisetty, T. Sivaprakasam, and S. Indraneel

17 Modiﬁed Cloud-Based Malware Identiﬁcation Technique

Using Machine Learning Approach ............................. 169

Gavini Sreelatha, Aishwarya Govindkar, and Sarukolla Ushaswini

18 Design and Deployment of the Road Safety System

in Vehicular Network Based on a Distance and Speed ............. 179

Thalakola Syamsundararao, Badugu Samatha,

Nagarjuna Karyemsetty, Subbarao Gogulamudi, and V. Deepak

19 Diagnosis of COVID-19 Using Artiﬁcial Intelligence

Techniques ................................................... 189

Pattan Afrid Ahmed, Prabhu Gantayat, Sarika Jay,

Venkata Sai Satvik, Jagadeesh Kannan Raju, and A. Balasundaram

20 Location Tracking via Bluetooth ................................ 203

Jasthi Siva Sai, Mukkamala Namitha, Routhu Ramya Dedeepya,

Mulugu Suma Anusha, Angadi Lakshmi, and Mukesh Chinta

Contents xiii

21 Shrimp Surfacing Recognition System in the Pond Using

Deep Computer Vision ......................................... 217

Gadhiraju Tej Varma and Sri Krishna Adusumalli

22 Sign Language Recognition for Needy People Using Machine

Learning Model ............................................... 227

Pavan Kumar Vadrevu, M. R. M. Veeramanickam,

Sri Krishna Adusumalli, and Sasi Kumar Bunga

23 Efﬁcient Usage of Spectrum by Using Joint Optimization

Channel Allocation Method .................................... 235

Padyala Venkata Vara Prasad, K. V. D. Kiran,

Rajasekhar Kommaraju, and N. Gayathri

24 An Intelligent Energy-Efﬁcient Routing Protocol for Wearable

Body Area Networks .......................................... 249

Muniraju Naidu Vadlamudi and Md. Asdaque Hussian

25 Enhanced Video Classiﬁcation System with Convolutional

Neural Networks Using Representative Frames as Input Data ..... 259

K. Jayasree and Sumam Mary Idicula

26 Text Recognition from Images Using Deep Learning

Techniques ................................................... 265

B. Narendra Kumar Rao, Kondra Pranitha, Ranjana,

C. V. Krishnaveni, and Midhun Chakkaravarthy

27 Early Detection and Diagnosis of Oral Cancer Using Fusioned

Deep Neural Network .......................................... 281

Sree T. Sucharitha, I. Kannan, and K. A. Varun Kumar

28 Fine-tuning for Transfer Learning of ResNet152 for Disease

Identiﬁcation in Tomato Leaves ................................. 295

Lakshmi Ramani Burra, Janakiramaiah Bonam,

Praveen Tumuluru, and B Narendra Kumar Rao

29 AI-Based Mental Fatigue Recognition and Responsive

Recommendation System ...................................... 303

Korupalli V. Rajesh Kumar, B. Rupa Devi, M. Sudhakara,

Gabbireddy Keerthi, and K. Reddy Madhavi

30 Multiple Slotted Triple-Band PIFA Antenna for Wearable

Medical Applications at 2.5–9 GHz ............................. 315

T. V. S. Divakar and G. Anantha Rao

31 Fish Classiﬁcation System Using Customized Deep Residual

Neural Networks on Small-Scale Underwater Images ............. 327

M. Sudhakara, Y. Vijaya Shambhavi, R. Obulakonda Reddy,

N. Badrinath, and K. Reddy Madhavi

xiv Contents

32 Multiple Face Recognition System Using OpenFace ............... 339

Janakiramaiah Bonam, Lakshmi Ramani Burra,

Roopasri Sai Varshitha Godavarthi, Divya Jagabattula,

Sowmya Eda, and Soumya Gogulamudi

33 EDAARP-Efﬁcient and Data-Aggregative Authentic Routing

Protocol for Wireless Sensor Networks .......................... 351

Kurakula Arun Kumar and Karthikeyan Jayaraman

34 Mobile-Based Selﬁe Sign Language Recognition System

(SSLRS) Using Statistical Features and ANN Classiﬁer ........... 361

G. Anantha Rao, K. Syamala, and T. V. S. Divakar

35 An Effective Model for Malware Detection ...................... 377

V. Valli Kumari and Shaik Jani

36 An Efﬁcient Approach to Retrieve Information for Desktop

Search Engine ................................................ 387

S. A. Karthik, G. Lalitha, Y. Md. Riyazuddin, and R. Venkataramana

37 Baggage Recognition and Collection at Airports .................. 397

Aviral Pulast and S. Asha

38 Computer Vision Technique to Detect Accidents ................. 407

A. Jafﬂet Trinishia and S. Asha

39 An Efﬁcient Machine Learning Approach for Apple Leaf

Disease Detection .............................................. 419

K. R. Bhavya, S. Pravinth Raja, B. Sunil Kumar, S. A. Karthik,

and Subhash Chavadaki

40 Precipitation Estimation Using Deep Learning ................... 431

Mohammad Gouse Galety, Fanar Fareed Hanna Rofoo,

and Rebaz Maaroof

41 The Adaptive Strategies Improving Digital Twin Using

the Internet of Things ......................................... 439

N. Venkateswarulu, P. Sunil Kumar Reddy, O. Obulesu,

and K. Suresh

42 Deep Learning for Breast Cancer Diagnosis Using

Histopathological Images ...................................... 447

Mohammad Gouse Galety, Firas Husham Almukhtar,

Rebaz Jamal Maaroof, and Fanar Fareed Hanna Rofoo

43 Implementation of 12 Band Integer Filter-Bank for Digital

Hearing Aid .................................................. 455

K. Ayyappa Swamy and Zachariah C. Alex

Contents xv

44 Comparative Analysis on Heart Disease Prediction

Using Convolutional Neural Network with Adapted

Backpropagation .............................................. 465

K. Suneetha, Kamala Challa, J. Avanija, Yaswanth Raparthi,

and Suresh Kallam

45 Applying Machine Learning to Enhance COVID-19

Prediction and Diagnosis of COVID-19 Treatment Using

Convalescent Plasma .......................................... 479

Lavanya Kongala, Thoutireddy Shilpa, K. Reddy Madhavi,

Pradeep Ghantasala, and Suresh Kallam

46 Analysis of Disaster Tweets Using Natural Language

Processing .................................................... 491

Thulasi Bikku, Pathakamuri Chandrika, Anuhya Kanyadari,

Vuyyuru Prathima, and Borra Bhavana Sai

Author Index ...................................................... 503

About the Editors

Dr. B. Narendra Kumar Rao is currently Professor and Head of the Department

of Computer Science and Engineering at Sree Vidyanikethan Engineering College,

Tirupati, Andhra Pradesh, India. He published and authored in reputed journals and

books to his credit. He is part of Intelligent Computing Research Centre at Sree

Vidyanikethan Engineering College, Tirupati. His areas of research include Software

Engineering and Deep Learning.

R. Balasubramanian is currently with the Department of Computer Science and

Engineering, Indian Institute of Technology, Roorkee, India. His areas of interest

include computer vision, image processing, machine learning and other allied areas.

Being an active researcher he is part of several reputed research labs such as SeSaMe

Research Centre, National University of Singapore, Singapore, School of Computer

Sciences, Universiti Sains Malaysia, Pulau Pinang, Malaysia, Faculty of Computing

and Informatics, Multimedia University, Cyberjaya Campus, Malaysia, and Depart-

ment of Computer Science, University at Albany-State University of New York

(SUNY), NY, USA. So far, he has published more than 130 international journals,

125 conference proceedings, seven book chapters and a technical report. He is the

recipient of BOYSCAST fellowship (awarded by DST, India). He is also the recipient

of “Outstanding Teacher Award 2010” awarded by IIT Roorkee.

Prof. Shiuh-Jeng Wang is currently with the Department of Information Manage-

ment at Central Police University, Taoyuan, Taiwan, where he directs the Informa-

tion Cryptology and Construction Laboratory (ICCL). He was a recipient of the 5th

Acer Long-Tung Master Thesis Award and the 10th Acer Long-Tung Ph.D. Disser-

tation Award in 1991 and 1996, respectively. He served the editor-in-chief of the

journal of Communications of the CCISA in Taiwan from 2000 to 2006. He authored

eight books: Information Security,Cryptography and Network Security,State of the

Art on Internet Security and Digital Forensics,Eyes of Privacy–Information Secu-

rity and Computer Forensics,Information Multimedia Security, Computer Forensics

xvii

xviii About the Editors

and Digital Evidence,Computer Forensics and Security Systems, and Computer

and Network Security in Practice, published in 2003, 2004, 2006, 2007, and 2009,

respectively.

Assoc. Prof. Richi Nayak is with the School of Electrical Engineering and Computer

Science, Queensland University of Technology, Brisbane, Australia. She is Head of

the Data Science Discipline in EECS. She is an internationally recognised expert in

data mining, text mining, and web intelligence. She has combined knowledge in these

areas very successfully with diverse disciplines such as Social Science, Science, and

Engineering for technology transfer to real-world problems to change their practices

and methodologies. Her particular research interests are machine learning, and in

recent years, she has concentrated her work on text mining, personalization, automa-

tion, and social network analysis. She has published high-quality conference and

journal articles and highly cited in her research ﬁeld. She has received a number of

awards and nominations for teaching, research, and service activities.

Chapter 1

Prediction of Depression-Related Posts

in Instagram Social Media Platform

M. Harini and B. Sivakumar

Abstract Depression is taken as a vital issue that is a leading cause of disability

worldwide and is a major contributor to serious medical illness which may even lead

to suicide. Depression causes feelings of sadness and loss of interest in activities you

once enjoyed. This serious medical illness seriously affects the way you think, how

you feel and act. Millions of people are suffering from depression, but only a few of

them are undergoing proper and adequate treatment. So as most of us are very much

connected to social media today, we decided to explore any such depression-related

behavior in their posts that they do in it. Usually, this is caused by a person’s day-

to-day life activities such as working, thinking, relationship issues, and studying. It

is taken as a sober challenge in our everyday lives. Nowadays, people spend much

time on social media forums, and detection of depression-related posts is important

to avoid sharing bad posts among the people community spreading positivity. Deter-

mination of depression levels and person’s negative response is important because

it tells us about the negativism and also usage of ML classiﬁer techniques and auto-

matic negative posts are also determined. This proposed system can help the kids in

viewing only positive posts in the social media forums.

1.1 Introduction

Nowadays, Instagram became the most well-liked social media platform, as the user

interface attracts people to use it for a longer time, there is no end for matter and

we can keep on scrolling. People keep on sharing their daily day-to-day activities

on the social media platform, and users share all kinds of emotions and activities in

the form of images and texts. Depression is increasing vigorously, it is being treated

spuriously, and it is a mental disorder that has an unremitting feeling of hopelessness

and worthlessness. It affects the way the user thinks in his normal life, and they often

feel very difﬁcult to interact with people.

M. Harini ·B. Sivakumar (B)

Computer Science and Engineering Department SRMIST, Kattankulathur, India

e-mail: sivakumb2@srmist.edu.in

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_1

2 M. Harini and B. Sivakumar

Using sentimental analysis [1–5] since it is a thriving topic has been researched for

a long time, with the goal of determining the character of the text and categorizing the

polarity of it. In today’s information age, a wealth of data is available for sentiment

classiﬁcation of both image and text via social networking sites; [6–8] however, using

data from some social networking sites elevates privacy concerns; thus, Instagram

is the best social networking site for gathering enough data while also protecting us

from privacy laws.

For post classiﬁcation, a variety of machine learning algorithms and natural

language processing (NLP) approaches are applicable, including Naive Bayes, SVM

algorithms, random forest algorithms, PCA, LIWC, and LDA. This work aims to

use ML and NLP [2,3] approaches to text-related data on social media in order to

understand the emotions of users, with a particular focus on depression, as well as

feature extraction techniques for picture classiﬁcation. In this article, we largely deal

with social media postings, which are brief messages with a character restriction.

Users express their thoughts and feelings regarding current events in their lives and

the world around them in this short way.

1.2 Literature Survey

A A Computational Approach to Include Extraction for Recognizable Proof of

Self-Destructive Ideation in Tweets by T. Deepa and M. Kiran Mayee in the year

of 2020 proposed the depiction of any self-destructive depression behavior of a

person reﬂected in their tweets that they do in one of the trending social media

platform Twitter. The objective of the paper is it mainly focuses on linguistic

inquiry and word count to procure preferred outcomes [9].

B Detection of Depression-Related Posts in Reddit Social Media Forum by Michael

M. Tadesse, Hongfei Lin, Bo Xu, and Liang Yang in the year of 2019 proposed the

illustration of run-through of contemporary depression-related etiquette in social

media. The main objective of the paper is it mainly focuses on NLP techniques

such as N-grams, LIWC, and LDA to prevail desired outputs.

C Data Mining Approach to the Detection of Suicide in Social Media: A Case

Study of Singapore by Jane H. K. Seah, Kyong Jin Shim in the year of 2018

proposed the delineation of the analysis of the posts and comments that are

related to depression and suicide in social media. The intent of the paper is it

mainly focuses on the usage of LDA NLP technique to obtain at results.

D Predicting Depression Levels Using Social Media Posts by Maryam Mohammed

Aldarwish, Haﬁz Farooq Ahmed in the year of 2017 proposed to present the

association among SNS users’ activities and mental health illness in social media.

The main grail of the paper is it mainly focuses on to classifying texts in different

social media platforms.

E Nonparametric Discovery of Online Mental Health-Related Communities by

Bo Dao, Thin Nguyen, Svetha Venkatesh and Dinh Phungin the year of 2015

1 Prediction of Depression-Related Posts … 3

proposed to render the distributed-free ﬁnding of interactive web-based life-

disrupting allied sections. The leading object of the paper is it mainly focuses on

to classifying texts in different social media platforms.

1.3 Proposed Work

As depicted in Fig. 1.1 while in the initial phase, dataset or metadata is obtained

from Instagram API and public repositories while collecting the post’s ﬁlters that are

applied. The collected dataset contains seven different kinds of emotions, and the

dataset we collected is divided into two categories that is training and testing datasets.

Training helps in training the given model and helps in creating the model. Testing

helps in testing the model and also checks the output data. Around 80% should be

trained dataset, and remaining 20% will be testing dataset. Figure 1.1 depicts the

steps involved in emotion identiﬁcation from images.

A Preprocessing: In preprocessing, the image taken as input from the dataset is

ampliﬁed by magnifying and removing the original (present) distortions, and

noise also improves its features to promote for next stage of processing. Here

details of input image are preserved, and at the same time, redundancy which

causes noise is removed. Overall this stage reads the input image and then rescales

it through the processes of ﬁltration and normalization to remove noise. Then in

the rotated and uniform sized image, it is further sent to segmentation stage. The

image obtained from the preprocessing stage is taken as input to the segmentation

stage.

B Segmentation: Segmentation analysis partitions the digital input image taken as

input from preprocessing stage which separates image into multiple fragments.

The input image is divided into compatible, analogous, and coherent regions

Fig. 1.1 Workﬂow of proposed method for image

4 M. Harini and B. Sivakumar

corresponding to different entities in the input image on the framework of dispo-

sition, boundary, and potency. The image obtained from the segmentation stage

is taken as input to the feature extraction stage.

C Feature extraction: In this stage, the variable dimensions of the input image are

constricted. For the objective of facial emotion recognition, in order to prevail

the real-time staging and to reduce time complexity, only the eye entity and the

mouth entity are considered in an image. Composite of these two features is

ample to fetch the emotion accurately. To arrive at this combination, requisite

that is the projecting points from characteristic regions of image point detection

algorithm (PDA) is used.

In a similar way, preprocessing of text is the initial step that needs to be done

notably to the input text dataset taken. This stage includes takeaway of hash tags,

taggings, stop words, and redundant words from the input text.

Then the output from preprocessing stage is subjected to feature extraction stage.

Feature extraction is used to convert the text into a matrix or vector form. There

are many ways where we can do feature extraction. For this, we are using bag of

words. This feature helps in combining various machine learning algorithms and

extracts the data in many ways. It is a simple form to transform tokens into features,

it does not care about the format and count of words, and only consideration part is

word present in set of words. Then the output of previous stage is subjected to RNN

classiﬁer. In this stage, we will come to know under which categorization of emotion

the text originally belongs to Fig. 1.2 depicts the series of steps involved in detecting

the type of emotion from texts.

Fig. 1.2 Workﬂow of proposed method for text

1 Prediction of Depression-Related Posts … 5

1.4 Implementation

A Eye extraction: The eye entity in image is the most crucial segment in recognition

due to disclosure of white space along with the iris. The entity consisting of

strappy vertical frontiers or boundaries is to be detected. Therefore, the Sobel

mask or ﬁlter or frontier detector is applied to the input image to detect the parallel

projecting points of erect boundaries. Hence, it evaluates the inclination in both

x-direction and y-direction in order to improve accuracy.

B Eyebrow extraction: The eyebrow entity in the input image needs to be detected

and precisely segmented in order to stage eyebrow division scrutiny. The two

elliptical portions of characteristic boundaries in the image which recline straight

overhead each of the eye fragment are sorted out as the eyebrow regions.

The boundary fragments of these two regions are fetched for further straining.

Instantly, Sobel detection is employed in fetching the edge image, as it can recog-

nize more boundaries than Robert’s method. These fetched fringe images are then

distended, and the trenches are ﬁlled. The out-turn periphery images are used in

straining the eyebrow portions.

C Mouth extraction: The mouth entity in the input image needs to be detected and

strictly segmented and features to be extracted in order to stage mouth rift perusal.

The head (upper jaw), base (lower jaw), right most, and left most positions of the

mouth are fetched, and the core or centroid of the mouth is computed. Graphs:

Classiﬁcation accuracy for Xceptional model in CNN is depicted as below, Y-

axis: Accuracy in percentage computational time required for Xception model

in CNN is depicted as below, and Y-axis: time in hours.

D RNN classiﬁer: Classiﬁcation tells us to which category the text comes under,

[10–13] here we are using RNN classiﬁer for both text [14,15] and emotion of

user posts. RNN classiﬁer: It works in three stages: The ﬁrst stage moves through

hidden layer and makes it prediction; in the second stage, it compares with the

data present, and in the ﬁnal stage, it calculates gradient for each node. This

model is used because each layer gives short-term memory, by using this we can

predict the next level in a very easy way, and it is mainly used in sentimental

analysis and speech tagging, etc. Fig. 1.3 shows the amount of each emotion

expressed in the image and Fig. 1.4 shows the emotion detected from the image

along with the proportion of other emotions that are possible.

1.5 Results and Discussions

The main objective and outcome of our work are to analyze the emotions of users

from the kind of activities they do through posts extracted from Instagram forum

by applying PDA(CNN) classiﬁcation technique to image input dataset and RNN

classiﬁer to text dataset [3,15,16,17]. The image emotion detected in posts is

6 M. Harini and B. Sivakumar

Fig. 1.3 Illustration of proportion of each emotion in Fig. 1.4 using CNN

Fig. 1.4 Emotion depicted in image of Instagram post using PDA

classiﬁed into seven categories, and text emotion is classiﬁed into three categories

of nature.

References

1. Goel, A., Gautam, J., & Kumar, S. (2016). Real time sentiment analysis of tweets using

Naive Bayes. In 2nd International Conference on Next Generation Computing Technologies,

Dehradun.

2. Praveen, P., Sudheer, P., & Sudheer Kumar, K. (2018). Public sentiment analysis on movie

reviews. In Computer Science and Engineering Department.

3. Bouazizi, M., & Ohtsuki, T. (2016). Sentiment analysis in Twitter: From classiﬁcation to

quantiﬁcation of sentiments within tweets. In Proceedings IEEE GLOBECOM (pp. 1–6).

4. Gao, W., & Sebastiani, F. (2015). Tweet sentiment: From classiﬁcation to quantiﬁcation. In

Qatar Computing Research Institute Hamad bin Khalifa University PO Box 5825, Doha, Qatar

IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

5. Hadeghatta, U. R. (2013). Sentiment analysis of Hollywood movies on Twitter. In Proceedings

of the IEEE/ACM ASONAM (pp. 1401-1).

6. Kumar, A., Sharma, A., & Arora, A. (2019). Anxious depression prediction in real-time

social data. In International Conference on Advances in Engineering Science Management &

Technology (ICAESMT)-2019, Uttaranchal University, Dehradun, India.

1 Prediction of Depression-Related Posts … 7

7. Smys, S., & Raj, J. S. (2021). Analysis of deep learning techniques for early detection of

depression on social media network—A comparative study. Journal of trends in Computer

Science and Smart technology (TCSST), 3(01), 24–39.

8. Hossain, M. T., Talukder, M. A. R., & Jahan, N. (2021). Social networking sites data analysis

using NLP and ML to predict depression. In 12th International Conference on Computing

Communication and Networking Technologies (ICCCNT) (pp. 1–5). IEEE.

9. Chiu, C. Y., Lane, H. Y., Koh, J. L., & Chen, A. L. (2021). Multimodal depression detection

on Instagram considering time interval of posts. Journal of Intelligent Information Systems,

56(1), 25–47.

10. Nanomi Arachchige, I. A., Sandanapitchai, P., & Weerasinghe, R. (2021). Investigatingmachine

learning and natural language processing techniques applied for predicting depression disorder

from online support forums: A systematic literature review. Information, 12(11), 444.

11. Bouazizi, M., & Ohtsuki, T. (2017). A pattern-based approach for multi-class sentiment analysis

in Twitter. In Proceedings of the IEEE Access (pp. 20617–20639).

12. Mohammad, S. M., & Kiritchenko, S. (2015). Using hashtags to capture ﬁne emotion categories

from tweets. In Computational Intelligence (Vol. 31, No. 2, pp. 301–326).

13. Plank, B., & Hovy, D. (2015). Personality traits on Twitter or how to get 1500 personality tests

in a week. In Proceedings of the 6th Workshop on Computational Approaches to Subjectivity,

Sentiment and Social Media Analysis (pp. 92–98).

14. Agarwal, A., Xie, B., Vovsha, I., Rambow, O., & Passonneau, R. (2016). Sentiment analysis

of Twitter data. In Department of Computer Science Columbia University New York.

15. Ferragina, P., Piccinno, F., & Santoro, R. (2016). On analyzing hashtags in Twitter. In Dipar-

timent of di Informatica University of Pisa Proceedings of the Ninth International AAAI

Conference on Web and Social Media.

16. Garimella, A., & Mihalcea, R. (2016). Zooming in on gender differences in social media. In

Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality,

and Emotions in Social Media.

17. Bamman, D., & Smith, N. A. (2015). Contextualized sarcasm detection on Twitter. In

Proceedings of the 9th International AAAI Conference on Web and Social Media Citeseer

(pp. 574–577).

Chapter 2

Classiﬁcation of Credit Card Frauds

Using Autoencoded Features

Kerenalli Sudarshana, C. MylaraReddy, and Zameer Ahmed Adhoni

Abstract With the advent of online payment and business systems, safety of trans-

actions has become an essential factor. In 2020, Internet Crime Complaint Center

(IC3) had received 2,211,396 complaints culminating in a $13.3 billion loss which

is approximately equal to cumulative loss from 2016–19. Credit card fraud detection

is hindered by concept drift, uneven class distribution, identifying a smaller set of

features, and detecting real-time frauds. To overcome the above issues, in this article,

we propose an autoencoder-based classiﬁcation scheme to extract the characteristics

such as credit card type, credit grade, credit line, book balance, and other features

from a European credit card dataset. Also, the performance of different machine

learning algorithms compared for the classiﬁcation consistency using encoded fea-

tures. The results obtained are as follows: accuracy of 99.95%, 97.45% precision,

88.26% recall, and 92.36% F1-score.

2.1 Introduction

With the introduction of online payment systems and the migration of businesses to

the Internet, a secure cyber-transaction has become a critical component of payment

security. In 2020, the Internet Crime Complaint Center (IC3) published the number

of complaints and its supporting data (Fig. 2.1). IC3 received 2,211,396 complaints

within that period, resulting in a $13.3 billion loss [4]. A stolen, misplaced, or cloned

credit card can result in fraud. Furthermore, the rise of online purchasing has escalated

the number of cases of card-not-present fraud or the use of a credit card number in an

e-commerce transaction. Application fraud, account takeover fraud, social engineer-

K. Sudarshana (B)·C. MylaraReddy ·Z. A. Adhoni

Department of CSE, School of Technology, GITAM, Bengaluru, India

e-mail: kerenalli@gmail.com

C. MylaraReddy

e-mail: mchinnai@gitam.edu

Z. A. Adhoni

URL: https://www.gitam.edu

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19- 4162-7_2

10 K. Sudarshana et al.

Fig. 2.1 Top ﬁve cybercrimes and corresponding losses from 2016–20. Source https://www.fbi.

gov

ing fraud, and skimming fraud are some of the most frequent categories of payment

card fraud. The number of relevant research initiatives in the literature has increased

because of the signiﬁcant socioeconomic implications of recognizing credit card

frauds. Recent approaches for identifying fraudulent credit card transactions include

classiﬁcation [3,14,17], clustering [8], anomaly detection, and association analy-

sis [13]. Credit card fraud detection is challenged by concept drift, imbalanced class

distribution, discovering a smaller collection of attributes, and detecting real-time

frauds [1]. New fraud issues are also arising as mobile devices become more widely

utilized [6]. Finding a representative set of attributes for a classiﬁcation task is difﬁ-

cult. Some of the most common ways for reducing the attribute set include sampling,

aggregation, linear algebra-based principal component analysis, and discrete wavelet

analysis.

In this proposed method, a nonlinear principal component analysis methodology,

“autoencoder,” is used to extract representative features by decreasing the number of

attributes. The proposed approach uses the European credit card dataset to explore the

effectiveness of multiple classiﬁers that use this set of characteristics. The classiﬁer’s

accuracy, precision, recall, and F1-score assess its performance.

The paper is organized as follows: In Sect. 2.2, literature survey is given.

Section 2.3 outlines the autoencoder mathematical description. Section 2.4 provides a

brief description of the proposed technique. In Sect. 2.5, the outcomes are examined,

and a brief report on the experimentation results is provided. Finally, in Sect. 2.6,we

conclude the article with a scope for further investigation.

2 Classiﬁcation of Credit Card Frauds Using Autoencoded Features 11

Table 2.1 Performance of summary of recent methods

Author Year Method Algorithms Results

Sudha and Akila [14]2021 MVE WMSP, RF, SVM F1-score (87%)

Visalakshi and

others [16]

2021 Classiﬁcation RF, SVM, GNBL Accuracy (99.78%)

Janvitha and others [8]2021 Clustering HMM AUC (RP = 69%)

Asha and Suresh [2]2021 Classiﬁcation SVM, k-NN, ANN F1-score (78.58%)

Jaiswal and others [7]2021 IFLOF analysis IFLOF Accuracy (97%)

Taha and Malebary [15]2020 Classiﬁcation OLGBM F1-score (56.95%)

Zhang and others [18]2019 Deep learning HOBA F1-score (48.80%)

Puh and Brki´c[12]2019 Classiﬁcation LR, SVM, RF Accuracy (91.07%)

Navanshu and

others [11]

2018 Data mining LR, DT, SVM, RF Accuracy (98.6%)

2.2 Literature Review

This section examines several notable studies for credit card fraud detection. The

authors have used several computational methods for detecting such a fraud trans-

actions in their respective work. The number of relevant research initiatives in the

literature has increased because of the signiﬁcant socioeconomic implications of

recognizing credit card frauds.

Homogeneity-oriented behavior analysis and deep learning were employed [18].

SMOTE was used to balance the European cardholders dataset and applied different

machine learning approaches [12].

Generative adversarial networks (GANs) model used to produce fake instances

and address the imbalanced class dataset problem [5]. The goal was to increase the

effectiveness of fraud detection methods. An under complete autoencoder network

presented by using a supervised learning technique [9] for dimensionality reduction

and fraud detection. Table 2.1 presents the various approaches that have been used

recently and the outcomes acquired.

2.3 Autoencoders

In this section, we discuss the autoencoders and their mathematical model.

2.3.1 Basic Architecture

Autoencoder is an unsupervised, nonlinear approach for detecting and eliminating

nonlinear correlations in data. The autoencoder is used to reduce dimensionality by

12 K. Sudarshana et al.

Fig. 2.2 Autoencoder

architecture. Source www.

wikipedia.org

removing redundant data. Feed-forward, non-recurrent architecture is the simplest

architecture of an autoencoder. It consists of two components: encoder and decoder

as shown in Fig. 2.3. The number of neurons in the input and output layer remains

equal (Fig. 2.2).

2.3.2 Mathematical Model for Autoencoder

Let X,Frepresents input vector, feature vector. Let ,, be the encoder and decoder

functions, such that:

:X→F

and

:F→X

The input be xof set Xis mapped as a latent vector hof set Fas in Eq. 2.1, where

σis an activation function, Wis the set of weights, and bis the set of biases.

h=σ×(W×x+b)(2.1)

The decoder network then constructs the approximation form xof xas given in

Eq. 2.2, and where σis an activation function, Wis the set of weights and bis the

set of biases, such that

x=σ×(W×h+b)(2.2)

The overall network then tries to minimize the reconstruction loss, L(x,x)as it is

given in Eq. 2.3.

L(x,x)=||(x−x)||2(2.3)

2 Classiﬁcation of Credit Card Frauds Using Autoencoded Features 13

Fig. 2.3 Model for feature extraction

Fig. 2.4 Model for classiﬁcation phase

2.4 Proposed Method

The proposed method consists of the following steps:

1. Feature Extraction Phase

2. Classiﬁcation Phase.

2.4.1 Feature Extraction Phase

The model’s architecture for feature extraction has input, an encoder, a bottleneck,

a decoder, and an output layer. Each input records ‘x’ as 30 features, and hence,

the input layer has 30 neurons. They are connected to the neurons in the encoder.

The encoder and decoder are comprised three layers of a feedforward network. The

output of each layer is forwarded to the next layer. The output of the last layer is

input to the neurons of the bottleneck layer. The bottleneck layer function is to reduce

the total number of attributes to 15. These attributes are used as input to decoder.

The decoder network reconstructs the approximation of the original input, x.Itis

depicted as in Fig. 2.3.

2.4.2 Classiﬁcation Phase

In the proposed method, the latent feature vector represents the encoded version of

the input vector. The small set of fraud class instances are encoded efﬁciently by the

encoder network during the training phase. Thereby, it eliminates the skewed data

distribution and feature engineering problems. Once the network learns the better

approximation, decoder is discarded, and the classiﬁer model is connected to the end

of the bottleneck layer. It is as shown in Fig. 2.4.

14 K. Sudarshana et al.

2.5 Results and Discussion

This section discusses the dataset used, experimental setup, evaluation metrics con-

sidered, and a brief description of the experimental results.

2.5.1 Dataset

Credit card features [1] are used to ﬁgure out account holders’ buying habits, which

are strongly linked to their attributes, such as income and age. The details on credit

card purchases can be found at [10]. There are 284,807 instances; 284,315 are valid

transactions and 492 fraudulent. Each record has 30 attributes. Fraudulent transac-

tions are negative, and legal ones are positive [9].

2.5.2 Experimental Setup

An Intel Core i3-7000 processor running at 3.90 GHz was used to execute the pro-

posed credit card fraud detection experiment. 8GB of RAM is installed. The tests

were carried out using a 64-bit Windows X64 operating system. The TensorFlow

framework was used to create the autoencoder method, and the sci-kit learn package

was used to implement and evaluate the machine learning approaches.

2.5.3 Evaluation Metrics

Let, Tbe the number of examples, then the number of true positives be TP, the

number of true negatives be TN, the number of false positives be FP and FN be the

false negatives. Following are some of the assessment measures evaluated in this

study:

Accuracy =Acc.=TP +TN

TP +TN +FP +FN (2.4)

Precision =Pre.=TP

TP +FP (2.5)

Recall =Rec.=TP

TP +FN (2.6)

F1-score =F1 =2∗Precision ∗Recall

Precision +Recall =2∗TP

2∗TP +FP +FN (2.7)

2 Classiﬁcation of Credit Card Frauds Using Autoencoded Features 15

Table 2.2 Performance of proposed method against the existing techniques

Classiﬁer Precision Recall F1-score Accuracy

ANN [2]81.13 76.19 NA 99.92

HOBA [18]36.18 75.00 48.80 96.51

OLGBM [15]97.34 40.59 56.95 98.40

MVE

method [14]

92.50 81.50 87.00 98.50

Proposed method 97.45 88.26 92.36 99.95

Fig. 2.5 Comparison of different ML algorithms

2.5.4 Results and Discussion

Our experiments yielded the following ﬁndings (shown in Table 2.2). Random for-

est and CatBoost outperformed the other algorithms, while linear regression (LR)

and light gradient boosting machines (LGBM) were the worst. Out of 98 fraud

transactions, the random forest algorithm correctly classiﬁed the false-negative sam-

ples 88.26% of the time. Non-fraudulent transactions accounted for just 11.64% of

the fraudulent transactions. Furthermore, the accuracy of 97.45% indicates that just

2.55% of genuine transactions were misclassiﬁed as fraud. However, the noticeable

drawback of this approach is that it required a lot of computational resources and

time to execute.

The CatBoost algorithm correctly classiﬁed the true class examples 99.95% of

the time and 86.20% of the time; the false-negative cases were correctly classiﬁed.

Non-fraudulent transactions accounted for just 13.80% of the fraudulent transactions.

Furthermore, the accuracy of 97.70% indicates that just 2.30% of actual transactions

were incorrectly labeled as fraud (Fig. 2.5).

We noticed that we achieved the precision of 97.45%, recall of 88.26%, F1-score

of 92.36%, and accuracy of 99.95% when the proposed method’s outcomes were

16 K. Sudarshana et al.

compared to the benchmark results. It has 2.69 times improved speciﬁcity, 84.94

percent higher recall, 1.89 times stronger F1-score, and 96.55% better accuracy than

the HOBA technique [18].

2.6 Conclusion

Increased credit card usage demands the detection of credit card fraud. Due to the

technological complexity and continued ﬁnancial and commercial losses, developing

an efﬁcient system for identifying fraudulent credit card transactions is required. This

paper proposes an effective technique for detecting fraud in credit card transactions

by using an autoencoder for feature selection. The proposed method’s performance

was determined by comparing it to other ﬁndings and state-of-the-art techniques.

The proposed strategy outperformed the other accuracy, precision, and F1-score

approaches according to the experiments. The results highlight the importance of a

practical representative feature in strengthening the prediction performance of the

proposed strategy.

References

1. Abdallah, A., Maarof, M. A., & Zainal, A. (2016). Fraud detection system: A survey. Journal

of Network and Computer Applications, 68, 90–113.

2. Asha, R., & Suresh Kumar, K. R. (2021). Credit card fraud detection using artiﬁcial neural

network. Global Transitions Proceedings, 2(1), 35–41.

3. Dhankhad, S., Mohammed, E., & Far, B. (2018). Supervised machine learning algorithms for

credit card fraudulent transaction detection: A comparative study. In 2018 IEEE International

Conference on Information Reuse and Integration (IRI) (pp. 122–125). IEEE.

4. FBI. FBI releases the internet crime complaint center 2020 internet crime report, including

covid-19 scam statistics. https://www.fbi.gov/news/pressrel/pressreleases/fbireleasestheintern

etcrimecomplaintcenter2020internetcrimereportincludingcovid19scamstatistics

5. Fiore, U., De Santis, A., Perla, F., Zanetti, P., & Palmieri, F. (2019). Using generative adversarial

networks for improving classiﬁcation effectiveness in credit card fraud detection. Information

Sciences, 479, 448–455.

6. Gullapalli, V., & Sireeshkumar Kalli, A. V. Cards and payments. https://www.capgemini.com/

in-en/service/cards-and-payments/

7. Jaiswal, S., Brindha, R., & Lakhotia, S. (2021). Credit card fraud detection using isolation

forest and local outlier factor. Annals of the Romanian Society for Cell Biology, 4391–4396.

8. Janvitha, K., Vasavi, C. R. S., Sruthi, A., Praharshitha, K., & Anguraj, D.K. (2021). Survey

on detection of credit card frauds using HMM and various clustering approaches. In 2021

6th International Conference on Inventive Computation Technologies (ICICT) (pp. 101–107).

IEEE.

9. Misra, S., Thakur, S., Ghosh, M., & Saha, S. K. (2020). An autoencoder based model for

detecting fraudulent credit card transaction. Procedia Computer Science, 167, 254–262.

10. mlg ulb: Credit card fraud detection. https://www.kaggle.com/mlg-ulb/creditcardfraud

11. NavanshuKhare, S. Y. S. (2018). Credit card fraud detection using machine learning techniques.

International Journal of Pure and Applied Mathematics, 557, 825–837.

2 Classiﬁcation of Credit Card Frauds Using Autoencoded Features 17

12. Puh, M., & Brki´c, L. (2019). Detecting credit card fraud using selected machine learning algo-

rithms. In 2019 42nd International Convention on Information and Communication Technology,

Electronics and Microelectronics (MIPRO) (pp. 1250–1255). IEEE.

13. Seeja, K., & Zareapoor, M. (2014). Fraudminer: A novel credit card fraud detection model

based on frequent itemset mining. The Scientiﬁc World Journal, 2014.

14. Sudha, C., & Akila, D. (2021). Majority vote ensemble classiﬁer for accurate detection of credit

card frauds. Materials Today: Proceedings.

15. Taha, A. A., & Malebary, S. J. (2020). An intelligent approach to credit card fraud detection

using an optimized light gradient boosting machine. IEEE Access, 8, 25579–25587.

16. Visalakshi,P., Madhuvani, K., et al. (2021). Detecting credit card frauds using differentmachine

learning algorithms. Annals of the Romanian Society for Cell Biology, 4681–4688.

17. Xuan, S., Liu, G., Li, Z., Zheng, L., Wang, S., & Jiang, C. (2018). Random forest for credit

card fraud detection. In: 2018 IEEE 15th International Conference on Networking, Sensing

and Control (ICNSC) (pp. 1–6). IEEE.

18. Zhang, X., Han, Y., Xu, W., & Wang, Q. (2021). Hoba: A novel feature engineering methodology

for credit card fraud detection with a deep learning architecture. Information Sciences, 557,

302–316.

Chapter 3

BIVFN: Blockchain-Enabled Intelligent

Vehicular Fog Networks

Priyanka Gaba and Ram Shringar Raw

Abstract Vehicular ad-hoc network (VANET), a network that facilitates inter-

vehicle communication regarding road safety or entertainment. The popularity of

VANET is because of its increasing demand and its blend with the latest tech-

nologies like cloud, IoT, AI, and ML. The expansion in vehicles on the road and

their communication leads to the need to use fog rather than cloud to provide low

latency to a real-time system VANET. The fog computing together with VANET is

termed as vehicular fog network although providing various advantages, but on the

other side compromises the network’s security. To improve the security and privacy

of the VANET system, blockchain an immutable, peer-to-peer, decentralized, and

distributed ledger-based technology seems suitable for making VFN a safer and more

reliable system. The combination of fog computing and blockchain technology with

VANET is termed as blockchain-enabled intelligent vehicular fog network (BIVFN).

This paper discusses the VFN also the architecture, and phases of BIVFN are explored

in detail. BIVFN can attain the security and privacy requirements well because of

blockchain and fog computing.

3.1 Introduction

Vehicular ad-hoc network (VANET) is a special type of mobile ad-hoc network

(MANET), which is a self-organized wireless communication network that lets the

vehicles transmit information between vehicles and road-side units (RSU) in real

time [1]. VANET provides value-added services and a well-organized road to create a

more efﬁcient and safe trafﬁc environment for vehicles [2]. The connections between

P. Ga b a ( B)

Department of Computer Science and Engineering, School of Information, Communication and

Technology, Guru Gobind Singh Indraprastha University, Dwarka, New Delhi, India

e-mail: priyanka.gaba2202@gmail.com

R. S. Raw

Department of Computer Science and Engineering, Netaji Subhas University of Technology, East

Campus, Delhi, India

e-mail: rsrao@aiactr.ac.in

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_3

20 P. Gaba and R. S. Raw

vehicle and infrastructure are utilized for communicating vehicles regarding road

conditions for safety-related and non-safety-related applications. Connected vehicles

provide a platform for cloud or fog computing to share safe and secure information

and evolve smart transportation systems for future VANET. Privacy and security are

the key challenges of connected vehicles in VANET. Any person connected with a

vehicle, like a car owner, mechanic, or any ofﬁcial personnel involved, could breach

vehicle data security and can harm easily. The possible security threats caused by

attackers are the security of devices and linkages, data validation, access control,

data privacy of drivers and vehicles [3]. Therefore, developing security and privacy

solutions for connected vehicles in VANET is a more challenging task.

Security of the VANET with connected vehicles includes the protection of safety

control and infotainment systems, hardware security, software maintenance, inter-

facing security, pedestrian communications with vehicles, and RSU through smart-

phones, etc. We focus here on securing the connected vehicle platforms where we

identify a novel method that all stakeholders will require to come together and

communicate with each other about threats, hacking, and attacks [4]. Researchers

in this ﬁeld have done much research; still, some issues exist in the current system.

Cloud computing helps in providing services like storage, infrastructure, and better

computing power to connected vehicles and charging them as per their requirements

[5]. Cloud computing suffers from various threats like data breaches, data loss, weak

identity, access management, denial of service, and many more. Fog computing

brings the functionality of cloud computing at the edge of the network by utilizing

the devices capable of offering its features as a fog device to the required vehicle

[6]. Thus, the network integrated with vehicles, RSU, and fog devices is known as

vehicular fog network (VFN).

One signiﬁcant problem with VFN is that data is stored on a single centralized

cloud server, which causes major security issues; if one entity is hacked, the whole

system gets compromised. Blockchain is a decentralized and distributed technology,

removes this drawback of one entity being hacked. It deals with many connected vehi-

cles involved in the process of writing and verifying transactional data and carrying

all veriﬁed transactions in a block [7].

The paper’s organization is as follows: Sect. 3.2 introduces the vehicular fog

network. The complete architecture and phases of BIVFN are discussed in Sect. 3.3.

Section 3.4 discusses the security requirements attainments by BIVFN and also its

challenges. Section 3.5 summarizes the paper.

3.2 Vehicular Fog Networking

Fog computing, a phrase, was being formed by Cisco. Unlike the cloud, a central-

ized server, the fog provides decentralized distributed computing features at the

network’s edge. Using this feature, fog exhibits a better solution to the limitations

of cloud computing [8]. Fog functionality could be provided by any device known

as a fog device that is capable enough to share resources on rent. Fog computing is

3 BIVFN: Blockchain-Enabled Intelligent Vehicular Fog Networks 21

Registered

Vehicle

Data

Registration

/ Verification

Vehicle

Data

Maps & Traffic

data

Sensors,

Infrastructure

Rent out

Resources

Need Storage Space

or Computation

facility

Road Activities

Take Quick Decision

and provide

temporary storage

Vehicle

Communication

Transfer

Messages

Provides storage

to data for

future Analysis

Vehicle

Control /Road

Navigation

Cloud

Server

Fog

Devices

Vehicles

Fig. 3.1 Vehicular fog networking (VFN)

best applicable for those applications which are time-sensitive and require a quick

response.

IoaV is one of the substantial applications of fog computing, and this integration is

known as vehicular fog network (VFN) [9]. VFN gives the advantage of low latency,

less network bandwidth requirement, security, and more reliability as vehicles need

not communicate data to the cloud. Apart from this, fog is also involved in segregating

data, forwarding it, or making real-time decisions for vehicular communication [10].

Despite sending complete data to the cloud, fog sends data required for some future

analysis. VFN deals with the mobility management of vehicles between the number of

fog servers to maintain service quality and provide essential solutions in the network.

Components of VFN with its functionalities and their connections are shown in

Fig. 3.1. Various authors have proposed their architecture, algorithms, and ideas for

VFN to make the system efﬁcient. VFN still faces a challenge of security which

could be handled by applying blockchain to it.

3.3 BIVFN: Blockchain-Enabled Intelligent Vehicular Fog

Network for Connected Vehicles

This section presents the blockchain security framework with vehicular fog

computing as a novel vehicular network architecture BIVFN. The vehicular network

architecture combined with fog computing provides features of cloud computing at

the edge of the network, making the blockchain security transactions faster. This

system is referred to as VFN. To make VFN more secure by storing the value

of reward points and the trustworthiness of vehicles in a trafﬁc environment, the

blockchain concept is applied to VFN. Together with fog computing, the blockchain

concept could resolve the major security concern in the IoV environment. Therefore,

we have integrated blockchain concepts with VFN and proposed a new network

called blockchain-enabled intelligent vehicular fog network for connected vehicles

(BIVFN).

22 P. Gaba and R. S. Raw

3.3.1 Architecture

The architecture of BIVFN comprises four layers, including the VANET layer, fog

layer, blockchain layer, and cloud layer as revealed in Fig. 3.2. Each layer has its

distinct components and roles. VANET layer consists of vehicles on the road which

are responsible for inter-communication. Each vehicle can be part of the system in

the following ways. First is every time a new vehicle needs to ﬁrst register into the

system to get certiﬁcates and keys. This registration helps the system to identify and

track each vehicle. Second, the vehicles can initiate an event that happened on the

road and can report the same on the system. The third is for any initiated event, nearby

vehicles act as validating nodes. Forth is vehicles receive messages of validated event

that has just occurred nearby the vehicle’s location and may impact its journey.

The fog layer contains fog devices that are responsible for the following functions.

First, fog provides cloud features like IaaS, SaaS, and PaaS to the desirable vehicles

on the road [11]. Second, fog devices are involved in the process of registration and

authentication of vehicles to assist the blockchain network. The third fog node also

performs the task of RSU as in doing the initial veriﬁcation of the vehicle identity

Cloud

Block n-1

Node

Block n Block n+1

Fig. 3.2 Architecture of BIVFN

3 BIVFN: Blockchain-Enabled Intelligent Vehicular Fog Networks 23

and location. Forth fog devices also identify any events nearby vehicles and provide

those to the blockchain network to be utilized as validating nodes. Fifth, fog devices

also act as a node of the blockchain network, which keeps a replica of the ledger.

The blockchain layer consists of all the components that are involved in the

complete functioning of the system. Blockchain can be implemented with any of

the different available platforms depending on one’s need and suitability.

The cloud layer is responsible for supporting the fog devices for carrying out the

task. It also stores the data of registered users, map-related data, and trafﬁc or other

data that may be required for future analysis.

The BIVFN system demands some security measures to be taken by the entities

of the network to keep the system reliable and secure. One of the requirements is

that every user should keep its keys safe by either using encryption, secret sharing,

or physical locks. Another is that the system must have a key compromise policy to

reduce the risk if in case it is compromised.

3.3.2 Phases of BIVFN

BIVFN, a blockchain-enabled vehicular fog network, will perform the tasks corre-

sponding to VANET in the blockchain network using a fog node. The tasks related

to VANET include sharing road status like trafﬁc information, red light failure, jam

information; borrowing services or infrastructure from fog node; registration of a new

vehicle in the network; authentication of vehicles; vehicle’s score updates according

to involvement; and accessing scores of vehicles. All these tasks act as transactions of

the blockchain, which are combined to create a block. The phases carried out in this

complete process include registration, transaction creation, and block creation. The

entities involved in these phases involve vehicles, fog nodes, application interfaces,

and blockchain networks. The steps followed in each phase are shown in Fig. 3.3 are

discussed below.

1. Registration phase

The vehicle enters its basic and owner details on the application interface for

registration in blockchain. The application interface then forwards these details

to the fog node. After that, the fog node forwards the details to blockchain after

performing the initial veriﬁcation of the vehicle. In blockchain, the new vehicle

details are stored as transactions, and transactions are added to the unveriﬁed

transaction pool. Blockchain will generate the certiﬁcate and key pair for the

vehicle for further communication in the network. The generated certiﬁcate and

key pair are forwarded to the vehicle through the application interface. Then, the

application interface informs the vehicle about successful registration.

2. Transaction creation

Any transaction from the above list could be initiated by either a vehicle or fog

node and reports the same on the application interface. Application interface will

forward the details of a new transaction to fog if that transaction is requested by

24 P. Gaba and R. S. Raw

Vehicle Application

Interface Fog Device Blockchain

Network

1.2 Registration

Request 1.3 Registration

Request

1.4 Generates

Certificates & Keys

1.5 Registration

Response

1.6 Registration

Response

1.1 Sends details

for registration

Other

Vehicles

2.1 Transaction

initiation

2.1 Transaction

initiation

2.2 Transaction

initial

verification

request 2.3 Performs

initial

verification

2.4.1 Rejects transaction

after initial verification 2.4.2 Sends

transaction for

validation

2.5 Performs

transaction

Validation process

3.1 Block creation

3.2 Block

Broadcasting

3.2 Block

Broadcasting

3.2 Block

Broadcasting

3.3 Block

Validation

3.3 Block

Validation

3.3 Block

Validation

3.4 Blockchain

update

3.4 Blockchain

update

3.4 Blockchain

update

Phase 1:

Registration

Phase 2:

Transaction

creation

Phase 3:

Block creation

Fig. 3.3 Phases of BIVFN

3 BIVFN: Blockchain-Enabled Intelligent Vehicular Fog Networks 25

any vehicle. Fog node also performs the features of RSU so that it will evaluate

the event for initial veriﬁcation. Fog node checks the basic details as ﬁrst-level

veriﬁcation and performs one of the below actions. If the transaction fails initial

veriﬁcation performed by fog, the event is rejected and will send the status back

to the vehicle through the application interface. If the transaction passes initial

veriﬁcation or is initiated by the fog node, then it is forwarded to the blockchain.

The list of probable nearby vehicles is also attached with the transaction that

acts as a validation node. The blockchain network will validate the transaction

by involving nearby vehicles suggested by the fog node.

3. Block creation

All leftover transactions from the unveriﬁed transaction pool are collected

and ordered, and a block is created out of those transactions. The newly created

block is broadcasted to the vehicles and fog nodes of the network. The vehicles

and fog node will validate the block. After validation, the newly created block is

updated to the ledger of each node, and thus blockchain is updated.

3.4 Security Requirements Attainment in BIVFN

BIVFN, a blockchain and fog computing-based VANET, can overcome its issues

and fulﬁll the expected requirements. Blockchain can attain VANET’s requirements

with the features like immutable, cryptography-based, distributed, and decentralized

ledger. Fog computing provides cloud-like features on the edge helps of support in

real-time constraint requirements. The attainment of VANET’s security requirements

by the distinct features of fog and blockchain are listed below which also depicts the

efﬁciency of our proposed model as compared to conventional models.

•Authentication: It is ensured by verifying that every involved vehicle is already a

registered user of the system having a well-deﬁned identity.

•Non-repudiation: The vehicle’s identity is recorded during communication, and

hence, the vehicle cannot deny its involvement in the system.

•Conﬁdentiality: The blockchain network encrypts the identity of the involved

vehicles, and only the public key is visible. The attacker is not able to identify the

actual vehicle behind any transaction.

•Integrity: The concept of hashing is applied on transactions and complete blocks,

which does not allow any alteration and ensures the network’s integrity.

•Access control: Only the authorized users have restricted access in the network

depending upon its role and hence ensure access control.

•Privacy: The unauthorized users cannot access the details of the registered vehicles

of the network as the data is completely stored as hash form.

•Data veriﬁcation: The data transmitted in the network acts as a transaction of the

network and is validated ﬁrst before adding it to the blockchain network and is

added only if it proves valid.

26 P. Gaba and R. S. Raw

•Real-time constraints: Vehicles are moving at high speed and thus need fast notiﬁ-

cations and responses corresponding to any event that fog computing can achieve

well.

3.5 Conclusion

This paper presents a novel architecture of blockchain-enabled intelligent vehicular

fog network for connected vehicles (BIVFN). BIVFN is a VANET incorporating

blockchain technology, and fog computing provides the following features. (i) Fog

computing provides cloud services, memory, and infrastructure on edge, enabling it

to work in real time and give fast responses. (ii) Blockchain being immutable keeps

the data secure and system reliable. (iii) Blockchain’s certiﬁcate and key generation

process ensures the authentication and access control in the VANET system. (iv)

The concept of cryptography ensures the integrity of the system. The phases of

BIVFN exploring the complete process are also discussed. The proposed system is

completely reliable and ﬁts well in today’s market demand.

References

1. Pandey, K., Raina, S. K., & Raw, R. S. (2016). Distance and direction-based location aided

multi-hop routing protocol for vehicular ad-hoc networks. International Journal of Commu-

nication Networks and Distributed Systems, 16(1), 71–98. https://doi.org/10.1504/IJCNDS.

2016.073410

2. Raw, R. S., & Lobiyal, D. K. (2011). E-DIR: A directional routing protocol for VANETs in a

city trafﬁc environment. International Journal of Information and Communication Technology,

3(3), 242–257. https://doi.org/10.1504/IJICT.2011.041927

3. Woo, S., Jo, H. J., & Lee, D. H. (2015). A practical wireless attack on the connected car

and security protocol for in-vehicle CAN. IEEE Transactions on Intelligent Transportation

Systems, 16(2), 993–1006. https://doi.org/10.1109/TITS.2014.2351612

4. Raw, R. S., Kumar, M., & Singh, N. (2013). Security challenges, issues and their solutions for

VANET. 5(5), 95–105.

5. Raw, R. S., Loveleen, Kumar, A., Kadam, A., & Singh, N. (2016). Analysis of message propa-

gation for intelligent disaster management through vehicular cloud network. In ACM Interna-

tional Conference on Proceeding Series (Vol. 04–05-Marc). https://doi.org/10.1145/2905055.

2905252

6. Fan, K., Wang, J., Wang, X., Li, H., & Yang, Y. (2018). Secure, efﬁcient and revocable data

sharing scheme for vehicular fogs. Peer-to-Peer Networking and Applcations, 11(4), 766–777.

https://doi.org/10.1007/s12083-017-0562-8

7. Priyanka, & Raw, R. S. (2020). The amalgamation of blockchain with smart and connected

vehicles: Requirements, attacks, and possible solution. In Proceedings—IEEE 2020 2nd Inter-

national Conference on Advances in Computing, Communication Control and Networking,

ICACCCN (pp. 896–902). https://doi.org/10.1109/ICACCCN51052.2020.9362906

8. Siddiqui, S. A., & Mahmood, A. (2018). Towards fog-based next generation internet of vehicles

architecture. In Proceedings ACM Conference on Computer and Communications Security

(pp. 15–21). https://doi.org/10.1145/3267195.3267200

3 BIVFN: Blockchain-Enabled Intelligent Vehicular Fog Networks 27

9. Zhang, L., et al. (2019). Blockchain based secure data sharing system for Internet of vehicles: A

position paper. Vehicular Communications, 16, 85–93. https://doi.org/10.1016/j.vehcom.2019.

03.003

10. Gaba, P., & Raw, R. S. (2020). Vehicular cloud and fog computing architecture, applications,

services, and challenges. In R. S. Rao, V. Jain, O. Kaiwartya & S. Nanhay (Eds.), IoT and

Cloud Computing Advancements in Vehicular Ad-Hoc Networks (pp. 268–296). IGI Global.

11. Aliyu, A., et al. (2018). Cloud computing in VANETs: Architecture, taxonomy, and challenges.

IETE Technical Review, 35(5), 523–547. https://doi.org/10.1080/02564602.2017.1342572

Chapter 4

Deep Learning Approach for Pedestrian

Detection, Tracking, and Suspicious

Activity Recognition in Academic

Environment

Kamal Hajari, Ujwalla Gawande, and Yogesh Golhar

Abstract Pedestrian detection, tracking, and suspicious activity recognition have

grown increasingly signiﬁcant in computer vision applications in recent years as

security threats have increased. Continuous monitoring of private and public areas in

high-density areas is very difﬁcult, so active video surveillance that can track pedes-

trian behavior in real time is required. We present an innovative and robust deep

learning system as well as a unique pedestrian dataset that includes student behavior

like as test cheating, laboratory equipment theft, student disputes, and danger situ-

ations in institutions. It is the ﬁrst of its kind to provide pedestrians with a uniﬁed

and stable ID annotation. Again, we also presented a comparative analysis of result

achieved by the recent deep learning approach of pedestrian detection, tracking,

and suspicious activity recognition methods on a recent benchmark dataset. Our

investigation will provide new research directions in vision-based surveillance for

practitioners and research scholars.

4.1 Introduction

Video surveillance is now installed everywhere to track and monitor pedestrians or

criminals in streets, airports, banks, prisms, laboratories, shopping centers, etc. [1].

The surveillance system is based on a closed-circuit television (CCTV) system.

Recently, Pan-Tilt-Zoom (PTZ) cameras have many advantages over traditional

CCTV cameras. The main advantage of a PTZ camera is that it allows users to

view more content than a ﬁxed camera. The features of the PTZ camera include:

(1) The user can pan left and right and tilt up and down to obtain a complete 180°

view, whether it is left or right or up and down. If installed and positioned correctly,

K. Hajari (B)·U. Gawande

Department of Information Technology, Yeshwantrao Chavan College of Engineering, Nagpur,

Maharashtra 441110, India

e-mail: kamalhajari123@gmail.com

Y. G o l h a r

Department of Computer Engineering, St. Vincent Palloti College of Engineering and

Technology, Nagpur 441108, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_4

30 K. Hajari et al.

advanced PTZ cameras can provide a complete 360° ﬁeld of view. Therefore, a

single pan/tilt camera can replace two or even three ﬁxed-view cameras, which is

very suitable and can almost eliminate most of the blind spots on cameras with

deviated ﬁxed-view angles. A PTZ camera is programmed to rotate automatically in

multiple directions at a different view of an area. Researchers are currently working

on creating a video surveillance system that can assess pedestrian methods in real

time [2].

The challenge of identifying pedestrians in crowded environments becomes

extremely challenging in real time when low-resolution images, motion blur, contrast

illumination, scale or size of pedestrian changes, and entirely or partially obscured

outlines are present. Figure 4.1 illustrates the motivation for the proposed effort. In the

Caltech [1], INRIA [2], MS COCO [3], ETH [4], and KITTI [5] datasets, pedestrian

cases are typically modest. Localizing these small instances in presence of illumi-

nation change and occlusion is a challenging task due to issues like (1) hazy appear-

ance, (2) blurred and unclear boundary, (3) overlapped pedestrian instances, (4) small

and large size instances having different characteristics, etc. The advanced research

on pedestrians’ analysis conducted on publicly available benchmark datasets. These

datasets have several limitations, including: (1) a limited variety of pedestrian stances

captured in a controlled context. (2) These datasets contain data with a short time

interval between each unique ID’s succeeding frames. (3) While all of these pedes-

trian datasets are recorded in the urban areas and speciﬁcally on city road, car parking,

and public and private places. No student behavior dataset is available. A robust and

novel deep learning model and student academic environment dataset proposed in

this paper. During each sequence of frames in the video, human experts annotated

the behavior of student pedestrians. We provide data in three categories. (1) Using

bounding boxes to locate pedestrians, (2) fully labeled, and (3) unique IDs are used

as a class category of annotated pedestrian.

The contributions of this paper are outline as follows:

1. To solve existing state-of-the-art database concerns such as size and illumination

variance in pedestrian images, we present the unique enhanced mask R-CNN

deep learning architecture.

2. We propose a student activity dataset. In which, we have recorded the student

normal and suspicious activities.

3. Within the framework of the proposed pedestrian dataset for academic settings,

we conduct a comprehensive review of previous work and compare state-of-the-

art methods.

The remainder of this paper is organized as follows: Sect. 4.2 covers the most

signiﬁcant studies in the new pedestrian dataset, as well as concerns and challenges

in the academic context. A deep learning architecture is described in Sect. 4.3.The

outcomes of the empirical examination are discussed in Sect. 4.4. Conclusions are

presented in Sect. 4.5, along with future research directions.

4 Deep Learning Approach for Pedestrian Detection, Tracking… 31

(a) Illumination Variation (b) Pedestrian Size Variation

(e) Pedestrian Cloth Variation (f) Multi-camera Pedestrian Variation

Fig. 4.1 Issues and challenges of ETH [2] and Caltech [3] dataset. aPedestrian signiﬁcant change

in the visual as the illumination changes. bPedestrian scale or size changes in images changed

signiﬁcantly. cPedestrian occlusion affects the detection and tracking result. dPedestrian occlusion

with other road object affects the detection accuracy. ePedestrian cloth variation effects the detection

algorithm accuracy. fMulti-camera captured direction represents a different visual appearance

4.2 Literature Survey

This section describes the most relevant and recent pedestrian datasets. In addition,

we discuss the advanced deep learning approaches of pedestrian detection, tracking,

and suspicious activity recognition, along with its limitation.

4.2.1 Pedestrian Dataset

In this section, we describe the ten commonly used pedestrian datasets by researchers

for pedestrian detection, tracking, and suspicious activity recognition. First, the

Caltech dataset contains 2300 unique pedestrians and 350,000 annotated bounding

boxes to represent these pedestrians. This dataset was created on the city road and

using the camera mounted on the vehicle [3]. Second, the MIT dataset is the ﬁrst

pedestrian dataset, which consists of high-quality pedestrian images. This dataset

consists of 709 unique pedestrians. Whether in front view or back view, the range of

32 K. Hajari et al.

pose images taken in city streets [4] is relatively limited. Third, Daimler, this dataset

captures people walking on the street through cameras installed on vehicles in an

urban environment during the day. The dataset includes pedestrian tracking attributes,

annotated labeled bounding boxes, ground truth images, and ﬂoating disparity map

ﬁles.

The training set contains 15,560 pedestrian images and 6744 annotated pedestrian

images. The test set contains 21,790 pedestrian images and 56,492 annotated images

[5]. The ATCI dataset is a pedestrian database acquired by a normal car’s rear-view

camera, and it is used to test pedestrian recognition in indoor and outdoor parking

lots, city streets, and private lanes. The dataset contains 250 video clips, each of 76

minutes, and 200,000 marked pedestrian bounding boxes, captured in day and night

scenes, with different weather conditions [6]. The ETH dataset is used to observe the

trafﬁc scene from the inside of the vehicle. The behavior of pedestrians is recorded

using a stereo device mounted on a stroller in the automobile. In an urban setting,

the dataset can be used for pedestrian recognition and tracking via mobile platforms.

Different trafﬁc agents, such as cars and pedestrians, are included in the dataset [7].

TUD-Brussels dataset was created using a mobile platform in an urban environment.

Crowded urban street behavior was recorded using a camera mounted on the front

side of the vehicle. It can be used in car safety scenarios in urban environments [8].

One of the most often used static pedestrian detection datasets is the INRIA dataset.

It incorporates human behavior, as well as a mobile camera and complex background

scenes, with various variations in posture, appearance, dress, background, lighting,

contrast, etc. [9]. The PASCAL Visual Object Classes (VOC) 2017 and 2007 collec-

tion contains static objects in an urban setting with various viewpoints and positions.

This dataset was created with the goal of recognizing visual object classes in real-

world scenarios. Animals, trees, road signs, vehicles, and people are among the

20 different categories in this collection [10]. The common object in context was

constructed using the MS COCO 2018 dataset [11]. (COCO). The 2018 dataset was

recently utilized to recognize distinct things in the context while focusing on stimulus

object detection.

The annotations include different examples of things connected to 80 different

object categories and 91 different human segmentation categories. For pedestrian

instances, there are key point annotations and ﬁve picture labels per sample image. (1)

real-scene object detection with segmentation mask, (2) panoptic segmentation, (3)

pedestrian key point evaluation, and (4) Dense Pose estimation in a congested scene is

among the COCO 2018 dataset challenges [12]. For street picture segmentation, the

Mapillary Vistas Research dataset is employed [13]. Pedestrians and other non-living

categories are solved using panoramic segmentation, which successfully merges

the concepts of semantic and instance segmentation. A comparison of pedestrian

databases and their video surveillance purposes is shown in Table 4.1. In addition,

we have included our proposed dataset, which will be introduced in the next section.

The connection is made based on the dataset’s use, size, environment, label, and

annotation. These details are used to verify the accuracy of the object detection and

tracking algorithms.

4 Deep Learning Approach for Pedestrian Detection, Tracking… 33

Table 4.1 Comparison of benchmark pedestrian dataset the pedestrian detection and tracking

Dataset Dataset size Annotation Environment Year Ref. Issues and

challenges

Caltech 250,000

frames

2300 unique

pedestrian

City street 2012 [3]Only urban

roads are

captured

MIT 709 unique

pedestrians

No annotated

pedestrian

Day light

scenario

2000, 2005 [4]Missing

annotation

not allow

user to verify

different

techniques

Daimler 15,560

unique

pedestrians

Ground truth

with bounding

boxes

City street 2016 [5]Only urban

roads are

captured

GM-ATCI Video clips:

250

Annotated

pedestrian

bounding

boxes: 200 K

Day and

complex

weather and

lighting

2015 [6]Only urban

roads are

captured.

Side view of

road not

captured

ETH Videos Annotated

cars and

pedestrians

City street 2010 [7]Small size

dataset.

Limited

scenarios

cover

TUD Brussels 1092 frames Ped. Annot.

1776

City street 2009 [8]Only urban

roads are

captured

INRIA 498 images Manual

annotations

City street 2005 [9]Only urban

roads are

captured

PASCAL

VOC 2012

11,530

images, 20

obj., classes

ROI annotated

27,450

City street 2012 [10]Only urban

roads are

captured

MS COCO

2017

328,124

images

Segmented

people obj

City street 2017 [11]Only urban

roads are

captured

MS COCO

2015

328,124

images

Segmented

people obj

City street 2015 [12]Only urban

roads are

captured

Mapillary

Vistas dataset

2017

152 objects,

25,000

images

categories

Pixel accurate

instance

pedestrian

annotations

City street 2017 [13]Only urban

roads are

captured.

Side view of

road not

captured

34 K. Hajari et al.

4.2.2 Proposed Deep Learning Architecture and Academic

Environment Pedestrian Dataset

In this section, we look at the suggested framework from a different perspective, as

captured by a high-quality DSLR camera. The proposed video acquisition framework

records the video at 30 f/s along with 384 ×2160 resolution. The size of dataset is

100 GB. The student behavior frames shown in Fig. 4.2. The orientation of camera

is in the range of 45° to 90°. Yeshwantrao Chavan College of Engineering (YCCE),

Nagpur student academic activity behavior recorded in the proposed dataset. The

student age is between 22 and 27 including both male and female. Out of which

65% are male and 35% are female. The academic environment dataset consists of

different behaviors such as lab student activities, exam hall, classroom, a student

cheating behavior, dispute, and stealing a mobile phone and lab electronic devices

[14].

At the frame level, domain experts annotate the pedestrian video sequence. The

labeling stage contains three phases: (1) human identiﬁcation, (2) tracking, and (3)

detection of suspicious activities. First, mask R-CNN [12] method is used to deter-

mine location of the pedestrian in the frame, followed by manual validation and

correction of the data. Next, deep sort [15] model used for extracting the tracking

information. At last, with these two basic operations, we get a rectangle bounding

box around pedestrian that deﬁnes the ROI for each human. The last stage of the

updating process is performed manually, with a human expert knowledgeable with

the academic environment. For each pedestrian instance in the frame, the height, age,

bounding box ID, feet, frame, body volume, haircut, hair color, head accessories,

apparel, mustache beard activity, and accessories are all speciﬁed on the label.

Fig. 4.2 An example of the designed database. The ﬁrst row depicts a lab ﬁght between two girls.

The scenario of snatching the phone is depicted in the second row.A scenario of a student threatening

is depicted in the third row. The fourth row describes the same critical situation. The ﬁfth row depicts

a situation in which students steal lab material. The sixth row depicts exam cheating scenario in

examination hall

4 Deep Learning Approach for Pedestrian Detection, Tracking… 35

4.3 Recent Deep Learning Architecture

The current deep learning-based pedestrian detection, tracking, and suspicious

activity recognition systems are not as accurate and fast as human vision [2]. Pedes-

trian detection, tracking, and activity recognition are now separated into two cate-

gories: CNN and deep learning. Dollar et al. [3] approach used in pedestrian detection

for face recognition. Again, HOG [16] approach is used for the pedestrian detection.

Next, deformable part model (DPM) [4] and multi-scale histogram of gradient [6]

are examples of conventional approaches. These procedures are computationally

intensive and time-consuming, and they necessitate the participation of humans.

CNN-based deep learning techniques have grown in prominence as a result of their

accuracy in pedestrian identiﬁcation [7,8]. R-CNN [9] is the ﬁrst deep learning model

for object detection. Two-stage detectors like R-CNN [9], SPPNet [10], fast-R-CNN

[11], faster R-CNN [12], and mask-R-CNN [13] and single-stage detectors like SSD

[15] and YOLO [17] are examples of deep learning approaches. As a result, real-time

pedestrian detection is now unsuitable. As a result, Kaiming et al. [17] introduced

the YOLO net that is an object regression architecture, to increase detection speed

and accuracy. Later, to improve both detection accuracy and speed when detecting

smaller and densely distributed pedestrians’ different variants of YOLO (v1, v2,

v3, v4, and v5) [4] proposed by the researchers. The proposed improved YOLOv5

method effectively detects small and constant pedestrians.

4.4 Experimental Results

In this section, we describe the results of three tasks performed using methods

regarded to be cutting-edge technology as pedestrian detection, tracking, and suspi-

cious activity recognition. We not only present the results acquired using such strate-

gies in the academic environment database, but we also present our baseline ﬁndings

using the same technique in a well-known dataset. Pedestrian detection is the ﬁrst

step. Both the R-FCN [17] and RetinaNet [18], Mask R-CNN [19] deep learning

frameworks excelled in the PASCAL VOC [19] problems, particularly in pedestrian

identiﬁcation. The projected dataset results of the two approaches were compared

to results from the PASCAL VOC 2007 and 2012. On top of the ResNet, RetinaNet

uses the feature pyramid network (FPN) as the baseline structure [20,21]. To record

changes in position, R-FCN [17] uses a speciﬁc convolutional layer. Instead of a

completely linked layer, we utilize a ROI pooling layer. Mask R-CNN was used to

evaluate the proposed dataset once more [22,23]. The dataset was divided into three

categories: 60% for training, 20% for real-time query, and 20% for testing. APIoU

=0.5 is the standard assessment measure average precision (AP) at the intersection

of union (IoU) values set to 0.5. For two reasons: (1) highest performer in the MOT

challenge; and (2) open-source pre-built framework, pedestrian tracking, the Tracktor

CV [2] and V-IOU [16] approaches provide the state-of-the-art [24,25]. The Tracktor

36 K. Hajari et al.

CV approach is comprised of two stages: (1) a regression component that uses the

detector step’s input to update the bounding box’s current position, and (2) a detector

which keeps track of the bounding boxes for the future frames [26,27]. We noticed

a positive association between both methodologies’ failure situations, which were

related to crowds, and two problematic cases: (1) scenes where trajectories intersect

others at each second due to dense pedestrian density; and (2) scenes where crucial

occlusions of human silhouettes occur. Recognizing suspicious activity, it is harder

to identify or track speciﬁc pedestrians within a crowd, and the aforementioned tech-

niques are not suitable for such a situation. Abnormal activities in a spatial context

are detected by a sequence of events with variants in appearance, scale, different

lighting, and pose. Other recent academic researchers have focused on the use of

motion in this direction [28,29]. Finally, collecting regional spatial cuboids from an

optical ﬂow has been attempted to do crowd behavior analysis [30–32]. Although the

approaches listed above have been shown to be effective in studies, the most of them

are limited to detecting anomalous behaviors in local or global domains [33,34].

Furthermore, we suggest that coupled contemplation of the motion ﬂows pattern,

variable item sizes, and interactions between surrounding objects in a frame may be

used to portray pedestrian behaviors in a high-density scene, resulting in improved

anomalous activity detection performance.

4.5 Conclusions

We propose the academic environment database in this paper, which comprises video

sequences of pedestrians in indoor academic environments that are annotated at the

frame level. The pedestrian database contains the behavior of students in the institu-

tion. This is the ﬁrst of its kind dataset that provide a uniﬁed and stable pedestrian

ID annotation, making it suitable for pedestrian detection, tracking, and behavior

detection. We have also proposed improved mask R-CNN architecture for pedestrian

detection, tracking, and suspicious activity recognition methods on recent benchmark

databases. This well-organized comparison helps to identify problems and challenges

in this domain. In the future, more experimentation is required for pose estimation

and pedestrian trajectory identiﬁcation and detection.

References

1. Ahmed, M., Jahangir, M., & Afzal, H. (2015). Using crowd-source based features from social

media and conventional features to predict the movies popularity. In IEEE International

Conference on Smart Cities, Social Communication and Sustained Communication, China

(pp. 273–278).

2. Bergmann, P., Meinhardt, T., & Taixe, L. (2019). Tracking without bells and whistles. In IEEE

ICCV, Seoul, Korea (pp. 1–16).

4 Deep Learning Approach for Pedestrian Detection, Tracking… 37

3. Dollar, P., Wojek, C., Schiele, B., & Perona, P. (2012). Pedestrian detection: An evaluation of

the state of the art. IEEE TPAMI, 34(4), 743–761.

4. Samsi, S., Weiss, M. L., Bestor, D., Li, D., Jones, M., Reuther, A., Edelman, D., Arcand, W., &

Byun, C. (2021). The MIT supercloud dataset. Cornell Journal of Distributed, Parallel, and

Cluster Computing,2108(02037).

5. Silberstein, S., Levi, D., Kogan, V., & Gazit, R. (2014). Vision-based pedestrian detection for

rear-view cameras. In IEEE Intelligent Vehicles Symposium Proceedings, Dearborn, MI, USA

(pp. 853–860).

6. Alom, M. Z., & Taha, T. M. (2017). Robust multi-view pedestrian tracking using neural

networks. In IEEE National Aerospace and Electronics Conference (NAECON), Dayton, OH,

USA (pp. 17–22).

7. Zhang, X., Park, S., Beeler, T., Bradley, D., Tang, S., & Hilliges, O. (2020). ETH-XGaze: A

large scale dataset for gaze estimation under extreme head pose and gaze variation. In ECCV.

Lecture Notes in Computer Science (Vol. 12350). Springer.

8. Wojek, C., Walk, S., & Schiele, B. (2009). IEEE CVPR,Miami,Florida,USA.

9. Nguyen, T., Kim, S., & Na, I. (2013). Fast pedestrian detection using histogram of oriented

gradients and principal components analysis. International Journal of Contents.

10. Everingham, M., Van Gool, L., & Williams, C. K. I. (2010). The Pascal visual object classes

(VOC) challenge. International Journal of Computer Vision, 88, 303–338.

11. Lin, T. Y., et al. (2014). Microsoft COCO: Common objects in context. In ECCV. Lecture Notes

in Computer Science (Vol. 8693, pp. 740–755). Springer.

12. Nicolai, W., Bewley, A., & Dietrich, P. (2017). Simple online and real-time tracking with a

deep association metric. In IEEE ICIP (pp. 3645–3649).

13. Dai, J., Li, Y., He, K., Sun, J., & FCN, R. (2016). Object detection via region-based fully

convolutional networks. In CVPR (pp. 1–11).

14. Hajari, K., Gawande, U., & Golhar, Y. (2021). Deep learning approach to pedestrian detection:

An evaluation of the state of the art. In Computing technologies and applications paving

path towards society 5.0 (1st ed.,). Routledge and CRC Press, Taylor & Francis Group.

ISBN:9780367763701.

15. Everingham, M., Eslami, S., Gool, V., Williams, C., & Winn, J. (2015). The PASCAL VOC

challenge: A retrospective. IJCV, 111, 98–136.

16. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE

Computer Society Conference on CVPR, San Diego, CA, USA (pp. 886–893).

17. Kaiming, H., Georgia, G., Dollar, P., & Girshick, R. (2020). Mask R-CNN. IEEE TPAMI, 42(2),

386–397.

18. Viola,P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features.

In Proceedings of the IEEE Computer Society Conference on CVPR, HI, USA (pp. I-I).

19. Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D. (2010). Object detection with

discriminatively trained part-based models. TPAMI, 32(9), 1627–1645.

20. He, K., Gkioxari, G., Dollar, P., & Girshick, R. (2017). Mask R-CNN. In IEEE International

Conference on Computer Vision, Venice, Italy (pp. 2980–2988).

21. Liu, W., Anguelov, D., Erhan, D., Szegedy,C., Reed, S., et al. (2016). SSD: Single shot multibox

detector. In European Conference on Computer Vision (pp. 21–37). Springer.

22. Redmon, J., Girshick, R., & Farhadi, A. (2016). You only look once: Uniﬁed, real-time object

detection. In IEEE Conference on CVPR, Las Vegas, NV, USA (pp. 779–788).

23. Bochinski, E., Senst, T., & Sikora, T. (2018). Extending IOU based multi-object tracking by

visual information. In IEEE International Conference on Advanced Video and Signal Based

Surveillance, Auckland, New Zealand.

24. Barnardin, K., & Stiefelhagen, R. (2008). Evaluating multiple object tracking performance:

The CLEAR MOT metrics. EURASIP Journal on Image and Video Processing 246309.

25. Kratz, L., & Nishino, K. (2012). Tracking pedestrians using local spatio-temporal motion

patterns in extremely crowded scenes. IEEE TPAMI, 34(5), 987–1002.

26. Wang, S., & Miao, Z. (2010). Anomaly detection in crowd scene. In 10th IEEE International

Conference on Signal Processing, Beijing, China (pp. 1220–1223).

38 K. Hajari et al.

27. Wang, S., & Miao, Z. (2010). Anomaly detection in crowd scene using historical information. In

IEEE International Symposium on Intelligent Signal Processing and Communication Systems,

Chengdu, China (pp. 1–4).

28. Muhammad, G., Hossain, M., & Kumar, N. (2021). EEG-based pathology detection for home

health monitoring. IEEE Journal on Selected Areas in Communications, 39(2), 603–610.

29. Muhammad, G., Alhamid, M. F., & Long, X. (2019). Computing and processing on the edge:

Smart pathology detection for connected healthcare. IEEE Network, 33, 44–49.

30. He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional

networks for visual recognition. TPAMI, 37(9), 1904–1916.

31. Muhammad, N., Hussain, M., Muhammad, G., & Bebis, G. (2011). Copy-move forgery detec-

tion using dyadic wavelet transform. In Eighth International Conference on CGIV, Singapore

(pp. 103–108).

32. Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate

object detection and semantic segmentation. In CVPR, Columbus, OH, USA (pp. 580–587).

33. Girshick, R. (2015). Fast R-CNN. In IEEE International Conference on CV, Santiago, Chile

(pp. 1440–1448).

34. Ren, S., He, K., Girshick, R., & Sun, J. (2016). Faster R-CNN: Towards real-time object

detection with region proposal networks. IEEE TPAMI, 39(6), 1137–1149.

Chapter 5

Data-Driven Approach to Deﬂate

Consumption in Delay Tolerant Networks

C. Venkata Subbaiah and K. Govinda

Abstract Postponement tolerant systems administration (DTN) is a technique to

PC orchestrates plan that attempts to deal with the speciﬁc issues in heterogeneous

frameworks that could require steady accessibility. The capacity to move, or course,

information from a communicating end to a tolerant end is an urgent limit of all data

exchanging frameworks should have. Deferral-tolerant systems (DTNs) are inspected

by their deﬁciency of accessibility, provoking a temporary afﬁliation setback between

the centers. In the current testing development, viable uniquely designated directing

shows like DSR and AODV conﬁrmation collection disregard to give correspon-

dence ways between the courses. We exhort a statistics-driven way to deal with keep

away from superﬂuous use of asset being in the childish or malevolent nodes simul-

taneously as assessing a hub’s trust powerfully in response to alterations inside the

ecological and nodes conditions. We additionally propose an allocated provenance-

basically-based trust control convention where each hub is thought to have ability

to screen its adjoining hubs with perceived probabilities of phony raise and falls in

identifying assault practices or energy level. We initially determine the trust of the

end through message supplier then, at that point, send the message. To decrease the

utilization of energy, during the transmission, we use grouping method what split the

entire message in to certain pockets.

5.1 Introduction

Through the network transmission has been exceptionally simple thing at present,

yet it is far still extremely difﬁcult to communicate data in a couple of networks

which can be regularly in put off and disruption. Because of progress in network

structure and harder environment delay in the route and interruption of networks

are the normal components. In order to address this concern, several researchers’

have proposed several results over the last ten years. However, such methods are all

C. Venkata Subbaiah ·K. Govinda (B)

Scope, Vit University, Vellore, Tamilnadu, India

e-mail: kgovinda@vit.ac.in

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_5

40 C. Venkata Subbaiah and K. Govinda

Fig. 5.1 Delay tolerant network

attempt to resolve the issue dependent on the traditional network protocol so these

plans are not viable in some particular cases, which brought about an idea of DTN.

The DTN is created from portable impromptu system (MANET). It is lacking and

infrequent related compose where strong correspondence beginning to end acces-

sibility is not open for message transmission. DTN is expected to work effectively

over maximum remote partitions, for instance, in space correspondence and inter-

planetary scale. In such an area, long dormancy is assessed in hours and days is an

unavoidable. Sensor-based structure, remote structures, earthly remote structures,

and lowered acoustic structures with delays are the examples of DTN (Fig. 5.1).

5.2 Related Work

Postponement tolerant systems administrations (DTNs) are frequently seen in

evolving applications, for example, crisis reaction, exceptional tasks, smooth envi-

ronments, habitation checking, and vehicular ad-hoc networks where numerous

nodes join in group communications to attain a common job. The center charac-

teristic of DTNs is that there is no assurance of start to ﬁnish network, in this

way creating high setback or interruption because of innate attributes or deliber-

ately getting misbehave nodes. Overseeing trust capably and adequately is basic to

working with participation or joint effort and dynamic assignments in DTNs while

meeting framework objectives like trustworthiness, accessibility, quality of service

(QoS), as well as adaptability. Precise trust assessment is particularly difﬁcult in

5 Data-Driven Approach to Deﬂate Consumption … 41

DTN conditions since nodes are scantily dispersed and do not frequently experi-

ence one another. Thusly, experience-based evidence trade among hubs may not be

dependably imaginable.

The quality of service (QoS) steering plays a signiﬁcant factor in QoS provisioning

in mobile ad-hoc networks. Bidirectional inquiry was utilized in Shin and Lee (2018)

for accomplishing efﬁcient QoS routing within the sight of various requirements.

With this QoS, satisﬁed ways were shown up at with efﬁcient asset usage through

most limited way algorithm and cost-efﬁcient steering algorithm. Notwithstanding,

with different limitations and prerequisites, all the QoS necessities are not supposed

to be satisﬁed. In Lal et al. (2016), a heuristic neighbor determination strategy with

geographic routing procedure was considered to improve the network length. One

more mixture algorithm upgrading the Cuckoo search was introduced in Deepa and

Suguna (2017) with novel position update model through differential advancement

algorithm with the target of deciding ideal route. To guarantee energy responsive

routing, an enhanced link formal routing protocol was designed in Deepa and Suguna

(2017) utilizing normal time series technique on auto backward coordinated. With

this efﬁcient energy, protecting was supposed to be accomplished. Notwithstanding,

with the transmission nature, undeniable degrees of between way impedances are

said to exist.

5.2.1 DTN Features

By contrasting and the regular Internet, ad-hoc topology, and WLAN, DTN has the

accompanying necessities.

5.2.2 Irregular Interfacing

DTN frequently gets detached because of restriction in portability and capability of

nodes leading frequent changes in the topology of DTN. It implies that DTN tracks

the situation with infrequent interfacing and unﬁnished connection, causing ensure

less ways between routes.

5.2.3 Low Speed, Inefﬁciency, and High Queue Interruption

By adding the interruption of each hop on the route gives in general node-to-node

interruption. The interruption is a mix of engendering time, trafﬁc time, and holding

up time. Essentially, because of the infrequent topology of DTN, hop interruption

may be exceptionally high and results extremely less information rate and causing

asymmetric qualities of information rate Along with this, trafﬁc interruption assumes

42 C. Venkata Subbaiah and K. Govinda

a vital role in node-to-node interruption and frequent varieties in DTN make high

lining interruption.

5.2.4 Limited Resource

Node’s assessing and dealing with capacity, correspondence capacity and storage

interstellar are more delicate than the capacity of a normal PC as the price limitations,

volume and energy. What’s more, the restricted storage brought about higher packet

loss rate.

5.2.5 Node Life Time

Node utilizes the battery energy generally, in speciﬁc conditions of the forced topolo-

gies on the unpleasant conditions, which decreases the existence season of the node.

At the point when the battery power is lost there is no assurance that the node will

work. It implies that the transmission of the message is unimaginable during the loss

of energy.

5.2.6 Dynamic Topology

See that the DTN topology is dynamic changing for a couple of intentions which

incorporates natural changes, strength exhaustion or diverse screw ups, which winds

up in missing out of networks. Or on the other hand, the necessities of getting into

DTN additionally make topology exchange.

5.2.7 Performance Issues

Procedure of protection inside a DTN forces each band-width use costs on the DTN

links and computational charges at the DTN nodes.

In our short-term task, we have considered of routing protocol safety and site

invitees’ assessment. Delay or interruption tolerant networks (DTNs) are regularly

found in emerging groups which crisis reaction, one of kind tasks, shrewd conditions,

territory observing, and vehicular ad-hoc networks wherein several nodes partake

in bunch correspondences to acquire an ordinary task. The middle component of

DTNs is that there is no assurance of cease-to-cease network, therefore, incurring

unnecessary deferral or interruption due to inherent traits (e.g., wireless medium,

5 Data-Driven Approach to Deﬂate Consumption … 43

resource constraints, or high mobility) or deliberately misbehaving nodes (e.g., mali-

cious or selﬁsh). Managing think about productively and proﬁciently is critical to

working with participation or joint effort and dynamic obligations in DTNs simul-

taneously as meeting machine wants which incorporate reliability, availability, ﬁrst-

class service, and/or scalability. Exact accept assessment is for the most part testing

in DTN conditions since hubs are reasonably dissipated and do presently not consis-

tently happen upon each unique. Therefore, coincidentally ﬁnd fundamentally based

evidence substitute among hubs will not be typically feasible.

We essentially reﬁne the past trust model in by thinking about the accompanying

upgrades:

(a) Minimizing trust bias and (b) minimizing communication cost caused by

trust assessment; and, QoS by minimizing message delivery delay and energy

consumption.

Postponement or disturbance tolerant systems (DTNs) are as often as possible seen

in rising applications including crisis response, one-of-a-kind activity, astute condi-

tions, natural surroundings following, and vehicular advert-hoc systems wherein two

or three hubs partake in organization correspondences to procure a typical under-

taking. The middle element of DTNs is that there is no guarantee of stop-to-stop avail-

ability, as a result causing unnecessary put off or interruption in view of inborn qual-

ities or deliberately getting out of hand hubs. Dealing with acknowledging as valid

effectively and adequately is signiﬁcant to encouraging participation or joint effort

and decision-making commitments in DTNs even as meeting machine objectives, for

example, dependability, accessibility, nature of administration (QoS), or potentially

versatility. Right accept evaluation is speciﬁcally troublesome in DTN conditions

since hubs are with some restraint dispersed and do never again every now and again

happen upon each unique. Therefore, go over essentially based conﬁrmation change

among hubs probably will not be constantly practical.

A very less direct connection revel in DTN environments delays non-stop support

series and may achieve wrong acknowledge as evident with assessment, prompting

negative programming generally execution. A critical test of a provenance-based

machine is that it ought to secure contrary to assailants who might direct or drop

messages like provenance insights or spread fake realities.

Postponement tolerant systems administration (DTN) is a methodology to PC put

together plan that hopes to adjust to the speciﬁc challenges in heterogeneous system

that could require constant network accessibility. The capacity to transport, or course,

information from a stock to an outing spot is a major limit all conversation system

should have. Postponement and disturbance tolerant systems (DTNs) are portrayed

through their deﬁciency of network, achieving a shortfall of right selection of ways

it gives up its methods. In refreshing circumstances, notable uncommonly named

coordinating shows, for instance, AODV and DSR apparent collection disregard to

foundation courses. Nevertheless, when second center-to-center ways are serious

or unfeasible to set up, guiding shows should take to a “extra and ahead” system,

wherein estimations are consistently moved and saved during the organization with

44 C. Venkata Subbaiah and K. Govinda

the assumption that it will ﬁnally show up at its objective. A standard strategy used

to help the likelihood of a message being practically moved is to copy different

duplicates of the message inside the presumption that one will accomplish achieving

its objective.

5.3 Proposed Method

In the present study, we suggest a disbursed provenance-based guaranteed manage-

ment protocol. Each node is thought to have the capacity to uncover its adjoining

closes with known probabilities of bogus raises and falls in distinguishing assault

practices or energy levels.

Origin of a node is revived to all its neighbor nodes. Exactly when a source

center picks its unbiased and send bunch, the trade which is sending bundles is pack

modiﬁer, by then it might uncover it as a typical node highlights its neighbor and

advance groups. Direct check is seen upon every association in another hub, while

contorted afﬁrmation is gathered when a DN gets a MM encasing.

In our task, we initially assess the trust of the hub by the assistance of message

transporter then, at that point, sends the message. To diminish the utilization of

energy, during the transmission, we use clustering technique what split the entire

message in to certain parcels. Pack n-number of center points (n1, n2, n3, n4…)

are accessible, and in a gathering, the sensor nodes which have greater essentialness

considered as a bundle head and it will pass on ﬁrst. A switch expert association

can see the center point nuances, see guiding way, and see time delay. Switch will

recognize the record from the expert community, the gathering head will pick ﬁrst,

and it size will diminished by the archive size, by then next time when we send the

report, the other node point will be pack head. Likewise, the pack head will pick

unmistakable center ward on most raised imperativeness. The time concedes still up

in the air subject to the coordinating deferment.

Smart sending is a reasonable and suitable way for accomplishing a tradeoff

between the display of the framework and use of essentialness for accomplishing

most outrageous imperativeness capability.

We use distance-based vitality proﬁcient entrepreneurial sending (DEEOF). In this

computation, in case there is any exchange center exists inside the best broadcast

extent of the resource center point and if its worth is not actually the cutoff regard,

by then the source place point progresses the package to that closest center without

replicating the information. This calculation is performed at the source at each contact

time.

We moreover consider center trafﬁc essentials close by its essentialness levels

by making CH assurance. The CH selection relies upon the CH job pivot approach,

where the center point becomes CH in the current round r, it the subjective number

picked by the node.

The limit T(i,r).

5 Data-Driven Approach to Deﬂate Consumption … 45

Fig. 5.2 System architecture

T(i,r)=⎧

⎨

⎩

Pi(r)

1−Pi(r)rmod 1

Pi(r)if node i∈G(r)

where Pi(r) is the CH selection probability for node iduring round r.G(r)isasetof

eligible nodes for the round r, where the rotation for node Ito become eligible again

is 1

Pi(r)(Fig. 5.2).

5.4 Algorithm

Distance-based energy-efﬁcient opportunistic forwarding

1. Set the edge recurrence.

2. Then, it checks on the off chance that there is any hand-off center point inside

its most limit transmission range. If for sure, the distance of move center not

entirely settled by with no copying a bundle. It gages the distance of the hand-off

without standard cell copy.

3. Calculate the sending practically identical energy-effectiveness distance (FEEDs

for instance to portray the association between the two centers and delay for

equality energy productivity) and P(the likelihood that the distance to the source

hub is more modest by something like one hand-off).

4. If Pis smaller than the threshold value, then it forwards the copy of the packet

to the nearest relay node. Else there is no need to forward the packet copy.

46 C. Venkata Subbaiah and K. Govinda

5. Set the edge recurrence.

5.5 Conclusion

In this investigation, we proposed information-driven strategy to lessen the utilization

of resources inside the proximity of biased or malevolent centers even as assessing

a center point agree with dynamically in light of adjustments inside the regular and

center conditions. We besides proposed a controlled provenance-based guaranteed

control show where every center is acknowledged to have convenience to uncover

its adjoining nodes with regarded chances of fake raises and falls in distinguishing

attack practices or imperativeness level.

We at ﬁrst survey the trust of the center by the help of message conveyor by then

send the message. To diminish the usage of imperativeness, during the transmission,

we use clustering methodology what split the whole message in to speciﬁc bundles.

We moreover considered nodes trafﬁc necessities close by its essentialness levels by

making CH assurance and the CH selection relies upon the CH work unrest approach.

References

1. Spyropoulos, T., Rais, R., Turletti, T., Obraczka, K., & Vasilakos, A. (2010). Routing for

disruption tolerant networks: Taxonomy and design. Wireless Networks, 16(8), 2349–2370.

2. Chen, I.-R., Bao, F., Chang, M., & Cho, J.-H. (2010). Trust management for encounter-based

routing in delay tolerant networks. In IEEE Global Telecommunications Conference, Miami,

FL (pp. 1–6).

3. (2014) Dynamic trust management for delay tolerant networks and its application to secure

routing. IEEE Transactions on Parallel and Distributed Systems, 25(5), 1200–1210.

4. Buneman, P., Khanna, S., & Tan, W. (2001). Why and where: A characterization of data

provenance. In Proceedings of International Conference on Database Theory (pp. 316–330).

Springer-Verlag.

5. Safaei, F., Boustead, P., Nguyen, C. D., Brun, J., & Dowlatshahi, M. (2005). Latency-

driven distribution: Infrastructure needs of participatory entertainment applications. IEEE

Communications Magazine, 43(5), 106–112.

6. Mauve, M., Vogel, J., Hilt, V., & Effelsberg, W. (2004). Local-lag and timewarp: Providing

consistency for replicated continuous applications. IEEE Transactions on Multimedia, 6(1),

47–57.

7. Valancius, V., Laoutaris, N., Massoulie, L., Diot, C., & Rodriguez, P. (2009). Greening the

internet with nano data centers. In Proceedings of the ACM 5th International Conference on

Emerging Networking Experiments and Technologies (pp. 37–48).

8. Choy, S., Wong, B., Simon, G., & Rosenberg, C. (2012). The brewing storm in cloud gaming:

A measurement study on cloud to enduser latency. In Proceedings of the ACM 11th Annual

Workshop on Network and Systems Support for Games (pp. 1–6).

9. Delaney, D., Ward,T., & McLoone, S. (2006). On consistency and network latency in distributed

interactive applications: A survey part I. Presence: Teleoperators and Virtual Environment,

15(2), 218–234.

10. Freire, J., Koop, D., Santos, E., & Silva, C. (2008). Provenance for computational tasks: A

survey. IEEE Computing in Science and Engineering, 10(3), 11–21.

5 Data-Driven Approach to Deﬂate Consumption … 47

11. Somula, R., & Sasikala, R. (2019). A honey bee inspired cloudlet selection for resource

allocation. In Smart Intelligent Computing and Applications (pp. 335–343). Springer.

12. Nalluri, S., Ramasubbareddy, S., & Kannayaram, G. (2019). Weather prediction using clus-

tering strategies in machine learning. Journal of Computational and Theoretical Nanoscience,

16(5–6), 1977–1981.

13. Sahoo, K. S., Tiwary, M., Mishra, P., Reddy, S. R. S., Balusamy, B., & Gandomi, A. H. (2019).

Improving end-users utility in software-deﬁned wide area network systems. IEEE Transactions

on Network and Service Management.

14. Sahoo, K. S., Tiwary, M., Sahoo, B., Mishra, B. K., RamaSubbaReddy, S., & Luhach, A. K.

(2019). RTSM: Response time optimisation during switch migration in software-deﬁned wide

area network. IET Wireless Sensor Systems.

15. Somula, R., Kumar, K. D., Aravindharamanan, S., & Govinda, K. (2020). Twitter senti-

ment analysis based on US presidential election 2016. In Smart Intelligent Computing and

Applications (pp. 363–373). Springer.

16. Sai, K. B. K., Subbareddy, S. R., & Luhach, A. K. (2019). IOT based air quality monitoring

system using MQ135 and MQ7 with machine learning analysis. Scalable Computing: Practice

and Experience, 20(4), 599–606.

17. Somula, R., Narayana, Y., Nalluri, S., Chunduru, A., & Sree, K. V. (2019). POUPR:

Properly utilizing user-provided recourses for energy saving in mobile cloud computing.

In Proceedings of the 2nd International Conference on Data Engineering and Communication

Technology (pp. 585–595). Springer.

18. Vaishali, R., Sasikala, R., Ramasubbareddy, S., Remya, S., & Nalluri, S. (2017). Genetic

algorithm based feature selection and MOE Fuzzy classiﬁcation algorithm on Pima Indians

Diabetes dataset. In International Conference on Computing Networking and Informatics

(ICCNI) (pp. 1–5). IEEE.

19. Somula, R., & Sasikala, R. (2019). A research review on energy consumption of different

frameworks in mobile cloud computing. In Innovations in Computer Science and Engi-

neering (pp. 129–142). Springer, Singapore. Kumar, I. P., Sambangi, S., Somukoa, R., Nalluri,

S., & Govinda, K. (2020). Server security in cloud computing using block-chaining technique.

In Data Engineering and Communication Technology (pp. 913–920). Springer.

20. Kumar, I. P., Gopal, V. H., Ramasubbareddy, S., Nalluri, S., & Govinda, K. (2020). Dominant

color palette extraction by K-means clustering algorithm and reconstruction of image. In Data

Engineering and Communication Technology (pp. 921–929). Springer.

Chapter 6

Code-Level Self-adaptive Approach

for Building Reusable Software

Components

Sampath Korra, V. Biksham, Kotte Vinaykumar, and T. Bhaskar

Abstract The code-level framework is based on the notion that separate process

units it can be well deﬁned along with the broader process of codes, which are

compatible with the life-cycle model. Centric code framework is chosen to meet the

needs of domain-speciﬁc organizations and special projects. The framework is also

independent of any life-cycle model such as a waterfall model or spiral model. We

are able to create adaptive components to show the form and the content of each

type of artifact related to the software, including documentation, code, and test plan

speciﬁcations of the component. This article presents the building of adaptive code-

level reusable software components. A set of related components can be adaptive

to create or modify a sample work product. For the work product, each product

components are adaptive in their nature.

S. Korra (B)

Department of CSE, Sri Indu College of Engineering and Technology, Sheriguda(V), Hyderabad,

India

e-mail: sampath_korra@yahoo.co.in

V. Biksham

Department of CSE, Sreyas Institute of Engineering and Technology, Hyderabad, India

e-mail: vbm2k2@gmail.com

K. Vinaykumar

Department of CSE, KITSW, Warangal, Telangana, India

e-mail: kvk.cse@kitsw.ac.in

T. Bh a s kar

Department of CSE, CMRCET, Hyderabad, India

e-mail: bhalu7cs@gmail.com

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_6

50 S. Korra et al.

6.1 Introduction

The software development life cycle (SDLC) is a series of steps that provide a

common understanding of the software development process. How to develop soft-

ware by capturing business needs and understanding to translate these business ideas

and requirements into features and functionality until used and supported to meet

the business needs. Software engineers must have sufﬁcient knowledge of SDLC

model selection based on the project environment and business requirements. The

software engineering life cycle has a series of phases, and these phases are all the

same as the requirement, analysis, design, implementation, testing, and debugging.

These phases are the same, although not necessarily the same steps with emphasis

or context. Part of the software product available to be changed, in order to develop

new products as well as to test and ﬁx the bug. The test to be applied before it can

test design and/or implementation is to be done. It is necessary to ﬁnish the require-

ments before looking at the design phase [1]. In the domain-speciﬁc engineering

(DsE), the component is the basic construction unit to develop the adaptive soft-

ware component. The components to frame work products are combined; to form

the product, these work products are aggregated to develop adaptive components.

Adaptive components represent the uniform group of similar parts or components,

each of which can be used for the same purpose. Adaptive components aim is easy

to access and quick to be components, especially for reuse in the new products or

modify existing products to meet the changing needs [2].

6.2 Literature Survey

6.2.1 Adaptive Reuse

The new thinking compatibility concept is widely used generally in electronic circuit

modeling and automatic control of other areas of the construction. Adaptive software

systems can quickly adapt to external changes. Only external changes in the life-cycle

maintenance and affect only operator intervention, so it is necessary and can also

greatly reduces the maintenance costs and time [3].

6.2.2 Adaptable Software System Versus Traditional Software

System

The adaptive software system has many advantages compared to conventional soft-

ware systems. The main advantage in system software is adapted to the system soft-

ware from traditional external changes. The system is hard to adapt to the external

changes. External conditions require reprogramming and changing the overall design

6 Code-Level Self-adaptive Approach for Building Reusable … 51

and layout tests, and even a thorough analysis may be required. The changes in the

system life cycle include the necessary highly skilled software engineers, company

operators, programmers, designers, analysts, and staff [4].

6.2.3 Adaptive Methods to Improve the Software System

The ability of adaptive software systems aims to improve the compatibility of the

system software. For adapting to the needs of business and external environment

changes, the following aspects are possible to improve the compatibility of the system

software [5].

6.2.3.1 Improve Interoperability of the Software

Software design is based on business requirements that focus not only on time but

also demand speciﬁc accounts. Various changes can be used in the design phase of

the system. When we change the external conditions, the system should be able to

better adapt and do stable work [6].

6.2.3.2 Improve the Description of the Software

The system has a certain ability to describe itself, and changes in external states can

be described as much as possible in the form of parameters. If the conditions change,

only the corresponding parameters need to be adjusted, and there is no need to make

extensive changes to the software. Improve some software tools that are basically

unchanged or often called modules. They can be seen as tools to solve problems in

the design phase, increasing the ﬂexibility and efﬁciency of the system [7].

6.3 Proposed Work

6.3.1 Software Engineering with Reusable Components

Reusing software development projects build the solution to new products and

processes. Through the reuse, building the solution of particular software devel-

opment, the formation of new products and new technologies are possible [8].

52 S. Korra et al.

6.3.2 Strategy Development Reuse

A strategy is used to create and manage code reuse and processes reuse. The activi-

ties required to deﬁne the nature of the strategy depend on the organization, how the

company is marketing the reusable components and systems. Based on the devel-

opment of the application framework for the given domain and reuse, to create the

development of a particular system or maintain the source code is needed [9].

6.3.3 The Incorporation Process to Reuse Project

The incorporation of the reuse process is the subject of the implementation of the

project is a general program for the particular project. The parent organization

projects reuse of public programs for the creation and improvement of the production

process, management code using a particular situation [10].

6.3.4 Process Measurement and Evolution

Reuse process measures and evaluates the performance of the input received in the

form of the captured data, code to create, and manage the use of processes and

products [11].

6.3.5 Management Reuse

A component is the basic unit of software creation and is an independent part of

the software because it interacts with other software components to accomplish a

given task. Reusable software components are designed to leverage software and

development processes. Each component has its own properties [12].

In the ﬁeld of software reuse, attempt for a good technique to analyze a domain

or best classiﬁcation schemes to be used. Extending the principles of software reuse

to other areas that support the reusability and management start to every area of the

domain and technical ﬁeld analysis model for technical non-abstract components

result in we can see. The different stages of the life cycle are as follows [13,14]:

(i) Understand: Understand the problem and determine the structure of a solution

based on the predeﬁned components.

(ii) The reconﬁguration: Reconﬁgure the structure solution to increase the possi-

6 Code-Level Self-adaptive Approach for Building Reusable … 53

bility of the use of predeﬁned parts in the next stage.

(iii) Recovery: The acquisition, evaluation, and instantiating the predeﬁned compo-

nents.

(iv) Search: Download exemplary and predeﬁned parts.

(v) Compatibility: Modifying and adapting the components [15].

(vi) Integration: Integrating the pieces into the products of this phase.

(vii) Assessment: The evaluation of the prospects of reusability components that

should be the development of the components obtained by editing the

predeﬁned components to aid in the collection of predeﬁned components.

The different component metrics are as follows [16].

Component Efﬁciency Metrics (CEM): This metric measuring time and resources

development process are based on the behavior of the component. Time may include

business components and the integration of time.

Component Semantic Efﬁciency Measurement (CSEM): This metric should be

time and resources developed in the semantic aspect of the behavior of the component

measure.

Component Reliability Metrics (CRM): Estimated time of component reliability

criteria is the possibility of non-defective system operation must be longer than the

speciﬁed duration is estimated. Developers are able to use the same techniques used

in traditional systems to obtain such metrics. This may be an important factor to take

into accounts such as portability and fault tolerance and recoverability.

Component Functional Metrics (CFM): Component of functional criteria is the

suitable precision of components, interoperability, and acceptance of the need to

measure the parts of the component-based development (CBD).

Component Customer Satisfaction Measurement (CCSM): Measurement of

customer satisfaction, customer expectations, and component measures how the

software in accordance with the needs. Such needs estimate and decide to reuse,

component makes a plan to guide and can help to develop a test plan that accurately

reﬂects the use of the product.

Component Cost Metrics (CCM): Measurement of the cost of the overall criteria

for costs incurred during the development of component-based component. These

costs include acquisition costs, component integration, and improve the quality of

the system.

54 S. Korra et al.

Proposed Algorithm

Final goal is to develop adaptive component to reuse that component directly in

other software development



i=1

siand Sdenote any software component and the complete software component set,

respectively.

6.4 Results and Discussions

In Table 6.1, we allow the values for component compatibility test metrics randomly

by assuming a range of values to get the adaptable components. In the ﬁrst result,

we provided the values out of boundary level. Hence, none of the components is

compatible with adaptation. CSEM metric and CCSM metrics are an upper bound.

CRM and CCM are lower bound. CEM and CFM metrics are within the range. Table

6.1 is describing the range of software metrics values.

In Fig. 6.1 given in the form of various high values for the different criteria and

ﬁnally, none of its components is adaptive in nature this way because of our selection

6 Code-Level Self-adaptive Approach for Building Reusable … 55

Table 6.1 Software metrics

with range values Software metrics Range of values

CEM 1.5–2.0

CSEM 1.0–9.0

CRM 2.0–11.0

CFM 1.0–10.0

CCSM 1.0–15.0

CCM 3.0–11.0

Fig. 6.1 None of the component adapted at code-based self-adaptive reuse

of metrics not satisﬁed the requirements. Hence, none of the components for adaptive

is selected.

We kept the values for component compatibility test metrics. In the second, result

depicts the successful of adaptive components among different components, here,

we provided the values within the range. Hence, the components are compatible with

adaptation. In Fig. 6.2, we had given different values for different metrics and ﬁnally

the two self-components comp2 and comp3 are the code-level components which

are adaptive in nature.

We supplied the values for component compatibility test metrics of actual values1

(result1) and actual values2 (result2) for results comparisons. In the second result,

we supplied the values within the range. Hence, the components are compatible with

adaptation at result2.

56 S. Korra et al.

Fig. 6.2 Two components are adapting at code-based self-adaptive reuse

6.5 Conclusion and Future Work

Software reuse is used in almost all engineering disciplines, and systems are the

maintenance costs if the source code of a reused software system. Reuse is possible

at a range of levels from simple functions to framework level an integrated set of

software artifacts. Software reuse production lines need to invest upfront but not

immediately. The value of the promise of the plan and attempt to start the program

reusing. The processes used and the programs should be incorporated into the existing

software development process. The parts must be designed speciﬁcally for reuse. By

using different software metrics, we discussed adaptive software reuse for different

technologies it helps to develop adaptive software.

The system leverages the inherent similarities of domain engineering is a domain.

The domain model established the deﬁnition of the basic structure of the current and

future needs. The right software processes used to be signiﬁcant beneﬁts to almost

every aspect of software development. The code gives us the level of compliance to

how to use the software reuse to solve problems that must be understood in any area

of the software code. In order to capture the essence of the technology developed for

the reuse of the software and apply them to other areas, non-application, we need to

identify processes and products for future utilization.

References

1. Prieto-Diaz, R. (1990). Domain analysis: An introduction. ACM SIGSOFT Software Engi-

neering Notes, 15(2).

2. Kelly, T. P., & Whittle, B. R. (1995). Applying lessons learnt from software reuse to other

do mains. In The Seventh Annual Workshop on Software Reuse, St. Charles, Illinois, USA

(pp. 28–30).

6 Code-Level Self-adaptive Approach for Building Reusable … 57

3. Ionital, M. T., Hammer, D. K., & Obbink, H. (2003). Scenario based software architecture

evaluation methods: An overview. Technical University.

4. Collier, R. W. (2002). Agent factory: A framework for the engineering of agent-oriented

applications.

5. Taylor, R. N., Tracz, W., & Coglianese, L. (1995). Software development using domain-speciﬁc

software architecture.ACM.

6. Grabow, P. C. (1984). Reusable software implementation technology reviews. Hughes Aircraft

Company.

7. Fonseca, S. P. (2002). An internal agent architecture for dynamic composition of reusable

agent subsystems—Part 1: Problem analysis and decomposition framework. Hewlett Packard

Company.

8. Sommerville, I. (2015). Software engineering (10th ed.).

9. Sametinger, Johannes. Software engineering with reusable components. Springer Science &

Business Media, 1997.

10. Tracz, W. (1995). DSSA (domain-speciﬁc software architecture) pedagogical example.ACM.

11. Korra, S., Babu, A. V., & Raju, S. V. (2014). The adaptive approach to software reuse. In

International Conference on Contemporary Computing and Informatics (IC3I). IEEE.

12. Sedigh-Ali, S., Ghafoor, A., & Paul, R. A. (2001). Software engineering metrics for COTS-

based systems. IEEE Computer (pp. 44–50).

13. Hu, X., & Tian, Z. (2003). Research on software metrics. Information Technology, 27(5), 58–60.

14. Reddy, K. V., & Korra, S. (2018). Object-oriented analysis and design using UML.BS

Publications.

15. Korra, S., Vasumathi, D., & Vinayababu, A. (2018). An approach for cognitive software reuse

framework. In Second International Conference on Intelligent Computing and Control Systems

(ICICCS) (pp. 1–6). IEEE.

16. Gill, N. S., Lycett, M., & deCesare, S. (2002). Measurement of component-based software:

Some important issues. In 7th Annual UKAIS Conference Proceedings, Leeds, UK (pp. 373–

381).

Chapter 7

Design of a Deep Network Model

for Weed Classiﬁcation

M. Vaidhehi and C. Malathy

Abstract Deep learning has attained superior performance in various aspects of

human life and the agricultural ﬁeld. The primary target of deep learning for farming

is to predict the precise crop and weed location over the farmland. Here, a novel dense

convolutional neural networks (DCNN)-based network is proposed to differentiate

weeds from crops. The overall architecture of crop and weed prediction is extensive,

with enormous parameters requiring a longer training time. To handle certain limita-

tions, a novel idea of a training network is proposed to acquire a coarse–ﬁne predic-

tion model, which is merged to achieve outcomes. The evaluation of the anticipated

model and comparison with other prevailing network models is performed using a

real-time dataset collected during the stage by stage growth of paddy. The exper-

imental outcomes and comparison pro-claims that the anticipated network model

outperforms the prevailing approaches like Inception-V4, ResNet, LSTM-RNN, etc.

The simulation is done in MATLAB 2020a simulation environment and evaluated

with metrics like accuracy, precision, F1-score, and recall.

7.1 Introduction

Deep learning (DL) is depicted as a subset of machine learning (ML) in the earlier

90s when thresholds reasoning were utilized to provision a computerized model

resembling biological pathways. This subject of study is currently developing; it

may be split into two time periods: 1943–2006 and 2012–the present [1]. Several

advancements are noticed during the ﬁrst phase, including training algorithm, chain

rule, neocognitron, handwritten text recognition (LeNET structure), and overcoming

the learning problem. However, cutting-edge algorithms/architectures were created

for various applications in the second phase, including self-driving vehicles, medical,

M. Vaidhehi (B)·C. Malathy

SRM Institute of Science and Technology, SRM Nagar, Kattankulathur 603203, India

e-mail: vaidhehm@srmist.edu.in

C. Malathy

e-mail: malathyc@srmist.edu.in

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_7

60 M. Vaidhehi and C. Malathy

language processing, earthquake forecasts, branding, ﬁnance, and image recognition.

AlexNet, which achieved the ImageNet competition for object identiﬁcation called

that of the ILSVRC, is regarded as a milestone in the area of DL [2].

Researchers used deep learning architectures to image prediction and classiﬁ-

cation as they evolved. These designs have been used for a variety of agricultural

purposes as well. In [3], categorization of leaves was carried out using a creator CNN

and RF encoder among the available species, with a CA score of 97.3%. It was,

however, less effective in detecting obstructed objects. Furthermore, the DL method

is used to recognize various plants. Sarker and Kim [4] utilized a user-modiﬁed CNN

architecture, whereas the author used AlexNet architecture. The DL method to plant

identiﬁcation was investigated and found to have a success rate of 91.78%. Further-

more, DL methods are utilized for essential tasks like planting disease identiﬁcation,

which is the subject of this paper. Some previous study papers have been presented to

summarize DL’s agricultural research (including plant disease identiﬁcation). Still,

it lacks in some events, i.e., visualization techniques used in conjunction with DL

and stacked versions of diverse DL models adopted for plant disease prediction. The

following are the research contributions:

(1) Here, the input is collected in a real-time environment based on crop growth

stages, i.e., real-time dataset with huge samples and well-suited to perform the

DL classiﬁcation process.

(2) To design a novel dense CNN (DCNN) to perform the classiﬁcation and generate

the outcomes. The model acts as a predictor approach to help the physicians

during the time of complexity.

(3) The efﬁcient approach is being trained to handle the over-ﬁtting issues and

evaluate various metrics like prediction accuracy, precision, recall, and F1-score.

The work is structured as: Sect. 7.2 provides an extensive review of the various

existing approaches used for weed prediction along with the pros and cons.

Section 7.3 gives a detailed description of the anticipated DCNN model for weed

prediction with the later structure. In Sect. 7.4, the numerical results attained with the

expected model are provided, and the detailed analysis is done with various existing

approaches to show the model superiority. In Sect. 7.5, the research summary is given

with certain research constraints and ideas for future research enhancements.

7.2 Related Works

The detection of plant density and distribution is a crucial topic of research. We can

assess weed coverage and dispersion over a ﬁeld using automatic pattern recognition

and image processing approaches. These estimates can then be used to treat weeds

appropriately. Developing a sprayer that can autonomously treat weedy regions in

crops is an intriguing application [5]. Unexpected ﬁeld circumstances, changing

weather conditions might alter dataset, and differences in lighting conditions are all

critical obstacles in this study domain [6]. With each growth stage, plant morphology,

7 Design of a Deep Network Model for Weed Classiﬁcation 61

texture and color of products, and weed change dramatically. The majority of

computer vision research aims at detecting distinct weed kinds in diverse images

like segmentation of narrow/broad leaf image. Using wavelet transform, Jørgensen

et al. [7] categorized weeds as narrow and wide weeds. After that, background objects

are removed using the difference between the two channels. Weed classiﬁcation is

based on tightness, momentum, curve, and Fourier coefﬁcients, and wheat images

are utilized for training. To categorize images, both supervised and unstructured

clustering are employed. Plant border, the interior plant region, breadth and height

with shortest distance of plant are the parameters utilized for classiﬁcation in [8]

suggested weed detection for maize plants. Image analysis accuracy is measured for

both static and moving robotic images. Weeds are identiﬁed using an artiﬁcial neural

network to classify them (ANN).

Another study in this ﬁeld looked at the identiﬁcation of maize plants and weeds

in various environments. Using the db1 [8] vector, two-level wavelet decomposition

is done, and the power of wavelet coefﬁcients at two levels is determined. The

backpropagation system is then used to classify seven characteristics, along with

the frequency and fractal dimensions of the wavelet coefﬁcient. Potena et al. [9]

developed a weed control spraying system that uses features derived from GLCM,

FFT, and scale-based feature transforms to identify weeds as broad or narrow leaf

(SIFT). Maximum and minimum coefﬁcients are used to categorize weed into two

kinds in two-dimensional FFT. SIFT is based on the image’s Gaussian difference.

In terms of accuracy, SIFT surpassed the other two methods. Simon and Zisserman

[10] developed a method for distinguishing grass weeds from rice using UAV data,

which used a deep convolution network to classify images into weed patches, rice

patches, and other ground, among other things. Another study used SVM and ANN-

based invariant moments and Fourier descriptors feature set to detect weed in sugar

beetX. Mubeen et al. [2] modeled ML approach for detecting weed patches in images

collected from a height of 50 m using a drone. The classiﬁers are trained using

nine texture features: Harlick’s and GLCM descriptors. For weeds that resemble

rice seedlings, such as grasses, this approach is inaccurate. The closest study by

dos Santos Ferreira et al. [11] concentrates on vegetation categorization using LBP

features followed by GLCM in soy bean. It uses grass density to classify images and

identify those that pose a ﬁre danger along roadways. As a voting majority predictor,

they utilized a mixture of three classiﬁers: ANN [13], k-NN [14], and linear SVM

[15]. Broadleaf weed density categorization into three wheat crop classes is another

effort similar to our approaches [16–17]. Another approach is to assess weed density

in a recorded video frame or image and spray based on density rather than the precise

weed location [18–20]. There have been studies on weed coverage, but none have

speciﬁcally addressed grass density categorization in rice ﬁelds.

62 M. Vaidhehi and C. Malathy

7.3 Methodology

This section involves two major phases: (1) dataset acquisition and (2) classiﬁcation

with dense CNN model. The simulation is done in the MATLAB 2020a environment,

and metrics like accuracy, precision, F1-score, and recall are evaluated and compared

with other approaches to depict the signiﬁcance of the anticipated model.

7.3.1 Dataset Description

The dataset is built with the real-time samples collected over some time. The images

are captured using a canon SX730HS 20.3 MP digital camera, where 1000 samples

are collected. Here, 700 samples are used for training and 300 samples are adopted

for testing. The label count is given for the corresponding dataset as paddy 81 and

weed 320, respectively. The input image is resized into 200 rows and 200 columns.

7.3.2 Dense Convolutional Neural Networks (DCNN)

The adoption of multiple FCL is done to perform target domain, and it is generally

known as ﬁne-tuning. It is considered as the essential step for transfer learning with

the utilization of pre-trained networks. Figure 7.1 depicts the experimental ﬂow of

the anticipated model.

The activation features over the initial convolution layers change the pixels based

on speciﬁc attributes like color, edge, and line over the window ﬁlter. The edge-based

feature passes via the intermediate CNN layers. It is merged with many ﬁlters where

the weights are kept randomly (random weights) and updated with backpropagation

training.

The intermediate layer pretends to identify the activated image parts, while the

ﬁnal layers learn about the discriminative features (pattern and shape) between the

target regions. When the training reaches convergence, then it represents that there

is no further weight change and training accuracy triggers to the maximum value,

thereby terminating the training process. Consequently, the proposed CNN model

is trained and acts as a generic feature extractor like of conventional way of gener-

ating the features. Here, multiple 2D ﬁlters pretend to give 4D outputs in every

connected layer, i.e., feature map/ﬁlter. Convoluting the features with the provided

ﬁlters predicts the occurrence of the features over the image. The ﬁlters are equal

to 64, 128, and 256. The convolution operation window keeps moving based on

the stride size. The provided network layer convolves input by eliminating ﬁlters

(horizontal and vertical) ﬁlters. However, it evaluates the weighted dot products and

given input and subsequently added to the bias term. When the ﬁltering is done along

7 Design of a Deep Network Model for Weed Classiﬁcation 63

Fig. 7.1 Experimentation workﬂow

with the input, it utilizes a similar weighted set and bias for performing convolution.

Therefore, it forms feature mapping.

During the training process, the gradient error needs to be BP via the transfor-

mation and evaluate the gradients based on the parameters like batch normalization

transformation. The initial layers are tested, i.e., convolutional-batch normalization,

max-pooling, and ReLU are deﬁned based on the optimal layers with an input of

64 ∗64 (2D scan). The ﬁnal feature size with convolution [222]is provided for

all 64 ﬁlters. It shows that the ﬁlter kernels possess two pixels for all ﬁlters. There-

fore, this work does not consider more than ﬁve convolution layers. The training and

validation are provided to project how the proposed 2DCNN inﬂuences the training

64 M. Vaidhehi and C. Malathy

process and assists in a better understanding of the convergence process for all CNN.

To better understand the feature extraction from the provided convolutional layers,

the image for all domains is passed, and every layer is monitored. The observations

help to predict the difference among the patterns, intensities, lines, and edges. The

layers are visualized for the entire testing set and, therefore, helps to segregate the

features from the provided framework in a superior manner. At last, the outcomes

from the various hyper-parameter setting are provided.

The ﬁlter size depicts the scanning window during convolution process, and two

strides increment the window size for all consecutive layers. Therefore, the extracted

features are extracted sequentially at lower, intermediate, and higher levels. Here,

lower-level features are removed from ﬁlter window 3 ∗3∗3 and maximum-

pooling by 2 ∗2∗2 windows with the stride of the convolutional layers. The ﬁlter

size increases based on the step size; however, the ﬁlters of every layer are provided

as 64. It is done so to preserve the channel size for the given input 64 ∗64 ∗64.

Thus, the variations are captured more straightforwardly. When the layer moves

more profoundly, the features are accumulated by increasing the window size of

all layers. The maximum-pooling strides are also grown for diminishing the feature

redundancy. The kernel size is uniform, with 3 ∗3∗3 for all convolutional layers.

7.4 Numerical Results and Discussion

Here, four metrics are utilized for quantitative comparison and evaluation, including

accuracy, precision, recall, and F1-score. In the equation given below, True Positive

(TP), True Negative (TN), False Positive (FP), and False Negative (FN) are used for

evaluation, and it is expressed as in Eqs. (7.1)–(7.4):

Accuracy =TP +TN

TP +FP +FN +TN (7.1)

Precision =TP

TP +FP (7.2)

Recall =TP

TP +FN (7.3)

F1 - score =2TP

2TP +FP +FN (7.4)

Table 7.1 depicts the DCNN performance with existing LSTM-RNN, SAE-DNN,

2DCNN, and standard CNN model. Here, metrics like accuracy, precision, recall,

and F1-score are evaluated and compared among these models. Table 7.2 depicts

DCNN performance with the existing Inception-v4, ResNet, and CNN models. Here,

metrics like accuracy, precision, recall, and F1-score are evaluated and compared

7 Design of a Deep Network Model for Weed Classiﬁcation 65

Table 7.1 Performance metrics evaluation

Metrics LSTM-RNN (%) SAE-DNN (%) 2DCNN (%) CNN (%) DCNN (%)

Accuracy 78 77 83 76 89.47

Precision 68 73 84 76 86.66

Recall 76 77 83 75 89.47

F1-score 72 75 82 64 90.41

among these models. Figure 7.2 represents the accuracy among training and vali-

dation models whereas, Figure 7.3 represents the loss incurred among training and

validation models. Figure 7.4 shows the accurate result of classiﬁcation between

paddy and weed.

It is considered an excellent alternative to deal with these constraint data. Even

though hybrid approaches have achieved relatively superior outcomes, they do not

Table 7.2 Performance evaluation of proposed versus existing

Metrics Inception-v4 (%) ResNet (%) CNN (%) DCNN (%)

Accuracy 75 82 83 89.47

Precision 67 68 84 86.66

Recall 75 82 83 89.47

F1-score 71 75 82 90.41

Fig. 7.2 Representation of training and validation accuracy

66 M. Vaidhehi and C. Malathy

Fig. 7.3 Representation of training and validation loss

Fig. 7.4 Classiﬁed paddy and weed

beneﬁt DL, which extracts features automatically from a large amount of imaging

data. The most commonly adopted DL approaches in computer vision studies are

CNN, specializing in removing images’ characteristics. Currently, DCNN models

are proposed in this work and show superior performance for weed classiﬁcation.

7 Design of a Deep Network Model for Weed Classiﬁcation 67

7.5 Conclusion

In this research, a novel deep CNN-based network model is proposed to differentiate

weeds from crops. To handle certain limitations of the general approaches, a novel

idea of a training network is proposed to acquire a coarse–ﬁne prediction model,

which is merged to achieve outcomes. Some research limitations are the complexity

in testing the speciﬁc weed growth stage and weeds (grass) types. Also, these network

models do not target the prediction of other kinds of weeds like sedges, broadleaf

weeds, etc.

In the future, the detection of these kinds of grasses with multiple growth stages

is considered. Further, research directions are to predict weeds in multiple weed

images, i.e., images with various weeds and evaluation of density over a speciﬁc

region for predicting weed density need to be concentrated. Next, the concatenations

of diverse optimization approaches are highly solicited to acquire global outcomes.

References

1. Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural

networks. Science,313(5786), 504–507.

2. Mubeen, K., Nadeem, M. A., Tanveer, A., & Jhala, A. J. (2014). Effects of seeding time and

weed control methods in direct seeded rice (Oryza sativa L.). Journal of Animal and Plant

Science, 24(2), 534–542.

3. Rao, A. N., & Ladha, J. K. (2013). Economic weed management approaches for rice in Asia.

In The Role of Weed Science in Supporting Food Security by 2020. Proceedings of the 24th

Asian-Paciﬁc Weed Science Society Conference, Bandung, Indonesia, October 22–25, 2013

(pp. 500–509). Weed Science Society of Indonesia.

4. Sarker,M. I., & Kim, H. (2019). Farm land weed detection with region-based deep convolutional

neural networks. arXiv preprint arXiv:1906.01885.

5. Wendel, A., & Underwood, J. (2016, May). Self-supervised weed detection in vegetable crops

using ground based hyperspectral imaging. In 2016 IEEE International Conference on Robotics

and Automation (ICRA) (pp. 5128–5135). IEEE.

6. Garcia-Ruiz, F. J., Wulfsohn, D., & Rasmussen, J. (2015). Sugar beet (Beta vulgaris L.)

and thistle (Cirsium arvensis L.) discrimination based on ﬁeld spectral data. Biosystems

Engineering, 139, 1–15.

7. Dyrmann, M., Jørgensen, R. N., & Midtiby, H. S. (2017). RoboWeedSupport—Detection

of weed locations in leaf occluded cereal crops using a fully convolutional neural network.

Advances in Animal Biosciences, 8(2), 842–847.

8. Lottes, P., Behley, J., Milioto, A., & Stachniss, C. (2018). Fully convolutional networks with

sequential information for robust crop and weed detection in precision farming. IEEE Robotics

and Automation Letters, 3(4), 2870–2877.

9. Potena, C., Nardi, D., & Pretto, A. (2016, July). Fast and accurate crop and weed identiﬁcation

with summarized train sets for precision agriculture. In International Conference on Intelligent

Autonomous Systems (pp. 105–121). Springer.

10. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image

recognition. arXiv preprint arXiv:1409.1556.

11. dos Santos Ferreira, A., Freitas, D. M., da Silva, G. G., Pistori, H., & Folhes, M. T. (2017).

Weed detection in soybean crops using ConvNets. Computers and Electronics in Agriculture,

143, 314–324.

68 M. Vaidhehi and C. Malathy

12. Myers, D., Ross, C. M., & Liu, B. (2015). A review of unmanned aircraft system (UAS)

applications for agriculture. In 2015 ASABE Annual International Meeting (p. 1). American

Society of Agricultural and Biological Engineers.

13. Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-net: Convolutional networks for

biomedical image segmentation. In International Conference on Medical Image Computing

and Computer-Assisted Intervention (pp. 234–241). Springer.

14. Ramprakash, T., Madhavi, M., & Yakadri, M. (2013). Inﬂuence of bispyribac sodium on soil

properties and persistence in soil, plant and grain in transplanted rice. Progressive Research,

8(1), 16–20.

15. Rao, A. N., & Chauhan, B. S. (2015). Weeds and weed management in India—A review.

16. Rashid, M. H., Alam, M. M., Rao, A. N., & Ladha, J. K. (2012). Comparative efﬁcacy of

pretilachlor and hand weeding in managing weeds and improving the productivity and net

income of wet-seeded rice in Bangladesh. Field Crops Research, 128, 17–26.

17. Bakhshipour, A., Jafari, A., Nassiri, S. M., & Zare, D. (2017). Weed segmentation using texture

features extracted from wavelet sub-images. Biosystems Engineering, 157, 1–12.

18. Lavania, S., & Matey, P. S. (2015, February). Novel method for weed classiﬁcation in

maize ﬁeld using Otsu and PCA implementation. In 2015 IEEE International Conference

on Computational Intelligence & Communication Technology (pp. 534–537). IEEE.

19. Rumpf, T., Römer, C., Weis, M., Sökefeld, M., Gerhards, R., & Plümer, L. (2012). Sequential

support vector machine classiﬁcation for small-grain weed species discrimination with special

regard to Cirsium arvense and Galium aparine.Computers and Electronics in Agriculture, 80,

89–96.

20. Yu, J., Sharpe, S. M., Schumann, A. W., & Boyd, N. S. (2019). Detection of broadleaf weeds

growing in turfgrass with convolutional neural networks. Pest Management Science, 75(8),

2211–2218.

Chapter 8

E-Voting System Using U-Net

Architecture with Blockchain Technology

Nuthalapati Sudha and A. Brahmananda Reddy

Abstract In order to give power to the right person, any democracy must have a

fair voting system that meets the needs of the people. In India, there are two types of

voting system: paper ballots and electronic voting machines (EVMs), both of which

have similar drawbacks. The provision of security in a digital voting system has

long been a major concern. In the proposed work, implemented an interface. Firstly,

the user has to register after which they will be authenticated and validated through

many phases such as OTP and IRIS recognition to establish whether they are valid

or fraudulent voters. The election process will be done under blockchain technology.

The blockchain technology is to provide transparency, security, reliability. Therefore,

there will be an increase in the voting percentage because each and every vote counts;

there will be no delay in calculating ballots.

8.1 Introduction

In 1989, the Election Commission of India collaborated with Bharat Electronics

Limited and Electronics Corporation of India Limited to create the Indian elec-

tronic voting machine (EVM). The EVMs were designed by faculty members at

IIT Bombay’s Industrial Design Centre. In traditional voting methods such as paper

voting, punch card voting, ballot systems and so on. There is an issue which has

been faced due to covid-19 (In recent GHMC elections, the voting percentage has

been drastically decreased due to the pandemic situations). Online voting system

provides a better voting method in terms of correctness, convenience, ﬂexibility,

conﬁdentiality, reliability, validity and portability, but it has a number of ﬂaws,

including time consumption, mountains of paperwork and mobility. Ofﬁcials played

no direct involvement; the machine was damaged, and there were no updates. These

weak points can overcome by online voting system. Voter can cast his/her vote from

anywhere in the location through online voting system (OVS).

N. Sudha ·A. Brahmananda Reddy (B)

VNR VJIET, Bachupally, Hyderabad, India

e-mail: brahmanandareddy_a@vnrvjiet.in

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_8

70 N. Sudha and A. Brahmananda Reddy

Electronic voting (e-voting) has been one of the blockchain’s (BC) newer appli-

cations, with academics trying to take advantage of beneﬁts including integrity,

anonymity and non-repudiation, all of which are vital in a voting application. A

block is a collection of data. Mining is the process of gathering and processing data

in order to ﬁt it into a block. A cryptographic hash could be used to identify each

block (also known as a digital ﬁngerprint). So that blocks can create a chain from the

ﬁrst block ever (known as the Genesis block) to the formed block; the created block

will contain a hash of the previous block. All of the data might be linked together

in such a way using a linked list structure. In 2008, a person (or group of persons)

going by the name Satoshi Nakamoto created the BC to serve as the public transac-

tion record for the cryptocurrency bitcoin. Satoshi Nakamoto’s true identity is still

unknown. With the creation of the BC for bitcoin, it became the ﬁrst digital currency

to solve the double-spending problem without the need for a trusted authority or

central server. Every user on the BC network has the ability to connect, send new

transactions, validate transactions and generate new blocks. A cryptographic hash is

assigned to each block, which remains true as long as the information in the block

is not changed. The voter’s voting transaction allows the acquisition of a casting a

vote token to his preferred candidate’s address as a receiver’s address. If one of the

assigned miners conﬁrms this transaction into a block, this becomes part of the main

BC. This may help to increase voter ‘participation. The main reasons in building

decentralised applications are there is no single point of failure, even if so. The

key advantages of developing decentralised apps are that there is no single point of

failure, and our programme will continue to function even if some machines fail.

These programmes have a higher level of security. These uses are undetectable. Any

participant can check that everyone else is following the rules and that no one is

tampering with the data.me machines go down our application will continue to run.

These applications are more secure. These applications are transparent. Any partici-

pant can check that everyone else is following the rules and that no one is tampering

with the data. Ethereum is a ﬂexible platform that can be used to create both private

and public decentralised applications. The basic concepts of e-voting system using

blockchain technology and system architecture were introduced in the research article

titled “E-Voting system using Deep Learning and Blockchain technology.”

8.2 Literature Survey

Gomathi and Veena Priyadarshini [1] provide an interface which is implemented by

3 security measures are magnetic coated scan, ﬁngerprint recognition and password

validation. These measures will assure the interface in terms of security, and no

fraud voters can be entertained. The voter cannot be able to vote more than once.

This system requires another device for the purpose of ﬁngerprint recognition. More

advanced or enhanced operations need to be implemented in order to safeguard the

system from malicious threats.

8 E-Voting System Using U-Net Architecture with Blockchain … 71

Yu et al. [2] appreciating the efforts of the authors that have put on to build the

system. Introduces a Hyperledger framework as a framework for BC, implements

a consensus algorithm by using PBFT, and the authentication is done by using ring

signature method which increases the count of the number of voters. Transaction

time will be taking more. It does not support transactional parallelism.

Hjálmarsson et al. [3] System comprises of a district node that manages the boot

node’s smart contract. Exonium, Quorum and Geth are three frameworks that come

highly recommended. Exonium is a premium system that can be used with cryptocur-

rency, making it prohibitively expensive for larger implementation. There are other

frameworks available which can give better performance and they are open-source

frameworks. Quorum and Geth are Ethereum-based frameworks that do not allow

for parallel transaction execution, limiting scalability and speed.

Zhang and Romero et al. [4] Paillier homomorphic encryption, as well as other

cryptographic methods like blind signatures and zero-knowledge proof, is used in

this e-voting system. This technique also meets the requirements for a generic voting

system, such as eligibility, accuracy, simplicity, privacy and robustness. It is written

in C++ and makes use of GMP, a mathematical multiple highly precise module and

the “Paillier” librarian to do the computations required for encrypting, decoding,

vote validation and tallying. To increase the e-voting overall system performance,

some parts of the system, such as certifying the sealed ballot, verifying for legitimate

votes and counting the votes, might be made to run in parallel.

Zagórski et al. [5] Voter registers for election and after registering the voter the

voting assistant will give him a QR code. This QR code will be scanned at the time of

election. The voter can cast his/her favourite candidate. The voter will be encrypted

and decrypts at the time of tallying the results.

Kshetri and Voas [6] work with BC technology where each voter contains a

“wallet” with the user credentials. Each voter receives a single coin in his wallet

that represents a vote in which a voter can only spend his coin once. The vote moves

the coin of the elector into the pocket of a nominee. However, voters can change

their vote from one candidate to another candidate before a pre-set deadline, and by

mistake, a voter can vote for the other candidate; the conﬁrmation has to be given

to the voter before they submit the vote. Voting by eligible electors on a machine

or mobile anonymously blockchain-enabled voting uses personal key identiﬁcation

encrypted. The method limits the voting to one vote. No voting fraud will enter as

this device uses encrypted personal identiﬁcation key.

Ismail and Kintu [7] deal with ﬁngerprint recognition for traditionally collection of

ﬁngerprints using ink or electronically using screen sensors, such as a person’s ﬁnger-

print reader/scanner. The ﬁngerprints are used for authentication, which requires a

user’s veriﬁcation and identiﬁcation. The veriﬁcation process consists of a matching

ﬁngerprint with a comparison between two of the bio-metric and other ﬁngerprints

already in the database. The ﬁngerprint involves four distinct components. The match

module contains a decision module, in which the user is checked for the identity

requested, a device storage module, data collected and stored in the database during

the ﬁngerprint registration to be later recovered. The database for the corresponding

user is set at 1:1; the user cannot ﬁt at 1:0, and the database can be equated with

72 N. Sudha and A. Brahmananda Reddy

the ﬁngerprints of the current registered user at 1:N. The system is not running on

various devices, for example, mobile phones.

Arshad et al. [8] author has provided a detailed description of how the system

prevents the problem of double voting (voting by the same person in multiple areas).

The voter must ﬁrst register and be authenticated using ﬁngerprint recognition. The

person can vote after the authentication process is completed. This information is

then used to generate a cryptographic hash, which contributes to the creation of

a transaction ID, which is then transmitted to the voter’s email address. Finger-

print recognition requires the use of another device, which is not a necessary safety

precaution at this moment of covid because as cases are spreading widely.

8.3 Methodology

Registration Phase: User has to register within the application by providing few

details like name, Aadhaar number, phone number and so on. And later, voter will

be validated whether he/she is a valid voter or not.

•Slogan: Vote for the best candidate and bring shine into your life.

Authentication Phase

Authentication Phase 1: The UIDAI (“Authority”) issues a 12-digit set of numbers

to Indian citizens who have already accomplished the Govt’s identity veriﬁcation.

Everyone residing in India, irrespective of age, is eligible to apply for an Aadhaar

number (Fig. 8.1).

Authentication Phase 2: In the second phase, voter will get a code to their mobile

number; the reason behind this is whether the information submitted by voters is

correct or not. By delivering a one-time code to a mobile phone number, we can also

prevent a voter fraud. SMS validation, also known as SMS two-factor authentication

Fig. 8.1 Aadhaar card

8 E-Voting System Using U-Net Architecture with Blockchain … 73

Fig. 8.2 Process of authentication

(2FA) and SMS one-time passcode (OTP), allows users to conﬁrm their identities by

texting a code to themselves. It is a type of 2FA that functions as a second veriﬁer for

users attempting to access a system, device, or app, and it is a solid initial step toward

added protection. Following sign-in, the user may get a text message containing an

SMS authentication token. To gain access, users just simply input their password on

the interface, and the voter’s code will be validated (Fig. 8.2).

Authentication Phase 3: IRIS recognition using convolutional neural networks

(Fig. 8.3).

Pre-processing an image: In pre-processing, ﬁrstly, the images are (BGR)

converted into grayscale image; then, by using the thresholding and canny edge

detection, we localise the part of the image. Secondly, normalisation is a critical step

that ensures each input parameter (pixel in this case) has a similar data distribution.

These speeds up convergence whilst training the network (Fig. 8.4).

Segmentation: The process of segmenting an image into parts with similar

attributes is known as image segmentation. Image segmentation’s primary goal is

to simplify the image so that it can be more easily analyzed. Image segmentation is

a subset of digital image processing that focuses on dividing an image into distinct

parts based on their features and properties.

Train and Test: The data set splits into 60 and 40% where 60% of images where

been trained and 40% of the data is been tested (Fig. 8.5).

Architecture (model): The U-Net model is a widely used deep convolution network

(CNN) architecture for biomedical and other image translation tasks. This model’s

advantage is its capacity to generate relatively accurate models from (very) tiny data

sets, which is a prevalent issue in data-constrained image processing applications like

74 N. Sudha and A. Brahmananda Reddy

Fig. 8.3 Flow process of IRIS recognition

Fig. 8.4 Input image: IRIS

pattern

iris recognition. An encoder–decoder path of the architecture is used by U-Net. Deep

neural encoding and decoding levels compensate the architecture. The encoding path

is on the left side of the system, and the decoding path is on the right. The sender

adopts a conventional CNN architecture popularised by the visual geometry (VGG)

network,whichcomprisesoftwo3*3convolutional followed by a rectiﬁed linear

unit (ReLU) activation layer with maximum pooling. Every step of encoder down-

sampling multiplies the amount of the feature channels whilst halving the image

resolution. The lower layer’s feature maps are then up-sampled by the decoder part,

which also concatenates and crops the encoder part’s output of the same depth. This

process ensures that data are propagated at all scales from the encoder to the decoder

and that no information is lost even during encoder’s down-sampling processes on

the U-Net model’s output. The network’s ﬁnal layer is a 1 ×1 convolutional layer

that integrates the output signals from the previous layer and generates fragmentation

(in each class—iris vs. non-iris) that represents the U-Net model’s result (Fig. 8.6).

8 E-Voting System Using U-Net Architecture with Blockchain … 75

Fig. 8.5 U-Net model

Fig. 8.6 Flow process of model

Election process:

•The vote has to select the best candidate to cast vote who deserves it or who can

handle the government.

•After the registered voter is been validated, the ballot paper has been visible to

the voter

76 N. Sudha and A. Brahmananda Reddy

Fig. 8.7 Chain of blocks in blockchain [1]

•The voter submits it; then, the voter gets acknowledgement (Alert message) that

they have been voted for the particular candidate (example: Id: 234,561 your vote

has been casted to candidate1 and thank you for voting).

•After the election is over, automatically, the votes will be displayed to the user.

Vote transaction: When a person votes, he or she interacts with a smart contract

called a ballot. The BC interacts with this smart contract, which adds the vote to the

BC. Following the voter’s submission of his or her preferred candidate, the votes are

held in a mem pool. The transaction from the mem pool will then be followed up on

by miners, who will add it to a block in a BC. Each vote is saved as a transaction

on the BC, and each voter is given the transaction ID for their vote. On the BC,

each transaction has information about who voted for which candidate. The Merkle

tree is used to store data in the BC, which requires very little memory. Each vote is

appended to the BC via the relevant ballot smart contract (Fig. 8.7).

Announce the results: The election commission will publish the results in the

website once the voting period has ended.

8.4 Results and Discussion

OTP has been sent to a user by using 2Factor services (Fig. 8.8).

IRIS Recognition: The model achieved an overall accuracy of 95% which is

signiﬁcantly higher than the traditional methods (Figs. 8.9,8.10, and 8.11).

Conditions to be followed

According to Article 58 of the Constitution, no one may be elected President unless

he is a citizen of India.

8 E-Voting System Using U-Net Architecture with Blockchain … 77

Fig. 8.8 OTP has been sent to a user

Fig. 8.9 IRIS pattern

Every individual voter should be greater than equal to 18 years of age at the time

of registering the voter. So, that they will get the eligibility to vote according to the

Election Commission of India.

8.5 Conclusion

The three levels of authentication (Aadhaar id variﬁcation, OTP generated to regis-

tered mobile number and ﬁnally IRIS recognition) used and applied CNN technique

78 N. Sudha and A. Brahmananda Reddy

Fig. 8.10 Accuracy

Fig. 8.11 Election process

to detect the fraudalent voters. Once the election process is done, it may also helps

in declaring the accurate results.

8 E-Voting System Using U-Net Architecture with Blockchain … 79

References

1. Gomathi, B., & Veena Priyadarshini, S. (2013). Modernized voting machine using ﬁnger print

recognition. IJSER, 4(5). ISSN: 2229-5518.

2. Yu, B., Liu, J. K., Sakzad, A., Nepal, S., Steinfeld, R., Rimba, P., & Au, M. H. (2018). Platform

independent secure blockchain-based voting system (pp. 369–386). Springer. https://doi.org/10.

1007/978-3-319-99136-8_20

3. Hjálmarsson, F., Hreiðarsson, G. K., Hamdaqa, M., & Hjálmtýsson, G. (2018). Blockchain-based

e-voting system (pp. 983–986). IEEE.

4. Zhang, M., & Romero, S. (2020). Design and implementation of an e-voting system based on

Paillier encryption (pp. 815–831). Springer.

5. Gaweł, D., Kosarzecki, M., Vora, P. L., Wu, H., & Zagórski, F. (2017). Apollo—End-to-end

veriﬁable internet voting with recovery from vote manipulation. Springer.

6. Kshetri, N., & Voas, J. (2018). Blockchain-enabled E-voting. IEEE Software, 35, 95–99.

7. Ismail, M., & Kintu, N. B. (2018). A secure e-voting system using biometric ﬁngerprint and

crypt-watermark methodology. IEEE.

8. Khan, K. M., Arshad, J., & Khan, M. M. (2018). Secure digital voting system based on blockchain

technology. International Journal of Electronic Government Research (IJEGR), 14, 53–62.

Chapter 9

Multi-layered Architecture to Monitor

and Control the Energy Management

in Smart Cities

A. K. Damodaram, S. Sreenivasa Chakravarthi, L. Venkateswara Reddy,

and K. Reddy Madhavi

Abstract Energy meter monitoring and control management in real time still have

several limitations in smart city establishments. One of the essential components of

smart city is energy meter monitoring and control in real time. Since it is a herculean

task for the energy companies to monitor and control energy meter in collecting the

data on energy consumption, billing, payments and other managerial aspects in smart

city environment, a suitable wireless sensor network architecture that encompasses

emerging technologies like smart devices, Cloud–Fog computing, IoT and drones

may be an approach to provide a solution in this regard. Present day communication

technologies revolutionized the way smart phones and smart gadgets/devices are

deployed to access, acquire, store and share data. Internet of Things (IoT), in the

recent past, facilities to provide several applications and solutions in making our lives

‘smart’ leading to ‘smart city’ establishments. The present article proposes a novel

‘three-layered’ wireless sensor network architecture for energy meter monitoring

control management in real time.

9.1 Introduction

In conventional metering system, energy providing authority sends personnel to

acquire the data manually and record the energy consumption [1], so billing can be

done accordingly. The manual reading systems have disadvantages like it require

large number of personnel to read the meters, involvement of erroneous readings,

tampering of the readings, etc. This calls for adoption of emerging technologies like

A. K. Damodaram (B)

Department of ME, Sree Vidyanikethan Engineering College (Autonomous), Tirupati, AP, India

e-mail: akdamodaram@vidyanikethan.edu

S. Sreenivasa Chakravarthi ·K. Reddy Madhavi

Department of CSE, Sree Vidyanikethan Engineering College (Autonomous), Tirupati, AP, India

e-mail: sschakravarthi@ieee.org

L. Venkateswara Reddy

Department of CSE, KG Reddy College of Engineering, Hyderabad, AP, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_9

82 A. K. Damodaram et al.

IoT to provide a comprehensive solution for the problem [2]. Energy consumption

monitoring using IoT for energy monitoring and management is emerging technology

today. This IoT-based system transforms the way people and organizations use and

control power consumption. Internet of Things (IoT) products became recently an

essential part of any home in conjunction with the great advancements in Internet

speeds and services [3]. It consists of smart devices, hardware and software. This

system allows real-time monitoring of power consumption telemetry and predictive

calculation of consumption. The smart meter can automatically send its reading to

the energy supplier for accurate billing, removing estimation-based billing [4].

9.1.1 Proposed Questions

The following questions were ﬂoated related to the development of architecture for

energy meter monitoring and control management for smart cities:

1. What are the architectural components of wireless sensor network requirement

for energy meter monitoring

2. What is the suitable architecture framework for smart city environment

3. How computing and communicating technologies distributed in the wireless

sensor network architecture for smart cities

4. What is the ‘threat landscape’ in to-be proposed architecture.

9.1.2 Methodology

The present article assumes a generic four-stage process which describes the data

ﬂow from the IoT devices and sensors through the Cloud or Fog layers. The to-be

proposed architecture shall fulﬁll the following:

a. All the data acquisition and collection to happen real time at the set time intervals

b. Acquired data on the energy consumption from devices and sensors need to be

conveyed Cloud or Fog or both with an objective to reduce or avoid the bourdon

on system due to large raw data to go core computing capabilities

c. Provide gateways for easy access between the separated layers for effective

utilization of the computing resources

d. Allow preprocessing of the data to reduce large unstructured data

e. Allow core computing facilities on the cloud to deliver better performance.

Based on the above mentioned aspects, a novel innovative ﬁrst-hand architecture

was developed from understandings of the information sourced from books, journals,

whitepapers and technical reports available on Internet.

9 Multi-layered Architecture to Monitor and Control… 83

9.2 Literature

9.2.1 Smart Meters

Asmart meter is an electronic device intended to record consumption of energy,

current, voltage power factor, etc. Smart meters communicate the information on

consumption in real time. Functions aspects of these smart meters made it best suit for

automation of data acquisition in IoT environment. In IoT, everything is conﬁgured

with Internet protocol addresses, and it can monitor controlled and access remotely

in accordance with Web technology [5]. However, the battery capacity capabilities of

the meters limit the quantity and frequency of sending data to the nodes and modules

of the monitoring system. Moreover, the location of meters often limits the signal

transmission in real-time data [6]. Beneﬁts of using smart meters in IoT environment

include

•Readings can be automatically sent to the electricity department.

•No need to prepare estimated bills.

•No room for human errors/mistakes.

9.2.2 Smart City

A smart city is a developed and modernized urban area characterized by the inte-

gration of urban infrastructural facilities with the information and communication

technologies (ICT) through Internet. Smart city establishments permit the city ofﬁ-

cials to interact directly with community in providing the urban infrastructure like

utilities, energy, water supply, waste disposal and other civic amenities. Smart city

establishment allows the use of wide range of electronic and digital technologies

to connect with urban infrastructure with proper technology architecture. It also

embeds information and communication technologies (ICT) to government systems.

Smart city energy sector module involves collection of data on energy consumption,

processing, billing, control management, etc. Energy providers see opportunities for

information and communication technology (ICT) enabled smart energy applications

[7].

9.2.3 Internet of Things (IoT)

The concept of the Internet of Things ﬁrst became popular in 1999, through the

Auto-ID Center at MIT and related market-analysis publications [8,9]. IoT is an

information technology foundation which enables communication between elec-

tronic devices and sensors with the assistance of Internet. The Internet of Things

(IoT) is emerging as a signiﬁcant development in information technology, with the

84 A. K. Damodaram et al.

potential to increase convenience and efﬁciency in daily life [10]. The technical

concept of the IoT is to enables these different physical objects to sense information

using sensors and sends these information to a server [11].

9.2.4 Cloud–Fog-IoT Environment

All fog environments, nodes are not always active computationally. Nodes are put in

operation mode only when required. It is also possible to turn OFF the node when

there is no data and can be put ‘ON’ when required. Also, the Fog environment

can be made scalable on each communication link among the nodes, and connec-

tivity features can be applied for data acquisition. Thus, consistent real-time data

transfer can be ensured. Fog and Cloud platforms differ from each other in resource

capacity and capability to facilitate deployment of Cloud platform in Fog environ-

ment, a cluster-based Fog system architecture is commonly used. Three-tier IoT

system architecture consists of smart devices in tier1, edge gateway in tier 2, and

Cloud-networked things like sensors and actuators. These sensors and actuator use

protocols such as Modbus, Bluetooth, Zigbee or any other proprietary protocols, to

connect to an Edge Gateway [12]. The Cloud tier in most Cloud-based IoT platforms

facilitate event queuing and messaging that transpires in all tiers. To build an Internet

of things, the Web of things acts as an architecture for the application layer. There is

also the IoT, which includes devices such as Nest thermostats or security cameras,

electric appliances, voice assistants, such as Google Home or Alexa, and sensors

[13]. In order to program and control the ﬂow of information in the IoT environment

need architectural design and environment with a blend of traditional process mining

and special automation capabilities (Fig. 9.1).

9.2.5 A Network Architecture

The Internet of Things (IoT) require huge scalability in the network space to handle

the surge of devices [1]. Fog computing is a viable alternative to prevent large amount

of data ﬂow through Internet [14].

9.2.6 Drones for Smart Cities

Use of drones is being witnessed across all the human technological interventions in

day-to-day life. Drones provide ﬂexibility in adapting for various aspects of smart city

life including safety and security, policing, delivery of small consignments, infras-

tructure planning, trafﬁc control and monitoring, smart lighting, smart metering,

9 Multi-layered Architecture to Monitor and Control… 85

Fig. 9.1 Cloud–Fog-IoT environment

smart banking smart governance, etc. Capabilities of drones to carry small to medium

loads are in high priority in research and development.

In context of the present work in this article, if drones are mounted with necessary

sensory devices to collect the data in real time from smart meters in a given locality,

we can eliminate the unnecessary investment on computing and communication tech-

nologies in terminal-perception layer interfaces. A sensor network with embedment

of drones and IoT can be a valuable option in smart city environment. Generally, IoT

is most abundant in manufacturing, transportation and utility organizations, making

use of sensors and other IoT devices [15].

9.2.7 Network Architectures for Smart Cities

The sensor network architecture for smart city needs to equip connectivity to the IoT

devices and computing and communicating technologies [16]. Where connectivity

was required, a zoned architecture was adopted, with ﬁrewalls and/or demilitarized

zones used to protect the core control system components [17]. The following are

the technologies that need to be incorporated in to the architectural form.

a. SCADA—Supervisory Control and Data Acquisition

b. WPAN—Wireless Personal Area Network

c. LPWAN—Low-Power Wide Area Network.

86 A. K. Damodaram et al.

The architecture for wireless sensor network for smart city energy meter moni-

toring and control management should serve the purpose of facilitating sharing

of information across different IoT devices (smart meters), applications and the

computing and communicating layers. Also, it is expected to consolidate the network,

cloud, Fog and IoT devices in the smart city environment with the following

capabilities

i. IoT device provisioning with access controls

ii. Network node provision in an automated approach.

Based on the application domain, we classify and discuss these solutions under

ﬁve different categories: (1) smart wearable, (2) smart home, (3) smart city, (4) smart

environment and (5) smart enterprise [18].

9.2.8 Objectives of Energy Meter Monitoring Device System

An energy consumption monitoring device systems in the smart city have the

following function to fulﬁll

(a) Web-based system for remote tracking the energy consumption system

(b) Real-time data acquisition through sensor-based IoT devices

(d) Quick and reliable connectivity to entire network through IoT platform

(e) Enable real-time reading and billing

(f) Enable access through Android/IOS apps.

There are multiple beneﬁts from using IoT for power monitoring and management.

These beneﬁts include real-time monitoring with total control over the acquired data.

Data from the domestic and industrial establishment pertaining to energy consump-

tion would be enormous and hence needs consolidation using required database

management tools. These data may be used further to establish trends in usage of

energy and as a check for power distribution losses. Other beneﬁts are predictive cost

calculation, if required, and automated energy consumption administration. Despite

its intricate connectivity and loaded functionality, the energy monitoring system is

easy to install and conﬁgure using a smart device.

The proposed systems have the following features;

•Real-time data processing for energy consumption and management for the

domestic and industrial establishments

•Electronic device recognition based on smart IoT devices platform

•Detection of energy consumption trends at distribution level and grid level

•Automated system with error-free energy consumption monitoring

•Automation of data collection, analytics, visualization and reporting on energy

consumption

9 Multi-layered Architecture to Monitor and Control… 87

•Detection of misuse of energy and tracing of faulty or tampered energy meters

•Scalability depending on requirements.

9.3 A Novel Approach to Data Acquisition for Energy

Monitoring System

IoT provides a three layers network through which sensors can share data with one

another. The Internet of Things belong to the branch of application programs for

collecting data in real time from multiple devices installed at remote locations to

control those devices by setting conditions. The major requirement for providing

communication among these devices is Internet. A suitable wireless sensor network

architecture is prerequisite to encompass all the capabilities on the cloud environ-

ment through Fog. IoT facilitates connecting and networking with a wide variety of

energy monitoring devices which are in commercial and residential use. IoT offers

sophisticated methods and techniques to analyze and optimize the use of energy

utilization monitoring devices in the entire distribution system. IoT can provide a

suitable models, algorithms and architectures for energy consumption monitoring

and management. IoT can also provide a scope to discover faulty smart meters

and erroneous consumption from outdated appliances, damaged/under-performing

appliances and faulty system components. Energy consumption should be monitored

effectively to avoid misuses, losses at the user side and to ensure proper auditing of

energy both at generation and distribution sides. Automated meter reading systems

make visits by the reader unnecessary and also allows energy companies to bill based

on real-time data instead of estimates over time. If the automated meter reading

devices are modiﬁed to smart devices with a capability to connect to the network

modules like Wi-Fi, IoT-Drone adapter and Arduino through Internet, it is possible

to establish a sound system of energy meter control and management effectively.

In this present work, we propose a novel three layers architecture for acquiring

energy consumption data which uses real-time interfacing technology-based smart

devices/smart meters, Wi-Fi and drones.

9.3.1 Proposed Architecture for Cloud-Fog-IoT

Environment—Framework

In the proposed approach, interfacing through the device enabled ‘real time’ through

three layers. The interfacing technologies for smart devices and ‘cryptography’ will

enable the real-time accessibility between the Cloud and Fog layers (Fig. 9.2).

A Three-layer architecture for Cloud–Fog-IoT deployment.

A three-layered architecture for ‘Cloud–Fog-IoT’-based system is proposed in the

present work. This architecture comprised of a terminal layer, perception layer and

88 A. K. Damodaram et al.

Fig. 9.2 Typical

Cloud–Fog-IoT environment

network layer. The terminal layer contains IoT devices deployed in home/ofﬁce

establishments to read the energy meters and acquire the data. The IoT devices include

smart meters, remote terminal units and information collection devices. This layer

acquires the data and transmits the data to the perception layer. The perception layer

collects the data from all the devices in the terminal layer through ‘data analytics’.

The data thus acquired is subjected to analysis and interpretation on the type of use,

consumption of energy, conversion in to standard datasets for every home/business

establishments according to the set norms by the energy supplying authority. Data

validation shall also be incorporated for detecting faulty energy meters, errors in

metering, etc., at this perception layer with suitable software and hardware. The data

then transferred to network layer for storage, retrieval, analysis and interpretation

at the central level. Hence, the proposed three-layer network model enables data

acquisition through the deployed IoT tools/devices across the layers (Fig. 9.3).

Fig. 9.3 Proposed Cloud–Fog-IoT environment

9 Multi-layered Architecture to Monitor and Control… 89

9.3.2 The Activities in Three-Layer Network

The following are the activities envisaged in the proposed ‘three-layer architecture

for smart meter monitoring and control management in real time for smart city

environment (Table 9.1).

IoT-Fog–Cloud layers in the proposed architecture (Fig. 9.4):

The information thus collected is conveyed to the Fog-based data processing

capabilities in the Fog computing infrastructure. Perception layer is the interme-

diary in the architecture for reducing the data processing bourdon on the network

layer where energy meter usage trends, storage of structured data pertaining to

the energy consumption and trends, interpretation of data by a suitable application

to support decision-making regarding the control management and comprehensive

energy consumption monitoring authorizations.

Table 9.1 Three-layer architecture proposed activities

S. No. Layer Activates

1Terminal layer Data collection—raw data on energy consumption

2 Perception layer Data processing—data consolidation to resource description

framework (RDF) using semantic Web technologies. Also, data

integration and reasoning

3 Network layer Data storage and interpretation—device controls, alters and

overall control management of the energy consumption trends

APPLICATION

ENERGY METERING TRENDS, STORAGE, INTERPRETATION &

MONITORING & CONTROL MODULES

DATA INTEGRATION

DATA

FILTERING OF DATA

COLLECTION & CONSOLIDATION OF DATA

ENABLERS

COMMUNICATION

PROTOCOLS

DRONE BASED

DATA

COLLECTION

SDN

SMART METER IoT IoT

PHYSICAL WORLD –

ENERGY CONSUMPTION SMART METER EQIPPED HOME/OFFICE/INDUSTRY

ESTABLISMENTS

Fig. 9.4 IoT-Fog–Cloud layers (proposed)

90 A. K. Damodaram et al.

Fig. 9.5 Block diagram of proposed system for IoT environment with Wi-Fi and drone modules

9.3.3 Arduino-Based Energy Meter Control

and Management

Figure 9.5 shows the Arduino-based energy meter control and management modules.

The ﬁgure elaborates various components and modules and their connectivity logic.

The following are the expected additional features of the proposed model with usual

features as mentioned below.

a. Tamper proof

b. Low/High voltage protection

c. IoT adaptability.

9.3.4 Components of IoT-Based Energy Meter Control

and Management

9.3.4.1 Smart Energy Meter Module

To record and display the energy consumption in KW at the user place, it is required

to measure the frequency, power factor along with the current and voltage [19].

9 Multi-layered Architecture to Monitor and Control… 91

9.3.4.2 Arduino

With required uplink/downlink capabilities, it have to send information to Wi-Fi

module or IoT-Drone adapter module. The selected Arduino shall have the low-power

consumptions, which shall be controlled via AT Commands and need to be embed

with TCP/UDP protocols. It is also required to support the required communication

protocols like TTL. Proper Arduino codes need to be developed to facilitate inter-

facing, Internet testing and sending and receiving data from the different modules of

the energy meter control and management system.

9.3.4.3 Wi-Fi Module

To provide seamless wireless connectivity to IoT three-layer platforms with Internet

connectivity.

9.3.4.4 IoT to Drone Adapter

The drone adapter shall equip interfacing modules to provide seamless connectivity

between energy meter control and management system as and when required.

9.3.4.5 LCD Display

LCD module with display has to display the parameters like voltage, current, power

(real time), energy consumptions, power factor and consolidated energy bill in real

time.

9.3.4.6 Relays and Regulators

Relays and regulators required to connect and disconnect the smart energy meter

system from the grid. Also, rectiﬁers, ﬁlter capacitors, voltage regulators along with

a step down transformer for the system are required.

In the proposed system, smart energy meter provides the information about the

usage of energy to Arduino which in turn displays the same for the use of consumer.

The information about the reading shall be sent to either to Wi-Fi module or IoT-

Drone adapter module to the layers (three layers proposed).

In case of far reach and scattered user points, a drone shall be equipped with

the receiver module to receive the information from IoT-Drone board in real time.

In case of users in regular domestic establishments like gated communities, open

communities and town/city layouts, Wi-Fi module shall convey the information about

the energy meter reading through Arduino.

92 A. K. Damodaram et al.

9.3.5 Features of Proposed ‘Three-Layered Architecture’

for Smart Meter Monitoring and Management

with Integration of Cloud, Fog, IoT and Wireless Sensor

Network Model as Shown in Fig. 9.3 Are as Follows

a. Smart meters and devices connectivity through terminal layer with the help of

interfacing device/meters/technologies.

b. Consolidation and data analytics through Fog in perception layer.

c. Storage of data after consolidation and analytics in the form of trends and

databases in Cloud in network layer.

The proposed architecture includes three platforms Fog, Cloud and IoT environ-

ment to facilitate data collection, data analytics and data storage at three different

layer of the architecture. Domestic establishment consists of several premises like

homes, campus, buildings and ofﬁce and industry establishments consists, factory

space, ofﬁces, departments, labs and rooms. Each establishment in the smart city

environment is with a different energy needs and utilization that are signiﬁcant to

take into account in managing energy consumption as a single unit. Even in a single

home, each family member has his/her own energy consumption requirements which

needs to be considered. Hence, this framework has a tree-like structure control plane

as presented in Fig. 9.3, comprising ‘three-layered network’ for energy consump-

tion monitoring and management right from single unit level consumption to smart

city level. Today’s IoT systems include event-driven smart applications (apps) that

interact with sensors and actuators [20].

9.3.6 Threat Landscape in Three Layer Architecture

Smart city IoT establishments are susceptible to cyber-attacks. The intruder may

breach into the architecture communication technologies and may temper, manip-

ulate or misuse the data pertaining to the energy consumption and usage trends.

Among the security threat in smart city establishment, data manipulation threats are

crucial. The data manipulation in the smart city may lead to tampering the data,

corrupting the data, misuse of the data and disrupting decision-making.

Data tampering: Data tampering may result in faulty smart meter readings

conveyed to the perception layer.

Data corruption: Data corruption may result in faulty meter readings conveyed to

the network layer.

Data misuse: Data misuse may result in fraudulent transactions related to the

energy consumption. And ﬁnally,

Disruptions: Disruption may result in decision-making regarding energy usage

trends in the city’s individual establishments like homes, buildings, ofﬁces and factory

9 Multi-layered Architecture to Monitor and Control… 93

premises and may inﬂuence the structured data stored in the network layer for further

decision-makings.

Since the intruder ﬁrst attack the smart devices and smart meters in the terminal

layer, it is essential to deploy the security measures to the perception layer with

the help of anti-virus, ﬁrewalls, etc. Hence, the smart meters and devices should be

enabled with protection against tampering, manipulation and reprogramming.

9.4 Conclusions

Proposed framework for energy consumption control and management through three-

layered Cloud–Fog-IoT architecture for deployment ensure decentralization of the

function of conventional central IoT platform. In general, Cloud architecture has

some limitations due to its geographically centralized and multi-hop nature from IoT

devices data source and hence inclusion of Fog layer in the proposed framework.

This inclusion shall address limitations by endorsing decentralization of devices,

applications and platforms. The decentralization shall lead to release of total data

burden on central system by distributing the same through the terminal and perception

layers where much data analytics is done at the initial stages of data acquisition stages.

The proposed framework is developed for energy monitoring system where the all

the limitation of conventional energy monitoring is eliminated and addressed. In

this work, motivation to propose Cloud–Fog-IoT-based architecture framework for

energy consumptions monitoring is because of ﬂexibility in three different layers

with three different approaches with inclusion of drones. The present work gives

a scope to further work on performance evaluation by considering parameters like

network delay, latency, cost, etc., on the proposed architecture through a simulation

tool kits to demonstrate feasibility of the proposed architecture.

References

1. Pal, A. (2015). Internet of Things: Making the hype a reality (PDF). IT Pro, 17(3), 2–4. https://

doi.org/10.1109/MITP.2015.36

2. Rashdi, A., Malik, R., Rashid, S., Ajmal, A., & Sadiq, S. (2012). Remote energy monitoring,

proﬁling and control through GSM network. In 2012 International Conference on Innovations

in Information Technology (IIT).

3. Alfandi, O., Hasan, M., & Balbahaith, Z. (2019). Assessment and hardening of IoT development

boards. Lecture notes in computer science (pp. 27–39). Springer. https://doi.org/10.1007/978-

3-030-30523-9_3. ISBN: 978-3-030-30522-2

4. Chauhuri, A. (2018). Internet of things, for things, and by things (p. 9781138710443). CRC

Press.

5. Das, H., & Saikia, L.C. (2015). GSM enabled smart energy meter and automation of home

appliances. IEEE. 978-1-4678-6503-1.

6. Lloret, J., Tomas, J., Canovas, A., & Parra, L. (2016). An integrated IoT architecture for smart

metering. IEEE Communications Magazine, 54(12), 50–57.

94 A. K. Damodaram et al.

7. Solaimani, S., Keijzer-Broers, W., & Bouwman, H. (2015). What we do—And don’t—Know

about the Smart Home: An analysis of the Smart Home literature. Indoor and Built Environ-

ment, 24(3), 370–383. https://doi.org/10.1177/1420326X13516350. ISSN: 1420-326X. S2CID

59443602.

8. Why edge computing is an IIoT requirement: How edge computing is poised to jump-start the

next industrial revolution. Retrieved June 3, 2019 from iotworldtoday.com

9. Staff, Investopedia. (2011). Cloud computing. Investopedia. Retrieved October 8, 2018.

10. Hsu, C.-L., & Lin, J.C.-C. (2016). An empirical examination of consumer adoption of Internet

of Things services: Network externalities and concern for information privacy perspectives.

Computers in Human Behavior, 62, 516–527.

11. Alsulami, M. M., & Akkari, N. (2018). The role of 5G wireless networks in the internet-

of-things (IoT). In 2018 1st International Conference on Computer Applications Information

Security (ICCAIS) (pp. 1–8). https://doi.org/10.1109/CAIS.2018.8471687. ISBN: 978-1-5386-

4427-0

12. Hassan, Q., Khan, A., & Madani, S. (2018). Internet of things: Challenges, advances, and

applications (p. 198). CRC Press. ISBN: 9781498778510.

13. Hamilton, E. (2019). What is edge computing: The network edge explained. Retrieved June

14, 2019 from cloudwards.net

14. Reza Arkian, H. (2017). MIST: Fog-based data analytics scheme with cost-efﬁcient resource

provisioning for IoT crowdsensing applications. Journal of Network and Computer Applica-

tions, 82, 152–165. https://doi.org/10.1016/j.jnca.2017.01.012

15. Rouse, M. (2019). Internet of things (IoT). IOT Agenda. Retrieved August 14, 2019.

16. Raji, R. S. (1994). Smart networks for control. IEEE Spectrum, 31(6), 49–55. https://doi.org/

10.1109/6.284793 S2CID 42364553.

17. Boyes, H., Hallaq, B., Cunningham, J., & Watson, T. (2018). The industrial internet of things

(IIoT): An analysis framework. Computers in Industry, 101, 1–12.

18. Perera, C., Liu, C. H., & Jayawardena, S. (2015). The emerging internet of things marketplace

from an industrial perspective: A survey. IEEE Transactions on Emerging Topics in Computing,

3(4), 585–598.

19. Kabir, Y., Mohsin, Y. M., & Khan, M. M. (2017). Automated power factor correction and

energy monitoring system. IEEE

20. Nguyen, D. T., Song, C., Qian, Z., Krishnamurthy, S. V., Colbert, E. J., & McDaniel, P.

(2018). IoTSan: Fortifying the safety of IoT systems. In Proceedings of the 14th Interna-

tional Conference on emerging Networking EXperiments and Technologies (CoNEXT ‘18).

Heraklion, Greece. arXiv:1810.09551.https://doi.org/10.1145/3281411.3281440

Chapter 10

Bio-Inspired Fireﬂy Algorithm

for Polygonal Approximation on Various

Shapes

L. Venkateswara Reddy, Ganesh Davanam, T. Pavan Kumar,

M. Sunil Kumar, and Mekala Narendar

Abstract Polygonal approximation (PA) is a challenging problem in representation

of images in computer vision, pattern recognition and image analysis. This paper

proposes a stochastic technique-based ﬁreﬂy algorithm (FA) for PA. This technique

customizes a kind of randomization by searching a set of solutions. In contrast, PA

requires more combination of approximation to ﬁnd an optimal solution. The algo-

rithm involves several steps to produce better results. The attractiveness and bright-

ness of the ﬁreﬂy have been used efﬁciently to solve the approximation problem.

While compared to other similar algorithms, FA is independent of velocities which

are considered as an advantage for this algorithm. Subsequently, the multi-swarm

nature of FA allows ﬁnding multiple optimal solutions concurrently. This technique

achieves the main goal of PA that is minimum error value with less number of domi-

nant points. The experimental results show that proposed algorithm generates better

solutions compared to other algorithms.

10.1 Introduction

Extraction of boundary points from a shape and representation of the same shape

with lesser number of boundary points is a state-of-the-art. This technique is termed

as polygonal approximation (PA) for the closed curves. These are applied in various

L. Venkateswara Reddy

Department of Computer Science and Engineering, KG Reddy College of Engineering and

Technology, Chilikuru (Village), Moinabad (M), R R District, Hyderabad, Telengana, India

G. Davanam (B)·M. Sunil Kumar

Department of Computer Science and Engineering, Sree Vidyanikethan Engineering College

(Autonomous), Tirupati, Andhra Pradesh, India

e-mail: dgani05@gmail.com

T. Pa va n K umar

Department of Computer Science and Engineering, Koneru Lakshmaiah Educational Foundation,

Vaddeswaram, Andhra Pradesh, India

M. Narendar

Jayamukhi Institute of Technological Sciences, Hanmkonda, Telangana, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_10

96 L. Venkateswara Reddy et al.

applications such as image analysis, computer vision and pattern recognition. The

accuracy depends on the compatibility of the approximated polygon to the original

polygon shape. As per the compactness, the error values differ and produce different

approximation. The points which are considered for the approximation is named

as a dominant points (DP) of the approximation. PA involves approximation of a

digital planar curve using line segments that are categorized as curvature maxima

and minima points. The technique involves determination of an object boundary

by tracing straight lines and approximating its outline to the best possible manner.

However, the time and space complexity of the algorithm increases with the number

of DP that cannot be usually manipulated in real time. This paper attempts to create a

novel mechanism to enable one to decide the accuracy required for tracing an object

outline using least possible. This paper begins with outlining background behind

this research and discusses various related works pioneered by various researchers

and then goes on to propose a sub-optimal algorithm. The authors subsequently

explain their derivation. This paper concludes with the advantages and limitations

of the algorithm as well as provides and overview of the various avenues where this

algorithm might ﬁnd its usage. The works of few decades were used split, merge,

split-and-merge and sequential techniques for solving the polygonal approximation

[1–10].

The polygonal approximation follows any one of these criteria,

min-ε:ﬁx Mvalue, search for the optimal Mvertices to approximate polygon

with minimum value of total distortion error.

min-#:ﬁxε, identify the vertices to approximate the polygon whose approximation

error does not exceed ε.

Ramer [9] used split-and-merge approaches and proposed an iterative method by

taking initial boundary points as input. The segments are splitted in each iteration

by considering the farthest point until the approximation error is not exceeding the

speciﬁed error value.

Masood [11] proposed an algorithm based on dominant point deletion for polyg-

onal approximation. The initial points are detected as the pre-processing steps and

eliminates the point one-by-one depends upon the error associated with each DP.

Once a point is deleted the neighbouring point’s error value is updated. The algorithm

does not guarantee optimal results.

Marji [12] proposed a nonparametric approach for the dominant point detection.

The 8-connected chain code is used efﬁciently for the PA based on the strength of

the point (left support region +right support region), length of support region and

centroid.

In this paper, Sect. 10.2 includes the existing meta-heuristic algorithms which

includes iPSO, GA, DPSO-EDA, DPSO, PSO, mPSO, etc. Section 10.3 summarizes

the problem formulation, and Sect. 3.1 gives an overview about Freeman’s chain

code computation and explains the rules in computing the freeman’s chain code

for a digital shape with illustrations. Section 10.4 narrates about the signiﬁcance of

ﬁreﬂy algorithm (FA). Sect. 4.1 suggests the proposed ﬁreﬂy algorithm for polyg-

onal approximation. Section 5 explains the experimental results of the proposed

10 Bio-Inspired Fireﬂy Algorithm for Polygonal … 97

algorithm comparative to the similar existing algorithms. Finally, Sect. 10.6 presents

the conclusion of the paper.

10.2 Problem Formulation

A closed digital curve CD with Npoints which are clockwise ordered sequence

of contour points are represented as C={Ci(Xi,Yi)/i=1,2,3,...N}, where ith

contour point Ciholding Xi,Yiand CN+i=C1.LetBij ={Ci,Ci+1 ,…,Cj) represents

a bend from Cito Cj.TheLS

ij represents the line segment between Ciand Cj, i.e. the

chord of Bij. The distortion error associated with approximating the bend Bij using

its chord LSij is determined using Eq. (10.1).

e(LS

ij,Bij)=



d(Ck,Lij)(10.1)

where d(Ck,Lij)is the perpendicular distance from the line segment LS

ij to the

contour point, Ck. The approximated polygon retains the shape with ‘M’ number of

contour points and is represented as

NC ={nci(xi,yi)/i=1,2,3,...M}

where ‘Mis always less than N’.

Figure 10.1 shows the sample approximation of the shape apple-5 from the

MPEG7_CE-Shape-1_Part_B. Figure 10.1a shows the original image with N=881,

and Fig. 10.2a shows the sample approximation with M=55. The approximated

image shows that it retains the same shape of the original image with lesser number

of contour points. The points which retain the same shape are decided as dominant

points. The points which are suppressed are decided as weak points of the shape.

Many techniques are used for the approximation of the shape with different criterion

and error metrics. Most of the techniques are split, merge, split and merge, sequential

and iterative in nature points. [9,10]. Most of the methods involve obtaining initial

set of candidates using freeman chain code assignment and then either compute the

redundant break point deviation from the curve or create pseudo points and attempt

to calculate the deviation of each point with respect to the median value.

10.2.1 Basic Pre-processing

The basic pre-processing which is used to suppress the collinear points and used

to detect the initial break points on most of the algorithm was proposed by Marji

et al. [12], Masood [11]. These break points are detected using freeman’s chain code

98 L. Venkateswara Reddy et al.

a. Original Contour N=881

(apple- 5)

b. Approximated contour M = 55

(apple-5)

Fig. 10.1 Sample approximation. aOriginal contour N=881 (apple-5)andbapproximated

contour M=55 (apple-5)

using eight different directions which are separated by 45° each direction as shown

in Fig. 10.2a.

A closed digital curve CD with Npoints which are clockwise ordered sequence

of contour points are represented as C={ci(Xi,Yi)/i=1, 2, 3,…, N}, where ith

contour point Ciholding Xi,Yi)(xi,yi) and Cn+I=C1CN+1 . The chain code which is

generated using this approach is shown in Fig. 10.2b, and the extracted breakpoints

for the chromosome shape is shown in Fig. 10.2c.

The following condition should be followed while computing the breakpoints as

given in Eq. (10.2)

CCi= CCi−1(10.2)

where CCirepresents the chain code of the ith point which is compared with CCi−1

if it is not equal, then it is treated as a break point of the shape.

10.3 Fireﬂy Algorithm

The proposed algorithm for polygonal approximation is nature-inspired algorithm

which is suitable to produce a better approximation. Fireﬂy algorithm is one among

the nature-inspired algorithm.

In the hot and moderate regions, ﬁreﬂy’s ﬂashing light is a wonderful sight. There

are about thousands of ﬁreﬂies which produce periodic and short ﬂashes. The ﬂashing

10 Bio-Inspired Fireﬂy Algorithm for Polygonal … 99

(a) Eight direction of Freeman (b) Chain code for chromosome

5 5 4 5 4 3 2 0 110

1 1 1 1 1 2 1 1 21 2

0 0 6 6 5 6 5 56 0 0

1 0 1 0 7 6 55 5 4 5

5 5 5 5 5 5 5 5 4 31

2 1 2 2

Fig. 10.2 Chain code illustration

light of ﬁreﬂies is produced by the bioluminescence and signalling systems. That is

quiet rhetoric. These ﬂashes are to attract the pair of partners and potential target.

The light intensity, Lof the ﬂash decreases when the distance, dincreases. It is

denoted by L∞1/d2. As the distance increases, the intensity of the ﬂash decreases

and becomes weaker. These reasons cause the ﬁreﬂy’s visibility to a limited distance

and are typically sufﬁcient to communicate to each other. The optimized objective

function is formulated and associated to the ﬂashing light, which leads to articulate

new optimization algorithm.

On describing the new ﬁreﬂy algorithm (FA), the following three rules to be

idealized:

1. Entire ﬁreﬂies are same sex so that one ﬁreﬂy will be fascinated to other ﬁreﬂies

irrespective of their sex.

2. Brightness is proportionate to their attractiveness; one less bright ﬁreﬂy is

attracted to the brighter one. Both attractiveness and brightness are decreased

100 L. Venkateswara Reddy et al.

when their distance increases. If no brighter one is availed, then it moves

randomly.

3. A ﬁreﬂy’s brightness is evaluated by the ﬁxation of the objective function. The

objective function deﬁnes the brightness factor.

Algorithm for the Fireﬂy algorithm (FA)

Input: Fireﬂy population P =(pi, ………., pN), Objective function OF (xi)

Output: The best solution pbest and its value OFmin =min (OF (pbest ))

Initial population generation IP =(pipi, ………., pip N),

OF(p0i)=estimate new best_solution and upgrade the intensity of light

k=0;

fori =1: n

forj=1: i

if (OF(p0ji)>OF(p0j i)),

end if

Attractiveness varies with distance d

calculate new solutions and upgrade intensity of light

end for j

end for i

Rate the ﬁreﬂies and identify the recent best

k=k+1

while (k <Max_Gen)

end while

The ﬁreﬂy algorithm is another populace-based algorithm, and every member

represents an initial solution of the problem to be solved and thus signiﬁes a point in

the search space. The initial solution is denoted as follows (10.3)

Ci=(Ci1,...,CiD)for i=1,...,N(10.3)

where Nrepresents the size of the population, and Drepresents the problem’s

dimensionality.

The initial solution is given by Eq. (10.4)

C(0)

ij =U(0,1).(ub −lb)+lb,for i=1,..., N(10.4)

where U(0, 1) denotes a random number in the interval [0,1] of uniform distribution

and lbjrefers to the lower limit and ubjrefers to the upper limits of the jth problem

variable.

The variation operator works on the intensity of the light, I∞r2and it can be

formulated as Eq. (10.5)

L(r)=le−r2(10.5)

where I0denotes the source light intensity and ﬁxed light absorption co-efﬁcient γ.

Usually the ﬁreﬂies get attracted to one another when it meets a brightest ﬁreﬂy in

10 Bio-Inspired Fireﬂy Algorithm for Polygonal … 101

its environment. Similar to the intensity of light, the attractiveness βis inﬂuenced

by the distance r. It can be computed by Eq. (10.6)

β(r)=β0e−r2for K >=1 (10.6)

where β0indicates the attractiveness at the distance, r=0.

Euclidian distance is used to represent the distance between any two ﬁreﬂies and

is given by Eq. (10.7)

Nij =

xi−xj

=



D



k=1

xikxjk(10.7)

where xiand xjrepresent any two ﬁreﬂies and Xik and Xjk represent the kth element of

the ith and jth ﬁreﬂy positions with respect to the search space, and Drepresents the

problem’s dimensionality. A ﬁreﬂy, igets attracted to the ﬁreﬂy j, and is represented

by Eq. (10.8)

Xi=xi+βoe−r2xj−xi+α·ei (10.8)

where a step size scaling factor is denoted by and randomization parameter is

denoted by ε. Equation (10.8) shows the sum of the position of the ith ﬁreﬂy with

the social component of ith ﬁreﬂy when attracted and move towards the jth ﬁreﬂy

and the ith ﬁreﬂy randomized move within the search space.

10.3.1 Fireﬂy for Polygonal Approximation

The ﬁreﬂy algorithm has advantages which includes the automatic subdivision and

facilitate the multimodal problem.

1. Fireﬂy algorithm works on the concept of attractiveness. The attraction decreases

with increase in the distance, which automatically subdivide the whole population

into subdivisions. Within every subdivision, the members of the subdivision

swarm around each mode or local optimum. The best global solution is identiﬁed

using all these modes. Similarly, polygonal approximation also considers the

local optimum, which leads to the global optimum solution. The perpendicular

distance is calculated between the three consecutive points in the shape, so the

points with higher distance are considered for approximation. If the perpendicular

distance is less, then the points are suppressed during the approximation.

2. This subdivision permits the ﬁreﬂies to detect all optima at the same time if the

population size is adequately greater than the number of modes. The split and

merge with combination of iterative process are used to identify the dominant

102 L. Venkateswara Reddy et al.

points. The algorithm works for multi-objective problem in polygonal approxi-

mation that is total distortion must decrease when the number of points decreases.

Thus approximation is achieved.

Algorithm for polygonal approximation using FA

Input: The contour points P =(pi, ………., pN), with the coordinates

{(xi,yi,),…………,(xN,yN)} Objective function f(pi)

Output: The best solution f(p

best)and its value fmin =min(f(pbest ))

Initial population generation pip =(pipi, ………., pip N),

f(p0i)=estimate new solution and upgrade

k=0;

do fori=1: n

forj=1: n

if (f(p0ji)>f(p0j i)) then

end if

Attractiveness varies with distance d

end for j

f(pi(t))=calculate new solutions and upgrade

end fori

Rate the points and identify the recent best

t=k+1

while (k<Max_Gen)

The original contour points are treated as the input for the algorithm, and the

output is the best optimal solution.

The squared perpendicular distance is treated as the associated error value (AEV),

where this value is high those points are treated as a dominant point. The sum of

this associated error value is called as the integral square error (ISE) which means

the total deviation of the approximated digital curve from the original digital curve.

Initialize the ﬁreﬂy structure, population array and create the initial population with

best solution ever found. The iteration compares the objective function’s metric with

each other. The values generated are compared to ﬁnd the minimum value and least

is used to ﬁnd the optimal solution. Different solutions are generated with different

combination of the original population with respect to the number of iterations. The

minimum value obtained by the objective function is updated by the recent minimum

value, ﬁnally is compared to the best solution which is modelled.

The best solution which is obtained is treated as the optimal solution. Figure 10.3

explains the detailed ﬂow of the entire phenomenon where the objective function is

used to generate the solution from a wide range of random population which in turn

helps in ﬁnalizing the ﬁnal best solution.

10.4 Experimental Results and Discussion

The experiments has been conducted for two digital curves namely a digital curve

that contains four semicircle of 102 points as given in Fig. 10.4a and the leaf shaped

10 Bio-Inspired Fireﬂy Algorithm for Polygonal … 103

Fig. 10.3 Algorithm ﬂow

curve of 120 points as in Fig. 10.5a. These curve shapes are bench marking shape

for testing the PA algorithms. The performance of the algorithms is compared with

other similar algorithms using those shapes.

The synthesized shapes and their computed chain code is shown in Fig. 10.4a,

b, and Figs. 10.5a, b 10.4a show the digital curve with semicircle shape before

approximation with 102 points, and Fig. 10.4b shows the chain code of the semicircle

shaped curve computed based on the freeman’s technique. Figure 10.5ashowsthe

leaf shape before approximation with 120 points, and Fig. 10.5b shows the chain

code of the leaf shape computed based on freeman’s technique. Similarly, Table 10.1

shows the comparison of results based on the number of DP, M and the ISE value.

The ISE value is a performance measure used here which helps in establishing the

error associated. The objective of our approach is to retain the shape of the digital

104 L. Venkateswara Reddy et al.

5 4 5 4 4 3 4 2

3 2 2 1 2 1 3 2

2 2 2 2 2 2 2 1

2 2 1 1 1 1 1 1

0 0 1 0 0 0 0 0

0 0 0 7 0 0 7 7

7 7 7 7 6 6 7 6

6 6 6 6 6 6 6 5

7 6 7 6 6 5 6 4

5 4 4 3 4 3 6 6

(a) (b)

Fig. 10.4 a Semicircle shape with 102 points and bchain code of semicircle shape

7 6 6 6 1 1 1 1

1 6 6 6 5 6 6 5

5 0 0 0 1 0 0 5

6 6 5 6 5 5 0 0

1 1 0 6 6 5 6 5

6 5 5 5 5 5 6 6

6 7 6 6 6 6 6 6

6 6 6 4 2 2 2 2

2 2 2 2 2 2 2 3

2 2 4 4 3 4 3 3

(a) (b)

Fig. 10.5 a Leaf shape with 120 points and bchain code of leaf shape

curve with minimum number of DP as well as with minimum error. The proposed

FA-based PA technique achieved the result which is given in Table 10.1 for reference.

Usually the standard of PA can be assessed based on the retainment of the shape of

digital curve with minimum number of DP. ISE is one of the most signiﬁcant measure

in determining the quality of PA which is widely used in different PA algorithms.

The ISE refers to the error caused during the approximation of a polygonal shape.

ISE is computed as given in Eq. (10.10).

ISE =



k=1

Ek(10.10)

10 Bio-Inspired Fireﬂy Algorithm for Polygonal … 105

Table 10.1 Comparison of the results of semicircle and leaf obtained by the bPSO, mPSO, GA,

iPSO and the proposed (FA)-based PA algorithm

Shape of digital curve Algorithm MISE

Semicircle bPSO 21 9.11

mPSO [13]20 9.57

GA [14]20 9.68

iPSO [15]20 9.2

FA 20 9.93

bPSO 13 27.12

mPSO [13]12 28.78

GA [14]12 27.87

iPSO [15]12 26

FA 12 28.17

Leaf bPSO 24 9.62

mPSO [13]24 9.56

GA [14]24 9.48

iPSO [15]23 9.46

FA 23 9.46

bPSO 16 27.6

mPSO [13]16 27.56

GA [14]16 27.56

iPSO [15]16 26.6

FA 16 27.26

where Ekis the squared distance of the kth point of the digital curve approximating

the polygon. When the value of ISE is higher, then the compression ratio is maximum.

Similarly minimum value of ISE refers to less compression ratio.

The results for the semicircle shape after PA is shown in Fig. 10.6a, b with DP, M

=20 and M=12. The results for the leaf shape after PA with dominant points M=

23 and M=16 is given in Fig. 10.7a, b.

10.5 Conclusion

The ﬁreﬂy algorithm is well suited for the approximation problem and approximates

the shape with less number of DP. The algorithm detects the DP with its attractiveness

and brightness which is in the basis of the FA algorithm. The FA algorithm takes

the original contour points as the input and generates the best optimal solution as

the output. A set of random contour points are selected based on the original popu-

lation and passed to the objective function to ﬁnd the minimum value. FA-based PA

106 L. Venkateswara Reddy et al.

a. M=20 b. M=12

Fig. 10.6 PA o f S e m ic i r c l e awith 20 DP and bwith 12 DP

a. M=23 b. M=16

Fig. 10.7 PA of leaf awith 23 DP and bwith 16 DP

successfully achieved the objective of retaining the shape of the digital curve with

minimum error value and minimum number of DP. The present FA-based PA has

presented a comparison of different shapes based on ISE error criterion since ISE

is a signiﬁcant measure in establishing the quality of PA. The error values which is

obtained for the shape leaf and semicircle is considerably less and generates a better

approximation.

10 Bio-Inspired Fireﬂy Algorithm for Polygonal … 107

References

1. Kalaivani, S., & Ray, B. K. (2019). A heuristic method for initial dominant point detection for

polygonal approximations. Soft Computing, 23(18, 27), 8435–8452.

2. Jung, J.-W., So, B.-C., Kang, J.-G., Lim, D.-W., & Son, Y. (2019). Expanded Douglas-Peucker

polygonal approximation and opposite angle-based exact cell decomposition for path planning

with curvilinear obstacles. Applied Sciences, 9, 638.

3. Madrid-Cuevas, F. J., Aguilera-Aguilera, E. J., Carmona-Poyato, A., Muñoz-Salinas, R.,

Medina-Carnicer, R., & Fernández-García, N. L. (2016). An efﬁcient unsupervised method

for obtaining polygonal approximations of closed digital planar curves. Journal of Visual

Communication and Image Representation, 39, 152–163.

4. Fernández-García, N. L., Del-Moral Martínez, L., Carmona-Poyato, A., Madrid-Cuevas, F.

J., & Medina-Carnicer, R. (2016) A new thresholding approach for automatic generation of

polygonal approximations. Journal of Visual Communication and Image Representation, 35,

155–168.

5. Masood, A. (2008). Optimized polygonal approximation by dominant point deletion. Pattern

Recognition, 41(1), 227–239.

6. Marji, M., & Siy, P. (2003). A new algorithm for dominant points detection and polygonization

of digital curves. Pattern Recognition, 36(10), 2239–2251.

7. Kolesnikov, A., & Fränti, P. (2003). Reduced-search dynamic programming for approximation

of polygonal curves. Pattern Recognition Letters, 24(14), 2243–2254.

8. Salotti, M. (2001). An efﬁcient algorithm for the optimal polygonal approximation of digitized

curves. Pattern Recognition Letters, 22(2), 215–221.

9. Pikaz, A., & Dinstein, I. (1995). An algorithm for polygonal approximation based on iterative

point elimination. Pattern Recognition Letters, 16(6), 557–563.

10. Ray, B. K., & Ray, K. S. (1995). A new split-and-merge technique for polygonal approximation

of chain coded curves. Pattern Recognition Letters, 16(2), 161–169.

11. Masood, A. (2008). Dominant point detection by reverse polygonization of digital curves.

Image and Vision Computing, 26(5), 702–715.

12. Marji, M., & Siy, P. (2004). Polygonal representation of digital planar curves through dominant

point detection—A nonparametric algorithm. Pattern Recognition, 37(11), 2113–2130.

13. Wang, B., Shu, H.-Z., Li, B.-S., & Niu, Z.-M. (2008). A mutation-particle swarm algorithm

for error-bounded polygonal approximation of digital curves. In Lecture Notes in Computer

Science (pp. 1149–1155).

14. Wang, B., Shu, H., & Luo, L. (2009). A genetic algorithm with chromosome-repairing for

min−# and min−εpolygonal approximation of digital curves. Journal of Visual Communica-

tion and Image Representation, 20(1), 45–56.

15. Wang, B., Brown, D., Zhang, X., Li, H., Gao, Y., & Cao, J. (2014). Polygonal approximation

using integer particle swarm optimization. Information Sciences, 278, 311–326.

Chapter 11

An Efﬁcient IoT Security Solution Using

Deep Learning Mechanisms

Maganti Venkatesh, Marni Srinu, Vijaya Kumar Gudivada,

Bibhuti Bhusan Dash, and Rabinarayan Satpathy

Abstract The Internet has become an inextricable element of human life, and the

number of Internet-connected gadgets is rapidly growing. Internet of Things (IoT)

gadgets, in particular, has become an integral component of modern life. IoT network

participants are generally resource constrained, rendering them vulnerable to cyber-

threats. Classic cryptographic techniques have been extensively used to deal with

the safety and conﬁdentiality problems in IoT systems in this regard. Due to the

exclusive qualities of IoT nodes, available results are unable to cover the complete

defense spectrum of IoT networks. However, some difﬁculties are becoming more

prevalent, and their remedies are unclear. The IoT is posing an increasing number of

issues in terms of technological security. The Internet of Things, on the other hand,

has been shown to be prone to security breaches. To address security concerns, it is

critical to establish effective solutions through the progress of the latest technologies

or the integration of obtainable technology. Deep learning, a division of machine

learning, has previously demonstrated potential for ﬁnding security vulnerabilities.

IoT devices also generate a lot of data with a lot of variety and veracity. As a result,

by using big data technologies, it is possible to achieve enhanced speed and data

management. In this research, we examine the safety necessities, assault vectors,

and existing safety resolutions for IoT systems and offer a ground-breaking deep

learning strategy for IoT security.

M. Venkatesh ·M. Srinu

Department of Computer Science and Engineering, Aditya Engineering College Surampalem,

East Godavari, AP, India

V. K. Gudivada

Department of Information Technology, Nawab Shah Alam Khan College of Engineering and

Technology, Hyderabad, TS, India

B. B. Dash

School of Computer Applications, KIIT Deemed to be University, KOEL Campus, Patia,

Bhubaneswar, Odisha, India

R. Satpathy (B)

Professor CSE (FET) and Director of the Ofﬁce of the VC, Sri Sri University, Cuttack, Odisha,

India

e-mail: rabinarayan.satpathy@gmail.com

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_11

109

110 M. Venkatesh et al.

11.1 Introduction

In view of the quick development of arising advances like sensors, PDAs, 5G corre-

spondence, and augmented reality, creative applications like associated enterprises,

shrewd urban communities, and keen energy are being created. These applications

include: associated businesses, associated autos, associated horticulture, associated

fabricating ediﬁces, associated medical care, shrewd retail outlets, and brilliant

production network, which are all adding to the aggregation of gigantic measures of

information. It is assessed that 50.1 billion Internet of Things (IoT) gadgets will be

associated with the Internet by 2020, as per a review directed by the National Cable

and Telecommunications Association (NCTA). The security of Internet of Things

devices has become a source of controversy as the number of devices has increased

[1,2].

As of January 1, 2018, virtually, every industry has been impacted by an onslaught

of cyberattacks and data breaches, according to McAfee Security. Aside from that, a

large number of these attacks were directed at Internet of Things (IoT) devices. As the

use of Internet of Things devices continues to rise, cybercriminals are increasingly

targeting them. Furthermore, because of the possibility of linkage, Internet of Things

devices are vulnerable [3]. According to VDC Research Group Inc., a study was

conducted in order to analyze the difﬁculties associated with establishing linked

devices. Approximately, 60% of the issues associated with developing connected

devices are related to security requirements [4] according to the report. According

to Kaspersky Lab data, the amount of malware trials for IoT procedures surged

substantially from 3219 samples in 2016 to 121,588 samples in 2018 [5]. There is

no denying that there are a number of vulnerabilities in Internet of Things devices.

Network-based risk monitoring is a challenge for many ﬁrms, according to [6,7]

particularly for the government, energy, and healthcare sectors as well as for deposito-

ries and research institutes. These segments also spend in safety-checking systems to

defend and lock their transportation. Massive amounts of information were created by

IoT devices, as previously stated, and this data travel via networks. Network attacks

can compromise data ﬂow across a network. According to the research, current tech-

nologies and approaches are unable to detect hackers’ innovative attacks because of

the volume, speed, variety, and veracity of available data. Because of this, a weekly

or monthly security analytics report will fall short when dealing with signiﬁcant

amounts of data. Another beneﬁt of big data technology is that it can deal with data

volume, velocity, variety, and validity difﬁculties.

It is possible that data transmissions through a network will be subject to network

assaults. In the research, it is said that current methodologies and approaches are

insufﬁcient for detecting novel assaults conducted by cybercriminals because of the

quantity of data, the rate at which it is generated, the variety of data. A weekly

or monthly security analytics report is also insufﬁcient when dealing with massive

amounts of data in order to detect and prevent risks. According to the paper, big data

technology will also be able to deal with challenges such as data volume, velocity,

variety, and validity.

11 An Efﬁcient IoT Security Solution Using Deep … 111

11.2 Related Work

Deep learning and big data technologies may be utilized to enhance the safety of IoT

plans, according to current study. Aspect engineering, unsupervised pre-training,

and ﬁrmness ability in deep learning have lately gained appeal, making it a hot

topic right now. Deep learning may be employed even in networks with minimal

resources because of these properties. With the ability to learn on its own, produce

highly accurate ﬁndings, and handle data more quickly, deep learning has become

quite popular. A resource-controlled scheme may dart into further concerns, such as

out-of-memory access and unsafe programming languages [8]. This is crucial. Many

studies just look at one aspect of IoT security, such as deep learning, big data, or big

data analytics. Deep learning [9,10] or big data have been studied for IoT protection.

There has been no prior study that comprehensively examines the viability of merging

these two techniques in the circumstance of IoT safety, to the best of our knowledge.

Monitoring of COVID-19 patients when they are in home isolation was presented in

[11], using IOT and handling smart cities trafﬁc management in [12]. 5G Integrated

Spectrum Selection and Spectrum Access Using AI-Based Framework for IoT Based

Sensor Networks in [13].

11.2.1 Safety in IoT Operation

Two of the most important considerations in the commercialization of IoT services

and applications are security and privacy. Security attacks range from basic hacks to

business level which are ﬁnely synchronized safety violations that include impacted

various productions, including health care and production, on today’s Internet, which

makes it an attractive target for attackers. The limitations of IoT devices, with the

surroundings in such a way that they function, supply to safety challenges that affect

mutually applications and devices. Safety and privacy issues in the Internet of Things

(IoT) have been studied from a variety of angles for example communication safety,

information safety, conﬁdentiality, and architectural safety.

11.2.2 Gaps in the Presented Security Resolutions for IoT

Networks

To make effective use of the Internet of Things, it is critical to understand where

security and privacy concerns originated. Furthermore, because past technologies

have thrown out the moniker “IoT,” determining if security risks in IoT were novel

or merely a revision of the legacy of previous knowledge is crucial. IoT and traditional

IT devices were compared and contrasted by Fernandes et al. Their attention was

drawn to the issue of privacy, as well. Hardware, software, networks, and applications

112 M. Venkatesh et al.

all play a role in the debate over the similarities and distinctions between them.

According to these classiﬁcations, the security concerns in traditional IT and IoT are

basically comparable. While the IoT’s key concern is resource limitations, existing

advanced security solutions will have a tough time adapting to IoT networks. To

tackle the safety and conﬁdentiality issue posed by the Internet of Things, a multi-

layered architecture and enhanced algorithms are necessary. In order to handle with

safety and conﬁdentiality, IoT devices, for example, may require a new generation

of efﬁcient cryptography and other algorithms due to computational limitations. For

security processes, the sheer number of IoT devices poses signiﬁcant challenges.

The majority of security concerns are complex, and discrete solutions are not

possible. False positives can occur when dealing with security threats like DDoS

or inﬁltration, rendering the solutions ineffective. Furthermore, consumer conﬁ-

dence will be eroded, decreasing the efﬁcacy of these remedies. The creation of

new intelligent, robust, evolutionary, and scalable techniques to deal with IoT safety

concerns will be part of a holistic strategy for IoT security and privacy that includes

contributions from existing security systems.

11.2.3 Machine Learning: A Solution to Iot Security

Challenges

Intelligent ways to optimizing performance criteria by learning from examples or

previous experience are referred to as machine learning (s). Algorithms based on

machine learning construct behavior models from vast datasets by utilizing mathe-

matical principles. With ML, smart devices may also learn on their own, without any

assistance from a human programmer. Next,such molds were utilized to produce fore-

casts for the future elements on the recently added information. Only, a few topics like

artiﬁcial intelligence and optimization theory—as well as cognitive science—have

had a signiﬁcant impact on machine learning. In situations when human knowledge is

either unavailable or ineffectual, such as while negotiating in a hostile atmosphere,

machine learning is used. It is also used in situations where individuals are not

capable to utilize its experience, such as robotics and speech gratitude. This method

can also be employed in circumstances when the answer to a precise trouble vary over

time. Aside from that, machine learning is utilized in real-world smart method; for

example, Google utilizes machine learning to identify risks associated with android

endpoints and applications. It can also be used to detect and remove malware from

phones that have been infected. As an example, Amazon has developed Macie, a tool

that uses machine learning to organize and categorize data stored on the company’s

cloud storage platform. Machine learning techniques are successful in a variety of

ﬁelds, but here is a risk of fake positives and true negatives when applying them.

Hence, ML methods necessitate direction and model correction in the event that

imprecise predictions are generated. A new type of machine learning, deep learning

11 An Efﬁcient IoT Security Solution Using Deep … 113

(DL), on the other hand, empowers models by allowing them to independently deter-

mine the correctness of their predictions. In particular, because of their self-service

nature, deep learning models are better suited for categorization and prediction

responsibilities in new IoT elements that provide background and tailored support.

However, despite the widespread use of traditional approaches for various aspects

of the Internet of Things (such as application and service development, architec-

tural design and protocol development, information aggregation, source allotment

and grouping), with safety, the huge scale consumption of the IoT calls for the

development of intellectual, vigorous, and consistent methods.

11.3 Methodology

An artiﬁcial neural network (ANN)-based computing system known as deep learning

(DL) is mainly effective and proﬁcient part of machine learning today. DL, which is

a subset of machine learning that relies on deep learning, is a total system stimulated

by the natural mind and able to learn from a large number of training examples.

Various domains have made use of the DL, and it is well identiﬁed for its capability

to distinguish best characteristics in unreﬁned information by applying a series of

nonlinear transformations, each of which gains in complexity and abstraction. In-

depth learning is classiﬁed into three types: supervised, unsupervised, and semi-

supervised (a combination of the three).

For the ﬁrst time, a neural network was proposed called FNN or deep FNN

(DFNN). Layers of a FNN include input, one or more hidden layers, and the ﬁnal

output layer. Every neuron layer isn’t recursively associated with all next layer

neurons by making a non-recursive association with the following layer without

framing a cycle back. In other words, it “indicates the relationship between a neuron

and another neuron.” In a neural network, the weight coefﬁcient indicates how

important a particular connection is Fig. 11.1.

In 1998, LeCun built on Fukushima’s work from 1980 to create the “Convolutional

Neural Networks” deep learning model (also known as the “MPL neural networks

version”). CNN has had a lot of success in computer vision applications over the

previous two decades. It is an advanced version of the DFNN, with additional layers

for convolution, activation, and pooling. A ﬁlter (matrix of numbers) is used in the

convolutional layer to transform data (usually pictures) according to the values from

the ﬁlter. The convolution kernel is used in this layer (see Fig. 11.2). Contribution

information is represented by “f,” O.S data by “h,” and row and column indices of

the output matrix by “m” and “n,” respectively, are used to construct the feature map

values. After that, it multiplies each kernel value by two, using the corresponding

data values as the multiplicands. In the end, the sum is saved as a production attribute

plan. The grouping sheet reduces the size of subsequent layers by using maximum or

standard grouping to assist prevent overﬁtting. Max pooling parts the contribution to

non-covering bunches and chooses the greatest incentive for each group in the past

layer.

114 M. Venkatesh et al.

Fig. 11.1 Feed forward neural networks layers

G[m,n]=(f∗h)[m,n]=



h[j,k]f[m−j,n−k]

A ﬂattened input is used in the fully connected layer, and it is associated to every

neurons. It is the launching purpose of nodule that determines its production from

a set of contributions. All volume elements are activated using ReLU, which is a

rectiﬁed linear unit. Its objective is to increase the network’s nonlinearity.

CNN, in contrast to standard feature selection algorithms, can automatically learn

new characteristics and classify trafﬁc. Since it uses the identical convolution matrix,

it can classify superior and discover more characteristics from trafﬁc information,

which reduces the quantity of constraints and training calculations considerably.

Fig. 11.2 CNN layers

11 An Efﬁcient IoT Security Solution Using Deep … 115

11.4 Results and Discussion

11.4.1 Dataset Used

Datasets such as KDDCUP99 and its expanded version NSL-KDD, ISCXIDS2012,

CSE-CIC-IDS 2018, and remaining are widely used for Network IDS, but, as previ-

ously stated, these datasets were not appropriate for Internet of Things (IoT). The

UNSW-NB15 dataset appears to be a fascinating dataset, and bot-IoT dataset was

generated by building a practical IoT network setting with a mix of standard and

botnet transfer. These two datasets were uncovered during our investigation.

11.4.2 Training and Test Data

Training machine learning algorithms is accomplished through the use of statistical

features derived from innocuous network trafﬁc data. Capturing raw network trafﬁc

data via port mirroring on a network switch is an effective method of gathering data.

As soon as the device was linked to the network, the IoT network trafﬁc was captured

and analyzed. An overview of the network trafﬁc gathering can be found in the below.

(1) The network trafﬁc came from a well-known IP address.

(2) All network sources got their MAC and IP addresses from the same place.

(3) Between the source and destination IP addresses, known data are sent.

(4) Information about the destination TCP/UDP/IP port is gathered.

A total of 115 characteristics were extracted from each of the ﬁve time frames,

which were divided into increments of 100 ms, 500 ms, 1.5 s, 10 s, and 1 min.

These characteristics can be computed fast and gradually, allowing rogue packets

to be identiﬁed in real time. The statistical range of normal network trafﬁc feature

statistical values is displayed across the maximum sample collection time window

as well (1 min). When it comes to catching basis IP spooﬁng and other typical

malware ﬂaws, these characteristics are extremely useful. Because of the behavior

of a hacked IoT device spooﬁng an IP address, the elements amassed by the source

MAC/IP (highlight variable MI) and IP/channel (include variable HP) will quickly

ﬂag a high irregularity score because of the hacked IoT gadget’s conduct.

Understanding the data can aid in determining whether a machine learning mold

would be effective for categorizing data using a classiﬁer approach based on the

information included in it. Figure 11.3 shows the data characteristics of both ordinary

trafﬁc and malicious attack trafﬁc placed on top of one another. In this case, there is

no direct relationship between the data attributes and the other data properties. It is

not possible to determine the data for IoT trafﬁc merely by examining the correlations

between normal and attack trafﬁc. The types of attacks, as well as the characteristics

of the network, are critical indicators for protecting against malware attacks.

116 M. Venkatesh et al.

Fig. 11.3 Comparison of feature data against attacks

11.5 Conclusion

The growing number of IoT devices has prompted researchers to explore the secu-

rity threats they pose. Due to current increased assaults such as the Carna and Mirai

botnets, IoT procedures have been demonstrated to be vulnerable. Furthermore, IoT

devices generate a tremendous amount, speed, and diversity of information. Existing

methods become less competent as a result, necessitating the use of modern-day

alternatives. Deep learning has achieved widespread acceptance among researchers

and organizations as a result of its elevated accurateness, capability to study deep

features, and lack of human supervision. Based on the ﬁndings of our inquiry, we

can infer that much effort has been put into studying IoT security in recent years. In

fact, deep learning was used in a diversity of research areas, including cybersecurity

and intrusion detection systems (IDSs), where cybersecurity has yielded a number of

promising discoveries, opening the way for more powerful security in IoT contexts.

The study focused on datasets for training models; nevertheless, the ﬁndings should

be applicable to IDS autonomy in real-world IoT contexts by getting fresh infor-

mation utilizing deep reinforcement learning algorithms to produce powerful and

efﬁcient molds.

References

1. Mohan, N., & Kangasharju, J. (2016). Edge-fog cloud: A distributed cloud for internet of things

computations. In Proceedings of cloudiﬁcation of the internet of things (CIoT) (pp. 1–6). https://

doi.org/10.1109/CIOT.2016.7872914

2. Habeeb, R. A. A., Nasaruddin, F., Gani, A., Hashem, I. A. T., Ahmed, E., & Imran, M.

(2019). Real-time big data processing for anomaly detection: A survey. International Journal

of Information Management, 45, 289–307.

11 An Efﬁcient IoT Security Solution Using Deep … 117

3. Davis, G., & Davis, G. (2018). Trending: IoT malware attacks of 2018. https://securingt

omorrow.mcafee.com/consumer/mobile-and-iot-security/top-trending-iot-malware-attacks-

of-2018. Accessed on May 10, 2019.

4. Wong, W. G. (2015). Developers discuss IoT security and platforms trends. https://www.electr

onicdesign.com/embedded/developers-discuss-iot-security-and-platforms-trends. Accessed

on May 1, 2019.

5. New trends in the world of iot threats. https://securelist.com/new-trends-in-the-world-of-iot-

threats/87991/. Accessed on May 10, 2019

6. Katal, A., Wazid, M., & Goudar, R. H. (2013). Big data: Issues, challenges, tools and good

practices. In: 2013 6th International Conference on Contemporary Computing (IC3), IEEE.

https://doi.org/10.1109/ic3.2013.6612229

7. Cardenas, A. A., Manadhata, P. K., & Rajan, S. P. (2013). Big data analytics for security. IEEE

Security & Privacy, 11(6), 74–76. https://doi.org/10.1109/msp.2013.138

8. McDermott, C. D., Majdani, F., & Petrovski, A. V. (2018). Botnet detection in the internet

of things using deep learning approaches. In: 2018 International Joint Conference on Neural

Networks (IJCNN) (pp. 1–8). IEEE

9. Aly, M., Khomh, F., Haoues, M., Quintero, A., & Yacout, S. (2019). Enforcing security in

internet of things frameworks: A systematic literature review. Internet of Things 100050.

10. Pan, J., & Yang, Z. (2018). Cybersecurity challenges and opportunities in the new edge

computing+ IoT world. In: Proceedings of the 2018 ACM International Workshop on Security

in Software Deﬁned Networks & Network Function Virtualization (pp. 29–32). ACM

11. Reddy Madhavi, K., Vijaya Sambhavi, Y., Sudhakara, M., & Srujan Raju, K. (2021). Covid-

19 isolation monitoring System, Springer series—Lecture Notes on Data Engineering and

Communication Technology.https://doi.org/10.1007/978-981-16-0081-4_60

12. Rizwan, P., Suresh, K., & Babu, M. R. (2016). Real-time smart trafﬁc management system

for smart cities by using Internet of Things and big data. In 2016 International Conference on

Emerging Technological Trends (ICETT). IEEE.

13. Sekaran, R., Goddumarri, S. N., Kallam, S., Patan, R., Ramachandran, M., Al-Turjman, F.

(2021). 5G integrated spectrum selection and spectrum access using ai-based framework for

iot based sensor networks, Computer Networks,186.

Chapter 12

Intelligent Disease Analysis Using

Machine Learning

Nagendra Panini Challa , J. S. Shyam Mohan, M. Naga Badra Kali,

and P. Venkata Rama Raju

Abstract Heart diseases include disordered functioning of heart which can be saved

through early diagnosis. This diagnosis needs a lot of time to perceive the patient

through an accurate approach for treatment. Technical advancements are a boon to

healthcare domain for analyzing huge amounts of data generated by various hospitals.

These data can be further preprocessed and ﬁltered according to the disease analysis.

In this paper, logistic regression (LR) and support vector machine (SVM) models are

incorporated for effective prediction of heart disease. The results achieved are 93%

accurate when the datasets are compared with SVM model.

12.1 Introduction

Heart is the most essential organ for any individual which involuntarily pumps blood

to all the parts of the body starting from the brain to the smallest tissue of our body.

Any miniature problem within the heart can cause many disruptions within the whole

body [1]. The symptoms of heart disease vary from persons when different related

attributes such as heartbeat, chest pain and its associated symptoms are associated

[2]. Heart diseases contribute to most deaths faced by any individual every year as

more than 16 million people [3]. Due to the technical advancements in the disease

prediction which are based on attributes has become most popular technique for

providing accurate diagnosis of chronic heart diseases at an early stage.

P. Shaji and P. Mamatha Alex (2019) have developed a model for heart disease

prediction which uses different machine learning algorithms like support vector

N. P. Challa (B)

School of Computer Science and Engineering (SCOPE), VIT-AP University, Amaravati, India

e-mail: paninichalla123@gmail.com;nagendra.challa@vitap.ac.in

J. S. Shyam Mohan

Department of Computer Science and Engineering, Sri Chandrasekharendra Saraswati Viswa

Mahavidyalaya, Kanchipuram 631561, India

M. Naga Badra Kali ·P. Venkata Rama Raju

Department of Computer Science and Engineering, Shri Vishnu Engineering College for Women,

Bhimavaram 534202, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_12

119

120 N. P. Challa et al.

machine, artiﬁcial neural network, K-nearest neighbors, and random forest. The

best accuracy was produced by artiﬁcial neural networks with respect to their dataset

[4]. Vikas Deep et al. worked on heart disease prediction using machine learning.

They used the approach of rule learning in their study. Their research helps a lot in

the reduction of human force. The data can be directly retrieved from the equipped

electronic gadgets. They made strong association rules. They performed data mining

on the dataset where patient’s medical history was there, which resulted in reducing

the services and proved that many rules together help for the most accurate prediction

of cardiovascular disease [5].

12.2 Literature Survey

B. Gomathy and et al. have developed a model which included big data techniques

increased the accuracy of the model [6]. Sai Deepak Ravikanti and et al. proposed

a model based on cryptographic AES algorithm for safe transfer of the data and

Naive Bayesian techniques for classiﬁcation of data and prediction of cardiovascular

disease [7] (Table 12.1).

Thomas and Theresa Princy R. J. together implemented various machine learning

models for predicting heart disease accurately to ﬁnd out the accuracy based on

Table 12.1 Literature survey

Reference number Title Research ﬁnding

[1] Heart disease prediction using new

age computing techniques

Different new age computing

approaches like machine learning

and deep learning are applied UCI

dataset for result analysis

[2]Prediction of coronary heart disease

using risk factor categories

To examine the association of

JNC-V blood pressure and NCEP

cholesterol categories with

coronary heart disease (CHD) risk

[3] Heart disease prediction using

machine learning techniques

The prediction is carried out using

various supervised learning

techniques like Naïve Bayes,

decision tree, and so on

[4]Improving the accuracy of prediction

of heart disease risk based on

ensemble classiﬁcation techniques

A novel algorithm is proposed on

medical dataset which predicts the

heart disease at an early stage of

diagnosis

[5] Heart disease prediction using

exploratory data analysis

The attributes are collected from

various risk factors which are

predicted using kmeans algorithm

with a publicly available dataset

12 Intelligent Disease Analysis Using Machine Learning 121

various attributes [8]. C. Chetan et al. used machine learning classiﬁcation and exper-

imented on predicting heart disease. They included mean absolute error (MAE), root

mean squared error, and sum of squares error as performance measure [9]. This

research identiﬁed that the support vector machine is the most accurate algorithm

in terms of accuracy. SVM gave a better accuracy than Naive Bayes [10]. After

incorporating the given attributes related to cardiovascular diseases, a deep learning

neural network model was developed. They included nearly 120 hidden layers, and

they were administered by the output perceptron model [11]. Fahd Saleh devel-

oped a machine learning model in which he compared the performance of various

algorithms. He used a rapid tool which led to higher accuracy compared to any

other tools like Weka tools [12]. The ﬁve algorithms he used are support vector

machine, random forest, Naive Bayes, decision tree, and logistic regression. Among

all these, the algorithm with highest accuracy was decision tree classiﬁer [13]. A

heart disease prediction was made using different kinds of algorithms like K-nearest

neighbors, logistic regression, and random forest classiﬁer. It is observed that every

algorithm had its own objectives registered and had its own strength. [14]. A. M.

Karandikar et al. together worked on cardiovascular diseases prediction by applying

a couple of machine learning algorithms [15]. They used Naive Bayes and decision

tree algorithms. They got higher accuracy for the decision tree than Naive Bayes

[16].

12.3 Proposed Work

The main objective of this work is to efﬁciently predict whether a person is suffering

from a heart disease or not. After we enter input values which are nothing but the

attribute values of a person’s health report, the output (whether that patient has heart

disease or not) is obtained [17,18]. For this work, we used the UCI Cleveland dataset

for the analysis. The source of this dataset is Kaggle [19]. This dataset totally has 76

factors. We have used 14 factors out of those. This dataset on Kaggle is preprocessed

without any null values. The attributes which are used in the project are shown below

[20] (Table 12.2).

This analysis is done using various APIs in Python programming language which

facilitates the data collection in the initial stage. For this research, the UCI repository

from the Kaggle website is chosen. The next step is data preprocessing and splitting

the data. We split all the 13 attributes except the target variable into Xand the target

attribute into Y(Fig. 12.1).

Now the data is passed into the algorithms. The algorithms used in this research

are namely support vector machine and logistic regression. Finally, this model is

evaluated based on different performance metrics like accuracy, F-score, etc.

Firstly, regression algorithm helps for classiﬁcation problems with two different

classes. Here, we implemented a logistic function to get an output value between

0 and 1. In this study, there are 13 independent attributes and one target attribute.

When there are multiple attributes and the result should lie between the values 0 and

122 N. P. Challa et al.

Table 12.2 Dataset attributes

S.No. Attribute description Val u e s

1. Age—the age of the person Numerical value

2. Gender—sex of the person 0[f], 1[m]

3. Chest pain—intensity of chest pain of a person Va l u es f rom 0 t o 3

4. Rest blood pressure—this shows person’s blood pressure

value

Numerical value

5. Cholesterol—this represents the cholesterol level of a person Numerical value

6. Fasting blood sugar—it shows the fasting blood sugar of the

person

1or0

7. Electro cardio gram—this represents the ECG result of a

person

0–2

8. Heartbeat raise—this represents the maximum heartbeat of a

person

Numerical value

9. Exercise angina—it shows whether the person has exercise

induced angina

0or1

10. Old peak—this represents the depression level of a person Numerical value

11. Slope—the heartbeat ﬂuctuations of a person during exercise Values between 1–3

12. CA—CA shows the ﬂuoroscopy results of a person Values between 0–3

13. Thallium test result Values between 0–3

14. Tar get 0—No, 1—Yes

Fig. 12.1 Proposed system architecture

12 Intelligent Disease Analysis Using Machine Learning 123

1, logistic regression could be the best ﬁt.

Sigmoid/Logistic Function −Y=1/1+e−z

Secondly, using SVM hyper planes are identiﬁed and most accurate decision

boundary is obtained. This algorithm divides the whole n-dimensional space into

classes (two classes if it is binary classiﬁcation) based on the boundary. So, the new

data can be put into the correct category based on the decision boundary.

maximize f(c1...cn)=



i=1

ci−1



i=1



j=1

yici(ϕ(xi)·ϕ(xi))yicj



i=1

ci−1



i=1



j=1

yicik(xi,xj)yicj

12.4 Results and Analysis

The results of this study after applying the machine learning algorithms are as

shown below. Here, performance is analyzed using accuracy, recall, precision, and

F-measure are used. Accuracy is computed by analyzing correct predictions in all

predictions made. This is the most widely used performance metric in evaluating

machine learning models. Precision is the ratio between the relevant records to all

the instances which are retrieved. Recall shows us how efﬁciently the model is able

to identify the given data. It is also referred to as sensitivity or the rate of TPs.

•True Positives [TP]: the person has the heart disease and the prediction is correct.

•False Positives [FP]: the person does not have the heart disease but the test

prediction is wrong.

•True Negatives [TN]: the person does not have the disease and the test prediction

is correct.

•False Negatives [FN]: the person has the disease but the test prediction is wrong

(Fig. 12.2).

Accuracy (A)=Total number of correct predictions/[Total number of

predictions made

Recall(R)=[True Positives]/True Positives +False Negatives

Precision(P)=[True Positives]/[True Positives +False Positives]

F - Measure(Fs)=[2∗Precision ∗Recall]/[Precision +Recall]

124 N. P. Challa et al.

Fig. 12.2 Results comparison

Table 12.3 Confusion matrix

Algorithm TP FP FN TN

Logistic regression 80 17 11 140

Support vector machine 89 8 6 105

Table 12.4 Performance analysis

Algorithm P R Fs A(%)

Logistic regression 0.87 0.83 0.85 87

Support vector machine 0.86 0.88 0.87 93.40

The above-mentioned performance metrics, accuracy, F-score, recall, and preci-

sion can be derived from the confusion matrix. The confusion matrix helps us to

assess the performance of the model. The confusion matrix values obtained for the

logistic regression and support vector machine are in Tables 12.3 and 12.4.

12.5 Conclusion

Out of most widely occurred people deaths around the globe majority constitute

to heart related problems. This research implements machine learning algorithms

like support vector machine and logistic regression for prediction of cardiovascular

diseases. The results clearly show that the support vector machine is the best algo-

rithm with an accuracy of 93.40% for prediction when the models are compared. In

12 Intelligent Disease Analysis Using Machine Learning 125

the future, the research may be extended to developing a web-based application or

basic GUI based on the support vector machine with perhaps a large dataset. It may

help in producing more accurate results and can efﬁciently help the doctors.

References

1. Pavan Kumar, T., & Avinash, G. (2019). Heart disease prediction using effective machine

learning techniques. International Journal of Recent Technology and Engineering, 8, 944–950

2. Chetty, N., Vaisla, K. S., & Patil, N. (2015). An improved method for disease prediction using

fuzzy approach. ACCE 2015.

3. Chilukuri, S. K., Challa, N. P., Mohan, J. S. S., Gokulakrishnan, S., Mehta, R. V. K., & Suchita,

A. P. (2021). Effective predictive maintenance to overcome system failures—A machine

learning approach. In H. Sharma, M. Saraswat, A. Yadav, J. H. Kim & J. C. Bansal (Eds.),

Congress on intelligent systems. CIS 2020. Advances in intelligent systems and computing (vol.

1334). Springer. https://doi.org/10.1007/978-981-33-6981-8_28

4. Mamatha Alex, P., & Shaji, S. P. (2019). Prediction and diagnosis of heart disease patients using

data mining technique. In International Conference on Communication and Signal Processing

2019.

5. Chauhan, A., Jain, A., Sharma, P., Deep, V. (2018). Heart disease prediction using evolutionary

rule learning. In International Conference on “Computational Intelligence and Communication

Technology” (CICT 2018).

6. Nagamani, T., Logeswari, S., Gomathy, B. (2019). Heart disease prediction using data mining

with mapreduce algorithm. International Journal of Innovative Technology and Exploring

Engineering (IJITEE), 8(3). ISSN: 2278-3075.

7. Repaka, A. N., Ravikanti, S. D., Franklin, R. G. (2019). Design and implementation of heart

disease prediction using Naives Bayesian. In International Conference on Trends in Electronics

and Information (ICOEI 2019).

8. Theresa Princy R., & Thomas, J. (2016). Human heart disease prediction system using data

mining techniques. In International Conference on CircuitPower and Computing Technologies,

Bangalore.

9. Lutimath, N. M., Chethan, C., & Pol, B. S. (2019). Prediction of heart disease using machine

learning. International Journal of Recent Technology and Engineering, 8, 474–477.

10. Kiyasu, J. Y. (1982). U.S. Patent No. 4,338,396. Washington, DC: U.S. Patent and Trademark

Ofﬁce.

11. Alotaibi, F. S. (2019). Implementation of a machine learning model to predict heart failure

disease (IJACSA). International Journal of Advanced Computer Science and Applications,

10(6).

12. Ganna, A., Magnusson, P. K., Pedersen, N. L., de Faire, U., Reilly, M., Ärnlöv, J., & Ingelsson,

E. (2013). Multilocus genetic risk scores for coronary heart disease prediction. Arteriosclerosis,

thrombosis, and vascular biology, 33(9), 2267–2272.

13. Nikhar,S., & Karandikar, A. M. (2016). Prediction of heart disease using machine learning algo-

rithms. International Journal of Advanced Engineering, Management and Science (IJAEMS)

Infogain Publication, 2(6); Jacobs, I. S., & Bean, C. P. (1963). Fine particles, thin ﬁlms and

exchange anisotropy. In G. T. Rado & H. Suhl (Eds.), Magnetism (vol. III, pp. 271–350).

Academic.

14. Piller, L. B., Davis, B. R., Cutler, J. A., Cushman, W. C., Wright, J. T., Williamson, J. D., &

Haywood, L. J. (2002). Validation of heart failure events in the Antihypertensive and lipid

lowering treatment to prevent heart attack trial (ALLHAT) participants assigned to doxazosin

and chlorthalidone. Current Controlled Trials in Cardiovascular Medicine, 3(1), 10.

126 N. P. Challa et al.

15. Folsom, A. R., Prineas, R. J., Kaye, S. A., & Soler, J. T. (1989). Body fat distribution and

self-reported prevalence of hypertension, heart attack, and other heart disease in older women.

International Journal of Epidemiology, 18(2), 361–367.

16. Zhang, Y., Fogoros, R., Thompson, J., Kenknight, B. H., Pederson, M. J., Patangay, A., &

Mazar, S. T. (2011). U.S. Patent No. 8,014,863. U.S. Patent and Trademark Ofﬁce.

17. Lee, I., et al. (2012). Challenges and research directions in medical cyber–physical systems.

Proceedings of the IEEE, 100, 75–90.

18. Rajkumar, R. (2012). A cyber–physical future. Proceedings of the IEEE, 100, 1309–1312.

19. Shyam Mohan, J. S., Vedantham, H., Vanam, V., Challa, N. P. (2021). Product recommendation

systems based on customer reviews using machine learning techniques. In I. Jeena Jacob, S.

Kolandapalayam Shanmugam, S. Piramuthu, P. Falkowski-Gilski (Eds.), Data Intelligence and

Cognitive Informatics. Algorithms for Intelligent Systems. Springer. https://doi.org/10.1007/

978-981-15-8530-2_21

20. Zhou, J., Cao, Z., Dong, X., & Vasilakos, A. V. (2017). Security and privacy for cloud-based

IoT: Challenges. IEEE Communications Magazine, 55(1), 26–33.

Chapter 13

Automated Detection of Skin Lesions

Using Back Propagation Neural Network

Nagendra Panini Challa , A. Mohan, Narendra Kumar Rao,

Bhaskar Kumar Rao, J. S. Shyam Mohan, and B. Balaji Bhanu

Abstract Many skin diseases impact our human body in a drastic way. Many skin

related problems exist but this paper focuses on skin lesions which has an abnormal

growth when compared to its surrounding skin. The diagnosis is analyzed using

neural networking (NN) model where data preprocessing, skin texture identiﬁcation

and classiﬁcation is performed using support vector machine (SVM) method for skin

disease dataset. The results obtained specify that good accuracy 93% is achieved by

reducing the data loss.

13.1 Introduction

Skin related problems like cancer occur due to the abnormal growth of unwanted or

harmful skin cells, and they have the ability to spread to other places of human body

when not addressed properly [1]. This occurs when the human skin is exposed to sun

abnormally at different durations of the day. Generally, skin diseases are classiﬁed

in to two groups namely non-melanoma (NMSC) and melanoma (MSC) skin cancer

[2]. This is most common types of cancer which develops on the upper layer of actual

skin. There are two main types of cancers found in NMSC like squamous-cell (SCC)

N. P. Challa (B)

School of Computer Science and Engineering (SCOPE), VIT-AP University, Amaravati, India

e-mail: paninichalla123@gmail.com;nagendra.challa123@gmail.com

A. Mohan

Department of Computer Science and Engineering, Lovely Professional University, Punjab, India

N. K. Rao ·B. K. Rao

Department of Computer Science and Engineering, Sree Vidyaniketan Engineering College,

Tirupathi, India

J. S. Shyam Mohan

Department of Computer Science and Engineering, Sri Chandrasekharendra Saraswati Viswa

Mahavidyalaya, Kanchipuram 631561, India

B. Balaji Bhanu

Department of Electronics, Andhra Loyola College, Vijayawada 520010, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_13

127

128 N. P. Challa et al.

skin cancer and basal-cell (BCC) skin cancer. MSC develops various harmful skin

cells which can be more serious when compared to NMSC [3]. These skin related

problems occur due to over exposure of ultraviolet radiation (UV) from sun and other

indoor light sources. DNA gets damaged when they are exposed which triggers the

basal cells in the epidermis causing abnormal growth in skin [4]. SCC occurs due

to skin exposure to sun which makes it red and very itchy. This is considered as a

dangerous cancer type when compared to MSC.

Most MSC cells consist of various color shades which are red and pin in color

which tend to be very harmful [5]. They change in shape and size abnormally

depending on the skin tissue. The skin exposure to UV radiations from sun causes

90% skin related cases in our country. Another type of radiation is the tanning beds

which cause cancer due to the exposure of human body during childhood is harmful

which enormously raises MSC and NMSC cells [6]. There is a chance for MSC to

impact the human skin through moles in the human body. The direct DNA will be

damaged due to the exposure of human skin to UV rays and the indirect DNA will

also be altered through the reactive oxygen species [7].

The skin cancer can be identiﬁed by various symptoms like ulcering in the skin,

skin do not heal, changes in moles, colored skin, and many more. The most vulnerable

profession for the skin cancer is farming where the farmers are exposed to the sun at

different intervals of time. Some of the other factors include smoking, HPC infections,

immune problems, and many more. Further, many partial solutions are proposed for

early diagnosis of the skin cancer which include different surgical procedures and

chemotherapy [8]. Most common solution for skin problems is the surgical excision.

These surgeries mainly focus on skin reconstruction according to the size and location

of the skin defect. The mortality rate is very less in these skin related problems as

there are many approaches for its cure and containment.

13.2 Literature Survey and Proposed Work

This CNN is very similar to the neurons of the human brain which responds to

the input dataset based on the relevant ﬁlters necessary for spatial and temporal

dependencies of the image [9]. CNN best ﬁts the input dataset which is given which

involves in various parameters and weights reusability. The classiﬁcation of the

correct image is successful only when the image pixels are near to the intensity

values of the input image [10] (Fig. 13.1).

13.2.1 Image Segmentation

This refers to splitting of the images into various segments where the required region

is identiﬁed for analysis [11,12]. Popular Otsu algorithm is used for segmentation

13 Automated Detection of Skin Lesions Using Back Propagation … 129

Fig. 13.1 Convolution neural network [9]

Table 13.1 Literature survey

S.No. Title Research ﬁnding

1 An introduction to ROC data analysis ROC graphs play a major role in identifying

the accurate classiﬁers for performance

visualization

2Automatic segmentation of clustered

breast cancer cells using watershed and

concave vertex graph

Automatic detection and analysis of

quantum dots (DQ) are performed using

fuzzy approach

3 Brain tumor segmentation and its area

calculation in brain MR images using

k-mean clustering and fuzzy c-mean

algorithm

A novel approach for shape detection of

tumor in MR brain images

4 Brain tumor detection using color-based

k-means clustering segmentation

A novel algorithm which is based on

color-based segmentation using k-means

clustering technique for classifying tumor

objects is proposed

which is most ideal for image thresholding which is an alternative of region-based

segmentation method [13,14] (Fig. 13.2).

In this work, the RGB image is converted into grayscale where ConvNet is utilized

to process the image without losing the features which are necessary for good predic-

tion based on out scalable input dataset [15]. The main objective to utilize this function

is to extract the high-level features from the input image and the low-level features are

identiﬁed from ConvLayer function. The pooling layer in CNN plays a major role for

reducing the spatial size of the input dataset. This work focuses on various skin lesions

patterns which is based on convolution neural network [16]. The model is a subset of

deep learning algorithm which takes an input image with its associated weights. The

inbuilt functions in many tools preprocess the data using ConvNet function which

learns various features from the input dataset. The data is preprocessed effectively

with dimensionality reduction technique for the input images where pooling process

130 N. P. Challa et al.

Fig. 13.2 4×4×3 RGB input image

will be continued for achieving the necessary features with a higher computation

power [17].

13.2.2 Morphological Image Processing

This refers to the combination of nonlinear operations which are dependent of relative

ordered pairs of image pixel values generally suitable for binary images [18,19].

The structuring element in the image is identiﬁed which is positioned and compared

with the nearest/neighbor pixels for easy classiﬁcation of the input image. The most

important operations are namely erosion and dilation are applied to the input dataset

for addition and removal of boundary objects of the input images [20].

13.3 Results

This work was implemented using MATLAB 2018 tool where image processing

toolbox is readily available and facilitates with different machine learning algorithms

[21–23]. This helps the input dataset to easily classify and detect the skin lesions

accurately (Figs. 13.3,13.4 and 13.5).

13 Automated Detection of Skin Lesions Using Back Propagation … 131

Fig. 13.3 Training progress of the datasets

Fig. 13.4 Segmented image of detected object

132 N. P. Challa et al.

Fig. 13.5 Cancer area detection in input image

13.4 Conclusion and Future Enhancement

This work mainly focuses on identifying the skin disease through a computer-based

technique which reduces the identiﬁcation time of disease by any individual (derma-

tologist). A suitable segmentation technique is applied on the input set of images

which are classiﬁed based on the CNN technique. The skin disease is accurately

identiﬁed which is based on the efﬁcient incorporated in the classiﬁcation algorithm

in this work. This work using the CNN can be further extended to other medical

images for efﬁcient identiﬁcation of disease analysis.

References

1. Fawcett, T. (2006). An introduction to roc analysis. Pattern Recognition Letters, 27(8), 861–

874.

2. Mouelhi, A., Sayadi, M., & Fnaiech, F. (2011). Automatic segmentation of clustered breast

cancer cells using watershed and concave vertex graph. In International Conference on

13 Automated Detection of Skin Lesions Using Back Propagation … 133

Communications, Computing and Control Applications(CCCA).

3. Sivakumar, J., Lakshmi, A., & Arivoli, T. (2012). Brain tumor segmentation and its area calcu-

lation in brain mr images using k-mean clustering and fuzzy c-mean algorithm. In International

Conference on Advances in Engineering, Science and Management (ICAESM).

4. Wu, M.-N., Lin, C.-C., & Chang, , C.-C. (2007). Braintumor detection using color-based k-

means clustering segmentation. In Third International Conference on Intelligent Information

Hiding and Multimedia Signal Processing (IIH-MSP).

5. Zhou, H., Rehg, J. M., & Chen, M. (2010). Exemplar-based segmentation of pigmented skin

lesions from dermoscopy images. In IEEE International Symposium on Biomedical Imaging:

From Nano to Macro.

6. Jones, T. D., & Plassmann, P. (2000). An active contour model for measuring the area of leg

ulcers. IEEE Transactions on Medical Imaging, 19(12), 1202–1210.

7. Tsap, L. V., Goldgof, D. B., Sarkar, S., & Powers, P. S. (1998). A vision-based technique for

objective assessment of burn scars. IEEE Transactions on Medical Imaging, 17 (4), 620–633.

8. Keke, S., Peng, Z., & Guohui, L. (2010). Study on skin color image segmentation used by fuzzy-

c-means arithmetic. In Seventh International Conference on Fuzzy Systems and Knowledge

Discovery.

9. Sarrafzade, O., Baygi, M. H. M., & Ghassemi, P. (2010). Skin lesion detection in dermoscopy

images using wavelet transform and morphology operations. In 17th Iranian Conference of

Biomedical Engineering(ICBME).

10. Challa, N. P., Gokulakrishnan, et al. (2020). A method of an artiﬁcially intelligent build

repository management system. Indian Patent Application Number: 202041026135, July 2020.

11. Cula, O. G., & Dana, K. J. (2002). Image-based skin analysis. In Proceedings of Texture

(pp. 35–40). Copenhagen.

12. Kundin, J. I. (1989). A new way to size up wounds. AJN The American Journal of Nursing,

89(2), 206–207.

13. Lucas, C., Classen, J., Harrison, D., & De, H. (2002). Pressure ulcer surface area measurement

using instant full-scale photography and transparency tracings. Advances in Skin and Wound

Care, 15(1), 17–23.

14. Keas, D. H., & Keith, C. (2004). Measure: A proposed assessment framework for developing

best practice recommendations for wound assessment. Wound Repair and Regeneration, 12(3),

1–17.

15. Shen, S., Sandham, W. A., Granat, M. H., Dempsey, M. F. & Patterson, J. (2003). Fuzzy

clustering-based application to medical image segmentation. In Proceedings of the 25th Annual

International Conference of the IEEE EMBS (pp. 747–750).

16. Otsu. N. (1979). A threshold selection method from gray-level histograms. IEEE Transactions

on Systems, Man, and Cybernetics, 9(1), 62–66.

17. Xu, R., & Wunsch, D. (2005). Survey of clustering algorithms. IEEE Transactions on Neural

Networks, 16(3), 645–678.

18. Chan, T. F., & Vese, L. A. (2001). Active contours without edges. IEEE Transactions on image

processing, 10(2), 266–277.

19. Challa, N. P., Rao, N. K., et al. (2019). Predictive maintenance for monitoring heritage buildings

and digitization of structural information. International Journal of Innovative Technology and

Exploring Engineering, 8(8), 1463–1468.

20. Yun, T., Zhou, M.-q., Wu, Z.-k., & Wang, X.-c. (2009). A region based active contour model for

image segmentation. In International Conference on Computational Intelligence and Security.

21. Dawod, A. Y., Abdullah, J., & Alam, M. J. (2010). A new method for hand segmentation using

free-form skin color model. In 3rd International Conference on Advanced Computer Theory

and Engineering (ICACTE).

22. Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithms. Plenum

press.

23. Abolfazl, K., Hadi, S., & Ali, A. (2011). A modiﬁed fcm algorithm for MRI brain image

segmentation. In 7th Iranian Conference on Machine Vision and Image Processing.

Chapter 14

Detection of COVID-19 Using CNN

and ML Algorithms

M. Raghav Srinivaas, Khanjan Shah, B. Abhishek, R. Jagadeesh Kannan,

and A. Balasundaram

Abstract As we see coronavirus is the very dangerous diseases and to identify this

diseases in one’s body is also not as easy. So during identiﬁcation of diseases there

are many false positive cases we see that person does not have corona and still the

prediction comes true and also in some cases, it happens that person has corona but it

does not get detected (false negative case). So due to this problem, we here come up

with the two approaches and make comparison between these two approaches and

decide which one is better to analyze the diseases in the body. We are using CNN to

scan chest X-ray dataset and ML algorithms for tabular dataset as it contains many

text information too. So in this project, we explain in detail, what is CNN, what is

ML, how to implement CNN and ML algorithms on particular dataset, what output

we will get as a comparison.

14.1 Introduction

COVID-19 is one of the worst pandemic seen around the world. Many people were

affected by it and many lost their lives to it. COVID-19 started around December

2019 in Wuhan, China and started to spread around the world by the ﬁrst quarter of

2020. From March 2020, lockdowns were imposed in India and it went around for

about 6 months. The vaccines were founded in early 2021 and now almost 30% of

the population is vaccinated. During the early stages of the pandemic, the detection

of the COVID-19 was a tedious job. It took about 2 to 3 days to get the test results

and by that time the condition of the people affected by the disease worsened.

In this paper, we are proposing a system to detect COVID-19 in an easy manner.

Two datasets are taken for training and testing purpose. The ﬁrst one is an image

dataset consisting of chest X-ray images. The second one is a text dataset. The image

M. Raghav Srinivaas ·K. Shah ·B. Abhishek ·R. Jagadeesh Kannan ·A. Balasundaram (B)

School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, India

e-mail: balasundaram.a@vit.ac.in

R. Jagadeesh Kannan

e-mail: jagadeeshkannan.r@vit.ac.in

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_14

135

136 M. Raghav Srinivaas et al.

dataset is used with convolution neural network, and the text dataset is used with

recurrent neural network. Using this we are able to create a system that can detect

COVID-19 in an easier as well as faster manner.

14.2 Literature Survey

14.2.1 Pneumonia and COVID-19 Detection Using

Convolution Neural Network

A system to detect COVID-19 and pneumonia using neural network is proposed.

CNN and VGG 16 have been used because methods of deep learning covering

the deep CNNs techniques on X-ray images of chest are receiving recognition and

encouraging outcomes has made it known in diverse applications. Kaggle chest X-

ray database containing 15,798 chest X-ray images of normal or healthy, viral, and

bacterial-pneumonia is taken. A CNN network is constructed containing a convolu-

tion layer followed by batch normalization layer and then max pooling layer with

activation layer. These layers are repeated multiple times with a ﬂatten and a dense

layer at the end. A learning rate of 0.0001 utilizing the Adam optimizer and a cross-

entropy using categorical were used as optimizer. A training model accuracy of 95%

was achieved. While testing an accuracy of 92% was achieved for pneumonia and

93% for COVID-19.

14.2.2 COVID-19 Detection Using Recurrent Neural

Network

A system to detect COVID-19 using recurrent neural networks is proposed. A speech

corpus consisting of 60 healthy speakers and 20 COVID patients was considered.

Three different types of sounds like cough sounds, breathing sounds and voice were

recorded. Mel-frequency spectral coefﬁcients were used to extract features. A long

short-term memory (LSTM) network was used for detection. 70% of data was taken

for training and 30% for testing. An accuracy of 98% was achieved for breathing

sounds and 97% for cough sounds, and 88% for voice was achieved. The end result

was that only breathing and cough sounds were enough for detection of COVID-19.

14 Detection of COVID-19 Using CNN and ML Algorithms 137

14.2.3 Transfer Learning to Detect COVID-19 Automatically

from X-ray Images Using Convolutional Neural

Networks

A system that uses transfer learning to detect COVID-19 automatically from X-ray

images using convolutional neural networks is designed. An X-ray image dataset

was downloaded from Kaggle. This dataset consists of chest X-ray images from

1200 individuals with COVID-19, 1341 images from healthy individuals, and 1345

images from individuals with other types of viral pneumonia. The transfer learning

technique was applied using ImageNet data. Three more layers were added to the

top of each model namely a fully connected layer (FC2) with the output of 512,

a dropout layer, and another fully connected layer (FC1) with a softmax classiﬁer.

A dropout layer was added to prevent overﬁtting. The network was trained with a

softmax classiﬁer for 15 epochs using an RMSprop optimizer, with a learning rate

of 0.00001 and a batch size of 32. An accuracy of 96% was achieved.

14.2.4 Automatic Detection of Coronavirus Disease

(COVID-19) Using X-ray Images and Deep

Convolutional Neural Networks

A system on automatic detection of coronavirus disease (COVID-19) using X-ray

images and deep convolutional neural networks is designed. A chest X-ray images

of 341 COVID-19 patients have been obtained from the open source GitHub reposi-

tory shared by Dr. Joseph Cohen. This repository is consisting chest X-ray/computed

tomography (CT) images of mainly patients with acute respiratory distress syndrome

(ARDS), COVID-19, middle east respiratory syndrome (MERS), pneumonia, and

severe acute respiratory syndrome (SARS). 2800 normal (healthy) chest X-ray

images were selected from “ChestX-ray8” database and 2772 bacterial and 1493

viral pneumonia chest X-ray images were used from Kaggle repository called “chest

X-ray images (pneumonia)”. The data augmentation is performed after which the

pretrained models like ResNet, IncaetionV3 is loaded. Then, it is passed through

a global average pooling layer and a fully connected layer. An accuracy 95.4 was

achieved using Inception V3 model, and an accuracy of 96.1% was achieved using

ResNet model.

138 M. Raghav Srinivaas et al.

14.2.5 Use of Fuzzy Soft Set in Decision Making

of COVID-19 Risks

To make human-like decisions, we are using fuzzy inference systems which have

fuzzy reasoning, this will happens only with the help of professionals who are about to

make decisions with positive issues with their continuous hard-work and knowledge.

This type of fuzzy systems is useful when we have human information because it

needs some processing through natural languages. But the problem is that we are

not able to feed whole information or we have not sufﬁcient information to feed to

our conventional mathematical model. The maximum usually used fuzzy inference

method is the Mamdani version. The Mamdani fuzzy inference technique is executed

in four consequent tiers; fuzziﬁcation, rule assessment, rule output aggregation, and

defuzziﬁcation. The fuzziﬁcation module maps the crisp enter price into a diploma

of club of fuzzy units by using making use of fuzziﬁcation membership capabilities.

A club function returns a value among 0 (for non-membership) and one (for full

membership). The knowledge base includes the IF-THEN guidelines which can be

provided by means of ﬁeld specialists.

14.2.6 CNN-LSTM Combination for COVID-19 Prediction

Time-series prediction is a forecasting approach that analyzes historic information

to capture the relationship and tendencies of a random variable. It will then be imple-

mented to forecast the fee of that random variable in the destiny. This technique is

in particular useful if the underlying distribution/technique records era is unknown

or if there may be no explanatory model capable of precisely linking the predic-

tion variable with different explanatory variables. A tremendous deal of attempt and

manufacturing of studies has long past into the development and development of

time collection forecasting techniques during the last several a long time. The next

paragraph summarizes many fruitful researches that demonstrate several models for

forecasting COVID-19 cases. Alternatively, algorithms primarily based on artiﬁ-

cial intelligence (AI) analyze from ancient information to forecast destiny results.

Machine-mastering and deep-studying algorithms are sorts of AI algorithms. It is a

discipline this is focused on laptop algorithms getting to know and growing on their

very own. Machine-gaining knowledge of primarily based forecasting repressor’s

change their parameters to in shape their forecasts to the actual data. Some associ-

ated research that used gadget-studying algorithms in forecasting the dispersion of

COVID-19 disease are discussed in the following paragraphs.

14 Detection of COVID-19 Using CNN and ML Algorithms 139

14.2.7 Statistical Techniques for Combating COVID-19

Computational intelligence changed into ofﬁcially described by means of Bezdek in

1994 such that a machine is called “computationally clever” if the system deals with

facts on a fundamental level (together with pixels of an image), carries a module

of pattern recognition, and does not utilize previous knowledge in the experience of

synthetic intelligence. According to Bezdek’s deﬁnition, computational intelligence

is one branch of synthetic intelligence. Actually, the dreams of each synthetic intel-

ligence and computational intelligence are the same, that’s to realize trendy intelli-

gence. Marks clariﬁed the distinction among synthetic intelligence and computational

intelligence via claiming that the previous is crafted from tough-computing technolo-

gies, while the latter is crafted from gentle-computing technologies. Therefore, we

are able to presume that types of device intelligence exist:

Artiﬁcial intelligence involves using advanced concepts of computing to impart

intelligence to the system [1]. Compared to the tough-computing-based synthetic

intelligence, computational intelligence can adapt too many exceptional situations

through the beneﬁts of the concept of gentle-computing. Hard-computing strategies

are designed the usage of a Boolean good judgment primarily based only on true or

false values that data engineering relies on. One crucial trouble in Boolean logic is

that Boolean values are unable to interpret herbal language easily.

14.2.8 Computational Intelligence Techniques for Combating

COVID-19: A Survey

In the disciplines of health sciences and epidemiology, computational intelligence has

been employed in a variety of applications [1–5]. Because of the fast and widespread

spread of COVID-19, many researchers all over the world have been working hard

to create computational intelligence methods and systems to battle the pandemic.

Despite the fact that over 200,000 scientiﬁc publications have been published on

COVID-19, SARS-CoV-2, and other coronaviruses [6–10], none of them properly

addressed the essential challenges for using computational intelligence to battle

COVID-19. To close this gap, this survey categorizes and evaluates the present state of

computational intelligence in the battle against this deadly illness. We hope to compile

and synthesize the most recent discoveries and ideas in computational intelligence

techniques such as machine learning, evolutionary computation, soft computing, and

big data analytics in this study. COVID-19 has been turned into a practical applica-

tion. They also look into some possible computational intelligence research topics for

combating the epidemic. According to the ﬁndings, the majority of research papers

addressing the challenges of characterization of viral infection symptoms have been

created using neural networks. They got the best results (97.62% true-positive rate),

140 M. Raghav Srinivaas et al.

indicating that deep neural networks for detecting symptoms from CT images are

well-developed.

14.2.9 Toward Using Recurrent Neural Networks

for Predicting Inﬂuenza-Like Illness: Case Study

of COVID-19 in Morocco

Inﬂuenza does not just damage people’s health; it is also a major concern for

governments and health-care facilities. The most effective control technique for ﬂu

outbreaks is early analysis, forecast, and reaction. Scientists working on artiﬁcial

intelligence (AI) are attempting to analyze epidemics, create supervised and unsu-

pervised models. In this work, they described the most commonly used machine

learning (ML) and deep learning (DL) models for analyzing time-series data to

better understand COVID-19’s behavior. Among many algorithms of ML, recurrent

neural network (RNN) was chosen for tracking this pandemic and forecasting its

future emergence. Since the ﬁrst case of COVID-19 was reported in Morocco, the

total number of documented infectious cases has continued to rise, albeit the number

varies by area of the country. In addition, they offer an analysis and prediction model

for the inﬂuenza-like disease COVID-19 based on geographical distribution in this

study. The use of machine learning and deep learning models for epidemic prediction

and analysis was proposed in this study, with the LSTM model being used to fore-

cast COVID-19 pandemic expansion in Morocco. This prediction model can assist

in making critical decisions that will result in a quicker response and management

of the issue.

14.2.10 The Role of Chest Imaging in Patient Management

During the COVID-19 Pandemic: A Multinational

Consensus Statement from the Fleischer’s Society

The coronavirus disease 2019 (COVID-19) pandemic has arisen as an extraordi-

nary health-care catastrophe, with over 900,000 conﬁrmed cases globally and almost

50,000 deaths in the ﬁrst three months of 2020. COVID-19 has spread in a variety of

ways, resulting in sporadic transmission and a small number of hospitalized COVID-

19 patients in some areas, and community transmission in others, resulting in a large

number of severe cases. Critical resource restrictions in diagnostic tests, hospital

beds, ventilators, and health-care professionals who have been ill as a result of the

virus have interrupted and impaired health-care delivery in these locations, which has

been worsened by a lack of personal protective equipment. Although initial instances

14 Detection of COVID-19 Using CNN and ML Algorithms 141

resemble normal upper respiratory virus infections, as the disease progresses, respi-

ratory dysfunction becomes the leading cause of morbidity and death. Chest radiog-

raphy and computed tomography (CT) are important diagnostic and treatment tech-

niques for pulmonary illness. The ﬁndings were compiled into ﬁve primary and

three supplementary suggestions to assist medical professionals in the use of chest

radiography and CT in the treatment of COVID-19.

14.3 Proposed Methodology

14.3.1 Detecting COVID-19 Using CNN

14.3.1.1 Dataset Info

COVID-19 Radiography: COVID-19 Radiography Database

A team of researchers from Qatar University, Doha, Qatar, and the University

of Dhaka, Bangladesh along with their collaborators from Pakistan and Malaysia in

collaboration with medical doctors have created a database of chest X-ray images

for COVID-19 positive cases along with normal and viral pneumonia images.

There are a total of 21,165 samples, which are classiﬁed into four categories:

1. COVID-19

2. Lung opacity

3. Normal

4. Viral pneumonia

The photos are all in the Portable Network Graphics (PNG) ﬁle type and are

299 ×299 pixels in size. The database presently contains 3616 COVID-19 positive

cases, 10,192 normal, 6012 lung opacity (non-COVID lung infection), and 1345 viral

pneumonia pictures, according to the most recent update.

We will only train our model on two of these four classes, namely the “normal”

and “COVID” classes.

14.3.1.2 Training Phase

1. The image dataset is used as an input to CNN.

2. The data is either chest X-ray or CT scan of the chest.

3. Once the data is provided as input, the CNN starts to extract the features of the

data.

4. The image is passed through a set of convolution and max pooling layers to

reduce the feature and focus on the point on infection.

142 M. Raghav Srinivaas et al.

Fig. 14.1 Pneumonia detection using convolutional neural network (CNN)

5. After the features are extracted, the image is ﬂattened from 2 to 1D.

6. In this way, we ﬁrst train the model.

7. This model is used for detection of pneumonia as well as COVID-19 (Fig. 14.1).

14.3.1.3 Testing Phase

1. The testing phase starts with passing the input image.

2. The image is passed through a set of convolution and max pooling layers to

reduce the feature and focus on the point on infection.

3. After the features are extracted, the image is ﬂattened from 2 to 1D.

4. Now the image is classiﬁed as positive is infection is there and negative if infection

is not there.

5. Various types of CNN can be used like Alex net, ResNet or region-based CNN.

6. We can also use the transfer learning approach where we are taking an already

trained model.

14.3.1.4 Exploratory Data Analysis

In the next section, we did an exploratory data analysis which deals with the dimen-

sion of the images in the respective classes and the number of images in each classes.

Here, we found that there was a big difference of data under each classes for which

we applied the concept of “focal loss”. After that, we did preprocessing of the images

and tried to examine a clear image with the help of computer vision techniques. The

graph for each techniques was plotted below the same image. This step was repeated

for the COVID image. Then, we dealt with data augmentation and split the data for

training and testing. After this tap, we built the model from the scratch and applied

various concepts of deep learning like pooling, dropouts, activation function, etc.,

for which the details are mentioned in the summary of the model(model.summary()).

We deﬁned a custom loss function and compiled the model. After that we trained

14 Detection of COVID-19 Using CNN and ML Algorithms 143

our model for 25 epochs which gave an accuracy of 89.29%. This model was saved

under the name of “my_model_1.h5”. We plotted a graph of training and validation

using epoch and accuracy as the grid values. Same step was executed to plot the loss

of training and validation.

Then, we trained the model using “InceptionResNetV2”. This model was trained

while running it for 12 epochs on which we got an accuracy of 95.47% and was saved

under the name “my_model_2.h5”. We plotted a graph to observe the accuracy and

loss at different points (Fig. 14.2).

After that, we ﬁne-tuned the model using the concept of focal loss and the same

step was executed for the model 2 above. But, at this time, the model ran up to 20

epochs, for which the accuracy of 99.24% was attained (Fig. 14.3).

Fig. 14.2 InceptionResNetV2 accuracy and loss of model

Fig. 14.3 Accuracy and loss of model after focal loss

144 M. Raghav Srinivaas et al.

14.3.2 Detecting COVID-19 Using Random Forest

14.3.2.1 Dataset Information for ML: Covid_Symptoms Checker

These data will help to identify whether any person is having a coronavirus disease

or not based on some pre-deﬁned standard symptoms. These symptoms are based on

guidelines given by the World Health Organization (WHO) who.int and the Ministry

of Health and Family Welfare, India.

The dataset contains seven major variables that will be having an impact on

whether someone has coronavirus disease or not, the variables are country, age,

symptoms, experience any other symptoms, severity, and contact. With all these

categorical variables, a combination for each label in the variable will be generated

and therefore, in total 316,800 combinations are created.

14.3.2.2 Implementation Using Random Forest

1. The text dataset is used as an input to random forest.

2. The data is either a report containing a sequence of text or labeled CSV ﬁles.

3. We initially set a threshold for each feature value like oxygen level, SAT’s, BP.

4. Once the data is provided as input, the random forest starts to extract the features.

5. After the features are extracted, it is compared with the threshold value to ﬁnds

if it is positive or negative.

This approach is a form of a supervised learning. We have the labeled data as well

as the outcome. So we perform random forest to extract the features and then, we

compare with the threshold value and classify it as positive or negative. Then, we

compare it with the output to ﬁnd whether we are able to identify properly.

14.3.2.3 Exploratory Data Analysis of COVID-19 Symptoms Data

See Figs. 14.4,14.5,14.6 and 14.7.

14.4 Experimental Results

Using CNN we created two models and were tested with different loss function.

Initially, a model was developed from scratch and while testing it, we were able to

get an accuracy of 89.29%. Then, we trained the model using InceptionResNetV2

and were able to achieve an accuracy of 95.47%. Later the same model was trained

with focal loss function and this gave us a better accuracy of 99.24%. When we used

14 Detection of COVID-19 Using CNN and ML Algorithms 145

Fig. 14.4 Symptoms counts

Fig. 14.5 Percentage of symptoms heavily occurred

146 M. Raghav Srinivaas et al.

Fig. 14.6 Type of symptoms checker

Fig. 14.7 Random forest

results

random forest model, a less accuracy of 75.09% was achieved. From this, we can

infer that convolution neural networks perform better when compared to machine

learning algorithms.

14.5 Conclusion

Two models using were created to detect COVID symptoms using X-ray images. The

performance was evaluated in terms of classiﬁcation accuracy, and it was observed

that the CNN model using InceptionResNetV2 provided a greater accuracy. Future

work will be toward using this model to enhance the accuracy for other datasets as

well.

14 Detection of COVID-19 Using CNN and ML Algorithms 147

References

1. Militante, S. V., Dionisio, N. V., & Sibbaluca, B. G. (2020). Pneumonia and COVID-19 detec-

tion using convolutional neural networks. In International Conference on Vocational Education

and Electrical Engineering (ICVEE).

2. Hasan, A., Shahin, I., & Alzabek, M. B. (2020). COVID 19 using recurrent neural networks.

IEEE.

3. Taresh, M. M., Zhu, N., Ali, T. A., Hameed, A. S., & Mutar, M. L. (2021). Transfer learning

to detect COVID-19 automatically from X-ray images using convolutional neural networks.

International Journal of Biomedical Imaging, Article ID 8828404.

4. Narin, A., Kaya, C., & Pamuk, Z. (2021). Automatic detection of coronavirus disease

(COVID-19) using X-ray images and deep convolutional neural networks. Pattern Analysis

and Application.

5. Awasthi, A., & Srivastava, S. K. (2021). A fuzzy soft set theoretic approach in decision making

of covid-19 risk in different regions. Communications in Mathematics and Applications, 12(2),

285–294. ISSN: 0975-8607.

6. Islam, M. Z., Islam, M. M., & Asraf, A. (2020). A combined deep CNN-LSTM network for

the detection of novel coronavirus (COVID-19) using X-ray images. Information in Medicine

Unlocked.

7. Digital Object Identiﬁer https://doi.org/10.1109/MCI.2020.3019873. Date of current version:

14 October 2020. “Statistical techniques for combating covid-19”.

8. Tseng, V. S., Ching Ying, J. J., Wong, S. T. C., Cook, D. J., & Lui, J. (2020). Computational

intelligence techniques for combating COVID-19: A survey. IEEE Computation Intelligence

Magazine.

9. Taj, R. M., El Mouden, Z. A., Jakimi, A., & Hajar, M. (2020). Towards using recurrent neural

networks for predicting inﬂuenza-like illness: Case study of covid-19 in Morocco. International

Journal of Advanced Trends in Computer Science and Engineering, 9(5).

10. Rubin G. D., et al. (2020). The role of chest imaging in patient management during the COVID-

19 pandemic: A multinational consensus statement from the Fleischner society. Radiology,

296(1), 172–180.

Chapter 15

Prioritization of Watersheds Using GIS

and Fuzzy Analytical Hierarchy (FAHP)

Method

K. Anil, S. Sivaprakasam, and P. Sridhar

Abstract Evaluating watershed characteristics with GIS tools and FAHP method

to prioritize based on erodability is crucial for the researchers. Presently, the study

area Kaddam watershed part of the Godavari basin has been selected to prioritize the

watersheds. The study area has a watershed divide with watershed codes of 4E3C4

(Sikkumanu River), classiﬁed as Eleven watersheds (4E3C4a to 4E3C4k), and 4E3C5

(Kaddam river) classiﬁed into seven watersheds (4E3C5a to 4E3C5g). The geospatial

thematic layer drainage network was extracted from Survey of India topographical

maps and delineated the watershed boundaries from Watershed Atlas of India [WAI]

using ArcGIS software. The GIS software provides a database to the users about

the drainage network’s physical shape and length, and watershed areas. The data is

helpful to analyze the characteristics of watershed compute through morphometric

parameters equations. The results showed seven watersheds as very severe, ﬁve as

severe, and six as moderate are checked and compared with soil map properties of the

study area. 85–87% FAHP simulations showed accurate erodability matched with

soil map. Hence, the study showed that FAHP simulation is beneﬁcial for prioritizing

watersheds based on erodability as a deciding factor.

15.1 Introduction

The watersheds prioritization apt to select on hydrological parameters such as runoff,

sediment yield, or erodability has gained more importance in watershed management

practices [1]. Watershed is deﬁned as a drained area or natural hydrological boundary

which constitutes a closed polygon or at the lower elevation where exit point collects

K. Anil (B)·S. Sivaprakasam

Department of Civil Engineering, Annamalai University, Chidambaram, India

e-mail: anilkodimela@gmail.com

K. Anil

Department of Civil Engineering, Bapatla Engineering College, Bapatla, India

P. Sridhar

Shri Vishnu Engineering College for Women, Bhimavaram, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_15

149

150 K. Anil et al.

the rainwater, melting snow, or ice coverage’s which meets another water body [2].

Delineate the watershed boundary depends on the occupancy size of streams and

which also joins to the mainstream. The length of the physical object of streams

and areas of watersheds extracts from topographical maps or satellite images, or

aerial photographs with the help of Geographical Information System (GIS) software

treated as morphometric parameters. The several morphometric parameters that hold

the mathematical equations measure the geometrical shape of above earth features

and describe the characteristics of the watershed. The characteristics of watersheds

assessed through employing advanced technologies such as remote sensing satellite

images data with medium to higher resolution, digital elevation models (DEM), and

GIS software proved signiﬁcant tools in the last few decades [3]. The GIS tools

provide to ﬁgure out the nature of the topography, hydrological behavior, and causes

of erodability, which are helpful to prioritize the watershed [4]. Erodability is the

decision-making parameter for prioritization of watersheds to identify soil and water

conservation measures studies is increased by recent times with the morphome-

tric and fuzzy analytical hierarchy (FAHP) method [3,5]. Apart from this earlier,

several researchers investigated to prioritize the watersheds on different terrains

applied morphometry with Sediment Yield Index (SYI) or LULC combination, or

the compound parameter method [7].

Nevertheless, many studies have recently focused on the FAHP method to prior-

itize watersheds deciding on erodability values, which are FAHP outcomes. FAHP

method evaluates the fuzzy behavioral process and identiﬁes which one shows higher

to the lower inﬂuences with better accuracy of any problem [8]. Therefore, present

research attempts to prioritize watersheds using a unique approach based on an exam-

ination of the natural drainage system by looking into the FAHP method to avoid the

complex information that comes with it by improving recognition and prioritization

accuracy with varied morphological parameters [9].

15.2 Relevant Study

The Kaddam watershed is a part of the Godavari river basin and is located on the

Northern side of Telangana state, a present district of Adilabad and Nirmal. The

coordinates of the study area are 19°5to 19°35N latitude and longitudes 78°10

to 78°55E. The study area covers 64% of clayey soils, cracking clay soil is 4.6%,

gravelly clay soils are 11.47%, Gravelly loam soils are 1.6%, and loamy soils are

18.58%. The mean annual rainfall of the study area was found as 1031.75 mm period

of 1996–2020 and 7° to 9 °C minimum temperature in the January month and 18° to

32 °C in the May months from the Bureau of Economics and Statistics department,

Hyderabad. The study area map represented in Fig. 15.1. The datasets are collected

from Survey of India Topographical maps or Toposheet numbers 56 I/3, I/6, I/7, I/8,

I/10, I/11, I/12, I/14, I/15, and 56 I/16, scale of 1:50,000. The software utilized here

is ArcGIS 10.3 and Microsoft Excel.

15 Prioritization of Watersheds Using GIS and FAHP... 151

Fig. 15.1 Study Area

15.3 Methodology

15.3.1 Topographical Maps

Survey of India [SOI] topographical maps of the study area has been collected,

scanned, and added into the ArcMap environment. All the maps are geo-referenced,

rectiﬁed, and projected into a three-dimensional spherical coordinate system to a

two-dimensional system to get the physical objects’ real-world object distance and

areas in the ArcMap environment [12]. Created a Geospatial database, extracted

linear features of drainage or stream network in the study area from topographical

maps. Strahler stream order attributed to drainage network as follows; ﬁrst order has

attributed to those streams that do not have any connected streams if two ﬁrst orders

meet or join at one point from that intersection onwards second order as follows if

any stream order number as two of two individual streams, then the intersection point

ahead of third-order as follows.

15.3.2 Watershed Atlas of India (WAI)

The study area watershed divide, delineation, and attribution of mini watersheds are

refereed from sheet no four (4) of the WAI [12].

The morphometry equations were used to compute the characteristics of the

watershed listed in Table 15.1.

152 K. Anil et al.

Fig. 15.2 Proposed ﬂow chart of methodology

Table 15.1 Morphometric parameter

S.No. Basic criterion Technique Mentioned by

1Basin length Lb=(1.312) ×[A]0.568 Nookaratnam (11)

2 Compactness coefﬁcient Cc=0.2821 ×P/(A)0.5 Horton (6)

3Bifurcation ratio Rb=Nu/Nu +1Schumn (17)

4Elongation ratio Re=(2/Lb)×(A/π)2Schumn (17)

5Drainage density Dd=(Lu)/(A)Hortan (6)

6Drainage texture T=(Nu)/(P)Hortan (6)

7Form factor Rf=A/(Lb)2Hortan (6)

8Circularity ratio Rc=4×π×A/(P)2Miller (10)

9Stream frequency Fs=(N)/(A)Horton (6)

Nu total number of streams of all order, Lbbasin Length (km), Lu total stream length of all orders,

Ntotal number of streams, Nu +1 number of stream segments of next higher order. Aarea of the

basin (km2)andPperimeter of the basin (km)

15.3.3 Fuzzy Analytical Hierarchy Program (FAHP)

Several researchers utilized statistical models AHP and MCDM techniques to prior-

itize watersheds based on morphometric parameters, and these outputs, such as

erosion, are decision-making processes found in recent and old literature. To assess

and prioritization of watersheds for the study area, morphometric outputs are consid-

ered to frame in matrix form in row and columns are nine [13]. The nine parameters

of matrix form are basin length (Lb), circularity ratio (Rc), stream frequency (Fs),

15 Prioritization of Watersheds Using GIS and FAHP... 153

drainage density (Dd), drainage texture (T), elongation ratio (Re), bifurcation ratio

(Rb) form factor, and compactness constant (Cc)[14]. The matrix parameters are

ﬁxed with numeric values, and the reciprocal values of numeric values are derived

from linguistic terms, and the values are as follows very severe for nine values, severe

for eight, and moderate for seven in the FAHP analysis. Then, these values provide

the weightage value of each morphometric parameter. The weightage value of each

parameter multiplied with morphometric parameter, these outcomes of each param-

eter summation provide the normalized weightage value of each watershed, which

can derive the prioritization ranks of a watershed [15]. Hierarchy of fuzzy analyt-

ical program analysis is required to ensure consistency in matrix simulations using

Saaty’s proposed consistency index (CI) and consistency ratio (CR). The consistency

ratio (CR) simulations show a value of less than 10%, indicating whether the decision

is consistent or not [16]. The consistency ratio and index formulas are listed below.

CR =(CI/RI)×100

Consistency Index =λMax −n

n−1

n—number of factors used in the table of matrix, for present study area, 9 ×9matrix

adopted for the watershed prioritization.

15.4 Results and Discussion

15.4.1 Morphometric Analysis

Each watershed characteristic is calculated using Table 15.1 formulas, and the results

are represented in Table 15.2.

The bifurcation ratio is the minimum and maximum values found in 4E3C5e

and 4E3C5a watersheds. The bifurcation ratio indicates that to measure the ﬂooding

proneness in the watershed, and the higher the bifurcation ratio indicates greater

ﬂooding probability. The ranges of bifurcation ratio show between 2 to 5 typically,

but the present study area has greater than ﬁve values (Table 15.2), which means has

severe ﬂoods occurred in the respective watersheds. The drainage density expresses

the potential of a hydrological parameter such as runoff in the watershed or basin.

Presently, higher drainage density founds in 4E3C4d as 3.51 km/km2(Table 15.2),

and lesser density is found in 4E3C4i as 2.1 km/km2(Table 15.2) watersheds. The

elongation ratio value lies in between 0 and near 1, if near 1 of any basin that

indicates the basin has less control on geomorphological parameters. The relief,

inﬁltration capacity, and permeability of the watershed or basin depend upon the

stream frequency. The stream frequency highest found in the 4E3C4c watershed was

7.2 (Table 15.2) and lowest in 4E3C4i as 3.6 (Table 15.2). The runoff degree of

dissection is directly related to stream frequency and in direct proportion to the mean

154 K. Anil et al.

Table 15.2 Characteristics of morphometric parameters

Code Ld,b CcDdFsRcRfReRbT

4E3C4a 41.65 2.2 2.82 4.6 0.2 0.25 0.56 5.35 12.4

4E3C4b 24 1.4 2.8 5.1 0.4 0.29 0.6 6.4 13.1

4E3C4c 25.11 1.7 3.4 7.2 0.33 0.28 0.6 4.73 15.8

4E3C4d 12.88 1.2 3.51 6.036 0.6 0.33 0.65 4.47 10.2

4E3C4e 16.48 22.9 4.215 0.2 0.31 0.63 5.99 5.39

4E3C4f 10.64 1.5 3.06 5.593 0.41 0.35 0.67 10.7 6.43

4E3C4g 8.54 1.5 3.07 5.468 0.4 0.37 0.68 6.5 5.09

4E3C4h 10.8 1.3 3.4 6.086 0.5 0.35 0.66 6.55 8.33

4E3C4i 17.98 1.8 2.1 3.627 0.28 0.31 0.62 6.04 5.43

4E3C4j 9.6 1.2 2.43 3.693 0.6 0.36 0.67 6.66 4.7

4E3C4k 25.96 1.3 3.14 5.098 0.57 0.28 0.6 815.1

4E3C5a 44.46 2.4 2.82 5.251 0.16 0.25 0.56 11.9 13.4

4E3C5b 19.24 1.5 2.4 3.7 0.43 0.3 0.62 4.38 7.3

4E3C5c 22.43 1.8 2.9 4.6 0.3 0.29 0.61 4.8 8.7

4E3C5d 9.78 1.7 3.25 50.31 0.35 0.67 5.36 4.6

4E3C5e 26.24 1.2 3.04 50.6 0.2 0.6 215.5

4E3C5f 15.81 1.4 2.94 4.7 0.46 0.32 0.63 4.2 8.1

4E3C5g 28.65 1.4 2.6 4.9 0.49 0.2 0.59 4.19 14.7

annual rainfall of the study area [18]. Very high runoff in the watersheds leads to

severe erosion problems, which can hamper the reservoir depositions. The texture

ratio is related to inﬁltration rate; a lesser texture value has high inﬁltration and

less susceptibility to erosion, whereas a higher texture value indicates more inferior

inﬁltration and increased exposure to erosion. The study area found that a more

secondary texture 4.6 (Table 15.2) is found in the 4E3C5d watershed and a higher

texture value 15.8 (Table 15.2) 4E3C4c watershed [19].

15.4.2 Watershed Prioritization Utilizing the Fuzzy

Analytical Hierarchy Process (FAHP) Technique

The morphometric parameters such as basin length (Lb), circularity ratio (Rc),

drainage density (Dd), stream frequency (Fs), elongation ratio (Re), form factor,

drainage texture (T), bifurcation ratio (Rb), and compactness constant (Cc) pairwise

comparison matrix size and weightage value obtained from the computation are

represented in Table 15.3. The ﬁeld experience and expert’s suggestions are helpful

to ﬁll the matrix between two pairs which determine the degree of importance in

the FAHP method [20]. Initially, the method needs to set the linguistic terms with

15 Prioritization of Watersheds Using GIS and FAHP... 155

Table 15.3 Pairwise comparison matrix values and weightage

Parameters LbCcDdFsRcRfReRbT

Weights 0.1012 0.1157 0.0613 0.1765 0.1269 0.1561 0.0393 0.142 0.0804

values such as very severe-9, severe-8, and moderate-7 and reciprocal of the same

importance to calculate each morphometric parameter to procure relative weights

shown in Table 15.3.

The relative weights are multiplied with morphometric parameters provides the

normalized values. Each parameter normalized value is summed together gets the

ﬁnal value that decides the watershed’s priority. The priority values are not unique

for any study area; different researchers or users get other values for priority ranking.

However, the ﬁnal values obtained of normalized values in the study area were chosen

as the values between 0–4 as a moderate priority, if the values are places in between

4–5 as severe and >5 as very severe erodability chances in the watersheds.

The erodability chances very severe in watersheds found are—4E3C4a, 4E3C4b,

4E3C4c, 4E3C4k, 4E3C5a, 4E3C5c, 4E3C5e, and 4E3C5g severe watersheds are—

4E3C4d, 4E3C4e, 4E3C4f, 4E3C4h, 4E3C4i, 4E3C5b, and 4E3C5f, followed by

Moderate watersheds are—4E3C4g, 4E3C4j, and 4E3C5d. The FAHP simulations

validate with soil map and taxonomy, from results 85% to 87% matched accurately

of erodability nature of the watersheds.

The various maps of the study area are represented in Fig. 15.3.

Fig. 15.3 Description of various study area maps

156 K. Anil et al.

15.5 Conclusion

The GIS tools are very effective tools that can extract from various sources of phys-

ical objects information at exact geospatial locations, which is a valuable insight

to various morphometric equations. The morphometric outcomes are easy to under-

stand about the hydrological behavior of the watersheds over a while spatially by

experts or analysts or working professionals in the ﬁeld. The statistical model like

FAHP needs expertise and understand each parameter to ﬁll or frame the parameters

in matrix size is the only disadvantage but remaining other computation in the model

easy can dose any person. But, integrating the simulations with the FAHP method to

solve fuzzy or vagueness to decide on a particular theme of erodability and water-

shed prioritization is not an easy task for planners. The present study showed that

morphometric and FAHP method is the prominent tools to prioritize the watersheds

on which erodability is a decision-making parameter. Finally, the spatial distribution

of prioritization map preparation, which is the outcome of FAHP, is helpful informa-

tion to the watershed planners so that they can easy to work on a priority base that is

very severe, severe, and moderate erodability watersheds. These maps are reducing

the valuable time with cost in the watershed for the planners. Hence, the methods

recommended and encouraged to do by researchers and planners in the watershed

management planning activities to achieve better results effectively.

References

1. Ahmed, R., Sajjad, H., & Husain, I. (2018). Morphometric parameters-based prioritization

of sub-watersheds using fuzzy analytical hierarchy process: A case study of lower Barpani

watershed, India. Natural Resources Research, 27(1), 67–75. https://doi.org/10.1007/s11053-

017-9337-4

2. Bali, R., Agarwal, K. K., Nawaz Ali, S., Rastogi, S. K., & Krishna, K. (2012). Drainage

morphometry of Himalayan Glacio-ﬂuvial basin, India: Hydrologic and neotectonic implica-

tions. Environmental Earth Sciences, 66(4), 1163–1174. https://doi.org/10.1007/s12665-011-

1324-1

3. Banerjee, A., Singh, P., & Pratap, K. (2017). Morphometric evaluation of Swarnrekha water-

shed, Madhya Pradesh, India: An integrated GIS-based approach. Applied Water Science, 7 (4),

1807–1815. https://doi.org/10.1007/s13201-015-0354-3

4. Bhattacharya, R. K., Chatterjee, N. D., & Das, K. (2020). Sub-basin prioritization for assess-

ment of soil erosion susceptibility in Kangsabati, a plateau basin: A comparison between

MCDM and SWAT models. Science of the Total Environment, 734, 139474. https://doi.org/10.

1016/j.scitotenv.2020.139474

5. Hembram, T. K., & Saha, S. (2020). Prioritization of sub-watersheds for soil erosion based on

morphometric attributes using fuzzy AHP and compound factor in Jainti Riverbasin, Jharkhand,

Eastern India. Environment, Development, and Sustainability, 22(2), 1241–1268. https://doi.

org/10.1007/s10668-018-0247-3

6. Horton, R.E. (1945) Erosional development of streams and their drainage basins: Hydro-

physical approach to quantitative morphology. GSA Bulletin, 56:275–370. https://doi.org/10.

1130/0016-7606(1945)56[275:EDOSAT]2.0.CO;2

7. Jhariya, D. C., Kumar, T., & Pandey, H. K. (2020). Watershed prioritization based on soil and

water hazard model using remote sensing, geographical information system, and multi-criteria

15 Prioritization of Watersheds Using GIS and FAHP... 157

decision analysis approach. Geocarto International, 35(2), 188–208. https://doi.org/10.1080/

10106049.2018.1510039

8. Magesh, N. S., Jitheshlal, K. V., Chandrasekar, N., & Jini, K. V. (2013). Geographical informa-

tion system-based morphometric analysis of Bharathapuzha river basin, Kerala, India. Applied

Water Science, 3(2), 467–477. https://doi.org/10.1007/s13201-013-0095-0

9. Meshram, S. G., Alvandi, E., Singh, V. P., & Meshram, C. (2019). Comparison of AHP and

fuzzy AHP models for prioritization of watersheds. Soft Computing, 23(24), 13615–13625.

https://doi.org/10.1007/s00500-019-03900-z

10. Miller, V.C. (1953) A quantitative geomorphologic study of drainage basin characteristics

in the clinch mountain area, Virginia and Tennessee. Columbia University, Department of

Geology, Technical Report, No. 3, Contract N6 ONR. 271–300

11. Nooka Ratnam, N., Srivastava, Y.K., Rao, V.V., Amminedu, E., & Murthy. K.S.R. (2005) Check

dam positioning by prioritization of micro-watersheds using SYI model and morphometric

analysis—remote sensing and GIS perspective. Journal of the Indian society of remote sensing,

33(1):25–38. https://doi.org/10.1007/BF02989988

12. Parupalli, S., Padma Kumari, K., & Ganapuram, S. (2019). Assessment and planning for inte-

grated river basin management using remote sensing, SWAT model, and morphometric analysis

(case study: Kaddam river basin, India). Geocarto International, 34(12), 1332–1362. https://

doi.org/10.1080/10106049.2018.1489420

13. Rahaman, S. A., Ajeez, S. A., Aruchamy, S., & Jegankumar, R. (2015). Prioritization of sub

watershed based on morphometric characteristics using fuzzy analytical hierarchy process

and geographical information system—A study of Kallar Watershed, Tamil Nadu. Aquatic

Procedia, 4(Icwrcoe), 1322–1330. https://doi.org/10.1016/j.aqpro.2015.02.172

14. Rai, P. K., Mohan, K., Mishra, S., Ahmad, A., & Mishra, V. N. (2017). A GIS-based approach

in drainage morphometric analysis of Kanhar River Basin, India. Applied Water Science, 7 (1),

217–232. https://doi.org/10.1007/s13201-014-0238-y

15. Sangma, F., & Guru, B. (2020). Watersheds characteristics and prioritization using morpho-

metric parameters and fuzzy analytical hierarchal process (FAHP): A part of Lower Subansiri

Sub-Basin. Journal of the Indian Society of Remote Sensing, 48(3), 473–496. https://doi.org/

10.1007/s12524-019-01091-6

16. Sarma, S., & Saikia, T. (2012). Prioritization of Sub-watersheds in Khanapara-Bornihat Area

of Assam-Meghalaya (India) based on land use and slope analysis using remote sensing and

GIS. Journal of the Indian Society of Remote Sensing, 40(3), 435–446. https://doi.org/10.1007/

s12524-011-0163-6

17. Schumm, S.A. (1956) Evolution of drainage systems and slopes in badlands at perth amboy,

New Jersey. GSA Bulletin 67(5):597–646. https://doi.org/10.1130/0016-7606(1956)67[597:

EODSAS]2.0.CO;2

18. Singh, P., Thakur, J. K., & Singh, U. C. (2013). Morphometric analysis of Morar River

Basin, Madhya Pradesh, India, using remote sensing and GIS techniques. Environmental Earth

Sciences, 68(7), 1967–1977. https://doi.org/10.1007/s12665-012-1884-8

19. Sridhar, P., & Ganapuram, S. (2021). Morphometric analysis using fuzzy analytical hierarchy

process (FAHP) and geographic information systems (GIS) for the prioritization of watersheds.

Arabian Journal of Geosciences, 14(4), 236. https://doi.org/10.1007/s12517-021-06539-z

20. Thomas, J., Joseph, S., Thrivikramji, K. P., Abe, G., & Kannan, N. (2012). Morphometrical anal-

ysis of two tropical mountain river basins of contrasting environmental settings, the southern

Western Ghats, India. Environmental Earth Sciences, 66(8), 2353–2366. https://doi.org/10.

1007/s12665-011-1457-2

Chapter 16

A Narrative Framework with Ensemble

Learning for Face Emotion Recognition

S. Naveen Kumar Polisetty, T. Sivaprakasam, and S. Indraneel

Abstract Face recognition is playing essential role in many different kind of sectors.

Past four decades face recognition (FR) features are extracted for security, entertain-

ment, employment and so on. Most auspicious part of our research across the world

is focusing to the living being and nature, etc. Our ancient myths are stated that mind

is the origin of all the thoughts and it can be expresses by eyes. Artiﬁcial intelligence

is playing vital role in living being’s life, for instance education, research, medical,

forecasting, many kinds of natural disaster and so on. In this paper, we focused

on deepest relativity between mind and matter and how the mind is emotionally

connected with eyes. We extract the features of mind and its correlation activities

because mind is acted as the creator for any event and face is the output from the

mind. Through the features we have planned to construct the security, employment

areas like candidate hiring process to various industry sectors. We proposed ensemble

learning techniques to achieve mind functions and how it can be imposed to the eye

visions.

16.1 Introduction

Mind is directionally propositional to the body which is the outward manifestation.

So body is simply acted as actuator of the mind. Especially, face is the ultimate

platform of the mind which states the different actions. When the body is always

following the mind, if the mink thinks of moving toward up area, the physical body

prepares automatically itself and reﬂects the external signals are sign.

S. Naveen Kumar Polisetty (B)

Research Scholar, Department of Computer Science and Engineering, Annamalai University,

Chidambaram, Tamil Nadu, India

e-mail: naveenmtech28@gmail.com

T. Sivaprakasam

Department of Computer Science and Engineering, Annamalai University, Chidambaram, Tamil

Nadu, India

S. Indraneel

St. Anns College of Engineering and Technology, Chirala, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_16

159

160 S. Naveen Kumar Polisetty et al.

Facial Recognition System

A facial recognition system is one of the vital techniques which are used to check the

identity of an object from the input sources like video or image frames. Biomedical-

enabled artiﬁcial intelligence system is incorporating with multiple image processing

techniques with optimum result criteria.

Emotional Intelligence

Emotional is the distinguish characteristics of human being and exposes their inner

thought. In this scenario, they may be discussed by four different parameters like

self-management, awareness, social impacts and relativities. This is one of the arts

of study which elucidate the managing skills against the relativities and other long

lusting objects. In fact, emotion recognition plays a pivotal role in the experience of

empathy [1].

Cat and Rat Theory

We consider about the above ﬁgures, Fig. 16.1 shows how the rat is affected by cat

and rat felt that it is not in safer zone. Figure 16.2 shows that kitten is in cat’s mouth

but kitten felt that it is in comfort place. Even though, two animals are in same place

but their feelings are different. Here, mind analysis was calculated between animals.

Our myths are saying about the mind in different aspects, and mind is the origin of

all kinds of matter like God, planet, evilness and so on. Since we have ﬁve major

organs eye, nose, ear, mouth and body; eye is the primary agent of mind.

Based on this analogy, we deeply analyzed about the mind and eye relativity

concerns. In past three decades, artiﬁcial intelligence methodologies are impacting

very big role and rule the world with different machine learning techniques.

Fig. 16.1 Cat catch rat

16 A Narrative Framework with Ensemble Learning for Face Emotion … 161

Fig. 16.2 Cat catch kitten

16.2 Related Work

Nurulhuda Ismail et al. discussed [1] about the few different face recognition algo-

rithms and their review results also. Principle component analysis (PCA) is the

method for simplifying the problem with dimension space of representation. It always

returns the original information of data linear discriminant analysis (LDA) is the

useful algorithm for feature extraction in face images. This approach requires several

training images for each face. Skin color-based algorithm is used for extracting

features of human faces. Skin pixels are the parameter for distinguish whether

skin color or not and it is constructed by Gaussian probability density. Wavelet-

based algorithm describes each face image is described by a subset of band ﬁltered

images containing wavelet coefﬁcients. Artiﬁcial neural networks-based algorithm

is commonly used in all industry level and commercial applications. It will be imple-

mented once a face has been detected to identify and recognize who the person is by

calculating the weight of the facial information. ANN is considered the—foundation

of AI and refers to the pieces of the processing system that computers use to perform

human-like, intelligent problem solving.

Alvappillai and Barrina [2] were implemented a facial recognition system using

a global-approach to feature extraction based on histogram-oriented gradient. They

extracted feature vectors from different facial images in AT&T dataset and Yale data

repositories. They used support vector machine (SVM) for learning the image set. It

supports regression and classiﬁcation and coming from supervised learning sector.

SVM is mostly used to for ﬁnding hyperplane in the features from the data points.

Hyperplanes are decision boundaries that aid in data classiﬁcation. Different classes

can be assigned to data points on either side of the hyperplane. The hyperplane’s

dimension is also determined by the number of features. Support vectors are data

points that are closer to the hyperplane and have an inﬂuence on the hyperplane’s

162 S. Naveen Kumar Polisetty et al.

Fig. 16.3 Blending two different images

position and orientation. We increasing the classiﬁer’s margin by using these support

vectors. The hyperplane’s position will be altered if the support vectors are deleted.

Bah and Ming [3] had demonstrated attendance monitoring system for real-life

scenario with the implementation of algorithm with the inputs of binary pattern. They

are incorporating as contrasting, bilateral ﬁltering for face recognition system. Their

experimental results from two interconnected parts, LBP cascade classiﬁer have been

implemented and images are captured by digital sources like camera, jpeg pictures.

Hear cascade classiﬁer algorithm was implemented for face detection accuracy and

minimizing the numbers of −+, −−. Second phase, they extracted image uniﬁcation

technique on their trained image datasets (Fig. 16.3).

They have concluded that using LBP with contrast adjustments, bilateral ﬁlter and

histogram equalization is evolving in sophisticated imaging techniques and func-

tional to the training image processing technique. K2 sectors are having two subsets,

LBP code will be calculated for every pixel in a region of the input face image by

comparing the center with the surrounding pixel.

Narmatha C. and Manimegalai P. have implemented modiﬁed stenography for

images (MSI) for encoding and decoding techniques foe image protection in the

transmission areas. The MSI technique emphases the stenographic images creation

using encode process and reconstructing with secret data in decode process.

The ‘stegano’ is the cover image and not able to identify the hidden content from

the ‘stegano’ image N,n

i,j=0ST(i,j)as shown in Fig. 16.4. Major features research

is shufﬂing of cover image. They tested with 1000 Gy scale images with cover and

secret parameters. 512 ×512 pixels ﬁxed for cover images and 256 ×256 pixels

for secret images are considered as input and applied for MSI processing. And it

takes lesser time 1–1.865 s only for processing with all data and they improved with

testing software for complexity strength [4].

16 A Narrative Framework with Ensemble Learning for Face Emotion … 163

Fig. 16.4 Cover and encoded images

16.3 Proposed Work

This architecture depicts the framework of face recognition, and we detect the face

with different trained parameters and analysis eye of impression as inputs to the

machine. Already we have the mind parameters about the particular person and

synchronize with ensemble learning. Human brain has 10 billion neurons, and it

can be incorporated with different duties of human. Based on the eye impression,

decision support or expert system will produce the output as expected hypothesis.

If the hypotheses are nearly meet the requirement or we have to apply the booster

function at ensemble learning part, then train the network with stipulated time.

Ensemble Learning

Ensemble learning is one of the supervised learning methods, instead of learn from

single hypothesis, it allows to learn multiple hypotheses and combine their predic-

tions. When combing multiple independent and divers decisions each of which is at

least more accurate than random guessing, random errors cancel each other out and

correct decisions are reinforced. Hypothesis space is important to note that the possi-

bility or impossibility of ﬁnding a simple, consistent hypothesis depends strongly

on the hypothesis space chosen. It is always good to prefer the simplest hypothesis

which is consistent with data. But consistency is not always achievable.

Figure 16.5 depicts the detailed processing of our learning model; we train the face

images for our expecting outcome which is come from his/her mind. We have taken

the input parameters are eye brow, eye ball rotation, skin color and intensity of eyes

and cheek. By the combinations of our entire trained hypothesis model combiner

will be generated. If it is meet our expectation we have concluded with statement or

we have to go for ADA boosting algorithm for improving the efﬁciency.

164 S. Naveen Kumar Polisetty et al.

Fig. 16.5 Ensemble learning with mind parameters

16.4 Results and Discussion

In this scenario, three classiﬁers have been analyzed; they are decision tree, neighbor

and logistic regression. From this classiﬁer is divided into two parts; one is consisting

with independent variables with trained data and other is incorporate with target

variable for training data.

Boost Algorithm

1. For each learners do—uniform weights

2. Match and train with learner with weight

3. All data must be trained

4. Identify the errors in weight

5. Conﬁne with ensemble value end for

Boost Features:

1. It is sequential algorithm

2. Concentrate on misclassiﬁed data

3. Very popular algorithm

4. Used to improve accuracy from the prior tree

16 A Narrative Framework with Ensemble Learning for Face Emotion … 165

We used keras tool with Python for simulate our research work. In this part, tree

has been constructed with all image parameters. Xis the actual image accelerations,

and Yis our expectations. Collection of Xis also considered as hypothesis and

collect n number hypothesis are grouped together and conﬁne the statement. If Xis

not supports we can boost algorithm for Xand get the Ynearer value.

Figure 16.6 is trained by our network and predict the optimum result. In

psychology, decision making is the major cognitive process and come across their

past memories and real expectation or depending on the current scenario. Our aim is

analyze the mind values through the face recognition. So we train the 1200 different

images with different actions are to be stored in the database for future reference.

Input images may give as recorded format like Jpeg, video formats or through the

live video stream to the framework, then feature extraction is enabled with the given

input. For instance, we have given black and white images, skin color is difﬁcult to

identify the features; so it can be split into 512 ×512 pixels and analyze the features.

How it works:

Figure 16.7 shows the reactions of deep looking or stares about something but this

picture is not in preferred pixel size. We have to implement boosting algorithm to

achieve the result; for this image parameters color, eye ball view and eye ball position

are having less pixel values. Hence, our trained model suggests add brightness 17%

and contrast 21% will be added to the images and conﬁne the result.

166 S. Naveen Kumar Polisetty et al.

Face – Peace Face – Angry

Face – Expected Face – Comfort

Face – Question to others Face Not Happy

Face – Not able to answer Face Stare

Fig. 16.6 Face—different actions on eye view

Fig. 16.7 Stare image

Our narrative framework deﬁnes how the mind ﬂows from the eye; therefore, we

identiﬁed the parameters like eye ball, eye brow, cheek, color and intensity of eyes

are called instance of training data. Each classiﬁer is able to identify or recognize

the test instances from its trained set.

Strictly follows the feature extraction, i.e., there are n similarity in the set, there

are no non-similarity in the given set. Next phases, all the (n) similarity of probability

16 A Narrative Framework with Ensemble Learning for Face Emotion … 167

Table 16.1 Parameterized analysis

Train the parameters Eye ball rotation Eye brow Cheek Intensity of eyes Skin color

Image 1 0.0 0.05 0.14 0.32 0.21

Image 10 0.15 0.32 0.21 0.42 0.25

Image 50 0.22 0.12 0.11 0.31 0.12

Image 80 0.24 0.14 0.16 0.32 0.12

are combined in region index for further inference.

y=arg max

cj∈C

hi∈H

P(cj|hj)P(T|hi)P(hi)

Table 16.1 depicts the images for decision making from ensemble learning.

Initially, we analyzed one images with different prosperities and ten images with

two persons, then started with ten, 16 persons with our narrative framework. Even-

tually, we evaluate with Yis the expected value, where Cis the possibility of classes

His the hypothesis and Pis to identify the probability with Ttraining data.

16.5 Conclusion

We conﬁned that this research will be used for all employability selection process,

security surveillance, smart home application, etc. For instance, through the face

detection we may have to make the decision making for face to face interview. Even

though, many researchers contributed their effective ideas about the face recognition;

we contributed mind with eye in different aspects. Decision making is the toughest

task for analyzing single hypothesis from the from the face recognition. Because

human mind is varying every aspect so we tend to narrate this framework with more

than one parameter for face recognition system. We achieved 94% accuracy from the

given trained set, and it will be very useful for security related application and other

humanoid applications.

References

1. Ismail, N., & Sabri, M. I. (2018). Review of existing algorithms for face detection and

recognition. Recent Advances in Computational Intelligence, Man-Machine Systems and

Cybernetics.

2. Alvappillai, A., & Barrina, P. N., Face Recognition Using Machine Learning.USCD

3. Bah, S. M., & Ming, F. (2020). An improved face recognition algorithm and its application in

attendance management system. Elsevier, 2590-0056.

4. Li, Y., Face Recognition System Thesis.

Chapter 17

Modiﬁed Cloud-Based Malware

Identiﬁcation Technique Using Machine

Learning Approach

Gavini Sreelatha, Aishwarya Govindkar, and Sarukolla Ushaswini

Abstract As there is drastic improvement in the wireless signals, which is associated

with IoT environment and results in the development mobile devices. This is needed

as there is growth in the impact of the mobile devices nowadays, threat developers

also active in spreading the malware day to day to weaken the user’s data privacy

and integrity as it is eventually need for the mobile users. So, there is a need of

effective framework to identify the malware, which is exist in the smartphone android

applications and able to analysis the devices dynamically from time to time. To

develop a machine learning-based web framework, which is able to identify the

malware in the mobile devices. As this framework should be more effective for the

real-time applications as it the propose framework needs more salient features to gain

the effectiveness based on the feature selection concept. For the analysis, we have to

take more samples on various android applications and analysis the framework with

certain parameters like F-measure and data accuracy. As feature selection is concern,

we can consider chi-square, gain ratio, information gain, logistic regression analysis,

one R and PCA.

17.1 Introduction

In the recent days, wireless network is growing rapidly due to the increase in the usage

of smartphones and become the daily element for the human life. As the humans are

physically and mentally dependent on the mobile phones and as the need of various

applications are need to process different activities such as bank transactions, chat

G. Sreelatha (B)·A. Govindkar ·S. Ushaswini

Department of Information Technology, Stanley College of Engineering and Technology for

Women, Hyderabad, India

e-mail: sreelathaprince13@gmail.com

A. Govindkar

e-mail: aishwaryagovindkar@gmail.com

S. Ushaswini

e-mail: sushaswini@gmail.com

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_17

169

170 G. Sreelatha et al.

interaction and social networking. To utilize the above mobile applications, there is a

need of vast Internet and huge data storage with the mobile phones where cloud plays

a signiﬁcant role. As there is a purpose of installing various applications, malware

detection has a major role and it extracts the useful information and it happens day to

day [1]. As the statistical survey has forecasted that there are few thousand new threats

are penetrated based on mobile applications. So, there is a signiﬁcant challenge faced

by the mobile user that the new intruder threats are a major problem.

Based on the mobile applications, the threat detection and identiﬁcation play

a signiﬁcant role and it can be processed based on feature selection. The feature

selection can be performed based on training the datasets through learning techniques.

As there are two broad classiﬁeds in the feature selection as mentioned as, ranking

the feature and subset selection approach. Regarding the feature selection.

•Gain Ratio: It can be determined based on the essential information based on

decision learning tree. It helps to reduce the predisposition based on the attributes

of multi-valued, and it will be based on the attribute choice using branches size

and count [2].

•Chi-Squared Test: By considering the classiﬁcation problem, categorical data are

considered as the input variables. Then, the output variables are generated based

on the test statistical method. Here, the test has dependent and non-dependent

method based on the input attributes.

•Information Ratio: It helps to measure the ﬁnancial information based on adjust-

ment risk related to certain limitations. Determine the data consistency to calcu-

late the excess data to be obtained and it exhibits the data identiﬁcation related to

problem obtained [3].

•One R: It follows the classiﬁcation of data, which obtained data prediction using

one rule method. When one rule is applied to determine the total error rate and

it performs prediction based on rule creation. For the construction of prediction,

the table frequency is considered based on the data target [4].

•Principal Component Analysis (PCA): It is nothing but the method of statistics,

and it converts the certain number of variable correlations into the number of

variable uncorrelation based on usage of transform method. It is mainly used for

analysing the data identiﬁcation and various model of predictions.

•Logistic Regression Analysis: It refers to the logical regression, and inputs are

considered as the data weights and its value coefﬁcient as joined. The above input

values able to predict the data to determine the prediction value output and this

output can be obtained based on the binary values of 0’s and 1’s [5].

For the data subset feature,

•Feature Selection Correlation: It is based on statistical test being analysed using

various machine learning techniques, and it is a feature selection method makes

faster execution [6]. This process helps in eliminating the unwanted and unused

data in order to improve the data performance based on machine learning

algorithm.

17 Modiﬁed Cloud-Based Malware Identiﬁcation Technique … 171

•Rough Set Analysis: Here, the analysis are performed based on data uncertainty

through rough datasets. The analysis help to calculate the object attributes and

able to construct the both set of upper and lower data. For real applications,

size and complexity of data get varies while performing the data analysis and

managing the execution, [7] to reduce the size of data and able to maintain the

data inconsistency.

•Consistency Subset Evaluation Approach: It has feature selection, which helps

to perform effective technique with reduced reduction dimensionality. The data

classiﬁcation helps to determine the feature selection in an optimal way with data

accuracy and reduced data size [8]. In this feature selection, there are two methods

such as candidate selection subset and feature space search.

•Filtered Subset Evaluation: This ﬁlter method is the statistical approach to analyse

the association between the input data and output data generated at the destina-

tion [9]. During the ﬁltering method, the score obtained based on the input data

considered will perform to choose the ﬁlter.

The above techniques used for the process of machine learning will able to identify

and analyse the detection of malware.

In this paper, modiﬁed cloud-based malware identiﬁcation technique is propose,

which is more effective and optimal based on cloud concept will able to identify and

analyse the malware application. This will able to perform based on the application

process interface. Generally, this proposes technique will makes the machine learning

algorithm to execute based on the application, which will support either android or

IOS. In this technique, machine learning will support supervised, unsupervised and

hybrid approach. In the performance analyse, the propose technique will attain high

accuracy during the detection of the malware based on the huge datasets considered.

The propose cloud-based malware identiﬁcation technique able to identify and

analyse the malware application to attain high accuracy during the detection of the

malware based on the huge datasets considered. Here, there are three phases are

associated with the propose work as mentioned below. Input is taken from various

repositories as all the malware-based application are stored over there. Mobile appli-

cation ﬁles are identiﬁed using the malware scanner to detect. Extracting the feature

associated with application permission and API calls are to be performed. In this

extract feature, i.e. feature selection based on machine learning will help to perform

any test as mentioned in Sect. 17.1. The malware detection model helps to iden-

tify the signiﬁcant feature, and those analyse will be performed based on machine

learning algorithm. To authenticate the malware, real-based applications are analysed

based on various parameters such as accuracy and F-measure based on P value and t

value. Based on the datasets, the deployment, detection rate and data availability are

performed (Fig. 17.1).

172 G. Sreelatha et al.

Fig. 17.1 Cloud malware detection technique

17.2 Literature Survey

Here, the existing malware-based detection technique is discussed and various

research breaches are highlighted. Then, the highlighted breaches are overcome

on the propose work [10]. Proposed a malware detection model associated with

cloud computing based on packet networking. The identiﬁcation of packets, which

is considered as the input uses data mining technique to reduce the packet knowl-

edge and this helps to validate whether malware detection or not. As data mining

will analyse the extraction of data but learning algorithm will learn the input dataset.

So, SMMDS-based malware detection model will follow the concept of machine

learning technique. Then, [11] proposes a detection approach to detect the appli-

cation based on malware that will happen during the installation period and this

approach will have a set of instructions to be carried out in mobile applications.

The framework is proposed to extract the feature during the installation of mobile

application and execute them. Then by using some technique, features are guided

by using the machine learning algorithms to perform data classiﬁcation, which helps

to identify the malware. Here, in this approach, there is limitation of huge system

resource and data loading time. The proposed [12] work to construct the detection-

based malware detection model when there is an anomaly behaviour happen in the

mobile applications. To perform the operation of feature information, there are some

machine learning algorithms such as naïve Bayes and logistic to calculate the accu-

racy rate among the data. As in this approach, there are certain parameters are not

considered as mentioned as, CPU utilization, data memory, battery and trained data

as the limitations.

Propose [13] malware aware detection model based on Gaussian mixture and

perform feature selection concept based on machine learning algorithm. Feature

collections are performed based on CPU utilization, battery, data memory, etc., and

it can be extracted. The limitations faced by the proposed model is the enabling of

remote server, i.e. cloud server. So far, the mobile behaviour has not been represented

while it is infected with malware, i.e. threat to the mobile phones [14]. Propose the

model, which manages the behaviour of the battery life when the mobile phone is

17 Modiﬁed Cloud-Based Malware Identiﬁcation Technique … 173

infected with malware threats. As the limitations is considered as more effective

models need an effective model to detect the malware [15] discussed the detection

model based on cloud will makes the malware detection more effective and make

the saving power more effective and optimal. Here, the detection model based on

machine learning will outperform the cloud-based detection model based on power

saving parameters. As it has the limitation of real-time application to be consid-

ered based on power saving parameter. Deep learning-based intrusion detection in

cloud services for resilience management was discussed in [16]. Deep learning with

machine learning algorithm based on Knowledge Decision Database (KDD) and

discussion on information security while exchanging the information between the

mobile devices. Parameter used to analyse the model as mentioned as, F-measure

and data accuracy based on various machine learning classiﬁcation.

(i) Problem Formulation:

There is a drastic improvement in the wireless signals, which is associated

with IoT environment and results in the development mobile devices. This is

needed as there is growth in the impact of mobile devices nowadays, hacking

developers also active in spreading the malware day to day to weaken the user’s

data privacy and integrity as it is eventually need for the mobile users. There

is certain limitation to consider while managing the mobile device physically

as mentioned as, CPU utilization, phone battery, data memory, data accuracy

based on F-measure and other test related to machine learning.

(ii) Research Objectives:

The proposed cloud-based malware identiﬁcation technique able to identify

and analyse the malware application to attain high accuracy during the detection

of the malware based on the huge datasets considered. Here, there are three

phases are associated with the propose work as mentioned below.

•Input is taken from various repositories as all the malware-based application

are stored over there.

•Mobile application ﬁles are identiﬁed using the malware scanner to detect.

•Extracting the feature associated with application permission and API calls

are to be performed. In this extract feature, i.e. feature selection based on

machine learning will help to perform any test as mentioned in Sect. 17.1.

•The malware detection model helps to identify the signiﬁcant feature, and

those analyse will be performed based on machine learning algorithm.

•To authenticate the malware, real-based applications are analysed based

on various parameters such as accuracy and F-measure based on Pvalue

and tvalue. Based on the datasets, the deployment, detection rate and data

availability are performed.

174 G. Sreelatha et al.

17.3 Proposed Methodology

The proposed cloud-based malware identiﬁcation technique able to identify and

analyse the malware application to attain high accuracy during the detection of the

malware based on the huge datasets considered. Input is taken from various reposi-

tories as all the malware-based application are stored over there. Mobile application

ﬁles are identiﬁed using the malware scanner to detect. Extracting the feature asso-

ciated with application permission and API calls are to be performed. In this extract

feature, i.e. feature selection based on machine learning will help to perform any test

as mentioned in Sect. 17.1.

Algorithm 1: Dataset Formulation Based on Machine Learning

//Collection of Data Files Input: Dataset Files

Output: Extracted feature data

Collection of categorical based datasets (.apk Files) Removing the duplicate ﬁles

based on datasets

Consider the ﬁles with normal that Perform Data Generalization. Training the datasets

Validate the data Testing the data

Data Generalization Outcome

//Extracting the Feature from the information

Collecting the unique data samples Extract the ﬁle permission and API Calls

To revoke the permission to use the resource Execute the collected data and extract

the ﬁle

If (API Calls) return true else return false

Extracted Feature Data

The malware detection model helps to identify the signiﬁcant feature, and those

analyse will be performed based on machine learning algorithm. To authenticate the

malware, real-based applications are analysed based on various parameters such as

accuracy and F-measure based on Pvalue and tvalue. Based on the datasets, the

deployment, detection rate and data availability are performed.

Algorithm 2: Feature Selection

Input: Feature Extraction

Output: Malware Detection Model

To identify the effective datasets based on feature extraction Categorical of classiﬁed

and unclassiﬁed errors based on the datasets

//Feature Ranking Training Datasets

17 Modiﬁed Cloud-Based Malware Identiﬁcation Technique … 175

Apply ﬁltering technique to perform ranking Mutual Information

Relief-F

Association of two ﬁltering technique to perform ranking feature

//Feature Subset Selection

Optimize the Feature Selection based on the training the dataset If (Identifying the

feature indices)

Select the feature set based on training set Else

Classify the data on machine learning algorithm}

To predict and validate the detection of malware is present or not

Limitations:

There is certain limitation on the existing work are mentioned below,

•Limited datasets

•Detection rate efﬁciency

•Effective computation rate (Figs. 17.2 and 17.3).

Expected parameters:

•F-measure and accuracy feature ranking.

•Supervised machine learning techniques.

•Unsupervised machine learning techniques.

•Semi-supervised machine learning techniques.

•Hybrid machine learning techniques.

•Feature selection techniques based on signiﬁcance and insigniﬁcance difference.

•Data accuracy with respect to detection rate.

•Malware families’ detection with respect to various machine learning techniques.

17.4 Conclusion

Construction of malware detection approach is proposed by performing and selecting

the featured datasets to identify the malware application or not. The feature selection

approach helps to minimize the featured datasets and perform better. Then, featured

set able to identify the malware with data misclassiﬁcation-based errors with better

accuracy. Then, apply machine learning algorithm to provide better detection rate

and computation rate.

Same analysis process is applied on an image ﬁle which is malware free and it

result is below 70% as expected. From the above various ﬁles results, the groove-

monitor.exe ﬁle is blocked and the Image File1.bin and other ﬁles are pass for the

second process.

176 G. Sreelatha et al.

Fig. 17.2 Feature extraction

process

17 Modiﬁed Cloud-Based Malware Identiﬁcation Technique … 177

Fig. 17.3 Feature raking

and feature subset

References

1. Singh, K. U., Gupta, P. K., & Ghrera, S. P. (2015). Performance evaluation of AOMDV routing

algorithm with local repair for wireless mesh networks. CSI Trans ICT, 2(4), 253–260.

2. Novakovic, J. (2010). The impact of feature selection on the accuracy of Naïve Bayes classiﬁer.

In 18th Telecommunications forum TELFOR (vol. 2, pp. 1113–1116).

3. Plackett, R. L. (1983). Karl Pearson and the chi-squared test. International statistical

review/revue internationale de statistique, 51(1), 59–72; Wang, W., Wang, X., Feng, D., Liu,

J., Han, Z., & Zhang, X. (2014). Exploring permission-induced risk in android applications

for malicious application detection. IEEE Transactions on Information Forensics and Security,

9(11), 1869–1882.

4. Cruz, C., Erika, A., & Ochimizu, K. (2009). Towards logistic regression models for predicting

fault-prone code across software projects. In Proceedings of the 2009 3rd International Sympo-

sium on Empirical Software Engineering and Measurement (pp. 460–463). IEEE Computer

Society

5. Hall, M. A. (1999). Correlation-based feature selection for machine learning (Doctoral disserta-

tion, The University of Waikato, Department of Computer Science); Pawlak, Z. (1982). Rough

sets. International Journal of Computer and Information Sciences, 11(5), 341–356.

178 G. Sreelatha et al.

6. Dash, M., & Liu, H. (2003). Consistency-based search in feature selection. Artiﬁcial Intel-

ligence, 151(1–2), 155–176; Kohavi, R., & John, G. H. (1997). Wrappers for feature subset

selection. Artiﬁcial Intelligence, 97(1–2), 273–324.

7. Arp, D., Michael, S., Malte, H., Hugo, G., Konrad, R., & Siemens, C. E. R. T. (2014). Drebin:

Effective and explainable detection of android malware in your pocket. NDSS, 14, 23–26.

8. Cui, B., Jin, H., Carullo, G., & Liu, Z. (2015). Service-oriented mobile malware detection

system based on mining strategies. Pervasive and Mobile Computing, 24, 101–116.

9. Enck, W., Ongtang, M., & McDaniel, P. (2009). On lightweight mobile phone application

certiﬁcation. In Proceedings of the 16th ACM Conference on Computer and Communications

Security (pp. 235–245). ACM.

10. Narudin, F. A., Ali, F., Nor, B. A., & Abdullah, G. (2016). Evaluation of machine learning

classiﬁers for mobile malware detection. Soft Computing, 20(1), 343–357.

11. Wei, T.-E., Mao, C.-H., Jeng, A. B., Lee, H.-M., Wang, H.-T., & Wu, D.-J. (2012). Android

malware detection via a latent network behavior analysis. In 2012 IEEE 11th International

Conference on Trust, Security and Privacy in Computing and Communications (pp. 1251–

1258). IEEE.

12. El Attar, A., Khatoun, R., & Lemercier, M. (2014). A Gaussian mixture model for dynamic

detection of abnormal behavior in smartphone applications. In: 2014 Global Information

Infrastructure and Networking Symposium (GIIS) (pp. 1–6). IEEE.

13. Dixon, B., & Mishra, S. (2013). Power based malicious code detection techniques for smart-

phones. In 2013 12th IEEE International Conference on Trust, Security and Privacy in

Computing and Communications (pp. 142–149). IEEE.

14. Suarez-, G., Tapiador, J. E., Peris, P., & Pastrana, S. (2015). Power-aware anomaly detection in

smartphones: An analysis of on-platform versus externalized operation. Pervasive and Mobile

Computing, 18, 137–151.

15. Chen, P. S., Lin, S.-C., & Sun, C.-H. (2015). Simple and effective method for detecting abnormal

internet behaviors of mobile devices. Information Sciences, 321, 193–204.

16. Chakravarthi, S. S., Kannan, R. J., Natarajan, V. A., & Gao, X. (2022). Deep learning based

intrusion detection in cloud services for resilience management. CMC-Computers, Materials &

Continua, 71(3), 5117–5133.

Chapter 18

Design and Deployment of the Road

Safety System in Vehicular Network

Based on a Distance and Speed

Thalakola Syamsundararao, Badugu Samatha, Nagarjuna Karyemsetty,

Subbarao Gogulamudi, and V. Deepak

Abstract In the age of intelligent communication technology, intelligent mobile

phones play a critical role in road accidents. Their effect on driving, resulting in

vehicle crashes over the last two decades, has become a signiﬁcant risk. Speed was

identiﬁed as an important risk factor for road trafﬁc accidents. By monitoring vehicle

speed and position, one can help prevent accidents by sending out alert messages and

limiting the consequences for unprotected road users such as pedestrians and cyclists.

To manage and control road accidents, the position and speed of the vehicle were

communicated to nearby cars, and the excellent circle method was used to calculate

the distance between the towing vehicle and the trailing vehicle. This method is based

on the zero point of the earth’s equator and GPS. The application was tested in this

paper using mobile devices. The experiment used various smartphone modules, such

as GPS receivers, digital road maps, and communication systems. A prototype was

developed and evaluated using mobile phones in highway and city scenarios with

varying speeds and network sizes. As a result of the experiment, location and speed

accuracy were determined, and alert messages were generated when the distance

between vehicles fell below the standard or government-speciﬁed length. The inves-

tigation could be expanded further by connecting to the Internet, storing data in the

cloud, performing analytics, and involving insurance agents, relatives, and nearby

hospitals.

T. Syamsundararao

Department of CSE, Kallam Haranadhareddy Institute of Technology (A), Chowdavaram, Guntur,

India

e-mail: syamsundar@khitguntur.ac.in

B. Samatha

Department of CSE, Vignan’s Foundation for Science, Technology, and Research, Guntur, India

e-mail: drbs_cse@vignan.ac.in

N. Karyemsetty (B)·S. Gogulamudi ·V. Deepak

Department of CSE, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh,

India

e-mail: nagarjunak@kluniversity.in

V. Deepak

e-mail: v.d @li ve . in

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_18

179

180 T. Syamsundararao et al.

18.1 Introduction

In transportation trafﬁc, safety becomes one of the signiﬁcant problems, and speed

is acknowledged as a high risk for causing road trafﬁc accidents. As vehicles are

growing in number, levels of trafﬁc accidents have signiﬁcantly increased [1,2]. The

cars and vulnerable road users like pedestrians and cyclists also face trafﬁc accidents.

According to the World Health Organization (WHO), road fatalities will increase to

become one of the major causes of death by 2030 [3]. More than 50% of road

injuries and accidents are caused due to unsafe driving behavior and attending phone

calls. However, driving was difﬁcult for venerable road users such as pedestrians and

cyclists, as proposed by the United States researchers [4]. Over the past two decades,

millions of people died due to the causes of road injuries every year. It shows the

way to lose their valuable life. Various companies and vehicle manufacturers have

developed a solution for continuously watching vehicle and driver behavior. The

problems with these solutions are that they are costly in buying [3,5]. We developed

an android project based on the real-time mobile application using java as the frontend

and ﬁrebase database as the backend. This application tracks the speed of the vehicle.

When the vehicle exceeds a certain speed, the user mobile automatically turns to silent

mode. If the user gets a call at that moment, then an alert message will be sent to

the caller to avoid talking while driving. The background communication network

among vehicles (OBUs) was mentioned in Fig. 18.1.

Fig. 18.1 Safety message communication among OBUs in VANET

18 Design and Deployment of the Road Safety System … 181

18.2 Literature Review

A comprehensive summary of previous research on mobile phones [2,3,5] based on

driving, tracking speed, and smartphone sensing has been provided in this section.

In addition, the literature survey has categorized information based on different

applications [4,6]. The data is

•Driver information.

•Trafﬁc information.

•Vehicle information.

•Environmental information.

The survey of the recent projects has been discussed in the context of the rest

of this paper. Ali et al. proposed that the Crowd-ITS [1] server provides a web

interface to aggregated trafﬁc information [6]. According to Signal Guru [7,8],

various trafﬁc signal detection, data ﬁltering, and trafﬁc signal scheduling schemes

have been suggested.

Jam Eyes, the designer of the trafﬁc jam awareness and observation system (TJAS)

[9], has presented a new trafﬁc jam awareness and observation system.

This new approach provides critical information such as the length of the trafﬁc

queue and the amount of time spent in the bottleneck. Verma [10] integrated the

OBD-2 system with a smartphone- and web-based data gathering technologies to

create a hybrid electric car CAN-bus and data monitoring system; they then used the

data from the servers for remote monitoring.

Accelerometers and audio sensors are attached to smartphones, and that informa-

tion is immediately forwarded to the dispatch server in the event of an accident on a

freeway [11,12]. The accident photos are being sent directly using GPS coordinates

and VOIP channels. Unlike the Zaldivar et al. system, the White et al. system incorpo-

rates the data recording detection technology. Accident detection uses accelerometer

values from smartphones and not electronic control unit values (ECU).

The researchers included Medins et al. [13] and designed a smartphone-based

accelerometer to help correct bad driving habits. The study done by Eriksson et al.

proposed the ﬁrst road condition monitoring system that utilizes an accelerometer

and GPS to detect and map road anomalies, such as potholes. The authors, Ghose and

colleagues and Mednis and colleagues, have collaborated to develop road monitoring

applications that see potholes and collect data via sensors, which are then sent to

remote servers where an alert is issued to drivers.

Mohan et al. have developed a mobile phone based on multiple sensors to identify

the bumps, pits, breakings, and honking in driving and localize the telephone to keep

energy. Castignani et al. have proposed a mobile app to monitor and track the ﬂeet

system using GPS and GSM modules to interact quickly and efﬁciently to automate

real-time ﬂeet management. Based on the literature survey, this paper has proposed a

safe drive mobile application, which recognized some limitations in present systems

and tried to advance road safety to control the speed and avoid talking on phones.

182 T. Syamsundararao et al.

18.3 Proposed System

Based on the limitations of the existing systems, we have found the solution for

avoiding speed driving and people talking on mobiles while driving at high speed. So,

we developed a safe drive mobile-based android application that tracks the vehicle’s

speed in real time by GPS for locating the vehicle positions. When the vehicle exceeds

a speciﬁc rate, the user mobile automatically turns to silent mode. If the user gets a

call, an alert message will be sent to the caller. If the user receives the ring more than

three times from the same user, the mobile phone will automatically turn to vibrate

mode or a general mode in that emergency.

The ‘Safe Drive’ app developed the speed tracking mobile application. The plat-

form used to design the app in Android Studio Version 2.4 Front End JAVA, XML

Back End, and Firebase (MOBILE).

‘Widgets’ are the group of options in Android, which can drop and drag things

easily. We cannot view the XML code in android studio, but the layout will exposé

while appearing on the screen. To deﬁne the nature of the application, we need to

go to the ‘MainActivy.java.’ Now, we can see the tabs under application > Java to

set the code or design the layout that we need to run the application. Initially, we

test the application and then run it on an android emulator which facilitates run-time

virtually. Later on, successful testing, we need to test the application on a real-time

instrument. The overall process has shown in diagrammatic form Fig. 18.1.Inthe

project window, click the application and then execute the toolbar module in Android

Studio. In the deployment target window, select the device and click the ‘OK’ button.

The program will be installed using Android Studio, and all connected devices will

begin to function. The Safe Drive app, which was created on the smartphone, was

running in real time. It uses the great circle distance (GCD) method to determine the

distance between two OBUs (Fig. 18.2).

18.3.1 Experimental Methodology

Each OBU consists of the following modules

1. GPS module associated with the digital road map

2. RSU interface

3. Wireless communication modules like IEEE 802.11a & 4G

4. Location tracking module

5. Safety alert module.

Location accuracy depends on the accuracy of the GPS receiver; in real-time

military GPS or dual-band GPS can be used. The wireless communication modules

can receive and transmit the speed and location data from nearby OBUs. GCD method

is used to calculate the distance between two OBUs. Equations 18.2 and 18.3 are

used to calculate the distance point at vehicle A and vehicle B, as shown in Fig. 18.3.

18 Design and Deployment of the Road Safety System … 183

Fig. 18.2 Proposed architecture for speed monitoring

Fig. 18.3 Great circle distance method

Cos AB =Cos PA ∗Cos PB +Sin PA ∗Sin PB ∗Cos P−(18.1)

184 T. Syamsundararao et al.

PB =latp −latB (18.2)

PA =latp −latA (18.3)

Depending on the distance between two-vehicle, warning alerts trigger the driver.

Standard distance, i.e., 50 m, is considered to prevent accidents from generating alert

messages.

18.3.2 Flow of Execution of an Application

The main advantages are:

•The application gives an ‘alert message.’

•The call will be activated after speciﬁc rings.

•Automatically phone will be in ‘silent mode’ when vehicle speed crosses < 40 km.

The proposed application has been tested on the road, giving users safe and

comfortable driving (Fig. 18.4).

To create a new application, the developer must choose the project template as

‘empty activity,’ then click on Nex and accept the default activity name as (main

activity) and select ﬁnish. These android package ﬁles have to be uploaded to the

google play store.

Fig. 18.4 Results of speed tracking

18 Design and Deployment of the Road Safety System … 185

18.3.3 Modules of the Application

Installation of application

We need to install the ‘Safe Drive’ mobile app for smartphone speed tracking. Here,

it indicates that the app is installed on your mobiles.

Device permissions

Here, the user will ask permission to access the ‘do not disturb mode’ in your mobile

while the app is running and allow permission to access the device location to track the

vehicle’s speed. In addition, the screen provides access to send and view messages,

then the mobile number from which the call has been received will send back an alert

message to that particular number. It will then ask permission to access the phone

calls so that the app can send messages to that speciﬁc number, and also, if the same

number repeats more than three times, then the mobile will change to the endless

mode.

Registration, Log in, and Home pages

The registration of the user. If the user is a ﬁrst-time user, they have to register for

further process. This includes full name, email ID, mobile number, and password.

After entering all the details, then click on the sign-in button. Next, it navigates to

the log-in screen. If the user forgets his password, he will receive an email with a

password update link to his registered email address. If the user has already registered,

he can log in by giving his mobile number, password,and ‘sign-in.’ There are different

options for users to select the type of vehicle mode, user proﬁle, and history on the

home page. Here, the user can choose the vehicle type, and after that, it will give a

green color indication that means the app has started. If the user wants to see their

proﬁle, they can click on the screen’s proﬁle option.

Pedometer screen

The next is the speedometer screen, which shows the vehicle’s current speed as it

was zero ‘0’ initially. When the vehicle exceeds the speed limit of <40, the app will

turn on to silent mode. And if the car is below the speed limit, the app turns to the

usual way. When the caller is trying to call a person on driving, the user can attend

the call when the vehicle’s speed is limited. If the user exceeds the rate, the alert

message ‘I am in driving call me later’ will automatically generate. It does not create

any miscommunication between the user and the caller. If the same caller repeats

more than three times, then the mobile will be turned on to general mode from silent

mode so that he can attempt the emergency call in certain situations. Figure 18.5

shows the trace of two OBUs in the mobile application.

The Safe Drive mobile application does not need any embedded system. The

application will automatically work with Global Positioning System (GPS) to ﬁnd

the user’s location with the help of latitudes and longitudes. The application will

check whether the GPS is ON or not; it will ask the user to switch ON the GPS if it

is not. The last screen is the stop screen. When the user reaches the destination, he

186 T. Syamsundararao et al.

Fig. 18.5 Monitoring the speed in mobile

Fig. 18.6 Coordinated communication in VANET (ﬁeld test)

can stop the mobile application will turn into the red color indication, implying that

the app has stopped working. Figure 18.6 shows the sharing of location and speed in

OBUs in different cases.

18.4 Conclusion

The risk of a crash increases signiﬁcantly when traveling above the speed limit.

Adequate speed can be imposed on trafﬁc via vehicle design features that restrict the

18 Design and Deployment of the Road Safety System … 187

vehicle’s speed; speciﬁc safe drive characteristics are proposed for the application.

The proposed system utilizes mobile GPS to determine the vehicle’s location and

speed and the excellent circle distance method to determine the distance between

two cars. Based on the calculated distance, warning and safety alert messages were

sent to drivers, and the notes were recorded for further analysis. This application

will signiﬁcantly improve your quality of life. These types of applications also help

drivers feel more secure on the road. We can save the lives of numerous humans,

such as pedestrians and cyclists, by utilizing these real-time applications. We can

also avoid talking on our phones while driving, avoiding accidents. The appropriate

investment will ensure the journey’s safety, as fulﬁlling the social responsibility of

these apps will beneﬁt us in the future. We anticipate that smartphone-based vehicle

tracking and the driver will be a critical component of the intelligent transportation

system domain in emerging countries such as India.

References

1. Ali, K., Al Yaseen, D., Ejaz, A., Javed, T., & Hassanein, H. S. (2012). CrowdITS: Crowd-

sourcing in intelligent transportation systems. In Wireless Communications and Networking

Conference (WCNC) (pp. 3307–3311). IEEE.

2. Koukoumidis, E., Peh, L. S., & Martonosi, M. R. (2011). Signalguru: Leveraging mobile

phones for collaborative trafﬁc signal schedule advisory. In Proceedings of the 9th International

Conference on Mobile Systems, Applications, and Services (pp. 127–140). ACM.

3. Zhang, X., Gong, H., Xu, Z., Tang, J., & Liu, B. (2012). Jam eyes: A trafﬁc jam awareness and

observation system using mobile phones. International Journal of DistributedSensor Networks.

4. Yang, Y., Chen, B., Su, L., & Qin, D. (2013). Research and development of hybrid elec-

tric vehicles can-bus data monitor and diagnostic system through ODB-11 and android-based

smartphones. Advances in Mechanical Engineering.

5. Koukoumidis, E., Martonosi, M., & Peh, L. S. (2012). Leveraging smartphone cameras for

collaborative road advisories. IEEE Transactions on Mobile Computing, 11(5), 707–723.

6. White, J., Thompson, C., Turner, H., Dougherty, B., & Schmidt, D. C. (2011). WreckWatch:

Automatic trafﬁc accident detection and notiﬁcation with smartphones. Mobile Networks and

Applications, 16(3), 285–303.

7. Zaldivar, J., Calafate, C. T., Cano, J. C., & Manzoni, P. (2011). Providing accident detec-

tion in vehicular networks through OBDII devices and Android-based smartphones. In 36th

Conference on Local Computer Networks (LCN) (pp. 813–819). IEEE.

8. Magaña, V. C., & Organero, M. M. Artemisa: Using an Android device as an eco-driving

assistant. Cyber Journals: Multidisciplinary Journals in Science and Technology: Journal of

Selected Areas in Mechatronics (JMTC).

9. Castignani, G., Derrmann, T., Frank, R., & Engel, T. (2015). Driver behavior proﬁling using

smartphones: A low-cost platform for driver monitoring. Intelligent Transportation Systems

Magazine, IEEE, 7(1), 91–102. https://doi.org/10.1109/MITS.2014.2328673

10. Verma, N. (2018). Development of native mobile application using android studio for cabs and

some glimpse of cross-platform apps. International Journal of Applied Engineering Research,

13(16), 12527–12530. ISSN 0973-4562. © Research India Publications. http://www.ripublica

tion.com

11. Eriksson, J., Girod, L., Hull, B., Newton, R., Madden, S., & Balakrishnan, H. The pothole

patrol: Using a mobile sensor network for road surface monitoring. In Proceedings of the 6th

International Conference on Mobile Systems, Applications, Street-Safety.

188 T. Syamsundararao et al.

12. Ghose, A., Biswas, P., Bhaumik, C., Sharma, M., Pal, A., & Jha, A. (2012). Road condition

monitoring and alert application: Using-vehicle smartphone as an internet-connected sensor.

In 10th International Conference on Pervasive Computing and Communications Workshops

(PerComWorkshops) (pp. 489–491). IEEE.

13. Mednis, A., Strazdins, G., Zviedris, R., Kanonirs, G., & Selavo, L. (2011). Real-time pothole

detection using android smartphones with accelerometers. In International Conference on

Distributed Computing in Sensor Systems and Workshops (DCOSS) (pp. 1–6). IEEE.

14. Mohan, P., Padmanabhan, V. N., & Ramjee, R. (2008). Nericell: Rich monitoring of road

and track conditions using mobile smartphones. In Proceedings of the 6th ACM conference

Embedded Network Sensor Systems (pp. 323–336). ACM.

Chapter 19

Diagnosis of COVID-19 Using Artiﬁcial

Intelligence Techniques

Pattan Afrid Ahmed, Prabhu Gantayat, Sarika Jay, Venkata Sai Satvik,

Jagadeesh Kannan Raju, and A. Balasundaram

Abstract Artiﬁcial intelligence is being used in a variety of ways by those trying

to address variants and for data management. AI, on the other hand, not only uses

historical data, it makes assumptions about the data without applying a deﬁned set

of rules. This allows the software to learn and adapt to information patterns in more

real time. Numerous sources of medical images (e.g., X-ray, CT, and MRI) make

deep learning a great technique to combat the COVID-19 outbreak. Motivated by

this fact, a large number of research works have been proposed and developed. Chest

CT is an emergency diagnostic tool to identify lung disease. Artiﬁcial intelligence

(AI) gives big guidance in the rapid analysis of CT scans to differentiate variants of

COVID-19 ﬁndings. This work focuses on the recent advances of COVID-19 drug

and vaccine development using artiﬁcial intelligence and the potential of intelligent

training for the discovery of COVID-19 therapeutics.

19.1 Introduction

Coronavirus or the coronavirus infection is a breathing sickness achieved by deﬁle-

ment with the coronavirus of a serious acute respiratory disorder, also known by

the name SARS-V2. COVID-19 directly tainted more than 16.6 crores individuals

worldwide causing more than 34.3 lakhs deaths. A signiﬁcant issue in the determina-

tion of COVID-19 is the shortcoming and shortage of clinical tests. In such a manner,

a few endeavors were dedicated to looking for elective techniques for determining

COVID-19’s occurrence. Computed tomography scans or CT examinations are seen

as promising for the identiﬁcation of patients with suspected COVID-19 infection.

CT shows clear radiological discoveries from patients with COVID-19, ﬁlling in

as a more effective and open-test strategy. The principal issue with this strategy is

that it relies upon the expert to investigate the CT pictures, as the cycle is tedious,

tedious, and tiring for the master because of various pictures to be examined, causing

weariness, which can prompt mistakes in ﬁnding [1].

P. A. Ahmed ·P. Gantayat ·S. Jay ·V. S. Satvik ·J. K. Raju ·A. Balasundaram (B)

School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, India

e-mail: balasundaram.a@vit.ac.in

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_19

189

190 P. A. Ahmed et al.

Even though the quick purpose of care COVID-19 tests is relied upon to be utilized

in clinical settings eventually, for the present, usually, the time taken for COVID-19

test outcomes range from 3 h to 3 days, and most likely, not all nations will approach

those test units that give results quickly [1]. To manage these issues, CT scan-based

strategies have been proposed by various works as of late, indicating incredible

guarantee for the use of deep learning techniques and profound machine learning or

the ML primarily based methodologies for powerful popularity of the disease from

chest CT photos. Inspired by using the requirement for an extensive assessment of

such AI-based diagnosing methodologies, in this work, deep CNN-based strategies

are investigated and focused tentatively to get the measure of the handiness of these

methodologies in the modern emergency.

Artiﬁcial intelligence instruments have created steady and exact outcomes in

the applications that utilize either picture-based or different sorts of information.

Messina-2 performed one of the principle assessments on COVID-19 identiﬁca-

tion utilizing X-beam pictures along with Apostol Poulos. In their investigation,

they considered exchange getting the hang of utilizing pre-prepared networks, for

example, Xception and Inception-ResNet-V2, VGG19, MobileNet-V2, Inception

which is the frequently used pre-trained models. A few assessment measurements

were utilized to assess the outcomes obtained from two distinctive datasets. The last

end was made by the creators utilizing the acquired disarray lattices, not the precision

results due to the imbalanced information [1].

Overall, CNNs that endeavor to replicate characteristic pieces of people on PCs

require pre-handling of pictures or data before dealing with them to the organization.

Right when the ConvNet was ﬁrst grown, regardless, it was portrayed as a neural

organization that requires insigniﬁcant pre-getting of pictures before dealing with

them to the organization, and a structure that is good for removing the features from

pictures to improve the learning execution of the neural organization.

The ConvNet includes both segment extraction and request stages in a single

organization. A customary ConvNet involves three layers: convolution, pooling,

and related layers or FCs. Feature extraction is acted in the convolutional layer by

applying cover, which is the route toward disconnecting pictures into a predeﬁned

estimation of segments and using channels to isolate features from the image. By

then, a feature map, which is the projection of features on the 2D guide, is made

by applying an established ability to the characteristics obtained by the cloak. The

activation limits establish the most capable neurons in a nonlinear way and reduce

the computational cost of the neural organization.

The discipline of scientiﬁc imaging has seen revolutionary adjustments inside

the ongoing past, due to the headways within the ﬁeld of DL and Open CV or CV,

for a few, contagion, equivalent to pneumonia, MERS, SARS, ARDS, etc [2–14].

These situations of the pandemic, it has gotten a good deal extra signiﬁcant for such

deep learning primarily based ways to address being applied progressively. Effective

utilization of such deep learning-based methodologies can conceivably be of high

utility for people, particularly concerning quick testing and recognition of the illness

that is quick diagnosis and predictions for the COVID-19.

19 Diagnosis of COVID-19 Using Artiﬁcial Intelligence Techniques 191

A couple of incitation limits are available in CNNs, and the corrected straight

unit ReLU is the most ordinarily used commencement work; it doesn’t enact all

the neurons at the same time, and along these, lines give a quicker assembly at the

point when the loads locate the ideal qualities to deliver the automatic reﬂex during

the preparation. A pooling activity is performed on the delivered highlight guide to

diminishing the measurements of the pictures.

Finally, the part map is leveled into a vector and sent off the related layer. The

get-together of the neural organization and the gathering of the data plans are acted in

the related layer, and its guidelines rely upon goof backpropagation to invigorate the

heaps inside this layer [15]. Each readied neural organization gets data for the partic-

ular task that is considered. While the fundamental guideline of fake neural networks

is to re-enact human conduct and insight, the exchange learning in counterfeit neural

networks is utilized to apply the put-away information on a speciﬁc assignment

for another connected assignment. Profound learning for picture acknowledgment

applications is ﬁt for learning a great many pictures, and a few tremendous models

were prepared with various structures. These pre-prepared models have been freely

mutual with the goal that everything scientists can utilize the put-away information.

The cutting-edge pre-prepared openly accessible networks, speciﬁcally, VGG16,

Inception-V3, VGG19, MobileNet-V2, ResNet50, and Densenet121 were considered

in the correlation [16].

The paper aims to bring in methodologies currently implemented in this arena,

both with vanilla custom CNN-based learning and transfer learning, understand them

and try to understand the implementations and bring out the challenges that were

faced. Also, this work explores the dataset which has been used like COVID CT

dataset for constructing the model. Also adding another layer is added to it by imple-

menting a model which classiﬁes COVID positive patients with pneumonia positive

and COVID positive. This step becomes very essential as there is a high chance that

pneumonia positive patients can be diagnosed as COVID-positive as the CT scans

of patients affected by these diseases are pretty similar. So, we had an extra model

which differentiates COVID positive from pneumonia positive as the last step if the

patient gets detected as COVID positive.

Also, there have been works on preprocessing the scans data or the CT scan

images using the OpenCV library [17]. Histogram equalization, adjusted log, and

rank equalization are done on the images to get a clear picture about all the abnor-

malities in the lung CT scan and more accurate results, while we were dealing with

deep learning models while training.

192 P. A. Ahmed et al.

Fig. 19.1 CT scan of lungs of COVID affected patients

19.2 Materials and Frameworks

19.2.1 Materials and Datasets

The COVID CT scans statistics set consists of 370 ﬁne corona positive instances and

398 corona non-positive chest X-rays or CT images. These CT pictures are in various

sizes relating to tallness (normal =491, greatest =1853, furthermore, least =153)

and width (normal =383, most extreme =1485, and least =124). The covid positive

pictures were gathered from GitHub. The normal age for the COVID-19 gathering

was 45–65 years and included 130 male patients who were affected by COVID-19 and

65 female patients who were pretentious by COVID-19 disease. Also, a few couples

of patients’ information are feeling the deﬁciency of; and it is since the informational

collection utilized in this examination doesn’t have to go with complete metadata,

since this is indisputably the ﬁrst uninhibitedly available COVID-19 CT scans picture

assortment, and it was made in a restricted time. Some examples of corona’s positive

effective and terrible chest scans pictures have been shown in Fig. 19.1.Andfor

pneumonia classiﬁcation, we are using pneumonia affected patients’ CT scans, which

we procured from Kaggle. It is a dataset of 400 pneumonia positive cases which are

head-to-head compared with COVID-19 positive cases’ CT scans. Even there we

had a properly balanced dataset with a binary classiﬁcation model. To establish or

organize our last informational collection for tests, all the photos have been changed

over into “.png”.

19.2.2 Frameworks

Late headways in the ﬁeld of deep learning, particularly in the clinical imaging

area, demonstrate the expected use of different deep convolutional neural network

structures. Initially, the models which were trained upon were standard models, which

incorporate VGG16, VGG19, Inception-V3, ResNet50, etc.; in this research work,

what we did was we went with a better model which contained a hybrid of multiple

models working together to produce a more robust model output which will give out

19 Diagnosis of COVID-19 Using Artiﬁcial Intelligence Techniques 193

stronger results when working on real-time work. With ReLU enactment, a solitary

hub forecast layer with Sigmoid enactment is added. Aside from this pattern model,

a choice combination-based methodology is additionally thought (Fig. 19.2).

The number one notion of this preference aggregate method is that the mix of

character models is probably controlled utilizing consolidating the character forecasts

through a more component casting poll technique, which can conceivably enhance

the general effectiveness of the same old fashions (Fig. 19.3).

Here are the models which we used in our research paper for transfer learning,

and we got pretty good results. Below is a basic introduction about each model’s

working and its architecture and network ﬂow.

VGG16 Architecture. VGG-Net is by a long shot the most prominently well-

known ML or deep convolutional neural network models, which made sure about

the ﬁrst and second situations in the ILSVRC 2014 article restriction and order

Fig. 19.2 Transfer learning models

Fig. 19.3 Layer breakdown

194 P. A. Ahmed et al.

errands. In this engineering, the principle thought was that expanding the profun-

dity of the convolutional neural network models and supplanting enormous bits

with numerous more modest portions were conceivably more exact in completing

Computer Vision [16] assignments. VGG-Net variations are as yet utilized broadly

for some, Computer Vision undertakings for extricating profound picture highlights,

for additional preparation, particularly in the clinical imaging ﬁeld.

Inception-V3 Architecture. In the Inception-V3 designs, the principle notion is to

manage outrageous changeability where the extremely good parts inside the photos

are feasible by permitting the community to include diverse forms of bits on a

comparable degree, which basically “broadens” the community. This possibility of

numerous bits at a similar stage is acknowledged by what’s called the modules called

Inception Mods. After a few years, Inception-V2 along with Inception-V3 structures

has been contemplated, which is more suitable over Inception-V1 layout by tending

to relevant points of the hobby with admiration to illustrative congestion and subor-

dinate classiﬁers, by using accumulating element factorization, and through supple-

menting institution standardization to helper classiﬁers. Then is Inception-V3 design

turned into the ﬁrst sprinter in the ILSVRC 2015 image order mission.

ResNet18 Architecture. The main thought in ResNet designs is a mixture of layers

also known as conv2D layers and pooling layers also known as pool 2D stacking up

one on another, can cause the organization execution to corrupt, as a result of the

issue of vanishing slant, thusly, to deal with this, character backup course of action

afﬁliations can be used, which can skirt at any rate one layers. These arrangements

of layers that contain character associations are known as a leftover squares. Adding

skip associations disposes of the high preparing mistake, which is ordinarily seen in

a generally profound design. ResNet18 is one of the variations of the ResNet design

that contains 18 layers.

DenseNet Architecture. The DenseNet engineering, proposed by Huang et al. [14],

enhances ResNet architecture by joining thick associations, which interfaces every

layer to each other layer. Its miles rather thickly associated models assure of the

fact, where individually overlay receives all of the component designs from each

preceding overlay and offers its remarkably personal detail guide to every ensuing

overlay. The role of engineering is to provide the ability to reuse and at the same

time as keeping up low-range boundaries altogether. There are numerous variations

of the DenseNet and its design that are utilized generally, out of which DenseNet-201

architectures were also used in that research paper.

VGG19 Architecture. VGG19 is a variation of the VGG model which in short

comprises a total of 19 layers in which 16 convolution layers, 3 fully associated

layers, 5 MaxPool layers, and 1 SoftMax layer are present. VGG19 has approx-

imately 19 billion FLOPs. VGG-Net variations are as yet utilized broadly for

some, Computer Vision undertakings for extricating profound picture highlights,

for additional preparation, particularly in the clinical imaging ﬁeld.

19 Diagnosis of COVID-19 Using Artiﬁcial Intelligence Techniques 195

19.3 Approach

The planned and intended strategy was previously discussed where the approach

comprises: (i) procurement of CT pictures, (ii) extraction of highlights utilizing a

convolutional neural network, (iii) characterization of pictures utilizing XGBoost,

(iv) approval of results utilizing measurements generally utilized in CAD frameworks

[16].Picture Securing. Coronavirus chest scans dataset from GitHub where it is an

open-source project where people from different parts of the world can come and

give their contribution of X-ray scans of covid positive patients and the maintainers

will make sure to keep the dataset balanced always with covid negative scans. There

are two class groupings in COVID-19 dataset. The dataset involves 745 images of

which positive cases COVID-19 pictures is 349 an all-out negative cases COVID-19

pictures is 397. The following illustration or the image instance of pictures from

corona chest X-rays and this dataset images will be used for our research purpose.

This work has used the above dataset to obtain more precise individual sentiment

results (Fig. 19.4).

The major drawback was, to improve the sentiment score, a highly accurate senti-

ment dictionary is in need and sentences like “not good” were neglected which has

opened up for future works. The nature of the pictures has been safeguarded. After

separating the data from the structure of the pictures, the subtitles related to the

pictures were recognized. The choice of CT was done physically. At that point, the

inscription or text related to every CT was perused for arrangement in COVID-19

and non-COVID-19 [1] (Fig. 19.5).

Feature Extraction. Feature descent is the main cycle in the improvement of a

programmed picture arrangement framework [18]. The exhibition of the character-

ization can be affected by the nature of the removed information, prompting lost

execution by the framework. Lately, profound models which have been pre-trained

have been suggested for feature extraction. A convolutional neural network is a

model of profound discovery that has various leveled structures of learning assets

Fig. 19.4 Architectural ﬂow design

196 P. A. Ahmed et al.

Fig. 19.5 CNN-SVM model layer architecture

with high caliber in its layers. Convolutional neural networks can diminish network

multifaceted nature and boundary numbers through neighborhood responsive ﬁelds,

sharing activity, and weight sharing.

The change of the convolution portions will be ﬁnished by the calculations of the

backpropagation [15], which depends on the casual or insouciant inclination plunge

calculation, used to lessen the space between the network yield information and the

preparation marks.

A convolutional neural network comprises exchanging layers of convolution and

subsampling, at that point changing into completely associated layers when moving

toward the yield layer. Figure 19.3 shows how a convolutional neural network is

designed. To take advantage of convolutional neural network as a characteristic or

property puller, remaining related overlay in the community become removed or

simply overlooked, and the last yield of the new version, which we organized with our

very own statistics, became applied as capabilities that portray the facts photograph.

And all the transfer learning models have this standard architecture maintained. This

involves replacing the pre-trained network part for different models. Figure 19.6

represents this.

Convolutional neural networks can remove commonly helpful information

features, distinguish also, eliminate input redundancies and save just fundamental

parts of the information in powerful and prejudicial portrayals [19]. Its semi-

associated and completely associated layers give a sensible climate for propelling

the preparation and learning measure [20]. Along these lines, the convolution layers

ﬁll in as a productive feature extractor and spend signiﬁcant time in diminishing the

size of the information and delivering a less-repetitive informational collection.

Preprocessing with OpenCV. The lung CT scan is transformed using a stack of

OpenCV techniques including CLAHE, adjusted log, and rank equalize for better

detection of diseases by our model. Our models with the help of the pre-trained

Fig. 19.6 TL model layer architecture

19 Diagnosis of COVID-19 Using Artiﬁcial Intelligence Techniques 197

Fig. 19.7 OpenCV processing

models will try to classify the CT scans into COVID positive class or COVID negative

class (Fig. 19.7).

CLAHE gives your image some extra contrast with which your image will start

looking more contrasty and where blacks will be more black and whites will be more

whiter. Adjusted log will help in enhancing the images with a little touch of extra

brightness and bringing out those suppressed parts of the image which went through

some compression after we applied CLAHE on the image.

Finally, our rank equalizer will give this input image some ﬁnal touches as it will

help in bringing out all those small patterns or light structures in the image. They

will pick all these and enhance them so that those small formations will be easily

handled and caught by our deep learning model.

Adding Regularization to images. During our research, we found out that models

were heavily overﬁtting due to fewer data, and also, we have seen that the models

were memorizing all the image patterns and formations, so we had to go for some

image regularization patterns. Some of the famous regularization patterns are L1 or

Lasso regularization, L2 or ridge regularization, and some use a hybrid of both L1

and L2. But, when it comes to images, it is better to go for random image patching or

random image hiding where we hide some part of the image and then send it as the

input to our model. Every time random patches are generated and random position

is patched. Now, this will help our model to not memorize the parts and structures

and force it to become more generalized and easy on the real-time data and not be

rude on unseen data. This way we can help our model to get a little bias and reduce

the already high variance of the overﬁtting model (Fig. 19.8).

Characterization. The arrangement comprises perceiving which of a bunch of clas-

siﬁcations a ground-breaking perception has a place, given past preparation on an

informational index that has perceptions whose classiﬁcation is known [21]. In AI,

tree development is an exceptionally successful and generally utilized technique.

Our model with the help of the pre-trained models will try to classify the CT scans

into COVID positive class or COVID negative class. The pictures that gave the

best outcomes with the ConvNet tests and factual estimation tests, which were the

natural pictures, were contrasted and the pre-prepared networks referenced in the

past segment.

198 P. A. Ahmed et al.

Fig. 19.8 Random masking

Approval and Results. To approve the model, we will go for measurable assessment

measurements normally utilized in the writing. These measurements are determined

dependent on the disarray network; given the number of actual perfects or (TP),

incorrect perfects or (FP), actual rejections or (TN), and incorrect rejections or (FN),

the procedures can be numerically communicated by the following:

Accuracy =TP +TN

TP +TN +FP +FN (19.1)

Precision =TP

TP +FP (19.2)

Recall =TP

TP +FN (19.3)

F−Score =2×Recall ×Precision

Recall +Precision (19.4)

The results shows exactness for the suggested models as 97.24% for VGG16,

95.89% for VGG19, 93.15% for Inception-V3, 94.52% for SeresNet18, 87.67% for

MobileNet and 90.42% for ResNet18 and for custom models gave 94.52% as the

best model.

Also, our pneumonia versus COVID classiﬁer gave a 95.65% validation score.

We notice that transfer learning models perform better and give better accuracy than

normal CNN models. VGG16 has the best performance with nearly 98% accuracy

closely followed by VGG19 at 96% and then our custom CNN model at 94%. Also,

our pneumonia classiﬁcation model gave 95.65% accuracy on the validation dataset.

We took the pneumonia dataset from Kaggle and used that against our covid positive

cases dataset for the classiﬁcation part. Both the datasets were balanced, and the

models were kept with similar hyperparameters tuning (Figs. 19.9 and 19.10).

19 Diagnosis of COVID-19 Using Artiﬁcial Intelligence Techniques 199

Fig. 19.9 Genome data visualization

Fig. 19.10 Vaccination forecasting

200 P. A. Ahmed et al.

19.4 Conclusion and Future

In this paper, AI-based deep learning models have been worked upon, and pre-trained

models show a better and more heavily efﬁcient model performance and accuracy

over handmade CNN models. Preprocessing techniques using OpenCV have been

used for better working of our model. Also, pneumonia cases are distinguished from

usual COVID-19 which was a serious concern in all previous research papers. Various

ways to deal with models are applied for contextual investigations, and experi-

mentation has brought about great execution and precise outcomes. Future works

have emerged from these referenced examination works. Deep convolutional neural

networks along with pre-trained models show a better and more heavily efﬁcient

model performance and accuracy.

Additionally, a choice combination-based methodology is likewise proposed,

which consolidates the prediction of every one of the individual deep convolutional

neural network models, to improve the prescient exhibition. Coronavirus is a world-

wide issue, and it not just hugely affects the strength of residents yet additionally on

the worldwide economy and how to recognize it utilizing lung CT examine pictures

from the deliberately chosen information of lung CT check COVID-19-contaminated

patients from around the world.

Besides, even though the model’s proposed technique shows extraordinary guar-

antee, there is even a lot of space for possibly improving the prescient exhibition of

the methodology. As of late, thoughts like image augmentation, transfer learning can

be incorporated as part of future enhancements. The thoughts need to be investigated

as a feature of things to come to work [16].

References

1. Carvalho, E. D., Carvalho, E. D., de Carvalho Filho, A. O., de Araújo, F. H. D., & Andrade Lira

Rabêlo, R. D. (2020). Diagnosis of COVID19 in CT imageusing CNN and XGBoost. In IEEE

Symposium on Computers and Communications (ISCC), Rennes, France (pp. 1–6).https://doi.

org/10.1109/ISCC50000.2020.9219726

2. Sekeroglu, B., & Ozsahin, I. (2020). Detection of COVID19 from chest X-ray images using

convolutional neural networks. SLAS TECHNOLOGY: Translating Life Sciences Innovation.

https://doi.org/10.1177/2472630320958376

3. Shuja, J., Alanazi, E., Alasmary, W., et al. (2020). COVID19 open source data sets: A

comprehensive survey. Applied Intelligence.https://doi.org/10.1007/s10489-020-01862-6

4. Carvalho, E. D., de Carvalho Filho, A. O., de Sousa, A. D., Silva, A. C., & Gattass, M. (2018).

Method of differentiation of benign and malignant masses in digital mammograms using texture

analysis based on phylogenetic diversity. Computers Electrical Engineering, 67, 210–222.

5. de Carvalho, A. S. V., Jr., Carvalho, E. D., de Carvalho Filho, A. O., de Sousa, A. D., Silva,

A. C., & Gattass, M. (2018). Automatic methods for diagnosis of glaucoma using texture

descriptors based on phylogenetic diversity. Computers Electrical Engineering, 71, 102–114.

6. He, X., Yang, X., Zhang, S., Zhao, J., Zhang, Y., Xing, E., & Xie, P. (2020). Sample-efﬁcient

deep learning for COVID19 diagnosis based on CT scans. medrxiv.

7. Abbas, A., Abdelsamea, M., & Gaber, M. (2020). Classiﬁcation of COVID19 in chest X-ray

images using detrac deep convolutional neural network. medRxiv.

19 Diagnosis of COVID-19 Using Artiﬁcial Intelligence Techniques 201

8. Zhao, J., Zhang, Y., He, X., & Xie, P. (2020). COVID CT-dataset: A CT scan dataset about

COVID19.

9. Narin, A., Kaya, C., & Pamuk, Z. (2020). Automatic detection of Coronavirus disease

(COVID19) using X-ray images and deep convolutional neural networks.

10. Carvalho,E.D.,Filho,A.O.,Silva,R.R.,Araujo,F.H.,Diniz,J.O.,Silva,A.C.,Paiva,A.

C., & Gattass, M. (2020). Breast cancer diagnosis from histopathological images using textural

features and CBIR. Artiﬁcial Intelligence in Medicine, 105, 101845.

11. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image

recognition. https://arxiv.org/abs/1409.1556

12. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–

778). IEEE.

13. Szegedy, C., Liu, W., Jia, Y., et al. (2015). Going deeper with convolutions. In Proceedings of

the IEEE Conference on Computer Vision and Pattern Recognition.

14. Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected

convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and

Pattern Recognition (pp. 4700–4708). IEEE.

15. Zipser, D. & Andersen, R. (1988). A back-propagation programmed network that simulates

response properties of a subset of posterior parietal neurons. Nature, 331(6158), 679–684.

[Online]. https://doi.org/10.1038/331679a0

16. Mishra, A. K., Das, S. K., Roy, P., & Bandyopadhyay, S. (2020). Identifying COVID19

from chest CT images: A deep convolutional neural networks based approach. Journal of

Healthcare Engineering, 11(2020), 8843664. https://doi.org/10.1155/2020/8843664.PMID:

32832047;PMCID:PMC7424536

17. Ripley, B. D. (1996). Pattern recognition and neural networks. Cambridge University Press.

18. Ren, X., Guo, H., Li, S., Wang, S., & Li, J. (2017). A novel image classiﬁcation method with

CNN-XGboost model (pp. 378–390).

19. Kothandaraman, D., Balasundaram, A., Dhanalakshmi, R., Sivaraman, A. K., Ashokkumar,

S., et al. (2022). Energy and bandwidth based link stability routing algorithm for IoT. CMC-

Computers, Materials & Continua, 70(2), 3875–3890.

20. Balasundaram, A., Dilip, G., Manickam, M., Sivaraman, A. K., Gurunathan, K., et al. (2022).

Abnormality Identiﬁcation in video surveillance system using DCT. Intelligent Automation &

Soft Computing, 32(2), 693–704.

21. Arunachalam, P., Janakiraman, N., Sivaraman, A. K., Balasundaram, A., Vincent, R., et al.

(2022). Synovial sarcoma classiﬁcation technique using support vector machine and structure

features. Intelligent Automation & Soft Computing, 32(2), 1241–1259.

22. Masci, J., Meier, U., Cire¸san, D., & Schmidhuber, J. (2011). Stacked convolutional auto-

encoders for hierarchical feature extraction. In T. Honkela, W. Duch, M. Girolami, & S. Kaski,

(Eds.), Artiﬁcial neural networks and machine learning—ICANN (pp. 52–59). Springer Berlin

Heidelberg.

23. Liu, Y. (2018). Feature extraction and image recognition with convolutional neural networks.

Journal of Physics: Conference Series, 1087, 062032.

Chapter 20

Location Tracking via Bluetooth

Jasthi Siva Sai, Mukkamala Namitha, Routhu Ramya Dedeepya,

Mulugu Suma Anusha, Angadi Lakshmi, and Mukesh Chinta

Abstract The inclination to forget and misplace things gradually increases with

aging, and this creates a difﬁcult situation for elderly people to remember and locate

the objects without assistance. In this novel era, this phenomenon is quite common

with all groups to misplace the objects which are used frequently, for instance, vehicle

keys, glasses, wallets, and so forth. Finding misplaced objects is usually time taking

and makes an individual stressed out; if this situation is prolonged, it makes the

individual miss deadlines, difﬁcult to manage the activities, and eventually stressed

out. To reduce the work of ﬁnding the misplaced objects and to support elderly people

to manage the activities, the proposed application comes into play. Our application

titled I AM HERE works using Bluetooth technology and iTag, which can be used to

track the location of the objects using just a mobile phone. I AM HERE application

mainly focuses on identifying the location of misplaced objects. An iTag is attached

to the objects used regularly everyday and is commonly misplaced. The iTags are

identiﬁed by the I AM HERE application using Bluetooth technology individually.

The mobile application is developed in the Kodular platform which is an open-

source tool. The application mainly targets the elderly people who rely on others for

everyday activities to identify the location of their objects by using a smart mobile

phone.

20.1 Introduction

The pandemic has a huge impact on the lives of people. The way of living has

been altered to a great extent, in order to stay healthy people are advised to follow

social distancing protocol. This situation has created turbulence in the lifestyle of

people, especially for elderly people as elderly people mostly depend on supporters to

perform their day-to-day activities. As the tendency to misplace and forget objects is

usually high in elderly people, it would be difﬁcult for them to cope up with the present

scenario. Our main motive is to reduce the dependency on people and to support

J. S. Sai (B)·M. Namitha ·R. Ramya Dedeepya ·M. S. Anusha ·A. Lakshmi ·M. Chinta

VR Siddhartha Engineering College, Vijayawada, India

e-mail: jasthi.sivasai@gmail.com

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_20

203

204 J. S. Sai et al.

people with different age groups. Our application I AM HERE can track various

objects which have a high priority of usage in our daily activities and are usually

misplaced. Bluetooth technology is prominently used in connecting devices within a

small range. Using Bluetooth technology and iTag, our application is developed on

the Kodular platform. I AM HERE application can be used to ﬁnd the approximate

location of the objects within a certain range. In today’s era, mobile applications are

utilized to a signiﬁcant extent. According to a recent study from the Pew Research

Center, it is found that among seniors of ages 65 and above, about 85% own a cell

phone. Of those, 46% of seniors use a smartphone. As the utility of smartphones

among seniors is of a good number, the mobile application interacts with them and

promotes ﬁnding the location of the speciﬁc object. The Bluetooth tags are highly

advantageous as the iTags can withstand a harsh environment and are of low cost.

So, iTags can be utilized for a longer period and require less maintenance.

20.2 Related Work

Kanyanee Phutcharoen; Monchai Chamchoy; and Pichaya Supanakoon proposed a

study regarding the improvement of the accuracy of indoor positioning by utilizing

the Bluetooth low-energy beacons. These beacons were placed inside an indoor envi-

ronment and connected to the electronic devices having a beacon analyzer application

to measure the strength of the RSS signals from each of these devices. The position of

the user equipment is obtained by utilizing the ﬁngerprinting technique with the least

RMS error matching. The accuracy is obtained from each measurement, and ﬁve such

measurements were taken into consideration. The cumulative distribution function of

distance error is found, and the average of the considered measurements is inferred.

From the obtained results, it is concluded that the average of the measurements

reduces the error in positioning of objects in the indoor environment [1] Hannah,

Francis, and Tan Shunhua work involved designing a device using Arduino which

can detect and also track in real-time, any items (premium/important for the user),

which a user is most likely to keep close by, i.e., not more than 10 m away at any

time. Bluetooth technology is used to maintain a connection between the users smart

phone and the device at all times. Based on the results, when a system detects a

disconnection between the device and the phone, the user’s device receives a call

along with a warning message including the current location alerting the user of this

disconnection. From thereon, new updated coordinates are received on the phone for

every 1000 ms which helps to keep track of the device even if its location has changed

[2]. Lu Bai, and Fabio Ciravegna proposed an indoor positioning system based on

Bluetooth low energy (BLE) to monitor the daily lifestyle of senior citizens or people

with inabilities. The proposed detection system consists of several sensors which are

established in different positions in a domestic area. With BLE beacons attached and

based on the captured raw received signal strength indicator, users speciﬁc location

within the building is captured. For determining the indoor location, a trilateration-

based and a ﬁngerprint-based methods are suggested. To verify and test the systems

20 Location Tracking via Bluetooth 205

performance, simulations are done in different environments. Based on the results,

it has been observed that accurate information on the users location can be obtained

allowing to track the users lifestyle through which users health condition can be

monitored and judged. The work also demonstrated that placement location of BLE

beacons and the quality of the beacons does not have a huge impact on the accuracy

[3].

20.3 Methodology

This section shows the proposed system of the possible approach.

The proposed system consists of the following stages.

•Initialization of the mobile application.

•Navigation to the second screen.

•Initializing the iTags.

•Connection of the required iTag.

•Finding the required object.

A. Initialization of the mobile application

The name of the mobile application we developed is “I AM HERE”. To

initialize the mobile application, simply open the application. The initial screen

of the mobile application is displayed. The user also needs to enable the Bluetooth

in their smartphone (Fig. 20.1)

B. Navigation to the second screen

In order to navigate to the second screen, click the “go” button widget. Since

the entire activity takes place in the second screen, the user needs to navigate to

the screen.

C. Initializing the iTags

Now, the iTags are to be connected to the frequently lost or misplaced objects

and initialized by giving them a long press to an approx. of three seconds.

D. Connection of the required iTag

All the iTags that fall under the range of reachability of the smartphones

that is all those iTags that can be detected by the smartphone are displayed

upon the screen. The user needs to select the particular iTag which he or she

intends to ﬁnd, by simply clicking upon the detected iTag. After this step, the

connection gets established in between the iTag that has been selected and the

mobile application.

E. Finding the required object

Now in order to ﬁnd the required object, the user needs to click the “alarm” button

widget. This results in the generation of a beep sound from the iTag attached to the

object. Hence, the user reaches the object by displacing it according to the direction

of the sound. The alarm button widget is re-clicked to stop the generation of the beep

sound (Fig. 20.2).

206 J. S. Sai et al.

Fig. 20.1 Flowchart representing the proposed methodology

20.4 Implementation

20.4.1 System Components

There are different hardware components used in the application. They are Kodular,

Bluetooth Technology, iTag.

Kodular Kodular software allows the development of Android applications by

providing modules that require no coding skills. Kodular endows features as compo-

nents and blocks through which interactive applications can be developed. Material

design UI is a Kodular module that being enhanced the user interface. Adding to this,

20 Location Tracking via Bluetooth 207

Fig. 20.2 Proposed system diagram

Kodular enables to share and distribution of the applications on a free online app store

that contains apps developed using Kodular. IDE is a service that permits us to create

our extensions and distribute the apps. The status page speciﬁes the services that are

available for usage and the services that are under maintenance. Kodular commu-

nity, where users interact, generate, and modify the ideas. The projects developed on

Kodular can be viewed and stored by using my Kodular, the control panel on Kodular

account. The “I AM HERE” application is developed on Kodular.

Bluetooth Bluetooth is a prominent wireless technology that enables transfer of

information between devices. Bluetooth functions by using the UHF radio waves

within the ISM bands ranging from 2.402 to 2.48 GHz. Bluetooth can be attained

by embedding low-cost transceivers within the devices. The Bluetooth connectivity

range is about 30 ft. Bluetooth consumes low power to function and is cost-effective.

In 79 different channels, Bluetooth sends and receives the signals and is focused on

2.4 GHz. Bluetooth devices connect automatically upon the detection and allow two

minimum and a maximum of eight devices to communicate at once. Out of which

one device works as a master, and the other devices work as slaves. The master

device initiates the communication and controls the trafﬁc between the slave devices

connected to it and itself. To the master device, the slave devices respond. The device

roles can be switched between master and slave. The master device, Bluetooth device

address (BD_ADDR), determines the frequency hopping sequence.

iTag iTag is a kind of Bluetooth 4.0 and low-energy consuming product that func-

tions through “I AM HERE” application. iTag can be connected to the things which

we usually lose and tend to misplace and works with smart phone to prevent lost. In

addition, iTag can provide a last seen pin drop on map to assist you in recovering your

items. It is compact sized, and also it is low-energy consuming supporting Bluetooth

4.0 whose effective range is up to 25 m. It functions like an anti-lost alarm (Fig. 20.3).

Brand and Model: iTag Compatible mobile phone: any iPhone starting from 4th

generation, and other Apple handheld devices like touch, air, iPad; Android device

with Bluetooth 4.0, any smartphone (Android 4.3 version and up-graded version).

208 J. S. Sai et al.

Fig. 20.3 iTag

Battery: CR2032 lithium coin battery.

Standby time: six months.

20.4.2 Application Implementation

The home page is when the application is initialized. The go button is used to navigate

to the other screen, where the actual activity takes place as shown in Fig. 20.4.

The detected iTags are displayed upon the screen along with the MAC address as

shown in Fig. 20.5.

To establish a connection with an iTag, the required iTag is to be selected, and

then click on the connect button. After that a message “connecting...” is displayed

on the screen as shown in Fig. 20.6.

The iTag has been connected successfully, the connect button changes its color to

green and text to “connected”, indicating that the connection has been successfully

established, and the iTag is ready to be utilized as shown in Fig. 20.7.

To ﬁnd the object’s location, the alarm widget is to be clicked. When the alarm

widget is clicked, its color changes to red, and the iTag begins to produce an alarm

or sound continuously. This helps the user to ﬁnd the object as shown in Fig. 20.8

The beep sound is produced to track the location of the object, and when the

object is found, the user can again tap on the alarm widget, and then, the iTag stops

producing the sound, and the color of the alarm widget changes to blue indicating

that the required object has been found as shown in Fig. 20.9.

20 Location Tracking via Bluetooth 209

Fig. 20.4 Application home

screen

20.5 Conclusion

The application mainly focuses on identifying the lost items. The proposed appli-

cation helps in tracking the misplaced objects through Bluetooth technology via a

mobile app and with the help of iTags attached to these items. The functionality of

the application can be extended as per the requirement of the client. The tags and the

app were donated to one of the old age homes, and we received satisfactory feedback

from our clients. The beep sound is very audible making it quite easy to identify

the objects. Further, this application can be improved by including the location of

the object, through which the location of an object will be displayed on maps which

are much easier to identify the object. Voice recognition features can also be imple-

mented to connect the iTags by recognizing their names through user instructions.

This is speciﬁcally helpful for those who are not comfortable utilizing the keypad or

a touch interface.

210 J. S. Sai et al.

Fig. 20.5 Application

detected iTag

20 Location Tracking via Bluetooth 211

Fig. 20.6 Connecting an

iTag

212 J. S. Sai et al.

Fig. 20.7 iTag connected

20 Location Tracking via Bluetooth 213

Fig. 20.8 Selecting alarm

button

214 J. S. Sai et al.

Fig. 20.9 Sound produced

References

1. Phutcharoen, K., Chamchoy, M., & Supanakoon, P. (2020). Accuracy study of indoor posi-

tioning with Bluetooth low energy beacons.In 2020 Joint International Conference on Digital

Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics,

Computer and Telecommunications Engineering (ECTI DAMT & NCON) (pp. 24–27)

2. Adjei, H. A. S., Oduro-Gyimah, F. K., Shunhua, T., Agordzo, G. K., & Musariri, M. (2020).

Developing a bluetooth based tracking system for tracking devices using Arduino. In 2020

5th International Conference on Computing, Communication and Security (ICCCS) (pp. 1–5).

|INSPEC: 20257873.

3. Bai, L., Ciravegna, F., Bond, R., Mulvenna, M. (2020). A low cost indoor positioning system

using bluetooth low energy. IEEE Access, 8, 136858–136871. |INSPEC: 19974145.

4. Bapat, A. C., & Nimbhorkar, S. U. (2016). Designing RFID based object tracking system by

applying multilevel security. In International Conference on Wireless Communications, Signal

Processing and Networking (WiSPNET).

20 Location Tracking via Bluetooth 215

5. Samual, J. (2015). Implementation of GPS based object location and route tracking on android

device. International Journal of Information System and Engineering,3(2). ISSN: 2289–7615.

6. Gaikwad, P. V., & Kalshetty, Y. R. (2017). Bluetooth based smart automation system using

android. International Journal of New Innovations in Engineering and Technology, (Ijsr.net).

7. Ahmad, S., Rouyu, L., & Hussain, M. J. (2014). Never lose! smart phone based personal

tracking via bluetooth. International Journal of Academic Research in Business and Social

Sciences,4(3). |ISSN: 2222–6990.

8. Bisio, I., Sciarrone, A., & Zappatore, S. (2015). Asset tracking architecture with bluetooth low

energy tags and ad hoc smartphone applications. In European Conference on Networks and

Communications (EuCNC) |INSPEC: 15366926.

9. Tiwari, M., & Singhai, R. (2017). A review of detection and tracking of object from image

and video sequences. International Journal of Computational Intelligence Research, 13(5),

745–765. |ISSN 0973–1873

10. Coustasse, A., Tomblin, S., & Slack, C. (2013). Impact of radio-frequency identiﬁcation

(RFID) technologies on the hospital supply chain: A literature review. Perspectives in Health

Information Management, 10, 1d.

11. Friesen, M. R & McLeod, R. D. (2015). “i” International Journal of Intelligent Transporation

System Research 13(3), 143–153

12. K. Kalaiselvi, K., & Karunya, S. (2017). Tracking system—A proposed model on literature

review, In 2017 International Conference on Inventive Computing and Informatics (ICICI)

(pp. 766–770)

Chapter 21

Shrimp Surfacing Recognition System

in the Pond Using Deep Computer Vision

Gadhiraju Tej Varma and Sri Krishna Adusumalli

Abstract Shrimp productivity has greatly increased and has high-economical values

along with its impact on the GDP of our country. Considering these impact on the

farmers, this paper proposes a convolutional neural network model, which assists and

the farmer in understanding the symptoms that the shrimp exhibits during critical

conditions of any virus effects or decrease in the dissolved oxygen levels in the

water. The proposed system also synthetizes a set of image by uses of image data

augmentation techniques for obtaining a considerable set of images. These images

are used as the training images for the model. The custom shrimp detection systems

use faster_rcnn_inception_v2_coco model, which effectively detects the shrimp and

represents them through boundary boxes. Whenever surfacing kind of symptoms are

exhibited, thus triggering the farmer to identify the risk factor and take the counter

measures.

21.1 Introduction

Indian aquaculture and ﬁsheries are not only an important source of nutritional nour-

ishment to the people but also contribute to the economical and ﬁnancial well-being

of around 28 million people. India is also 3rd largest producer and contributes to

7.7% of global ﬁsh products. An area of 2.36 million Ha of tanks and ponds where

culture-based ﬁsheries contribute to the total ﬁsh production. The current produc-

tion is around 8.5 million MT and is targeted to be 13.5 MT. According to National

Fisheries Policy 2020, which states about the optimizing the capture and culture

ﬁsheries potential, of our country which in turn contributes to the economical well-

being of the country as well as the farmer [1]. More than 50 different varieties of

G. Tej Varma (B)

Department of Computer Science and Engineering, Centurion University of Technology and

Management [CUTM], Vizianagaram, Andhra Pradesh 535003, India

e-mail: tejvarma.varma@gmail.com

S. K. Adusumalli

Department of AI, Shri Vishnu Engineering College for Women (A), Bhimavaram, West

Godavari, Andhra Pradesh 534202, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_21

217

218 G. Tej Varma and S. K. Adusumalli

shrimp and ﬁsh are cultivated and exported to different countries, which prominently

include Tiger Shrimp (Penaeus monodeon), Paciﬁc white legged shrimp (Penaeus

vannamei), Indian white shrimp (Pangasius indicus), etc. During year 2020–21, the

export of frozen shrimp and ﬁsh is worth Rs. 36,946.48 Crores in value. Apart from

these number additions, a target of 5 millions tones of productivity is set for the year

2022 [2]. In spite of these remarkable number, the gray scale is the inﬂuence of the

disease and its spread which causes huge loses to the farmer.

One of the major products of aquaculture is shrimps, and the cultivation dura-

tion for each crop ranges from 3 to 4 months which is entirely dependent on the

geographical location and the soil where shrimp is grown. There would be a lot

of efforts and investments, which ranges from 5 to 10 lakhs per acre. The water

parameters, which inﬂuence the shrimp health and the productivity, are the dissolved

oxygen, pH, ammonia, and minerals present in the water. Apart from these parame-

ters, there are various other inﬂuences, which effect the health of the shrimp which

include virus attacks also. The most common and early symptoms of such kind of

virus attacks or lack of dissolved oxygen are that the shrimp would to reaching to

the surface of the water.

The below table gives a brief overview about different viral, bacterial, and fungal

infections to the shrimp and clinical signs or symptoms that the shrimp exhibits

during the effect period [3] (Table 21.1).

These symptoms require closely monitored by the farmer for understanding the

condition of the shrimp which in turn becomes a tedious job for the farmer in spite of

have CC cameras ﬁtted around the pond in few conditions. In this context, we have

developed a new solution for the monitoring of the pond with help of CC cameras

and also with image recognition model which is developed.

Table 21.1 Various viral or

bacterial infections and

symptoms for the shrimp

Names of viral or bacterial

infections

Clinical signs or symptoms

Hepatopancreatic parvo like

virus (HPV)

Reduced feed, poor growth,

body surfacing

Yellow head disease (YHD) Appear swimming slowly

near the surface of pond

Luminous vibriosis Vertical swimming behavior

Vibriosis Corkscrew swimming

behavior and appears at edge

of the pond

Soft shell syndrome Appears toward the edge of

the ponds

Decrease in dissolved oxygen

(4–10 ppm)

Appears on the surface of

water when DO less than

4 ppm

21 Shrimp Surfacing Recognition System in the Pond Using … 219

21.2 Literature Survey

There is a lot of research which is taking place in the deep learning and image

processing along with IoT. Dah-Jye Lee, Guangming Xiong, Robert M. Lace, and

Dong Zhang describe about a grading and shrimp quality detection in shrimp shipping

process, which also detects if the shrimp has a whole body or a broken part of shrimp

is present [4]. Dmitry A. Konovalov, Alzayat Saleh, Dina B. Efremova, Jose A.

Domingos, and Dean R. Jerry talk about the system which can automate the weight of

ﬁsh which is being harvested by training with 100 whole ﬁsh and with 200 defective

ﬁsh without tails or ﬁns [5]. Zihao Liu, Fang Cheng, and Wei Zhang used deep

learning technique to identify the soft shell in the shrimps when the shrimp passes

through the conveyor belt, they have experimented with images of 200 shrimps with

soft shell and 200 shrimps in good condition and develop a neural network model

for automatically identifying the shrimps with soft shell or loose shell [6]. Weber.

V.A.M, Weber F.L, Gomes R.C, Oliveira Junior, A. S.; Menezes, G. V.; Abreu, U.

G. P.; Belete, N. A. S. and Pistori, H. have proposed a model on identiﬁcation of

body weight of cattle by experimenting on 34 cattle, by taking the dorsum image and

body lateral area, features like hip width, rib height, and other parameters are used

in identifying the weight of the cattle [7].

Wu-Chih Hu, Hsin—Te Wu, Yi Fan Zhang, Shu Huan Zhang, and Chun Hung Lo

explain about a two convolutional neural network architecture with fully connected

layers, which can successful detect 6 different types of shrimps with 95.48% of accu-

racy [8]. Takehiro Morimoto, Thi Thi Zin, and Toshiaki Itami talk about the detection

of infected shrimp by various abnormal behaviors of the shrimp like low food intake,

appearing in shallow waters or by making sudden movements inside water, but this

may not be effective for large ponds with large shrimp density [9]. Joseph Lemley,

Shabab Bazrafkan, and Peter Corcoran propose a new smart argumentation tech-

nique which shown a positive sign of increasing accuracy and also by reducing the

network losses by creating a network that learns while creating an augmented data

[10]. Ryo Takahashi, Takashi Matsubara, and Kuniaki Uehara propose a new for

image data augmentation by making use of deep convolutional neural networks in

which a random image cropping and patching (RICAP) technique, and a new image

is created from taking various portions of few images [11].

21.3 Proposed Work

The below given ﬁgure gives an architectural overview of the proposed system. This

deep learning architecture model is broadly divided to two module, one module

is for the synthesizing of required image data set of images through image data

augmentation, and the other module is the use of conventional neural networks to

train the machine for detection of the shrimp on the surface of the water (Fig. 21.1).

220 G. Tej Varma and S. K. Adusumalli

Fig. 21.1 Block diagram of proposed system

21.3.1 Image Data Augmentation

As mentioned earlier the ﬁrst module is for creation the image data set it is a complex

job to capture such images of shrimp on the surface of the water. So we managed to

collect a few images from a nearby pond during shrimp exhibiting such kind of symp-

toms. But unfortunately, it is not possible to train the machine using conventional

neural networks with a small data set. In order to amplify the number of images,

we made use of data augmentation techniques. Data augmentation is a deep learning

neural network which is used to artiﬁcially create a new training data from the exiting

small training data. Image data augmentation is most renowned data augmentation

technique which applies various transformation on the image to create a new artiﬁcial

data set. The above ﬁgure depicts the various image augmentation techniques. We

make use of geometric transformations which is one of the basic image manipula-

tions techniques. The original image which was captured has a dimensions of 4032

* 3024, and the synthetized image which is generated is also of 4032 * 3024.

21 Shrimp Surfacing Recognition System in the Pond Using … 221

Fig. 21.2 Before (left side) and after (right side) shift augmentation

21.3.2 Horizontal and Vertical Shift Augmentation

While keeping the image dimensions as same, this shifts the pixel of the image in a

either horizontal or vertical direction. A part of the image is been clipped, and the

new values of the replaced pixels are same the last pixel value in the row or column,

respectively. The left part of the image is the original, and right is after transformation

(Fig. 21.2).

21.3.3 Horizontal and Vertical Flip Augmentation

In this case of image augmentation, ﬂip of the pixel in the image is taken place with

respective to horizontal or vertical (Fig. 21.3).

Fig. 21.3 Before (left side) and after (right side) ﬂip augmentation

222 G. Tej Varma and S. K. Adusumalli

Fig. 21.4 Before (left side) and after (right side) random rotation augmentation

21.3.4 Random Rotation Augmentation

The image transformation undergoes a rotation of pixels with some rotation angle

which is ranging from 0 to 360 ° (Fig. 21.4).

In each of the augmentation techniques, the left image indicates the original image,

and right image indicates an example of an image with augmentation techniques. In

total from the available set of 20 original images, each image is further synthesized

to 50 images, and we all put together which brings up the image count to a 1000

images.

Module 1: Custom object detection for shrimp detection.

The second module is a customized object detection model using deep convo-

lutional neural network techniques in which we used image processing along with

deep learning techniques in order to train the system for shrimp detection. The ﬁrst

step involves the installation of libraries in Anaconda and dependent model from the

GitHub. Followed by the installation process, it is important to annotation the image

which are developed from the image data augmentation with the help of labelImg

package, which automates the process and generation of the .xml ﬁle from each

image which is annotated. A green rectangular box indicates the shrimp which is

surfacing on the water which is annotated.

Once the .xml ﬁle has been generated and the annotation is complete. It is impor-

tant to partition the images into training set and test set, in order to train and creation

of the model, we develop a training set, and to evaluate the developed model, we

test it with few images which is known as the test set. The partition of the images

is done with the ratio of 90–10% which indicates 90% of the training model and

10% for evaluating the model. The ration can also be taken as 80–20% also. Apart

from these partitions, creation of label map is also done, which indicates the desired

object in annotation. In the proposed system, we are training the model to identify

only shrimps in the image. The next step involves the creation of the TensorFlow

records from the obtained .xml ﬁle of the annotation. Initially, the .xml is converted

into two .csv ﬁles based on the distribution of the training and the test set. These

21 Shrimp Surfacing Recognition System in the Pond Using … 223

Table 21.2 List of various

convolutional neural

networks versus speed of

respective models

Name of RCNN model Speed in ms

Faster_rcnn_inception_v2_coco 58

Faster_rcnn_resnet50_coco 89

Faster_rcnn_resnet50_lowproposals_coco 64

Faster_rcnn_inception_resnet_v2_atrous_coco 620

Faster_rcnn_resnet101_coco 106

Mask_rcnn_inception_v2_coco 79

Mask_rcnn_resnet50_atrous_coco 343

CSVs are now considered as the inputs and are created to record format which indi-

cates as the TensorFlow records. For creation of the training pipeline model, we

consider faster_rcnn_inception_v2_coco model which is one of the most powerful

RCNN models available. The below table gives a glimpse of various RCNN models

available at GitHub for creating of the training model. From the numbers given

below, it is evident that the faster_rcnn_inception_v2_coco model is the fastest of

the convolution neural networks (Table 21.2).

Upon downloading the required model from the GitHub along with the .conﬁg

ﬁle, we then customize the ﬁle with the modiﬁcations that are necessary to run the

model on the image data set which is already labeled using the annotations of the

image. Running the model will take a large amount of time based on the hardware

availability. A GPU-based system considerably takes lesser time when compared with

the CPU systems, CPU systems may take around few hours to days of time which

is also based on the model and number of steps we involving in the model. But for

a stronger model to be created, it typical require a large number of steps. At regular

interval of time, the model creates a checkpoint with name of model_number of

steps.ckpt. Various graphs would be generated for each steps based on the precision,

learning rate, loss, etc. It is quite evident from the graphs that the total number of steps

involved in training the model is 42 thousand steps. In order to develop a stronger

model, we can continue training with more number of steps. The loss function for

identifying the shrimp in the image can be written as

L({pi},{ti})=1

Ncls 

Lclspi,p∗

i+λ1

Nreg 

p∗

iLregti,t∗

i

where iindicates the index of the annotation in the images and Piis the probability of

ibeing in the image. Pi*is the truth label which is denoted by 0 if the annotation is a

negative value and 1 if the annotation is a positive value. There are also various tables

which represent the learning curve for the model. The ﬁnal step of the custom object

detection for the shrimp is to evaluate the developed model. This makes use of the

10% of evaluation images which are partitioned prier training the model. Intersection

over union (IoU) values can also be evaluated by using the ground truth boundary

boxes and predicted bounding box.

224 G. Tej Varma and S. K. Adusumalli

IoU =Overlapping Area

Combined Area

21.4 Results and Analysis

For the development of this work, we made use of Macbook Air with 1.6 GHz Intel

Core i5 Processor, 8 GB 1600 MHz DDR3 Ram, Intel HD graphics 600 1536 MB, and

the collection of original images are collected through Apple Iphone 7 with color

space of RGB and proﬁle as display P3. As mentioned, the dimensions of all the

images captured are 4032 * 3024. Annotation is very useful in object classiﬁcation

of an image. Bounding boxes are used in annotations process and also for training

the model. The below represents a screenshot of the annotation which is done with

the help of marking the green boxes on the image. Each image may contain more

than one boxes to represent the presence of the shrimp. To mark an rectangular box,

we require Xmin, Ymin, Xmax, and Ymax. In the same way, the image might have

more than one rectangular box which is supposed to be represented. This annotation

process has to be done for all the training images and the test images which are

partitioned. An .xml ﬁle is generated from the image after the annotation is done

with the help of labeling package (Fig. 21.5).

After the annotations and the partitions of the training and the test set, creating of

the label map is a part where the number of items are recognized in the image.

In the model we are creating, we have a single entity which is the shrimp, so

we had only one item and id. As mentioned for creation of the training model,

Fig. 21.5 Shrimp detection using proposed model

21 Shrimp Surfacing Recognition System in the Pond Using … 225

faster_rcnn_inception_v2_coco model is used and on the system that we used it took

around 2 days to complete the training of 42 thousand steps. The ﬁrst two graphs

represent the total mean average precision and mAP large, and the second set of

graphs represent the various losses like box classiﬁer loss and localization loss from

the training model. The third set of images indicate the consistent learning rate and

the loss rate throughout the training model (Fig. 21.6).

After the training model is created, the model has to evaluate using the test set

which we partitioned just after the annotation of the images. The below images are

from the test set and also indicate all the shrimps with the boundary boxes (Fig. 21.7).

We also got the boundary boxes values which are represented as Ymin, Xmin,

Ymax, and Xmax, respectively, in the below image.

Fig. 21.6 Performance of annotation for image partition

Fig. 21.7 Shrimp detection using training data

226 G. Tej Varma and S. K. Adusumalli

21.5 Conclusion

The proposed model used convolutional neural network-based deep learning archi-

tecture model for the shrimp surfacing detection system, in order to assist the farmers

considering the economical impact and the difﬁculty in surveillance of the pond

throughout the life time of the crop. The proposed model produces good results

through using faster_rcnn_inception_v2_coco for identiﬁcation of the shrimp using

the boundary boxes on the images. As the proposed model, we made study on shrimp

detection in shallow water as well as surfacing, and we would like to enhance

the model by developing an IoT device which alerts the farmer and the system

currently works on image capturing which can be enhanced detect shrimp in live

video streaming so that it can be directly integrated to CC cameras.

References

1. Lee, D. J., Xiong, G., Lane, R. M., & Zhang, D. (2012). An efﬁcient shape analysis method

for shrimp quality evaluation. In 2012 12th International Conference on Control Automation

Robotics & Vision (ICARCV) (pp. 865–870). IEEE.

2. Konovalov,D. A., Saleh, A., Efremova, D. B., Domingos, J. A., & Jerry, D. R. (2019). Automatic

weight estimation of harvested ﬁsh from images. In 2019 Digital Image Computing: Techniques

and Applications (DICTA) (pp. 1–7). IEEE.

3. Liu, Z., Cheng, F., & Zhang, W. (2016). Identiﬁcation of soft shell shrimp based on deep

learning.In 2016 ASABE Annual International Meeting (p. 1). American Society of Agricultural

and Biological Engineers.

4. Weber, V. A. D. M., Weber, F. D. L., Gomes, R. D. C., Oliveira Junior, A. D. S., Menezes,

G. V., Abreu, U. G. P. D., Belete, N. A. D. S & Pistori, H. (2020). Prediction of girolando

cattle weight by means of body measurements extracted from images. Revista Brasileira de

Zootecnia, 49.

5. Hu, W. C., Wu, H. T., Zhang, Y. F., Zhang, S. H., & Lo, C. H. (2020). Shrimp recognition

using shrimpnet based on convolutional neural network. Journal of Ambient Intelligence and

Humanized Computing, 1–8.

6. Morimoto, T., Zin, T. T.,& Itami, T. (2018). A study on abnormal behavior detection of infected

shrimp. In 2018 IEEE 7th Global Conference on Consumer Electronics (GCCE) (pp. 291–292)

IEEE.

7. Lemley, J., Bazrafkan, S., & Corcoran, P. (2017). Smart augmentation learning an optimal data

augmentation strategy. IEEE Access, 5, 5858–5869.

8. Takahashi, R., Matsubara, T., & Uehara, K. (2019). Data augmentation using random image

cropping and patching for deep CNNs. IEEE Transactions on Circuits and Systems for Video

Technology.

9. Shorten, C., & Khoshgoftaar, T. M. (2019). A survey on image data augmentation for deep

learning. Journal of Big Data, 6, 60. https://doi.org/10.1186/s40537-019-0197-0

10. Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection

with region proposal networks. Advances in neural information processing systems 91–99

11. Gavrilescu, R., Zet, C., Fos

,al˘au, C., Skoczylas, M., & Cotovanu, D. (2018) Faster R-CNN: an

approach to real-time object detection. In 2018 International Conference and Exposition on

Electrical and Power Engineering (EPE) (pp. 0165–0168). IEEE.

Chapter 22

Sign Language Recognition for Needy

People Using Machine Learning Model

Pavan Kumar Vadrevu, M. R. M. Veeramanickam, Sri Krishna Adusumalli,

and Sasi Kumar Bunga

Abstract In this digital era, gesture identiﬁcation is one of the revolutions to serve

specially-abled (deaf and dumb) people, and it is been investigated for several

decades. Tactlessly, all research studies have their restrictions and are incompetent

to be used commercial. Few studies have been recognized to be effective for gesture

identiﬁcation, but it needs an afﬂuent cost to be marketed. Currently, research scholars

have more consideration for developing gesture identiﬁcation systems that can be

used commercially. Scholars do their studies in innumerable ways. It twitches from

the data gaining approaches when required. The data gaining approach differs since

the cost is required for a decent device, but a low-cost device is needed for the hand

gesture identiﬁcation system to be made saleable. Approaches used in implementing

the system are diverse between studies. Separately, approach has its strong point

compared to other approach and research still using different approaches in devel-

oping their gesture identiﬁcation. Every approach has setbacks compared to others.

This manuscript intends to review the gesture identiﬁcation system for needy people.

Hence, other studies can get more information about the approaches used and develop

better applications in the forthcoming.

22.1 Introduction

The proposed statement in this manuscript is mute people cannot communicate with

other people directly, and they can only communicate with the help of sign language.

As the people do not know the sign language that specially-abled people communicate

with so the key objective of the proposed model is to forecast the sign that we give so

it helps us to communicate with them easily. Even though we get the sign as written

P. K. Va d r e v u (B)·M. R. M. Veeramanickam ·S. K. Bunga

Department of Information Technology, Shri Vishnu Engineering College for Women

(Autonomous), Bhimavaram, India

e-mail: vadrevu.pavan@svecw.edu.in

S. K. Adusumalli

Department of AI, Shri Vishnu Engineering College for Women (Autonomous), Bhimavaram,

India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_22

227

228 P. K. Vadrevu et al.

text, then the people who do not know how to read will get the problem. If we also

give it in the speech format, someone cannot understand any other language other

than the native language (Example: Telugu). By considering this problem statement,

authors experimented with a sign detector that detects the sign and displays it in

both languages like English and Telugu text and also in Telugu speech. The existing

system is very simple to understand and interpret.

22.2 Relevant Study

Here, in this existing system, we have a sign detector but some people cannot under-

stand any language other than their native language, so few systems convert the sign

into text and as well as speech but it is not in Telugu. The existing hand gesture recog-

nition system recognizes the hand gestures statically. This means every hand gesture

has to be captured manually by a person and has to be sent into the system to identify

the gesture [1]. Also, people have to control all the operations and conditions in the

environment, which increases manual work. Also, the existing system should have

special requirements and working conditions to work. Another existing system uses

distinct data belts. A data belt is a communicating device, approximating a belt worn

on the hand, which enables perceptible sensing and ﬁne-motion control in robotics

and virtual reality [2]. Data belts are one of the several types of electro-mechanical

devices used in the haptic application. Motion control involves the use of a sensor

to detect the movement of the user hand and ﬁngers, and the translation of these

motions into signals that can be used by a virtual hand (for example, in gaming) or

a robotic hand (for example, in remote- control surgery). This method services the

devices (mechanical or optical) attached to a belt that transduces ﬁnger ﬂexion into

an electrical signal for deﬁning the hand position [2].

22.2.1 Pros and Cons of the Existing and Planned Models

The existing state-of-the-art systems are no longer helpful for the users who know

only native languages of their own. It is difﬁcult to get the sign-in Telugu speech

format and text formats. It is tough to communicate with specially-abled people. It is

not possible to capture the gestures always with the state-of-the-art systems. Every

model must have a plain and uniform background always, and it is not movable.

The system functionality dropdown when the distance increases from the user and

camera. In the proposed system, we develop the model to reduce the drawbacks that

were present in the existing system. We created a sign detector so that we can get the

sign detected and get that sign in both Telugu speech and Telugu text. If we did not

place the hand to detect, then it shows the warning that the hand is not placed. If we

give the valid sign, then the sign will get predicted. If the sign is not valid, then the

sign will not get recognized. This will detect the hand gesture and say it is a sign. This

22 Sign Language Recognition for Needy People … 229

sign will be converted into both Telugu text and Telugu speech. The advantage of this

model is to improve the accuracy, and it is very easy to maintain compared with the

state-of-the-art models. It provides people to understand better sign language. This

model helps the users who do not know how to read and understand other languages

than the Telugu language.

22.3 Proposed Method

22.3.1 Data Preparation

We are using a total of six different hand gestures. We will be creating 6 different

folders for each class in the directory with each folder name representing the number

of ﬁngers in hand gestures. That is from 0 to 5. During the setup, we are going to

generate our data set. This is done by capturing the live camera feed with the help of

the system’s camera [3]. While the system’s camera continuously starts capturing,

we will be pressing the corresponding folder name on the keyboard which inserts

the current frame picture into the corresponding directory. Algorithm:

Step 1: The camera input frame is converted into a grayscale image.

Step 2: This image is again threshold and converted into a binary image.

Step 3: That image has just one channel which makes the CNN algorithm learn

easily.

Step 4: Image augmentation is submitted to all the composed data to increase size

of the learning data set.

CNN is part of deep learning techniques. The neural networks are layers dispensa-

tion algorithms containing input, output, and other intermediate ﬁlms. So, this inter-

mediate ﬁlm includes processing ﬁlm like intricacy, assembling, repeated, drop-out,

clatter, stabilization, etc. [4]. This model consists of ﬁlms that will be processing

images faster and loading the features depending on the number of times the training

is going and number of samples been trained. “Keras” is the neural network built-in

library that will be imported to make CNN work on system [4]. The below architec-

ture gives the complete details of the model which is used to convert the given text

into voice (Fig. 22.1).

•Hand Segmentation from Background: A histogram method is used to separate a

hand from the background image. Background termination techniques are used

to gain the ﬁnest result (Fig. 22.2).

230 P. K. Vadrevu et al.

Fig. 22.1 Proposed model

Fig. 22.2 System architecture

22.4 Results and Discussion

Gesture Labeling: The identiﬁed hand is then handled and shown by ﬁnding contours

and convex hull to identify ﬁnger and palm positions and dimensions. Finally, a

gesture object is shaped from the recognized pattern which is compared to a deﬁned

gesture dictionary.

The experimentation is conducted by collecting the data set from different Internet

sources. The collected data set is undergone for training with the model, and this

22 Sign Language Recognition for Needy People … 231

Fig. 22.3 Test case speciﬁcation

can be tested using a ten-fold cross-validation technique to understand the model

efﬁciency and accuracy [5]. The system is implemented in a Python environment as

we have a rich set of built-in libraries to get more accurate results on a windows-based

machine. Based on the obtained results, this implementation is giving better accurate

results compared with the existing few models. 80% of the data set is used for the

training, and 20% is used to test the model. The given ﬁgure represents the test case

speciﬁcation for gesture recognition with ﬁve valid test cases. The experimentation

results were also shown in the given ﬁgure (Fig. 22.3 and 22.4).

22.5 Conclusion

This paper deals with hand gesture recognition for HCI automation. User-friendly

gestures to improve communication between specially-abled people and normal

people and also supervisory things by hand are more normal, cooler, more elastic,

and cheap, and there is no need to ﬁx problems caused by hardware devices since

none is required. The CNN algorithm has given an accuracy of about 94% which

indicates that the algorithm is performing accurately with the given hand gestures [6–

8]. The implemented system serves as an extendible foundation for future research,

232 P. K. Vadrevu et al.

Fig. 22.4 Experimental results

extensions to the current system have been proposed. In extension to our existing

work with the help of a high-range camera, we can capture the depth data, segment

the hand, and then classify it using a chamfer matching method to measure the simi-

larities between the candidate hand image and hand templates in the database [6–8].

With the use of infrared high-resolution cameras, there is a chance for controlling the

appliances even during the night time. The comparison study needs to be conducted

by taking the other models that can be presented in the next coming paper.

References

1. Wang, H., Chai, X., Zhou, Y., & Chen, X. (2015). Fast sign language recognition beneﬁted

from low rank approximation. In: 2015 11th IEEE International Conference and Workshops

on Automatic Face and Gesture Recognition, FG 2015.

2. Tewari, D., & Srivastava, S. (2012). A visual recognition of static hand gestures in Indian

sign language based on Kohonen self-organizing map algorithm. International Journal of

Engineering and Advanced Technology (IJEAT), 2(2), 165–170.

3. Huang, J., Zhou, W., Li, H., Li, W. (2015). Sign language recognition using 3D convolutional

neural networks. In 2015 IEEE International Conference on Multimedia and Expo (ICME),

(pp. 1–6). IEEE.

4. Suryapriya, A. K., Sumam, S., & Idicula M. (2009). Design and development of a frame

based MT system for english-to-ISL. In World Congress on Nature and Biologically Inspired

Computing (pp. 1382–1387).

5. Georghiades, A. S., Belhumeur,P. N., & Kriegman,D. J. (2001). From few to many: illumination

cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern

Analysis and Machine Intelligence, 23(6), 643–660.

6. Vadrevu, P. K., Adusumalli, S. K., & Mangalapall, V. K. (2020) Motion detection to preserve

personal privacy from surveillance data using contrary motion. International Journal of Recent

Technology and Engineering (IJRTE), 8(6).

7. Vadrevu, P. K., Adusumalli, S. K., & Mangalapall, V. K. (2019) A survey on personal

privacy preserving data publication in IoT. International Journal of Innovative Technology

and Exploring Engineering (IJITEE), 8(6C2). ISSN: 2278–3075.

22 Sign Language Recognition for Needy People … 233

8. Vadrevu, P. K., Adusumalli, S. K., & Mangalapalli, V. K., Survey: privacy preserving data

publication in the age of big data in IoT Era. International Journal of Engineering, Science

and Mathematics, 6(8), 938–944.

9. Ong S., & Ranganath S. (2005). Automatic sign language analysis: a survey and the future

beyond lexical meaning. IEEE Transactions on Pattern Analysis and Machine Intelligence,

27(6), 873–891.

10. In Human Friendly Mechatronics Selected Papers of the International Conference on Machine

Automation ICMA 2000 Sept 27–29, 2000, Osaka, Japan, 2001, pp 11–16.

11. Bantupalli K., & Xie, Y. (2018). American sign language recognition using deep learning and

computer vision. In 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA,

USA, 2018, (pp. 4896–4899). https://doi.org/10.1109/BigData.2018.8622141.

12. Cabrera, M., Bogado, J., Fermãn, L., Acuna, R., & Ralev, D. (2012). Glove-based gesture

recognition system. https://doi.org/10.1142/9789814415958_0095.

13. He, S. (2019). Research of a sign language translation system based on deep learning. 392–396.

https://doi.org/10.1109/AIAM48774.2019.00083.

14. Herath, H. C. M., Kumari, W. A. L. V., Senevirathne, W. A. P. B., & Dissanayake, M. (2013).

Image based sign language recognition system for Sinhala sign language

15. Geetha, M., & Manjusha, U. C. (2012). A vision based recognition of Indian sign language

alphabets and numerals using b-spline approximation. International Journal on Computer

Science and Engineering (IJCSE), 4(3), 406–415.

16. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: a large-scale

hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition,

CVPR 2009, (pp. 248–255). IEEE. Miami, FL, USA.

17. Feng, Z., & Jiang, Y. (2013). The review of the gesture recognition research. In Proceeding of

the Jinan University: Natural Sciences column (vol. 4, pp. 336–341).

18. Zafrulla, Z., Brashear, H., Starner, T.,et al. (2011). American sign language recognition with the

Kinect. In Proceedings of the 13th international conference on multimodal interfaces (pp. 279–

286).

Chapter 23

Efﬁcient Usage of Spectrum by Using

Joint Optimization Channel Allocation

Method

Padyala Venkata Vara Prasad, K. V. D. Kiran, Rajasekhar Kommaraju,

and N. Gayathri

Abstract Traditionally, cognitive radio can be accessed by secondary user only

when foremost user is absent, but the subordinate client needs to evacuate the idle

spectrum when existence of foremost user is detected. Hence, the bandwidth is

reduced in traditional scheme. To overcome the problem, non-orthogonal multiple

access is used to increase the efﬁciency of spectrum in 5G communications. Non-

orthogonal multiple access is used in this method to allow the subordinate user to

enter the gamut even when forecast client is attending or not attending the conduit.

Foremost client decryption technique and subordinate client decryption technique

are introduced to decrypt the non-orthogonal signs. Hence, through the decoding

techniques, secondary user throughput can be achieved; to increase the primary user

throughput, duct channel energy must be in a limit. However due to the disturbance

caused by the foremost client, the subordinate efﬁciency may be decreased. Orienting

toward foremost user ﬁrst deciphering and subordinate user ﬁrst decoding. Here, we

come up with two enhancement problems to enhance the efﬁciency of both primary

client and secondary client. This is done using jointly optimizing spectrum resource

(Gayatri et al. in Mobile Netw Appl 6:435–441 [1]). This citation embracing how

much amount of sub-channel transmission power is used, and also, it enhances the

number of sub-channels present in it. This type of citation is to enhance optimiza-

tion problems. Jointly optimization algorithm is introduced to eliminate the existing

problem. This is achieved by accepting the signs and calculating the time needed to

sense gamut and the forecast client attending or not attending the conduit to decrypt

the data sent while forecast client is absent (Ghasemi and Sousa in IEEE Commun

Mag 46:32–39 [2]). The miniature outcomes will be shown the non-orthogonal-based

multiple access cognitive radio’s predominant transmission efﬁciency.

P. V. V. P r a s a d ( B)·K. V. D. Kiran ·N. Gayathri

Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation,

Vaddeswaram, Andhra Pradesh, India

e-mail: padyalaprasad@gmail.com

R. Kommaraju

IT Department, LakiReddy BaliReddy College of Engineering, Mylavaram, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_23

235

236 P. V. V. Prasad et al.

23.1 Introduction

In contrast to past periods of compact interchange frameworks, execution. The other

variation of adaptable intercommunication changes, improvements from 1 to 2G,

3G, 4G, is not reasonable to technology of new cutting edge 5G advancement, but

it provides an unused methodology that provides ubiquitous organization.5G archi-

tecture is astonishingly precise. Past programs have advanced more motivated by

which loop of the new development could be performed. Unique company promo-

tion apps have powered the unused 5G growth. For applications as contrasting as car

interchanges, distant control with haptic design feedback, monster video downloads,

as well as the amazingly form data implemented applications such as difﬁcult to

reach sensors they known to be used as IoT, Web of things, 5G portable correspon-

dences have been powered by the should give ubiquitous organization. 5G can supply

signiﬁcantly more unmistakable versatility and can be ready to back a considerably

broader stretch out of uses [3], from moo data rate Web of things necessities through

to especially snappy data rate and astoundingly moo dormancy applications.

An ideal joint detecting edge and sub-band control assignment is proposed for

multichannel cognitive radio (CR) shell by naming a mixed-variable optimization

issue to maximize the whole throughput of cognitive signal whereas containing all the

restraint to the PU, the full control of the CR, and possibilities of wrong caution with

address of each sub-channel. Based on the bi-level optimization strategy. A few of the

available procedures utilized in cognitive radio incorporate range detecting, range

database, and pilot channel. The designed sharpening problem is divided into two

sub-issues of single-variable angled enhancement: the upper level for streamlining

the limit and the not used end level for enhancing power. The reenactment is about

creating the perception that with unique sub-channel gets formed joint sharpen will

achieve charming improvement in the throughput. A few of the existing procedures

utilized in cognitive radio incorporate range detecting, range database, and pilot

channel. These strategies are either complex that requires tall computational control

to distinguish unused range or come up short to capitalize on range space made

in real amount. Cognitive signal (CR) could be a shape of remote communication

with which handset can scholarly people discern which transmission channels are in

utilize. It immediately moves into empty channels whereas maintaining a strategic

distance from involved ones. It does not cause any impedances to the authorized

client. Figure 23.1 appears a way of range sharing by applying joint optimization

process in channel allocation.

Cognitive radio may be a concept presented to assault the up to the next coming

range crunch issue. Cognitive radio clients are unlicensed clients who discover

unused authorized range powerfully for its possess utilize without causing any

obstructions to authorized clients [4]. A few of the existing procedures utilized in

Cognitive radio incorporate range detecting, range database, and pilot channel. These

strategies are either complex that requires tall computational control to distinguish

unused range or come up short to capitalize on range space made in real amount.

Cognitive signal (CR) could be a shape of remote communication with which handset

23 Efﬁcient Usage of Spectrum by Using Joint Optimization … 237

Fig. 23.1 Flowchart for jointly optimization

can scholarly people discern which transmission channels are in utilize. It imme-

diately moves into empty channels whereas maintaining a strategic distance from

involved ones. It does not cause any impedances to the authorized client. The solid-

iﬁed approach highlighted in this paper can help make less complex, taken a toll

viable arrangement. It can radically diminish operator’s speculation as CR utilize

unlicensed range. As the IoT wonder develops, tens of billions of gadgets will have to

be communicate with each other in real time. The highlighted CR approach will help

administrators cater to enormous range necessities and help construct an associated

world.

Traditionally, cognitive radio can be accessed by subordinate client only when

foremost user is absent, but the subordinate client needs to evacuate the idle spec-

trum when existence of foremost user is detected. Hence, the bandwidth is reduced

in traditional scheme. To overcome the problem, non-orthogonal multiple access is

used to increase the efﬁciency of spectrum in 5G communications. Non-orthogonal

multiple access is used in this method to allow the subordinate user to enter the gamut

even when forecast client is attending or not attending the conduit. Foremost client

decryption technique and subordinate client decryption technique are introduced to

decrypt the non-orthogonal signs. Hence, through the decoding techniques, subor-

dinate client efﬁciency can be addressed, to make high the primary user throughput

238 P. V. V. Prasad et al.

duct channel energy must be in a limit. However due to the disturbance caused by the

foremost client, the subordinate efﬁciency may be decreased [5]. Orienting toward

foremost user ﬁrst deciphering and subordinate user ﬁrst decoding, here, we come

up with two enhancement problems to enhance the efﬁciency of both foremost client

and subordinate client. This is done using jointly optimizing spectrum resource. This

citation embracing how much amount of sub-channel transmission power is used for

implementation, and it enhances the number of sub-channels present in it. This type

of citation is to enhance optimization problems. Jointly optimization algorithm is

introduced to eliminate the existing problem. This is achieved by accepting the signs

and measuring the time needed to sense gamut and the forecast client attending or

not attending the conduit to decrypt the data sent while forecast client is absent.

The miniature outcomes will be shown the non-orthogonal-based multiple access

cognitive radio’s predominant transmission efﬁciency. The entire article is having

Sect. 23.1 about introduction part is about the channel allocation in wireless commu-

nication, next Sect. 23.2 is about literature survey having different calculation formula

used while allocating channels to PU and SU users, next Sect. 23.3 of paper is about

proposed joint optimization method for channel allocation, and ﬁnally, Sect 23.4 is

having experimental results, Sect. 23.5 is about comparison between traditional and

proposed joint channel allocation process, and Sect. 23.6 is conclusion and future

scope of the work.

23.2 Literature Survey

Gamut sharing instrument should improve its present data point of the gamut utiliza-

tion for the cognitive signals by permitting a subordinate user for entering trivial spec-

trum allotted for foremost user. Anyway, the subordinate client may not, permitted

to upset the customary correspondences of the component. The Subordinate user

detects a similar array by action range detecting which might recognize as binary

fold detection issue. Inside, the conventional plans, the subordinate client can exclu-

sively get to the inert channel; ﬁnally, there is any existence of the foremost user is

detected energy sight is edged in accustomed to detecting the element with scrutiny

of stockpile energy datum of the element signal to regain its threshold. The element

is identiﬁed to truant if power is minimized below the threshold. The gig of vivacity

detection is mirrored by warning likelihood and sight of probability.

x=cos2∗pi∗1000∗t(23.1)

a=ammod(x,F,Fs)(23.2)

Pxx =periodogram(a)(23.3)

where Fis frequency of the user, and Fs is frequency of the gamut.

23 Efﬁcient Usage of Spectrum by Using Joint Optimization … 239

In the zero-mode chance that the essentialness numbers are underneath the imprint,

the plutonium is resolved not to be available. Energy locator effectiveness is interacted

by the measurable of caution and the likelihood of identiﬁcation. Less inaccurate

alert likelihood offers exceedingly trustworthy otiose channel recognition, while

more probability of detection suggests that the existence of the foremost recipient is

exceptionally correctly identiﬁed. The moment takes remembering the show that is

lifted above the limit of risk counterfeit and discovery of weight space locations. Pre-

talk range control subject tuning was expected, when the edge was sliced up during the

opening and submission of the note medium space and even the subordinate recipient

should introduce the medium of sending opening when the inert channel inside

the detection is recognized. In a technique of detecting, throughput compromise

was found to seek the best warning time to maximize the subordinate user’s yield.

Several nuclear number 24 channels have been suggested that will upgrade the second

client yield by allowing the subordinate user to enter various inactive sub-channels

at a comparable time [6]. To upgrade the yield of the subordinate customer to the

non-attendance of plutonium, capacity enhancement of the multichannel nuclear

number 24 was discovered. Notwithstanding, if the premier customer was open,

the subordinate channel accessing person does not use the scope inside higher than

strategies. In this way, the throughput reduces. Multiple chances are expanded to

support 5G exchange spectrum power and can assist range access within the NOMA

of various customers in force space, multiplexing different customers on an equivalent

sub-channel by adding superposition committal to writing at the transmitter and

demanding impedance to undo the legatee’s strategy. Therefore, as soon as the item

is present, the subordinate user may also accept the range to boost the out turn through

mistreatment ulceration.

Hpsd =dspdata.psd(Pxx,‘Fs’,Fs)(23.4)

c=Pxx(25)∗10,000 (23.5)

Cognitive radio clients are unlicensed clients who discover unused authorized

range powerfully for its possess utilize without causing any obstructions to authorized

clients. A few of the existing procedures utilized in cognitive radio incorporate range

detecting, range database, and pilot channel. These strategies are either complex that

requires tall computational control to distinguish unused range or come up short to

capitalize on range space made in real amount. Cognitive signal (CR) could be a

shape of remote communication with which handset can scholarly people discern

which transmission channels are in utilize. It immediately moves into empty channels

whereas maintaining a strategic distance from involved ones.

aa =ammod(x,F,Fs)(23.6)

Initial methods, the efﬁciency of the forecast user is reduced due to noise caused

by the subordinate user when both the clients in access mode to the conduit at the

240 P. V. V. Prasad et al.

same time, so this disturbance leads to decrease in efﬁciency of primary user. So,

to overcome the problem, we used non-orthogonal multiple access to enhance the

efﬁciency of the forecast client. Hence, when contrasted with traditional scheme, in

our algorithm, subordinate client access the conduit even when the forecast client

is attending the conduit, and the efﬁciency of both the clients is maintained while

accepting the conduit.

23.3 Proposed Method

Goaling at primary user (PF) decoding mode and SF decoding mode, we have got

planned two improvement drawbacks, respectively, ﬁrstly, to access the conduit while

the forecast client is attending or not attending the channel. Secondly, to enhance

the efﬁciency of both the forecast client and subordinate client efﬁciency using non-

orthogonal multiple access. A jointly optimization algorithm is introduced to enter

the sub-conduit while forecast client is attending or not attending the conduit and to

enhance the efﬁciency.

In the old method, the efﬁciency of the forecast user is reduced due to the distur-

bance caused by the subordinate user when both the clients are accessing the conduit

at the same time, so this disturbance leads to decrease in efﬁciency of primary user.

So, to overcome the problem, we used non-orthogonal multiple access to enhance

the efﬁciency of the forecast client. Hence, when contrasted with traditional scheme,

in our algorithm, subordinate client access the conduit even when the forecast client

is attending the conduit and the efﬁciency of both the clients is maintained while

accepting the conduit.

Hence, when contrasted with traditional scheme, in our algorithm, subordinate

client access the conduit even when the forecast client is attending the conduit and

the efﬁciency of both the clients is maintained while accepting the conduit. This is

achieved by accepting the signs and calculating the time needed to sense gamut and

the forecast client attending or not attending the conduit to decrypt the data sent while

forecast client is absent. The miniature outcomes will be shown the non-orthogonal-

based multiple access cognitive radio’s predominant transmission efﬁciency.

Therefore, through the decrypting techniques, secondary user efﬁciency can be

achieved, to increase the primary user throughput duct channel energy must be in a

limit. However due to the disturbance caused by the foremost client, the subordinate

efﬁciency may be decreased. Orienting toward foremost user ﬁrst deciphering and

subordinate user ﬁrst decoding, here, we come up with two enhancement problems to

enhance the efﬁciency of both primary client and secondary client. This is done using

jointly optimizing spectrum resource. This citation embracing how much amount

23 Efﬁcient Usage of Spectrum by Using Joint Optimization … 241

of sub-channel transmission power is used, and it enhances the number of sub-

channels present in it. This type of citation is to enhance optimization problems.

Jointly optimization algorithm is introduced to eliminate the existing problem.

Algorithm for Sub-channel Power Optimization

Initialize: time =0,v ar =0 and Freq > 0;

1. with given t, calculate {x} using (pi);

2. with {t, pi}, update ‘x’ through (1);

3. set F =0, Fs =max

4. with given freq, Fs, calculate {a} using (x);

5. with {F, Fs, x}, update ‘a’ through (2);

6. repeat step (2) to (5) until F =5.

output: graph{Frequencies}

In the algorithm, we are assigning the frequencies of different users and assigning

time with 0 and calculating the lambda values, and ammod function is used to calcu-

late the respected range using the frequency of the user and frequency of the spectrum

[7]. In Fig. 23.2, the subordinate client is entering the conduit while the forecast is

attending the conduit. However, the efﬁciency is reduced due to the disturbance

caused in the gamut by the subordinate client while forecast client is attending the

conduit.

In the algorithm, we give the number of forecast users, then we give calculate the

threshold of the forecast user. If the threshold is in between the given range, then the

forecast user is present or not is concluded. If the forecast is not attending the conduit,

then the subordinate client can access the conduit to send its encrypted data. When

the forecast user is attending the conduit, then the subordinate client can accept the

data from the conduit which was sent in the absence of forecast client for subordinate

client and decrypt the encrypted data. Here using non-orthogonal multiple access, the

efﬁciency of the forecast client and subordinate client is improved, and the frequen-

cies of both the clients are maintained. A graph is plotted to show the efﬁciency and

Fig. 23.2 Primary user is present

242 P. V. V. Prasad et al.

frequencies of the clients to show the outputs and the work done between the fore-

cast and subordinate clients. In the old method, the efﬁciency of the forecast user is

reduced due to the disturbance caused by the subordinate user when both the clients

are accessing the conduit at the same time, so this disturbance leads to decrease in

efﬁciency of primary user. But the efﬁciency of the forecast user is reduced due to

the disturbance caused by the subordinate user when both the clients are accessing

the conduit at the same time, so this disturbance leads to decrease in efﬁciency of

primary user [8]. So, to overcome the problem, we used non-orthogonal multiple

access to enhance the efﬁciency of the forecast client. Hence, when contrasted with

traditional scheme, in our algorithm, subordinate client access the conduit even when

the forecast client is attending the conduit, and the efﬁciency of both the clients is

maintained while accepting the conduit. The subordinate client sends its data when

forecast client is absent and access the data from the client while forecast client

is present by decrypting the data from the conduit. But, however, the efﬁciency of

the clients is distorted due to the disturbances caused by the forecast client and the

subordinate client.

Algorithm for Jointly Optimization

Initialize: gamma =8000 and Freq > 0;

1. with given F, calculate {a} using algorithm 1;

2. with {a}, update ‘Pxx’ through (3);

3. with {Fs}, update ‘Hpsd’ through (4);

4. with {Pxx, F}, update ‘c’ through (5);

5. if c < gamma

6. with given {F}, calculate {aa} using (x);

7. with {x, F, Fs}, update ‘aa’ through (6);

8. repeat step (3).

output: graph{Frequencies}

In the algorithm, the subordinate client is accessing the gamut while the forecast

client is attending or not attending the client. Here, in the above Fig. 23.3,itis

show that the efﬁciency of the forecast user is achieved using the non-orthogonal

multiple access. The efﬁciency of both the clients forecast client and subordinate

client is maintained. The efﬁciency can be calculated by frequencies difference.

Here, the forecast client and subordinate client are having different frequencies.

Hence, efﬁciency is enhanced.

With the proposed algorithm gives results will be obtained and we give the number

of forecast users then we give calculate the threshold of the forecast user. If the

threshold is in between the given range, then the forecast user is present or not is

concluded. If the forecast is not attending the conduit, then the subordinate client

can access the conduit to send its encrypted data. When the forecast user is attending

the conduit, then the subordinate client can accept the data from the conduit which

was sent in the absence of forecast client for subordinate client and decrypt the

encrypted data. Here using non-orthogonal multiple access, the efﬁciency of the

23 Efﬁcient Usage of Spectrum by Using Joint Optimization … 243

Fig. 23.3 Usage of channel by primary user and secondary user

forecast client and subordinate client is improved, and the frequencies of both the

clients are maintained. A graph is plotted to show the efﬁciency and frequencies of the

clients to show the outputs and the work done between the forecast and subordinate

clients. In the old method, the efﬁciency of the forecast user is reduced due to the

disturbance caused by the subordinate user when both the clients are accessing the

244 P. V. V. Prasad et al.

conduit at the same time, so this disturbance leads to decrease in efﬁciency of primary

user.

So, to overcome the problem, we used non-orthogonal multiple access to enhance

the efﬁciency of the forecast client. Hence, when contrasted with traditional scheme,

in our algorithm, subordinate client access the conduit even when the forecast client

is attending the conduit and the efﬁciency of both the clients is maintained while

accepting the conduit. This is done using jointly optimizing spectrum resource. This

citation embracing how much amount of sub-channel transmission power is used,

and it enhances the number of sub-channels present in it. This type of citation is to

enhance optimization problems [9]. In Fig. 23.3, jointly optimization algorithm is

introduced to eliminate the existing problem. This is achieved by accepting the signs

and calculating the time needed to sense gamut and the forecast client attending or

not attending the conduit to decrypt the data sent while forecast client is absent.

The miniature outcomes will be shown the non-orthogonal-based multiple access

cognitive radio’s predominant transmission efﬁciency.

23.4 Experimental Results

In the algorithm, we give the number of forecast users, then we give calculate the

threshold of the forecast user. If the threshold is in between the given range, then the

forecast user is present or not is concluded. If the forecast is not attending the conduit,

then the subordinate client can access the conduit to send its encrypted data. When

the forecast user is attending the conduit, then the subordinate client can accept the

data from the conduit which was sent in the absence of forecast client for subor-

dinate client and decrypt the encrypted data. Here using non-orthogonal multiple

access, the efﬁciency of the forecast client and subordinate client is improved, and

the frequencies of both the clients are maintained. A graph is plotted to show the efﬁ-

ciency and frequencies of the clients to show the outputs and the work done between

the forecast and subordinate clients. In the old method, the efﬁciency of the forecast

user is reduced due to the disturbance caused by the subordinate user when both

the clients are accessing the conduit at the same time, so this disturbance leads to

decrease in efﬁciency of primary user. So, to overcome the problem, we used non-

orthogonal multiple access to enhance the efﬁciency of the forecast client. Hence,

when contrasted with traditional scheme, in our algorithm, subordinate client access

the conduit even when the forecast client is attending the conduit, and the efﬁciency

of both the clients is maintained while accepting the conduit.

23.5 Comparison of Results

In ancient method, the subordinate client can access the forecast client conduit only

when the forecast conduit is not attending the conduit. But in the modern scheme,

23 Efﬁcient Usage of Spectrum by Using Joint Optimization … 245

subordinate client can access the forecast client conduit even when forecast client

is attending the conduit. So, in this algorithm, the subordinate client can access the

forecast conduit when forecast client is attending or not attending the conduit. Firstly,

the conduit is sensed to know the presence of forecast client, based on the gamma

range, if it is between the given threshold, then the subordinate client can know

whether the forecast user is attending the conduit or not. In Fig. 23.4, results shown

are secondary user accessing primary user sub-channel, and power usage is very

affective. If forecast client is not attending the conduit, then the subordinate client

sends its encrypted data to the forecast client. So, by this whether forecast client

is attending or not attending, the conduit subordinate client can enter the conduit.

Whenever the subordinate client wants to accept the data when forecast client is

attending the conduit, then the data is decrypted at the forecast client, and the actual

data is sent to the subordinate client. So, in this way, both the forecast and subordinate

clients can accept the conduit to access the sub-conduit even when forecast user is

attending or not attending the conduit. But here, the drawback is that whenever the

forecast client and subordinate client accept the sub-conduit, the efﬁciency of the

clients is decreased so to overcome the drawback jointly optimization technic is used

to enhance the efﬁciency of the clients. Using MATLAB, we showed the difference

between the old and our approach. In the algorithm, ﬁrstly, frequencies are assigned

to the variables, ammod function is used to calculate the output, and the outcome is

plotted on the graph to show the result. Figure 23.3 is showing throughput comparison

between traditional and joint optimization with the parameters of received power and

bits transmitted in allocated channel.

Fig. 23.4 Throughput comparison between traditional and joint optimization

246 P. V. V. Prasad et al.

Here, the subordinate client senses the conduit and identiﬁes the presence or

absence of forecast client. Based on the sensing result, it takes decision whether

to enter the conduit or send data to the conduit. However, this algorithm fails to

improve the efﬁciency of the algorithm, so to improve the efﬁciency, we discussed

the jointly optimization algorithm. In the technique, non-orthogonal multiple access

is used to eliminate the disturbances caused by the subordinate user and to enhance

the efﬁciency of the forecast client.

23.6 Conclusion and Future Scope

In ancient method, subordinate client can access the forecast client conduit only

when the forecast conduit is not attending the conduit. But in the modern scheme,

subordinate client can access the forecast client conduit even when forecast client

is attending the conduit. So, in this algorithm, the subordinate client can access

the forecast conduit when forecast client is attending or not attending the conduit.

Firstly, the conduit is sensed to know the presence of forecast client, based on the

gamma range, if it is between the given threshold, then the subordinate client can

know whether the forecast user is attending the conduit or not. If forecast client is

not attending the conduit, then the subordinate client sends its encrypted data to the

forecast client. So, by this whether forecast client is attending or not attending, the

conduit subordinate client can enter the conduit. Whenever the subordinate client

wants to accept the data when forecast client is attending the conduit, then the data is

decrypted at the forecast client, and the actual data is sent to the subordinate client.

So, in this way, both the forecast and subordinate clients can accept the conduit

to access the sub-conduit even when forecast user is attending or not attending the

conduit.

But here, the drawback is that whenever the forecast client and subordinate client

accept the sub-conduit, the efﬁciency of the clients is decreased so to overcome the

drawback jointly optimization technic is used to enhance the efﬁciency of the clients.

Using MATLAB, we showed the difference between the old and our approach. In the

algorithm, ﬁrstly, frequencies are assigned to the variables, ammod function is used

to calculate the output, and the outcome is plotted on the graph to show the result.

Here, the subordinate client senses the conduit and identiﬁes the presence or

absence of forecast client. Based on the sensing result, it takes decision whether

to enter the conduit or send data to the conduit. However, this algorithm fails to

improve the efﬁciency of the algorithm so to improve the efﬁciency we discussed

the jointly optimization algorithm. In the technique, non-orthogonal multiple access

is used to eliminate the disturbances caused by the subordinate user and to enhance

the efﬁciency of the forecast client.

23 Efﬁcient Usage of Spectrum by Using Joint Optimization … 247

References

1. Gayatri, T., Sharma, V. K., & Anveshkuar, N. (2019). A survey on conceptualization of cogni-

tive radio and dynamic spectrum access for next generation wireless communication. Mobile

Networks and Applications, 6(5), 435–441.

2. Ghasemi, A., & Sousa, E. S. (2018). Spectrum sensing in cognitive radio networks: Require-

ments, challenges and design tradeoffs. IEEE Communications Magazine, 46(4), 32–39.

3. Liu, X., Jia, M., Gu, X., & Tan, X. (2017). Optimal periodic cooperative spectrum sensing based

on weight fusion in cognitive radio networks. Sensors, 13(4), 5251–5272.

4. Shen, J., Liu, S., Wang, Y., Xie, G., Rashvand, H. F., & Liu, Y. (2019). Robust energy detection

in cognitive radio. IET Communications, 3(6), 1016–1023.

5. Yang, L., Han, Z., Huang, Z., & Ma, J. (2018). A remotely keyed ﬁle encryption scheme under

mobile cloud computing. Journal of Network and Computer Applications, 106, 90–99.

6. Unde, A. S., & Deepthi, P. P. (2020, January). Design and analysis of compressive sensing

based lightweight encryption scheme for multimedia IoT. IEEE Transactions Circuits Systems

II, Express Briefs, 67(1), 167–171

7. Liu, X., & Jia, M. (2017). Joint optimal fair cooperative spectrum sensing and transmission in

cognitive radio. Physical Communication, 25, 445–453.

8. Liu, X., Li, F., & Na, Z. (2017). Optimal resource allocation in simultaneous cooperative

spectrum sensing and energy harvesting for multichannel cognitive radio. IEEE Access, 5,

3801–3812.

9. Muhammad, K., Hamza, R., Ahmad, J., Lloret, J., Wang, H., & Baik, S. W. (2018). Secure

surveillance framework for IoT systems using probabilistic image encryption. IEEE Transactions

Industrial Informatics, 14(8), 3679–3689.

Chapter 24

An Intelligent Energy-Efﬁcient Routing

Protocol for Wearable Body Area

Networks

Muniraju Naidu Vadlamudi and Md. Asdaque Hussian

Abstract The wireless body area network (WBAN) is a wireless sensor network

that calculates natural constraints for special applications using wireless sensor nodes

in and around the human body. Routing protocols used in wireless body networks

inﬂuence energy, topology, temperature, location, sensor radio choice, and appro-

priate quality of service in sensor nodes. The achievement of energy effectiveness in

wireless body networks (WBANs) is one of these challenges, as energy efﬁciency

eventually affects the durability of the network. The majority of applications in

the wireless body sensor network (WBAN) currently require efﬁcient transmission

processes for remote monitoring of wearable system data on demand and in a timely

manner. This paper proposes a reliable, power efﬁcient, and highly stable network

for wireless body area sensor networks. The simulation has been completed, and the

ﬁndings have been found to be consistent.

24.1 Introduction

The WSN includes a node that is used in a speciﬁc geographical area to sensor or

monitor parameters such as temperature, humidity, pressure, noise level, and move-

ment of the vehicle or the human being. The WBAN IEEE standard was provided

as IEEE 802.15 as ‘a communication standard optimized for low-power devices

and operation on, in or around the human body (but not limited to humans) to

serve a variety of applications including medical, consumer electronics/personal

entertainment, and others’ [1].

Information aggregation procedures are classiﬁed as follows:

M. N. Vadlamudi (B)

Assistant Professor, Department of Computer Science and Engineering, School of Engineering,

Malla Reddy University, Hyderabad, Telangana, India

e-mail: munirajunaidu.v@gmail.com

M. N. Vadlamudi ·Md. A. Hussian

Associate Professor, Faculty of Computer Studies, Arab Open University, Manama, Bahrain

e-mail: m.asdaque@aou.org.bh

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_24

249

250 M. N. Vadlamudi and Md. A. Hussian

(a) A prearranged method where the battery exhausts quicker when the network

topology is analyzed and arranged that leads to network failure, and

(b) A network topology open method in which the power does not misuse energy

set-up network topology [2]. Information aggregation was made possible for big

body area networks by reducing an active state node set to reduce network usage

[3]. For heterogeneous networks, the aggregation of data takes place through the

compromise among information quality and power [4]. A tree-based topology

aggregation is dependable by supplying instance gaps for forwarding data and

a predeﬁned transmission power to improve network life [5].

24.2 Issues in Designing Routing Protocols

Due to speciﬁc conditions such as architecture, thickness, and information speed,

WBAN is not directly used for protocols used in WSNs because it is a subset of

WSN [6,7]. A few difﬁculty factors that make choosing a WBAN routing protocol

are A. mobility, B. efﬁcient communication choice, C. service quality, and D. safety

and conﬁdentiality(Fig. 24.1).

Fig. 24.1 Deployment of nodes

24 An Intelligent Energy-Efﬁcient Routing Protocol … 251

24.2.1 Existing Energy-Efﬁcient Routing Protocols in WBAN

Javaidane [8] proposed an energy proﬁcient and thermally sensitive routing protocol

for WBANs to reduce node temperatures and reduce critical data delay. Tauqir et al.

[9] propose the distance aware relaying energy efﬁcient (DARE) for monitoring the

patient’s biological status. Ahmedulla [10] proposed a (LAEEBA). The route with

the fewest nodes is chosen for broadcast in LAEEBA. Nadeemunem [11]wasthe

ﬁrst to propose a WBAN routing model that was both energy and power efﬁcient.

24.3 Proposed Approach

24.3.1 Intelligent Energy-Efﬁcient Routing Protocol

24.3.1.1 Power Consumption Analysis

The sensors need to sense and progress the sensed information and broadcast it to

the source. Equation 24.1 is a mathematical representation of the energy consumed

during transmission.

FTX =(Famp +Felec)∗s∗d2(24.1)

In Eq. 24.1, the distance is represented by the d parameter. Energy transmission

sensors consume more energy as the distance between them increases. Equation 24.2

shows the amount of energy used by the WBASN sensor node to transmit data.

Fnode =Ftx +Fretx +Fack +Facc (24.2)

(Fnode): Total energy, (Ftx), is a term used to describe the amount of energy in a

transmission energy, (Fretx) is a term used to describe the energy used in transmission,

and (Fack) is a word that means ‘energy retransmission’. (ACK) is the amount of

energy used to send an acknowledgment packet. (Facc) is the amount of energy

consumed by the channel access procedures. In this study, the Nordic NRF2401A1

transceiver was used due to its mega low-power utilization. It communicates using

the 4.8 GHz ISM (manufacturing, technical, and medicinal) group. It has a battery

life of several months to years when powered by AA or AAA batteries. A list of

some parameters is provided in Table 24.2.

24.3.1.2 Proposed Algorithm

A total of eight sensors will be used in the proposed scheme. The coordinates of

the proposed scheme’s entire sensor nodes are shown in Table 24.1. Figure 24.2

252 M. N. Vadlamudi and Md. A. Hussian

Table 24.1 Nordic

parameters Parameters Val u e s

FxDC current 12 0.5 mA

GxDC current 19 mA

Minimum voltage supply 1.6 V

Ftx-elec 14.6 nJ/bit

Frs-elec 33.6 nJ/bit

Eamp 2.36e-10 j/b

depicts the sensor deployment on the human body. The sensors are represented by

circular dots, while the base station is represented by a rectangular box. The power

and computation capabilities of the sensor nodes are equal. Nodes have an initial

energy of 1.3 J. The 1.1 J threshold energy level has been set. A sensor is considered

dead if its energy level falls below a certain threshold. The center of the body is where

the sink node is located. The multi-hopping method is utilized to save energy as the

space among distant nodes decreases, data is now sent to the sink by the forwarder

node. The information from other sensors is collected by a forwarder node and sent to

the sink. Whenever one round ends, the forwarder in the proposed scheme is chosen.

The sink knows the identity of the sensor nodes as well as their energy status. The

forwarder node should be chosen carefully and based on the cost function. This is

done to determine the next stop on the journey. To determine whether any of the

sensors can act as a forwarder, the cost function [8] is used. Which forwarder node to

use is determined by the smallest amount distance from descend and energy above

maximum energy cost function can be represented as

i=Ei

(24.3)

Table 24.2 Sensor

coordinates Sensor number Xcoordinate Ycoordinate

S1 0.35 0.2

S2 0.6 0.3

S3 0.15 0.4

S4 0.6 0.5

S5 0.5 0.65

S6 0.35 0.75

S7 0.7 0.9

S8 0.3 0.85

24 An Intelligent Energy-Efﬁcient Routing Protocol … 253

Fig. 24.2 Sensor

deployment

where Edenotes the sensor’s distance from the sink, FR denotes the sensor’s residual

energy, and Idenotes the sensor’s number.

Algorithm

Intelligent Energy-Efﬁcient Routing Protocol

Inputs: Neighbor list N, energy F, threshold th, sensor S

Output: Efﬁcient energy management scheme

1. Start

2. Initialize neighboring nodes

3. Initialize F, sensors S

4. Initialize consistency to true

5. For each sensor S

6. Calculate the energy consumed FTX

7. FTX =compute FTX(n)

8. End For

9. maxftx =ﬁnd Max FTX(pmap)

10. IF maxftx < th THEN

11. Reassign the sensor energy

12. Amount of energy used by the WBASN sensor node to transmit data

13. Assign the value to Fnode

14. Consider both the transmission and retransmission energies

15. Both are less than the threshold level ‘th’

16. Find the resultant sensor nodes available in the limit

17. Calculate cost function ‘CF’ of the nodes

254 M. N. Vadlamudi and Md. A. Hussian

Fig. 24.3 Evaluation of energy consumption

18. IF ‘CF’ < =FTX

19. Notify (‘node is energy efﬁcient and useful for transmission’)

20. Else

21. Notify (‘new node will be considered’);

22. END IF

23. END FOR

24. Stop

Network stability refers to how long it takes for the ﬁrst node to die, while network

lifetime refers to how long it takes for all of the sensor nodes to die (Fig. 24.3).

24.4 Simulation Setup

We compare the performance of intelligent energy efﬁcient routing scheme (IEERS)

with distance aware relaying energy efﬁcient (DARE) [9] and energy efﬁcient scheme

for body area network (LAEEBA) [10]. Network simulator (NS2) is used to evaluate

our IEERS. The evaluation is carried out on a network with a size of 200 m * 200 m,

a packet size of 250 bytes, and a number of packets transmitted per session of 500.

The total available spectrum (BW) is set to range from 10 to 20 MHz. The bandwidth

available to nodes is limited to 2; 4; and 6 MHz. The trafﬁc is constant bit ﬂow, and

the nodes are distributed in a uniform random distribution. The parameters α,β,a,

b,γ,θare set to the values 2, 3, 0.5, 0.5, 0.5. The duration of the simulation is set to

900 s. The Tw, Tm, and Tassoc parameters are set to 5, 1, and 10.

Energy consumption: Certain amount of energy is consumed for every node to

complete its work. The average energy is measured as given below.

E=1



e[n](24.4)

24 An Intelligent Energy-Efﬁcient Routing Protocol … 255

Table 24.3 Simulation

results for energy

consumption

Number of WBAN nodes Energy consumption (kJ)

IEERS DARE LAEEBA

25 2.8 3.7 4.6

50 4.2 5.7 6.8

75 6.3 7.5 8.2

100 7.5 11.5 14.5

125 8.6 12.3 14.2

150 9.3 13.3 16.2

175 9.8 14.5 18.2

200 11.3 16.5 19.3

225 12.4 17.6 20.36

250 13.6 18.3 22.36

From the above equation, the energy ‘E’ is measured based on the generated

sampled energy vectors ‘e[n],where(n=1,2,3,...,Ns)’.

Simulation Results: The energy consumption calculated using three different

methods is shown in Table 24.3.

Therefore, energy use of IEERS is reportedly reduced by 35% compared with

DARE and 51% compared with LAEEBA (Fig. 24.3).

Packet Delivery Ratio: The packet delivery ratio (PDR) is deﬁned as the ratio

of total packets to total packets transmitted from the source node to a network

destination node. Performance improves as the packet delivery ratio increases. It

is mathematically denoted by the equation.

Packet Delivery Ratio =(Total packets received by all destination node)/

(Total packets send by all source node)

Therefore, the IEERS packet delivery ratio would be reduced by 23% compared

with DARE and 41% compared to LAEEBA (Fig. 24.4).

24.5 Conclusion

The wireless body area network is a common form of sensor network in a number

of applications including medical services, entertainment, games, burning, and mili-

tary applications. One method of achieving energy efﬁciency is the efﬁcient routing

of energy. This paper proposes an intelligent WBASN routing protocol for energy

efﬁciency. The sensor is ﬁrst calculated in this system for the forwarder node. This

forwarder node collects data from other sensors and then adds data to the sink.

256 M. N. Vadlamudi and Md. A. Hussian

No. of

WBANs Packet Delivery Ratio (%)

IEERS DARE LAEEBA

25 98 96

50 90 88

75 84 80

100 79 71

125 70 62

150 68 60

175 62 53

Fig. 24.4 Evaluation of packet delivery ratio

After completing a round successfully, a forwarder node is chosen based on less

distance from the sink and overall residual energy than all sensor nodes. The scheme

proposed achieves a better energy consumption and delivery ratio, as can be seen

from the results of the simulation.

References

1. Jain, K. L., & Mohapatra, S. (2019). Grid base energy efﬁcient coverage aware routing protocol

for wireless sensor network. In Proceedings of the 2nd International Conference on Software

Engineering and Information Management, ACM (pp. 49–53).

2. Merzoug, M. A., Boukerche, A., Mostefaoui, A., & Chouali, S. (2019). Spreading aggrega-

tion: A distributed collision-free approach for data aggregation in large-scale wireless sensor

networks. Journal of Parallel and Distributed Computing, 125, 121–134.

3. Soltani M., Hempel, M., & Sharif, H. (2014). Data fusion utilization for optimizing large- scale

wireless sensor networks. In 2014 IEEE International Conference on Communications, ICC,

(pp. 367–372). IEEE

4. Baskar, S., Periyanayagi, S., Shakeel, P. M., & Dhulipala, V. S. (2019). An energy persis-

tent range-dependent regulated transmission communication model for vehicular network

applications. Computer Networks.https://doi.org/10.1016/j.comnet.2019.01.027

5. Gong, D., & Yang, Y. (2014). Low-latency SINR-based data gathering in wireless sensor

networks. IEEE Transactions on Wireless Communications, 13(6), 3207–3221.

6. Khan, Z., Sivakumar, S., Philips, W., & Robertson, B. (2013). A QoS-aware routing protocol

for reliability sensitive data in hospital body area networks. In The 4th International Conference

on Ambient Systems, Networks and Technologies, Procedia Computer Science.

7. Macwan, S., Gondaliya, N., & Raja, N. (2016). Survey on wireless body area network.

International Journal of Advanced Research in Computer and Communication Engineering.

8. Tang, Q., Tummala, N., Gupta, & Schwiebert, L. (2005). TARA: thermal aware routing algo-

rithm for implanted sensor networks. In International Conference on Distributed Computing

in Sensor Systems (pp. 206–217). Springer Berlin Heidelberg.

9. Tauqir, A., Javaid, N., Akram, S., Rao, A., & Mohammad, S. (2013). Distance aware

relaying energy-efﬁcient: Dare to monitor patients in multi-hop body area sensor networks.

In Broadband and Wireless Computing, Communication and Applications (BWCCA), Eighth

International Conference (pp. 206–213). IEEE

24 An Intelligent Energy-Efﬁcient Routing Protocol … 257

10. Ahmed, S., Javaid, N., Akbar, M., Iqbal, A., Khan, Z., & Qasim, U. (2014). Laeeba: Link aware

and energy efﬁcient scheme for body area networks. In IEEE 28th International Conference

on Advanced Information Networking and Applications (pp. 435–440). IEEE.

11. Nadeem, Q., Javaid, N., Mohammad, S., Khan, M., Sarfraz, S., & Gull, M. (2013). Simple:

stable increased-throughput multi-hop protocol for link efﬁciency in wireless body area

networks. In 2013 Eighth International Conference on Broadband and Wireless Computing,

Communication and Applications (BWCCA) (pp. 221–226). IEEE.

Chapter 25

Enhanced Video Classiﬁcation System

with Convolutional Neural Networks

Using Representative Frames as Input

Data

K. Jayasree and Sumam Mary Idicula

Abstract Convolutional neural networks are extensively used in video classiﬁca-

tions systems, which uses video action recognition data set which consists of 13,320

videos grouped to 101 classes. In this proposed method, we are using a promising

way of speeding up the training by using representative frames from each video for

the entire video data set. The advantage is that the training time can be drastically

decreased with the compact input video data. Here we are using the inceptionv3

pretrained model for classiﬁcation where the weights and architecture can be used

for similar types of inputs. Instead of giving the entire video frames for video clas-

siﬁcation, the proposed method, selects representative frames only, still producing

comparable results. When used with ‘action recognition data set’ having 101 classes

and 13,320 videos, while using full frames, it gives a F1-score of 0.717 and while

using representative frames, it results with a 0.704 value which is comparable. When

all the frames of the videos are given as input to the pretrained model, the perfor-

mance is considerably high. However, there is a direct implication on the cost as

far as time is concerned. The main objective of this research work is to enhance the

computational efﬁciency of the video classiﬁer by using the representative frames

for speeding up the process of training the data set.

25.1 Introduction

Earlier classiﬁcation methods primarily involve two major stages which are feature

extraction and classiﬁcation. Feature extraction includes extraction of handcrafted

features such as Motion Vector, Edge Histogram, and Group of frame features.

For classiﬁcation, different classiﬁers like support vector machine, HMM, Bayesian

K. Jayasree (B)

Department of Computer Engineering, Model Engineering College, Ernakulam, Kerala 682021,

India

e-mail: jayasreek@mec.ac.in

S. M. Idicula

Department of Computer Science, Muthoot Institute of Technology and Science, Ernakulam,

Kerala 682308, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_25

259

260 K. Jayasree and S. M. Idicula

classiﬁer, and neural networks are used. Nowadays convolutional neural networks

(CNNs) which use the concepts of deep learning have been extensively used in image

recognition problems. Hence, it can also be extended to video classiﬁcation problems

as videos comprises sequences of images.

The network acquires knowledge from the data as weights of the network. These

learned values are transferred to other neural network model instead of training it

from the starting point. Training with transfer learning entails initializing all the pre-

ﬁnal layer weights to the pretrained values and training just the last few layers with

relatively low rate. To construct a neural network, it takes a lot of time for training

the model. However, if a pre-constructed network structure and pretrained weights

are there, it will make the process faster. Thus, we don’t require a large-scale training

data set once learning outcomes are transferred. Hence, the pretrained convolutional

neural networks used in common for classifying the video data.

The pretrained model is obtained by training on a large data set. The weights and

architecture of the pretrained model can be used for further classiﬁcation with same

or different data set. This process is known as transfer learning. The selection of

the pretrained model must be done with utmost care. Many pretrained architectures

are available in Keras library. ImageNet data set is comfortably large enough for

creating generalized models. Therefore, the data set is widely used for building

various architectures. Keras is a widely used API on TensorFlow platform and Theano

library. Keras APIs are deep learning models which can be used for prediction, feature

extraction, and ﬁne tuning. These models are associated with pretrained weights.

These weights are downloaded automatically while initiating a model.

Inceptionv3 is a pretrained deep convolutional neural network trained on the

ImageNet data set. In this proposed method, Inceptionv3 is used for classifying

the video data. Usually, the entire video frames are given to Inceptionv3 pretrained

model for training and testing. In our proposed method instead of giving full frames,

selected frames which best represents the data are given for training and testing.

It is found that the results are somewhat very near. The representative frames are

created using block-based adaptive threshold method. The proposed method which

processes fewer frames still produces comparable results with entire frames used as

an input to the classiﬁer.

25.2 Related Works

In the past few years, the most crucial problem found in video analysis and multimedia

database area is the automatic content-based video classiﬁcation. The usual require-

ment for accessing the query-based video database is that the user must provide an

example clip so as to get similar clips as outputs. Searching of similar clips in the

entire database can be burdensome. In order to make the searching more efﬁcient, the

video data must be classiﬁed into different categories. To improve the efﬁciency of

searching, methods need to be developed for categorizing the video data into one of

the predeﬁned classes. Ferman and Tekalp [1] have used a probabilistic framework

25 Enhanced Video Classiﬁcation System with Convolutional Neural … 261

for the construction of descriptors based on the location, object, and events. In order

to categorize content domain and for extracting the relevant semantic information,

Bayesian belief networks and hidden Markov models (HMMs) are used at multiple

levels. A video indexing technique using decision tree is considered in [2] to classify

video in categories such as commercials, music, and sports. For efﬁcient video clas-

siﬁcation in large scale, convolutional neural network is introduced in [3] where the

run time performance is improved by architecture of CNN which incorporates input

at two spatial resolutions. For action recognition, Simonyan et al. [4] make use of

two stream architecture for video classiﬁcation that helps in encoding static frames

and optical ﬂow frames. In [5], compute-efﬁcient video classiﬁcation models are

built by processing fewer frames, thereby complimenting the research on memory

efﬁcient video classiﬁcation. In [6], recurrent convolutional neural network archi-

tecture extracts the local features from image frames and temporal feature between

consecutive frames are taken for video classiﬁcation. In [7], classiﬁcation includes

extraction of features like frames and audio from video data, which is accomplished

by using a convolutional recurrent neural network.

In the proposed system, we are giving representative frames extracted from the

whole frames as input to the convolutional neural network and compare the perfor-

mance with system using all the frames of video data for classiﬁcation. Our proposed

method achieves a classiﬁcation accuracy using only representative frames from the

UCF101 data set, which is competing to the state-of-the art methods.

25.3 Methodology

Convolutional neural networks are widely accepted class of models for image recog-

nition tasks. It can be extended to video classiﬁcation problems. Nowadays pretrained

convolutional neural networksmodels are used for classifying video data. Inceptionv3

is a deep convolutional neural network pretrained for the ImageNet and Large Visual

Recognition Challenge using data from 2012, and it can be differentiable among

1000 different classes.

Usually for video classiﬁcation, all frames are extracted from each video. All

extracted frames are given to the pretrained model, and the performance is evaluated.

This paper proposes a novel method which is used to effectively classify video

data into predeﬁned classes. Here, instead of using entire images, frames which

most represent the video are only taken into consideration. These representative

frames are only given to CNN. It provides a performance which is comparable to the

performance where all frames are used. The representative frames are created using

‘two-pass block based adaptive threshold technique’ [8]. The schematic diagram for

the proposed method is given below. (see Fig. 25.1).

The block-based adaptive threshold algorithm consists of two passes. Each frame

of the video is further segmented into four corners and a middle part of size 60

* 60 pixels during the ﬁrst pass. Each of these blocks is named BottomLeft[BL],

BottomRight[BR], TopLeft[TL], TopRight[TR], and Middle[MID] in accordance

262 K. Jayasree and S. M. Idicula

Fig. 25.1 Schematic representation of video classiﬁcation

with its positions in the frame. For each block, we create a quantized 64-bin RGB

colour histogram. From the list of frames, every two consecutive frames are taken,

and the accumulated histogram-based dissimilarity S(fm,fn) is determined for all

frames using the following formula.

S(fm,fn)=



i=1

B∗

iSp(fm,fn,i)(25.1)

In this equation, Biis the weighting factor predetermined for each block, Sp(f m,

fn,i)is the partial match, obtained through histogram matching method, fmand fn

are the consecutive frames, and rshows the number of blocks.

From ﬁrst pass, a sequence of the sequence of dissimilarity measures is obtained.

In the second pass, these computed values are taken for the detection of the repre-

sentative frames using a single adaptive threshold based on Dugad model [9]. For

adaptively setting the threshold, the dissimilarity measures from the consecutive

frames are used. Thus, the procedure uses a sliding window of a given size where

the dissimilarity measures within this window is taken into consideration. From the

samples in the left side of the middle sample in the window and right side of the

middle sample in the window, the mean and standard deviations are estimated. The

threshold is set as given below in the Eq. (25.2). If the both conditions given below

are satisﬁed, the middle sample ‘mr’gives the representative frame.

1. ‘mr’is to be the maximum value in the window.

2. ‘mr’satisﬁes the condition as given in the following equation:

mr>maxμleft +√σleft,μright +√σright  (25.2)

where ‘mr’is the dissimilarity value calculated for the two consecutive frames. μleft

is mean of the samples in the left side of the middle sample within the window μright

mean of the samples in the right side of the middle sample within the window. σright

and σleft represent the standard deviations of left and right samples.

Video classiﬁcation is done using Inceptionv3 pretrained convolutional neural

network model. UCF101 action recognition data set is used as input to the classiﬁer.

25 Enhanced Video Classiﬁcation System with Convolutional Neural … 263

25.4 Experiments and Results

Experiments are conducted on UCF101, video action recognition data set. UCF101

data set contains 101 classes with 13,320 videos. In Inceptionv3 model architecture,

we add fully connected last layer of 101 nodes for 101 classes. Last layer is SoftMax

activated and we have used ADAM optimizer for optimization of this deep model.

We have trained the model for 50 epochs. In this stage of training, last layer weights

are only updated by keeping all of the other layers froze. After this, we ﬁne-tuned

this model for 250 epochs keeping all the layers trainable. We got the model with

best accuracy of 72.14% with representative frames as input.

The standard data set UCF101 gave an F1-score of 0.717 when classiﬁed on 101

classes of 13,320 videos for full frames and 0.704 when classiﬁed with representative

frames. From the result obtained, it is evident that using representative frames as input

to a classiﬁer will give comparatively better result which is almost same as the result

obtained using all frames for all videos used for classiﬁcation. The method saves a

considerable time in training.

The following Table 25.1 gives the comparison of various an F1-scores obtained

for different models with all frames taken into consideration for classiﬁcation. Here

with full frames Inceptionv3 model gives an F1-score of 0.717.

Table 25.2 gives the comparison of various an F1-scores obtained for different

models with representative frames taken into consideration for classiﬁcation. Here

with representative frames, Inceptionv3 model gives an F1-score of 0.704. There is

not that much difference, and it gives an approximately similar performance.

Table 25.1 Comparison of various classiﬁcation models without representative frames

Models Recall Precision F1-score No. of

parameters

Model size Videos per min

1ResNet 50 58.47 67.14 0.625 42,548,524 175.54 MB 15

1MobileNet 57.15 66.68 0.615 4,294,462 17.85 MB 41

3VGG 16 62.49 57.37 0.598 15,145,245 60.45 MB 23

4VGG 19 65.99 63.71 0.648 20,448,235 85.41 MB 19

5Inceptionv3 72.85 70.63 0.717 23,784,238 94.78 MB 21

Table 25.2 Comparison of various classiﬁcation models—with representative frames

Models Recall Precision F1-score No. of

parameters

Model size Videos per min

1ResNet 50 54.18 59.47 0.567 42,548,524 175.54 MB 22

2MobileNet 52.12 58.15 0.549 4,294,462 17.85 MB 53

3VGG 16 61.49 54.28 0.576 15,145,245 60.45 MB 29

4VGG 19 64.91 61.58 0.632 20,448,235 85.41 MB 26

5Inceptionv3 71.45 69.43 0.704 23,784,238 94.78 MB 32

264 K. Jayasree and S. M. Idicula

25.5 Conclusion

Convolutional neural networks are prominent class of models for video classiﬁcation

tasks. The performance of the pretrained model having all the frames of the videos

as input was comparable to that of the pretrained model which took representative

frames alone as input. Our proposed method of using representative frames as input

to the classiﬁer performs well with signiﬁcant reduction in computational time and

hence expedites the training process.

References

1. Ferman, A. M., & Tekalp, A. M. (1999). Probabilistic analysis and extraction of videocontent.

In Proceedings of ICIP (vol. 2, pp 91–95).

2. Yuan, Y., Song, Q. -B., & Shen, J. -Y. (2002) Automatic video classiﬁcation using decision tree

method. In Proceedings of Machine Learning and Cybernetics (pp. 1153–1157).

3. Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Large-

scale video classiﬁcation with convolutional neural networks. In Proceedings of International

Computer Vision and Pattern Recognition (CVPR 2014) . IEEE.

4. Simonyan, K., & Zisserman, A. (2014). Two stream convolutional networks for action

recognition in videos. CoRR abs/1406.2199:1-8

5. Bhardwaj, Shweta, Mukundhan Srinivasan,and Mitesh M. Khapra. ”Efﬁcient videoclassiﬁcation

using fewer frames.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern

Recognition. 2019.

6. Xu, Zhenqi, Jiani Hu, and Weihong Deng. ”Recurrent convolutional neural networkfor video

classiﬁcation.” 2016 IEEE International Conference on Multimedia and Expo (ICME). IEEE,

2016.

7. K Prasanna Lakshmi, Mihir Solanki,Jyothi,Avinash Bhargav.”Video Genre Classiﬁcation using

convolutional Recurrent Neural Networks.” International journal of advanced Computer science

and applications Vol. 11, No.3 2020.

8. V. Gomathi, “Content based video indexing and Retrieval, ” M.Tech thesis, Dept. of Computer

Science and Engineering, Indian Institute of Technology Madras, 2005.

9. Yusoff, Y., Christmas, W., & Kittler, J. (2000). Video shot cut detection using adaptive thresh-

olding. In Proceedings of the British Machine Vision Conference 2000, British Machine Vision

Association and Society for Pattern Recognition, Bristol, UK, 11–14 Sept 2000, (p. 37).

Chapter 26

Text Recognition from Images Using

Deep Learning Techniques

B. Narendra Kumar Rao, Kondra Pranitha, Ranjana, C. V. Krishnaveni,

and Midhun Chakkaravarthy

Abstract One of the most signiﬁcant methods utilized in the deep learning approach

is text recognition. Text recognition is now a very signiﬁcant activity that is utilized in

many applications of current gadgets to recognize images in a detailed manner. Auto-

matic number plate recognition, for example, is an image processing approach that

detects the vehicle’s number (license) plate. The automatic number plate recognition

system (ANPR) is a key feature that is used to manage trafﬁc congestion. The goal of

ANPR is to devise a method for automatically identifying permitted vehicles using

vehicle numbers. Automatic number plate recognition (ANPR) is utilized in a variety

of applications, including trafﬁc control, vehicle tracking, and automatic payment

of tolls on roads and bridges, as well as monitoring systems, parking management

systems, and toll collecting stations. The established approach ﬁrst recognizes the

vehicle before taking a picture of it. After that, the number plate region in the car

is localized using a neural network, and the image is segmented. Using a character

recognition approach, characters are retrieved from the plate. The results, together

with the time stamp, are then saved in the database. It is implemented and performed

in Python, and the results are tested on a real picture.

B. Narendra Kumar Rao (B)

Computer Science and Engineering, Sree Vidyanikethan Engineering College, Tirupati, India

e-mail: narendrakumarraob@gmail.com

K. Pranitha

Computer Science and Engineering, B V Raju Institute of Technology, Narasapur, Medak,

Telangana, India

Ranjana

Department of IT, Sairam Institute of Technology, Chennai, Tamil Nadu, India

C. V. Krishnaveni

SKR & SKR GCW(A), Kadapa, India

M. Chakkaravarthy

Faculty of Computer Science and Multimedia, Lincoln University College, Kota Bharu, Malaysia

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_26

265

266 B. Narendra Kumar Rao et al.

26.1 Introduction

Text recognition is one of the important method that is used in the deep learning

technique. Now a day’s, text recognition performs most important task which in

many applications of modern devices which is used to identify an image’s in detail

manner. For example, automatic number p1ate recognition system is a cast-off to

recognize the number p1ate in the vehicle. Here, this method gives good result in

the recognition of the text from images with many lines, images with roads and car

plate number recognition.

To eliminate inadequacies in CCTV camera surveillance, an automatic number

plate recognition system is implemented [1–3]. The ANPR system is widely utilized

in developed nations such as the United States, the United Kingdom, and Germany,

and it has been in use for a long time, but it became particularly essential in the 1990s

due to a growth in the number of cars. The data acquired from the 1icense plate

are generally utilized for trafﬁc monitoring, parking, motorway road tolling, access

control, and border control, travel time measurement for toll booth, etc., by the law

enforcement agencies. The problem that arises in recognition is generally sub divided

in to 5 parts:(1) image capturing, i.e., captures the image on the 1icense plate, (2)

preprocessing the image, (3) 1ocalizing the 1icense plate, (4) character segmentation,

i.e., to identify the images on the plate, (5) optical character recognition. It he1ps

in ﬁne-tuning the system such as number of characters in the 1icense plate and text

1uminance level. The problem can be reduced based on the application in speciﬁc

country. For example, the standard is to print the 1icense plate number in b1ack color

on a white background for private vehic1es, and for commercia1 vehicles, ye11ow

background is used in India (Fig. 26.1).

Image binarization mechanism is used to convert an image from black and white.

Here, the threshold is used to differentiate certain pixels as white and certain pixe1s as

black. The issue is to exact the threshold value for a particu1ar image. Sometimes, it

is impossible to select optimal threshold value. To overcome this problem, adaptive

threshold is used. And this mechanism of selecting threshold is called automatic

threshold selection.

Many operational algorithms are used for the edge detection [4] they are Canny,

Canny-Deice, differential, Sobel, and Prewitt. Hough transform is initially used for

line detection, and it is a feature extraction technique. This method is used to ﬁnd

the extended position of random form like circle or oval.

Blob detection is the method is used in the detection of points or regions that

ﬂuctuate in brightness or color as compared to its environment [5]. This method is

Fig. 26.1 Block diagram of

ANPR model

26 Text Recognition from Images Using Deep Learning Techniques 267

used to ﬁnd complementary regions which cannot be perceived by edge or corner

detection algorithms.

After detecting number plate, selected characters are tested for the further process.

For plate segmentation, many methods are used to recognize the character. In current

scenario, some of the methods that are used for character segmentation are image

binarization and CCA. Characters are recognized by using character segmenta-

tion, and it is important task to perform character recognition with good accuracy.

Character recognition is impossible in character segmentation due to errors that

are occurred sometimes. Vertical and horizontal projection gives better results for

segmentation.

Character recognition helps to identify, and it is going to convert the image text in

to modiﬁable text. Character recognition is a process of transforming data from bit

map representation in to form of representation, which is apt for the computers. Here,

character recognition should be invariant toward the user font type or deformation

formed by a skew.

Artiﬁcial neural network model [6–9] is used to differentiate the characters. It

contains three layers they are input, hidden and, output layers. Input layer is used for

decision-making; hidden layer is used to compute more complex relations, and output

layer is used to avail accurate result. ANN is trained by using feed forward back-

propagation (BP) algorithm. This algorithm is used where process time of 0.06 s

is essential. Neural network is mathematically named as artiﬁcial neural network

(ANN). Many algorithms are used in ANN.

Template matching is used to recognize ﬁxed-sized letters and to ﬁnd objects

in face detection and medical image processing [10]. There are two types of

template matching in this context: feature-based matching and template-based

matching. When the template picture includes prominent characteristics, feature-

based matching is advantageous; otherwise, template matching is advantageous. To

achieve 85% of character recognition rate, here, statistical feature extraction is used

[11]. To adjust characters with unique size, we use linear normalization algorithm.

The objectives include:

•To review other algorithms which recognize the number plate of vehicles.

•The proposed method is used to solve the automated number plate recognition

problem.

•To propose an algorithm that is used to evaluate and test the process and it presents

the evaluation results.

•Automatic number plate recognition system is used to overcome the drawbacks

and deﬁciency of successful surveillance of the CCTV cameras [12,13].

•Automatic number plate identiﬁcation does well in recognizing text from images

with several lines, images with road names, and automobile plate numbers.

•The text recognition problem is generally sub-divided in to 5 parts:

•Image acquisition, i.e., captures the image in the license plate [14].

•Preprocessing the image, i.e., normalization, adjusting the brightness.

•Localizing the license plate.

268 B. Narendra Kumar Rao et al.

•Character segmentation, i.e., identifying and locating the individual symbol

images and contrast of the image [15].

•Optical character recognition.

•Recognize the model for an image analysis procedure.

•We can understand how images are represented by utilizing this method, which

includes optical images, analog and digital images.

•By using image segmentation, many image types are used such as binary images,

grayscale images, color and multi-spectral images.

The fast expansion of today’s urban and national road networks in recent years

has necessitated the need for effective road trafﬁc monitoring [16] and management.

The increased usage of vehicles today produces societal concerns such as accidents,

trafﬁc congestion, and, as a result, trafﬁc pollution. The process of detecting and

recognizing the vehicle number plate or license plate is known as number plate

recognition [17]. We employ image processing techniques to extract the car license

plate from digital photographs in this case. Number plate recognition is made up of

two parts: The ﬁrst one is a camera that captures vehicle number plate photographs,

as well as software that extracts numbers from the license plate by capturing the

images using a character recognition tool that allows pixels to convert numerical

data into readable letters. Vehicle tracking, trafﬁc monitoring, automatic payment of

tolls on roads and bridges, surveillance systems, toll collection stations, and parking

management systems are just a few of the uses.

The algorithm under context is categorized divided into four steps:

i. Vehicle image acquisition

ii. Number plate extraction

iii. Character segmentation and

iv. Character recognition.

We employ a few procedures to recognize a text number plate. The ﬁrst stage is to

collect a picture from a car. Capturing an image is not an easy problem, but driving

a vehicle in real time in such a way that the vehicle number plate is not missed is a

difﬁcult task. The fourth step’s success is determined by how the second and third

steps allot vehicle number plates and separate each character.

26.2 Related Work

The detection of a number plate region is the initial stage in this procedure. This is

accomplished by incorporating algorithms capable of detecting a rectangular region

of the number plate in an original image. There are four key phases in the detection

and recognition process.

i. Preprocessing

ii. Localization

iii. Segmentation

26 Text Recognition from Images Using Deep Learning Techniques 269

iv. Recognition.

By acquiring pictures of data, it does the variety of vehicles, which are then

sent into the computer code, which converts them to grayscale images. To get the

quantity plate and its characters, modiﬁcations in contrast, brightness, and gamma

are made to their optimal values. The result is recorded in computerized software,

which veriﬁes whether it has all of the digits in the number plate on each loop. When

the results fulﬁll the speciﬁed requirements, the computer shows the quantity and

ends the program execution so that the next image may be analyze. Here, to develop

system, the following steps are involved.

The input image includes a lot of colors; therefore, it is preprocessed to improve

the standard and get it ready for the following steps. Because the image contains

varied color, the system uses the NTSC standard to transform the RGB images to

grayscale images.

Gray =0.3∗Red +0.58∗Green +0.11∗Blue

In the following part, the gray picture is ﬁltered with a median ﬁlter to reduce noise

while maintaining image clarity. We utilized a nonlinear ﬁlter to replace each pixel

with a value calculated by computing the median of pixel values. By categorizing

images into many groups by

Total number of grops =Height/Candidate region Eetraction

26.2.1 Localization

It is used to recognize the plate region in an image. The main goal is to locate

the car plate region using pictures of the vehicle captured by the camera or video.

The image’s quality is a crucial element of this approach, and preparing the image

helps to improve it. In general, number plates appear to have a large amount of

free space within the image. Because the numerals and letters are in the same row,

there are frequent variations in intensity horizontally. Because the rows that will hold

the number plate are expected to display signiﬁcant ﬂuctuations, this allows for the

detection of changes in horizontal intensity. The difference between the letters and

the backdrop is the cause of this extreme ﬂuctuation. The Hough transform is now

applied to the binary pictures that have been scaled. The pictures are subjected to

edge detection before being input into the Hough transforms program. Edges help

to describe boundaries and are hence a crucial element to consider while processing

a picture. Big O notation can be used to represent the difﬁculty of a Canny edge

operation.

OcN =ON

∗OM

∗OC

∗C+1

270 B. Narendra Kumar Rao et al.

where O,N,O,M,O,Cdenote the complexity of the ﬁrst, second, and third loops,

respectively, inside the codes. When robust intensity is present, edges in images are

the places where there is a rapid change in intensity from one pixel to the next.

Detecting edges lowers the quantity of data in a picture, which aids in ﬁltering out

unnecessary data while maintaining the image’s structural characteristics. The Hough

transformation is a common image analysis method that allows us to detect global

patterns in images by recognizing local patterns in a revised parameter space.

p=xcos θ−ysin θ

It is most beneﬁcial when the patterns start looking for areas that are sparsely

digitized and have “holes,” and the images become obnoxious, especially when a

straight line within the license plate is discovered. The main goal of this method is

to ﬁnd curves that can be parameterized into straight lines in a reasonable parameter

range. The vehicle area is identiﬁed using the exploitation Hough transforms [18].

The following is a depiction of the license plate region:

Cregion =0,0

blackpixel plate region ×col hit 255, otherwise

26.2.2 Segmentation

After identifying the number plate region in an image/video, this is the following

step to segment characters. It is one of the most essential procedures in automatic

number plate identiﬁcation, when all phases are taken into account. If segmentation

fails, a character may be incorrectly united or separated into two halves. If only one-

row plates are assumed, segmentation can be accomplished by identifying character

boundaries.

The acquired segments are improved in the second part of the segmentation. The

segment phase of a plate comprises not just characters, but also unwanted elements

such as dots and superﬂuous space on the character’s borders. These elements must

be removed, leaving simply the character. By identifying the space in its horizontal

projection, we can partition it. We use the adaptive threshold ﬁlter on a regular basis to

strengthen the plate before segmentation phase’s vicinity. With non-uniform lighting,

the adaptive threshold aids in distinguishing dark foreground from light background.

26 Text Recognition from Images Using Deep Learning Techniques 271

26.2.3 Recognition

The output is compared and identiﬁed against databases with completely distinct

algorithmic rules. On the acquired image, grayscale segmentation is performed. We

had to conduct some image preprocessing before creating the model. Following that,

take the following steps:

a. Binarization:

b. Inversion and intensity of the characters.

rectangle comprising these connected components. The picture size is normalized to

15 ×15 pixels. For each of the characters, the intensity values victimization must

store the algorithm rule listed below. We prefer to compute the segmented characters’

matching score using the character templates held on by the algorithmic rule that

follows. We compare the pixel values of the segmented character matrix with the

model matrix, adding 1 to the matching score for each match and decrementing

1 for each mismatch, which is applied to all 225 pixels. For each model, a match

score is generated, as well as the best score derived from a recognized character. For

recognition, character sets are employed.

26.2.4 Datasets

The GTI dataset was created on Spanish roads (Madrid, Brussels and Turin). The

data were gathered from video sequences captured by a camera mounted on the

vehicle’s front. On one hand, there are 3425 afﬁrmative rear pictures, which contain

the vehicle’s number plates and are retrieved from a range of viewpoints [19]. On

the other side, there are 3900 negative photos (taken from way sequences) that have

the number plates but no images of cars. To estimate the entire number of pictures

into 4000 positive and 4000 negative images, a limited number of photographs from

the Caltech and TU Graz-02 databases were employed.

Markus Weber Cars dataset was taken at the parking lot of California Institute of

Technology by Markus Weber. It is not a large dataset because it only contains 126

pictures with a resolution of 896 ×592 pixels, all of which are stored in JPG format

[20]. This dataset only contains pictures taken from the back, and it only includes

salon vehicles, not track or bus vehicles. This dataset’s pictures were all captured

under the same settings, which were sunny days. Furthermore, images shot at night,

in low illumination, rain, or in shadow weathers are not included in the collection. In

addition to all of these challenges, the dataset has no tilt, rotation, or unambiguous

translation.

272 B. Narendra Kumar Rao et al.

26.3 Proposed Work

Image acquisition, license plate extraction, character segmentation, and character

identiﬁcation are the four processes of a typical ANPR system.

26.3.1 Image Acquisition

The ﬁrst stage in this system is image acquisition, which involves taking a picture

using a digital camera connected to a computer. Because the pictures were recorded

in RGB format, they could be processed further for number plate extraction. The

database system stores the car owner’s personal information as well as a few plate

vehicle pictures, abbreviations, and acronyms.

26.3.2 Image Processing

The RGB picture is captured. Many factors impact the acquired image, such optical

system distortion, system disturbance, lack of presentation, or excessive relative

motion of the camera or vehicle, and so on. The consequence is a degraded captured

vehicle image and an unfavorable inﬂuence on subsequent image processing. As a

result, prior to the primary image processing, the acquired image must be prepro-

cessed, which includes converting RGB to gray scale, noise removal, and border

enhancement for brightness.

26.3.3 Plate Localization

The primary purpose of vehicle number plate identiﬁcation is to determine the plate

size. Number plates are similar to rectangular plates, and the region props function

is provided by the MATLAB toolbox. It expresses a collection of attributes for each

marked region in the matrix. In this case, we used the bounding box to estimate

the picture region’s attributes. Following the labeling of the linked components, the

area will be extracted from the input picture. The localization of number plates has

emerged.

In an ANPR system [21], number plate segmentation is critical. After you have

grown your area, the most important thing to remember is the requirements for a good

region. The following is used to determine which picture pixels fulﬁll the require-

ments. At every location when such a pixel is found, its neighbors are evaluated, and

if any of them meet the requirements, both pixels are considered to be in the same

26 Text Recognition from Images Using Deep Learning Techniques 273

region. We extract individual characters and numbers from the picture using vertical

and horizontal scanning techniques.

26.3.4 Character Recognition

This is the most crucial and fundamental step of the ANPR system. It demonstrates the

procedures needed to sort and then interpret the individual characters. The retrieved

characteristics are used to classify the data. By using statistical, syntactic, or neural

methodologies, features are arranged. For recognition of letters and characters in

the paper, we use distinctive strategies. The identiﬁcation process is completed by

calculating the similarity of features. Make the second identiﬁcation for the compa-

rable characters using the highlight point matching approach. Another methodology

is that once the lines is extracted from the vehicle, individual characters can now be

separated using the line separation process, which is now segment savvy. Individual

characters are then split and kept in different variables as a result. The extracted

characters are taken from number plate, and these characters must coordinate with

the database that has characters. The next phase is template matching. Template

matching is a powerful character recognition method. The image of the character is

compared to our database, and the best resemblance is chosen. Another approach for

character identiﬁcation is optical character recognition (OCR), which compares each

individual character to the whole alphanumeric database. To match individual char-

acters, the OCR uses a relationship mechanism, and the number is eventually recog-

nized and saved in string format in a variable. After that, the character is compared

to the vehicle authorization database. As a result of the comparison, the subsequent

indicators are presented. Every character, such as A–Z and 0–9, will have its own

template, as shown in the diagram (Fig. 26.2).

Fig. 26.2 Database of

templates

274 B. Narendra Kumar Rao et al.

The suggested technique is depicted in the above block diagram. In this suggested

technique, the input picture is converted to gray scale, and morphological scanning is

done to the grayscale image, which checks the image and identiﬁes the number plate

section of the vehicle. The separate number plate will be sent into split segmentation,

which will divide every single character and compare each character to the current

dataset. The most extreme coordination will be identiﬁed and combined to form the

ﬁnal product.

26.4 Proposed Method

In the previous section, we had discussed about the abstract and view of the model.

In this chapter, we are going to present the complete architecture that we are using

for the project. Here, we are using convolution neural network.

26.4.1 Convolution Neural Network

In deep learning, a convolutional neural network (CNN/ConvNet) is a sort of deep

neural network that is used to assess visual imagery. Convolution is a technique used

in it. Convolution is a mathematical procedure that takes two functions and produces

a third function that shows how the shape of one is modiﬁed by the other (Fig. 26.3).

Let’s go through the basics of what an image is and how it is portrayed before

we get into how CNN works. A grayscale image is nothing more than a single-plane

matrix of pixel values, whereas an RGB image is a three-plane matrix of pixel values.

To learn more, take a look at this diagram (Fig. 26.4).

Convolutional neural networks are made up of many layers of artiﬁcial neurons.

Here, the image is going to convert in to gray scale by using convolution neural

network (Fig. 26.5).

Fig. 26.3 Convolution neural network

26 Text Recognition from Images Using Deep Learning Techniques 275

Fig. 26.4 Multiple layers of

artiﬁcial neurons

Fig. 26.5 2D convolution

for pooling

26.4.2 Pooling

The pooling layer, like the convolutional layer, is responsible for shrinking the

convolved feature’s spatial size. By decreasing the size, the processing power required

to process the data is reduced. Average pooling and maximum pooling are the two

forms of pooling. So far, I have just worked with Max Pooling and haven’t run across

any issues.

26.4.3 Padding

Padding is a term used in convolutional neural networks to refer to the number of

pixels added to an image during CNN kernel processing. If the padding in a CNN is

set to zero, for example, every pixel value added will be 0. The image will have a

one-pixel border with a pixel value of zero if the zero padding is set to one (Fig. 26.6).

276 B. Narendra Kumar Rao et al.

Fig. 26.6 Padding =Same

26.5 Implementation

26.5.1 Models of CNN

Training CNN is used to evaluate picture performance; models are utilized to generate

a number of alternative CNN-based classiﬁcation models. I will be using the Keras

framework to create our model. Some of the models that are used as follows:

i. CNN with 1 convolutional layer

ii. CNN with 3 convolutional layer

iii. CNN with 4 convolutional layer.

To optimize the classiﬁer, divide the original training data (60,000 photos) into 80

percent training (48,000 images) and 20% validation (12,000 images), while keeping

the test data (10,000 images) to ﬁnally evaluate the model’s accuracy on data, it has

never seen. This allows me to determine if I’m over-ﬁtting on the training data, and

whether I should drop the learning rate and train for longer epochs if validation

accuracy is greater than training accuracy, or whether I should stop over-training if

validation accuracy shifts higher than training accuracy.

26.5.2 OCR using open CV and CNN

Our strategy will not be to recognize everything in an image in one go, but rather to

segment the image into characters, send these segmented characters to CNN to be

recognized, and then arrange the detected characters to duplicate the text shown in

the image (Fig. 26.7).

The method is illustrated in the diagram.

26 Text Recognition from Images Using Deep Learning Techniques 277

Fig. 26.7 OCR using open CV and CNN

26.6 Conclusion

We have successfully developed a fully working number plate recognition system

that used convolutional neural network in association with character recognition and

character segmentation, and license plate detection is used to detect the number

plate. The majority of the number plate recognition system’s components have been

successfully deployed. Our proposed approach works in general situations when the

distance between the camera and the vehicle is unrestricted, and weather conditions

are unfavorable. However, when the distance between the camera and the vehicle

remains constant, the performance of our system improves. To enhance the segmen-

tation section, we collect successfully trained data. Other prominent methods, such

as artiﬁcial neural network, can help enhance optical character recognition. I intend

to create an automatic number plate recognition system with its own data base, user

interface, and authorization system based on number plate identiﬁcation.

278 B. Narendra Kumar Rao et al.

References

1. Cheng, C., Koschan, A., Chen, C. –H., Page, D. L., Abidi, M. A. (2012). Outdoor scene image

segmentation based on background recognition and perceptual organization. IEEE Transactions

on Image Processing, 21(3), 1007–1019

2. Mehmood, S., Cagnoni, S., Mordonini, M., & Khan, S. A. (2012). An embeded architecture

for real-time object detection in digital images based on niching particle swarm optimization.

Journal of RealTime Image Processing, 1–15

3. Kurugollu, F., Sankur, B., & Harmanci, A. E. (2002). Image segmentation by relaxation using

constraint satisfaction neural network. Image and Vision Computing, 20(7), 483–497

4. Sarfraz, M. S., et al. (2011). Real-Time automatic license plate recognition for CCTV forensic

applications. Journal of Real-Time Image Processing.

5. Zhuang, D., & Zang, W. (2010) Content-Based image retrieval based on integrating region

segmentation and relevance feedback. In International Conference on Multimedia Technology

(ICMT)

6. Ong, S. H., Yeo, N. C., Lee, K. H., Venkatesh, Y. V., & Cao, D. M. (2002). Segmentation of

color images using a two stage self-organizing network. Image and Vision Computing, 20(4),

279–289.

7. Navon, E., Miller, O., & Averbuch, A. (2005). Color Image segmentation based on adaptive

local thresholds. Image and Vision Computing, 23(1), 69–85.

8. Zhuge, Y., Udupa, J. K., & Saha, P. K. (2006). Vector scale-based fuzzy-connected image

segmentation. Computer Vision and Image Understanding, 110(2), 177–193.

9. Crevier, D. (2008). Image segmentation algorithm development using ground truth image data

sets. Computer Vision and Image Understanding, 112(2), 143–159.

10. Lalimi, M. A., Ghofrani, S., & McLernon, D. (2012). A vehicle license plate detection method

using region and edge based methods. Computers & Electrical Engineering

11. Wang, S. -Z., & Lee, H. -J. (2009). A cascade framework for real-time statistical plate recogni-

tion system. IEEE Transactions on Information Forensics Security, 2(2), 267–282. Prathamesh

Kulkarni, Ashish Khatri, Prateek Banga & Kushal Shah (2009). Automatic number plate

recognition (ANPR). In Radioelektronika. 19th International Conference.

12. Naito, T., Tsukada, T., Kozuka, K., & Yamamoto, S. Robust license-plate recognition method

for passing vehicles under outside environment. IEEE Transactions on Vehicular Technology,

49,(6).

13. Janakiramaiah, B., Kalyani, G., & Jayalakshmi, A. (2021). Automatic alert generation in a

surveillance systems for smart city environment using deep learning algorithm. Evolutionary

Intelligence, 14, 635–642.

14. Cynthia, L., Hibdon, J., Cave, B., Koper, C. S., & Merola, L. (2011). License plate reader (LRP)

police patrols in crime hot spots: an experimental evaluation in two adjacent jurisdictions.

Journal of Experimel Criminology, 321–345.

15. Chen, J. -J., Su, C. -R., Grimson, W. E. L., Liu, J. L., & Shiue, D. -H. (2012). Object segmentation

of database images by dual multiscale morphological reconstructions and retrieval applications.

IEEE Transactions on Image Processing, 21(2), 828–843.

16. Suresh, K. V., Mahesh Kumar, G., Rajagopalan, A. N. (2007) Super resolution of license plates

in real trafﬁc videos. IEEE Transactions on Intelligent Transportation System, 8(2), 321–331.

17. AbdulkarSengur and YanhuiGuo. (2011). Color texture image segmentation based on neutro-

sophic set and wavelet transformation. Computer Vision and Image Understanding, 115(8),

1134–1144.

18. Ballard, D. H. (1981). Generalizing the hough transform to detect arbitary shapes. Pattern

Recognition, 13(2), 111–122.

19. Peng, B., Zhang, L., & Zhang, D. (2011). Automatic image segmenation by dynamic region

merging. IEEE Transactions on Image Processing, 20(12), 3592–3605

20. Tian, Y., Yap, K. -H., & He, Y. (2012). Vehicle license plate super-resolution using soft learning

prior. Multimedia Tools and Applications, 519–535

26 Text Recognition from Images Using Deep Learning Techniques 279

21. Wu, H., & Li, B. (2011). License plate recognition system. In International Conference on

Multimedia Technology (ICMT) (pp. 5425–5427).

22. Chen, R., & Luo, Y. (2012). An improved license plate location method based on edge detection.

Physics Procedia, 24, 1350–1356.

Chapter 27

Early Detection and Diagnosis of Oral

Cancer Using Fusioned Deep Neural

Network

Sree T. Sucharitha, I. Kannan, and K. A. Varun Kumar

Abstract In recent days, oral cancer cases are signiﬁcantly increasing due to

increase in tobacco consumption, along with the combination of consumption of

alcohol, poor oral hygiene, and human papilloma virus (HPV) infection. Early detec-

tion of this kind of cancers is preventive, or else, it may leads to premature deaths.

50% of cases are detected in advanced stages. For the above reasons, it is important to

develop the new model to detect the oral cavity cancer in early stage from the digital

data and image processing techniques. The research in detection of oral cancer is

highly active from the twentieth century. In this paper, the detection of oral cancer

with the fusion model of CNN +RNN is proposed. The proposed model outper-

forms the state-of-art techniques in detection of oral cancer with 82% of accuracy.

The obtained result is analyzed with systematic approach, and assured diagnosis is

ensured the diagnosis of the oral cancer we attempt to near future. The intention of

the proposed method is to improve the detection accuracy in the early diagnosis of

oral cavity cancers.

27.1 Introduction

Oral cancers are important for a gathering of diseases usually alluded to as head

and neck tumors, and of all head and neck tumors, they contain around 85% of

that classiﬁcation. Cerebrum malignant growth is a disease class unto itself and is

excluded from the head and neck malignant growth bunch. Historically, the death

rate is connected with the cancers why because of its hard to detect in early stage and

diagnose also complex till now. We are 2021 still cancer cell detection and diagnose

is complex in medical sectors. Oral cancer is more dangerous than other because

of detecting the origin of the cancer is difﬁcult once is treated in early stage it may

S. T. Sucharitha ·I. Kannan

Department of Community Medicine, Tagore Medical College and Hospital, Chennai, India

K. A. Varun Kumar (B)

School of Computing, SRM Institute of Science and Technology, Kattankulathur, India

e-mail: varun.kumar300@gmail.com

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_27

281

282 S. T. Sucharitha et al.

signiﬁcantly grow even if the person completed their treatment. Oral cancer may also

affect the lungs and liver cause of localization in intra-oral area in later stages. Oral

cancer is particularly dangerous because in its starting stages, it may not be seen by

the patient, as it can as frequently as conceivably thrive without making torture or

secondary effects they may expeditiously see, and considering the way that it has a

high risk of conveying second fundamental developments. This suggests that patients,

who persevere through a ﬁrst involvement in the afﬂiction, have up to different

occasion’s higher risk of developing second harmful development. This raised peril

part can continue to go for 5–10 years after the essential occasion. There are a couple

of sorts of oral growths, but around 90% is squamous cell carcinomas. The other

obviously more surprising oral malignancies are ACC and MEC illnesses which

by assessment are for the most part unprecedented, yet particularly perilous as the

signiﬁcance of data about them is certainly not by and large SCC. It is evaluated that

around the period of analyzed patients might show a period part in the biochemical or

biophysical cycles of maturing cells that permits dangerous change, or maybe, safe

framework ability reduces with age. Exceptionally, ongoing information persuade us

to think that the quickest developing fragment of the oral malignant growth populace

is non-smokers younger than ﬁfty, which would demonstrate a change in outlook

in the reason for the infection, and in the areas where it most every now and again

happens in the oral climate. The foremost of the mouth, tobacco, and liquor-related

diseases have declined alongside a comparing decrease in smoking, and the back

of the oral pit locales related with the HPV16 viral reason is expanding. So, while

talking in consensuses to general society, many allude to these two unmistakably

various diseases (oral and oropharyngeal) as “oral malignant growth,” and keeping

in mind that in fact, not precise is viewed as ordinary in overall population informing.

This is a causative specialist in diseases of the lip, just as other skin malignant

growths. Disease of the lip is one oral malignant growth whose numbers have declined

over the most recent couple of many years. This is reasonable because of the expanded

attention to the harming impacts of delayed openness to daylight and the utilization of

sunscreens for security. Another actual variable is openness to x-beams. Radiographs

were routinely taken during assessments and at the dental ofﬁce are protected, yet

recall that radiation openness is aggregate over a long period. It has been ensnared

in a few head and neck tumors.

Various Stages of Oral Cancers

Stage-0:

Stage 0 is beginning stage of the cancer which is medically called as “carcinoma

in situ.” It means an abnormal tumor cells were lining the lips and oral cavity of the

person which may lead to oral cancer.

Stage-1:

Stage 1 is early of the oral cancer; the tumor calls are growing less than 2

centimeters in oral cavity or lips. In this stage, cancer cells not reach to regional

clusters.

27 Early Detection and Diagnosis of Oral Cancer Using … 283

Stage-2:

Stage 2 describes cancer cells are growing more than 2 cm and not greater than 4

cm. This stage not reaches the regional clusters.

Stage-3:

Stage 3 describes the cancer cells growing more than 4 cm and reach regional

clusters in bottle neck stage.

Stage-4:

Stage 4 describes most advance stage of oral cancer. It may affect in any sizes but

its on spreading stage. This spreading follows below:

1. Nearby tissue like jaw and other parts of oral cavity.

2. One large regional cluster is growing in one side of the lips or oral cavity it also

affects the others side too.

3. It also distant the parts of the body and beyond the mouth likely lungs and liver.

Stage 3 and Stage 4 cancers recure the earlier stage of the cancer.

Convolutional Neural Network

CNN is a deep learning algorithm to extract the feature from the image. This algorithm

uses the image as input and extracts the feature by the user given parameters. From

those parameters, it will extract the feature from image and process the data to further

steps. CNN core architecture is like neural system of the human body. The CNN

system taken the image input into matrix form to processing the feature extraction

technics. As it were, the system can be prepared to comprehend the advancement of

the picture better. In previous systems, authors test the system by less samples it may

leads to the inaccuracies in the result to overcome this issues we use CNN model to

utilize the resource in effective manner. It also helps the automation in the system to

process the dataset.

27.2 Literature Survey

Wang et al. [1] proposed the portable oral cancer detection system. The authors use

micro-electromechanical system micromirror to ﬁeld of view technique to identify

the oral cancer from confocal image to large ﬁeld of view. They tested with multiple

clinical samples include the neoplastic lesion tissues. Pandey and Gupta [2] proposed

the stage determination model to detect the oral cancer. They use neuro fuzzy infer-

ence system to detect the stage of oral cancer by if then rule. From this approach,

they reduce the tolerance level of the detection system. Swetha et al. [3] proposed the

cancer detection model using the neural network. The authors use the CNN model

to detect the oral cancer. The created system automatically detects the attributes of

temperature, saliva ph value, and Co2for diagnosing the oral cancer. Aier and Khan

et al. [4] proposed the oral detection model to identify the mutation of the cancer

284 S. T. Sucharitha et al.

cells. The authors exploring the ELF4 transcriptional factor to detect the oral/mouth

cancer with high throughput sequencing data. They also focus on the quality of the

oral cancer detection; they also focus the potential targets of oral cancer throughput

investigations.

Sami et al. [5] proposed the model to detect the oral borderline malignancy. The

authors use the image processing methodology to develop the computer-aided model

to detect the oral cancer. This model is featuring the clinical data to processing and

detect the oral cancer of the mankind. Rekha et al. [6] proposed the model to identify

the blood plasma of oral cancer. The authors use Raman spectroscopic characteristic

to versatile detailing of the blood plasma. This method can archives result cure the

cancer patients from the oral cancer cell blood plasma with high accuracy of the

recovery rate. Amulya and Jayakumar [7] proposed the study on melanoma skin

cancer detection techniques. The author analyzes the various parameter of the skin

cancer to diagnose patients and detects the cancer cells in the early stage. Nezhadian

and Rashidi [8] proposed the model to detect the melanoma skin cancer by using the

color and texture features. The author develops the algorithm to detect skin cancer by

using the color and texture of the cancer cell. This algorithm extracts the feature from

dataset and classiﬁes the dataset by SVM to improve the accuracy of the detection.

Udrea and Mitra [9] proposed the model to detect the skin cancer using the clinical

images. The authors proposed the adversarial neural network algorithm to detect the

pigmented and non-pigmented skin cancers. This algorithm uses the clinical images

to extract the cancer cell feature and automatically detects the skin cancer in accuracy

of 92%. Caorsi and Lenzi [10] proposed the removal techniques to remove the breast

cancer cells. The authors use the ANN algorithm to train and test the dataset to detect

the cancer cell. This model also uses the real cleaning technics to detect the cancer

in improved accuracy. Jiang et al. [11] proposed the oral cancer detection model

to detect the cancer cells from the ﬂuorescent images. The authors use ﬂuorescent

images as input and processing the data with image fusion algorithm to detect the

cancer susceptible portion of the oral cavity. This will improve the system detection

rate and efﬁciency of the system. Shu-Fan [12] proposed the oral cancer diagnosis

model to identify the cancer tissue with improved accuracy rate. The authors use the

prob images to system to structuring the tissue inside the oral cavity for diagnosis

with scanned precancer cells.

Shalu and Kamboj [13] proposed the model to detect the skin cancer by using the

color-based method. In this model, the authors implement the algorithm to extract the

color features from the digital images. From that classiﬁcation, the system develops

the MED-NODE dataset for testing. This model uses machine learning algorithms

to improve the detection accuracy up to 82.35% [14]. Arik et al. [15] proposed the

deep learning-based skin cancer detection model. The authors use the deep learning

algorithm to develop the model to detect the skin cancer in early stage of the cancer

cell by clinical images. Demir et al. [16] proposed the deep learning architecture to

detect the skin cancer in early stage. They use ResNet-101 and Inception-v3 deep

learning architecture to detect the skin cancer. This model will recognize the cancer

cell from the CT images in accuracy of 87.42%.

27 Early Detection and Diagnosis of Oral Cancer Using … 285

Khokhar et al. [17] proposed the model to detect the skin cancer using the

millimeter waves. The authors develop the model to detect the skin cancer by passing

the electromagnetic waves to small part of the skin to detect the cancer cells. This

model improves accuracy in the early skin cancer detection. Sujatha et al. [14]

proposed the model to identify the therapeutic inhibitors of the oral cancer. The

authors describe the inhibitors treatment technic of oral cancer and proposed the

computation approach to validate the in vivo and in vitro studies to prove the efﬁ-

ciency of the system and protect the mankind from the oral cancer. Rajaguru and

Kumar Prabhakar [18] proposed the hybrid classiﬁcation model for oral cancer. The

authors use the Bayesian LDA algorithm with the combination of the artiﬁcial bee

colony optimization algorithm to classify the cancer sets of images from the input

system. The tested environment with the classiﬁcation accuracy of 83.13 out of 100%.

Mansutti et al. [19] proposed the millimeter wave near ﬁeld to detect the skin

cancer. The authors develop the promising tool based on the substrate-integrated

waveguide (SIW) to overcome the fabricated issues in the previous cancer detection

techniques [11]. This tool is placed near to the skin and passed the electric wave to

detect the skin cancer.

Bumrungkun et al. [20] proposed the model to detect the skin cancer using the

SVM and snake algorithm. The author analyzes the various parameters of the skin

cancer; parameters are extract feature from the image and process the image by image

segmentation to detect the skin cancer. These segmentation parameters are taken as

input to SVM to ﬁnd the accuracy of the cancer detection parameters [4]. Zhao et al.

[21] proposed the model to detect the skin cancer using the Raman spectroscopy.

The authors use the Raman scattering effect and inherit the effects to the model to

detect the skin cancer in preliminary stage. This model takes the clinical result of the

skin cancer testing and analysis the sensitivity of the skin to detect the skin cancer.

Aydinalp et al. [22] proposed the model to detect the skin cancer using the

microwave. The authors use the rat skin tissue to test the model. This model collects

the dielectric properties of the skin tissue and passed properties to tool to detect the

skin cancer [2]. Setiawan et al. [23] proposed the model to detect the skin cancer in

early stage. The authors use the CNN algorithm to process the image segmentation.

This model analyzes the skin color changes in human from the image. This model

detects only the early stages of skin cancer to overcome the medical complexity and

early diagnosis to prevent the skin cancer [1]. Azadeh Noori Hoshyar et al. [24]

proposed the review on early skin cancer detection methods. The author analyzes the

various automatic skin cancer detection methods and provides the overview of those

methods. This review helps the patients to detect the skin cancer in early stages and

diagnose them to prevent affect ratio.

27.3 Methodology

The details of proposed methodology are elaborated in these sections. The input

images taken from migrant from northern part of India are preprocessed for quality

286 S. T. Sucharitha et al.

enhancement. The feature extraction has to be done for classiﬁcation. Figure 27.1

shows the initial steps from the proposed methodology with demonstrates the

proposed methodology with deep learning techniques.

Dataset and Preprocessing

The input images are collected from various dental clinic all over India. There are

1600 images are collected, in that 1224 images are extracted from the histopatho-

logical repository. There are two different types of images found in the repository

which consist of two sets with two resolution. First set consists of more than 90

images as histopathological images and more than 400 images of OSCC, in the

second set more than 200 images as epithelium of normal and more than 450 images

of OSCC with good magniﬁcation. The images are collected from more than 500

patients with high-quality HD camera with standard lightening condition. Convolu-

tion neural networks outperform the state of art deep learning algorithms in certain

speciﬁc problems. The input layers are extracted from CNN algorithm which consist

of softmax, maxpooling, and connected layers. Normally, CNN comprises of these

layers with the sequential stack, and RNN is used to train the input model with the

term called transfer learning technology. The sample images for preprocessing have

shown in Fig. 27.2, and the detection of edges of the sample input images is shown

in Fig. 27.3 (Fig. 27.4).

Feature Selection

There are two types of feature selection methods speciﬁcally, vector and scalar

method. In scalar method, it has individual feature to be considered, and it is also

an easier method than vector method while using for feature selection. In case of

vector method, it is used based on the mutual subset and relationship between them.

It is a complex method while comparing scalar method for feature selection. Here,

Fig. 27.1 Sample images

collection

27 Early Detection and Diagnosis of Oral Cancer Using … 287

Fig. 27.2 Architecture of proposed model

Fig. 27.3 Preprocessing of input images

the different color intensity is considered as feature to select. In our model, we use

vector method for selection of best feature over input images.

Fully Connected Layer

After the feature extraction, the input data are transferred to ﬁnal fully connected

layer to process; the training of neural parameters happened in this layer with the

help of pooling layer and softmax layer; at last, it makes ﬁnal features which need

to be trained for the prediction of cancerous object.

27.4 Result Analysis

The epithelium images are processed for input images which has shown in Fig. 27.5.

This images are given to convolutional layer which the output of convolution layer,

288 S. T. Sucharitha et al.

Fig. 27.4 Edge detection

using input images

maxpooling, softmax and dense layers, and the input data is given to recurrent neural

learning (RNN) with the technology of transfer learning and the above process contin-

uous for the cancerous affected images. So, the differentiation between the normal

epithelial images and cancerous images has already trained and waiting for the vali-

dation to be done with the testing process. The sample image of cancerous affected

hepathamalogy has showed in Fig. 27.6. The process of convolution and layer by

layer had shown in Table 27.1. It starts with convol_1 of output shape with 896

parameters, conv_2 with output shape with 9248 parameters, and it continuous to

perform with maxpooling to dense layer with different parameters with output shape

(Fig. 27.7 and Table 27.2).

The accuracy of the classiﬁers has been deﬁned with following formula:

Accuracy =classiﬁcation of corrected samples/Number of samples taken

The accuracy rate for combined CNN +RNN offers an enhancement of 1.2%

related to normal CNN. The experimental results seen in Fig. 27.8 lead to the

remark of FPR that instigates with expressively worse value with both the algorithms,

whereas the ADR gains some part of improvements ranging from 1 to 2.2%.

The above Fig. 27.9 shows that proposed algorithm outperforms the current

algorithms which was clearly depicted the red line in the graph performs with the

limitation score of 1.8 which ranges from −4to2.

27 Early Detection and Diagnosis of Oral Cancer Using … 289

Fig. 27.5 Epithelium

normal images

Fig. 27.6 Cancerous affected images

27.5 Conclusion

More than 80,000 fresh oral cancer cases reported every year with a high mortality

rate owing to the delay in detection. Presently, the cancer identiﬁcation relies majorly

on oral examinations using torchlight to detect early stage cancers of the oral cavity.

Mostly, oral cancer affects the people from the lower income category and people in

rural area, due to the consumption of tobacco (either smokeless or smoking), alcohol,

etc. Earlier detection of oral cancer offers the best chance for long-term survival and

290 S. T. Sucharitha et al.

Table 27.1 Layer

conversions with parameters Layer type Output shape Parameters

Conv_1 12,812,832 896

Conv_2 12,812,832 9248

Maxpooling 646,432 Null

Droput 646,432 Null

Conv_3 646,464 18,496

Conv_4 646,464 36,982

Maxpooling_2 323,264 Null

Dropout2 323,264 Null

Conv_5 323,264 36,982

Conv_6 323,264 36,982

Maxpooling3 161,664 Null

Dropout3 161,664 Null

Flatten1 16,764 Null

Dense1 512 838,920

Dense2 73591

Fig. 27.7 Representation of

results with confusion matrix

Table 27.2 Result obtained with fusion of CNN +RNN

Parameters Precision Recall F1 score Support

10.52 0.82 0.52 0.578

20.80 0.54 0.53 0.975

Micro-average 0.62 0.47 0.60 1158

Macro-average 0.67 0.53 0.60 1158

Weighted average 0.58 0.61 0.60 1158

27 Early Detection and Diagnosis of Oral Cancer Using … 291

Fig. 27.8 RoC curve

achieved through obtained

result

Fig. 27.9 Comparison of

CNN and RNN

has the potential to improve treatment outcomes. The proposed method outperforms

the current state of art techniques with the detection accuracy of 82%. The patient

tracking system and Nanooral kit to be developed and deployed in an near future to

diagnose the early stages of cancer, which gives the moral support for the patients

with low-income category.

References

1. Wang, Y., Raj, M., McGuff, H. S., Shen, T., & Zhang, X. (2011).Portable oral cancer detection

using miniature confocal imaging probe with large ﬁeld of view. In 2011 16th International

Solid-State Sensors, Actuators and Microsystems Conference (pp. 1821–1824).https://doi.org/

10.1109/TRANSDUCERS.2011.5969765

2. Pandey & Gupta N. K. (2014). Stage determination of oral cancer using neurofuzzy inference

292 S. T. Sucharitha et al.

system. In 2014 IEEE Students’ Conference on Electrical, Electronics and Computer Science

(pp. 1–5). https://doi.org/10.1109/SCEECS.2014.6804517

3. Swetha, S., Kamali, P., Swathi, B., Vanithamani, R., & Karolinekersin, E. (2020). Oral disease

detection using neural network. In 2020 9th International Conference System Modeling and

Advancement in Research Trends (SMART) (pp. 339–342). https://doi.org/10.1109/SMART5

0582.2020.9337094

4. Aier, & Khan, S. M. (2018). Exploring the effect of wild type and mutant ELF4 transcriptional

factor on oral cancer using high-throughput sequencing data. In 2018 International Conference

on Bioinformatics and Systems Biology (BSB) (pp. 207–211). https://doi.org/10.1109/BSB.

2018.8770545

5. Sami, M. M., Saito, M., Kikuchi, H., & Saku, T. (2009). A computer-aided distinction of border-

line grades of oral cancer. In 2009 16th IEEE International Conference on Image Processing

(ICIP), pp. 4205–4208. https://doi.org/10.1109/ICIP.2009.5413534

6. Rekha, P., et al. (2013). Raman spectroscopic characterization of blood plasma of oral cancer.

In 2013 IEEE 4th International Conference on Photonics (ICP), pp. 135–137. https://doi.org/

10.1109/ICP.2013.6687092

7. Amulya, P. M., & Jayakumar, T. V. (2017). A study on melanoma skin cancer detection

techniques. In 2017 International Conference on Intelligent Sustainable Systems (ICISS)

(pp. 764–766). https://doi.org/10.1109/ISS1.2017.8389278

8. Nezhadian, F. K., & Rashidi, S. (2017). Breast cancer detection without removal pectoral

muscle by extraction turn counts feature. In 2017 Artiﬁcial Intelligence and Signal Processing

Conference (AISP) (pp. 6–10). https://doi.org/10.1109/AISP.2017.8324112

9. Udrea, A., & Mitra, G. D. (2017). Generative adversarial neural networks for pigmented and

non-pigmented skin lesions detection in clinical images. In 2017 21st International Conference

on Control Systems and Computer Science (CSCS) (pp. 364–368). https://doi.org/10.1109/

CSCS.2017.56

10. Caorsi, S., & Lenzi, C. (2015). Skin removal techniques for breast cancer radar detection based

on artiﬁcial neural networks. In 2015 IEEE 15th Mediterranean Microwave Symposium (MMS)

(pp. 1–4). https://doi.org/10.1109/MMS.2015.7375418

11. Jiang, C. F., Wang, C. Y., & Chiang, C. P. (2004). Oral cancer detection in ﬂuorescent image

by color image fusion. In The 26th Annual International Conference of the IEEE Engineering

in Medicine and Biology Society (pp. 1260–1262). https://doi.org/10.1109/IEMBS.2004.140

3399

12. Chen, S. -F., Lu, C. -W., Tsai, M. -T., Wang, Y. -M., Yang, C. C., & Chiang, C. -P. (2005). Oral

cancer diagnosis with optical coherence tomography. In 2005 IEEE Engineering in Medicine

and Biology 27th Annual Conference (pp. 7227–7229). https://doi.org/10.1109/IEMBS.2005.

1616178

13. Shalu, & Kamboj, A. (2018). A color-based approach for melanoma skin cancer detection.

In 2018 First International Conference on Secure Cyber Computing and Communication

(ICSCCC) (pp. 508–513). https://doi.org/10.1109/ICSCCC.2018.8703309

14. Sujatha, P. L. et al. (2016). Identiﬁcation of therapeutic inhibitors for the treatment of oral cancer

using herbal bio-active principles by means of computational approach. In: 2016 2nd Inter-

national Conference on Advances in Electrical, Electronics, Information, Communication and

Bio-Informatics (AEEICB) (pp. 549–553). https://doi.org/10.1109/AEEICB.2016.7538351

15. Arik, A., Golcuk, M., & Karslıgil, E. M. (2017). Deep learning based skin cancer diagnosis. In

2017 25th Signal Processing and Communications Applications Conference (SIU) (pp. 1–4).

https://doi.org/10.1109/SIU.2017.7960452

16. Demir, ˙

I., et al. (2019). SkelNetOn 2019: dataset and challenge on deep learning for

geometric shape understanding. In 2019 IEEE/CVF Conference on Computer Vision and

Pattern Recognition Workshops (CVPRW) (pp. 1143–1151). https://doi.org/10.1109/CVPRW.

2019.00149

17. Khokhar, U., Naqvi, S. A. R., Al-Badri, N., Bialkowski, K., & Abbosh, A. (2017). Near-ﬁeld

tapered waveguide probe operating at millimeter waves for skin cancer detection. In 2017

IEEE International Symposium on Antennas and Propagation & USNC/URSI National Radio

Science Meeting (pp. 795–796). https://doi.org/10.1109/APUSNCURSINRSM.2017.8072440

27 Early Detection and Diagnosis of Oral Cancer Using … 293

18. Rajaguru, H., & Kumar Prabhakar, S. (2017). Oral cancer classiﬁcation from hybrid ABC-PSO

and Bayesian LDA. In 2017 2nd International Conference on Communication and Electronics

Systems (ICCES) (pp. 230–233). https://doi.org/10.1109/CESYS.2017.8321271

19. Mansutti, G., Mobashsher, A. T., & Abbosh, A. M. (2018). Millimeter-wave substrate integrated

waveguideprobe for near-ﬁeld skin cancer detection. In 2018 Australian Microwave Symposium

(AMS) (pp. 81–82). https://doi.org/10.1109/AUSMS.2018.8346992

20. Bumrungkun, P., Chamnongthai, K., & Patchoo, W. (2018). Detection skin cancer using SVM

and snake model. In 2018 International Workshop on Advanced Image Technology (IWAIT)

(pp. 1–4).https://doi.org/10.1109/IWAIT.2018.8369708

21. Zhao, J., Lui, H., McLean, D. I., & Zeng, H. (2008). Real-time raman spectroscopy for non-

invasive skin cancer detection - preliminary results. In 2008 30th Annual International Confer-

ence of the IEEE Engineering in Medicine and Biology Society (pp. 3107–3109). https://doi.

org/10.1109/IEMBS.2008.4649861

22. Aydinalp, C., Joof, S., Yilmaz, T., Özsobaci, N. P., Alkan, F. A., & Akduman, I. (2019). In vitro

dielectric properties of rat skin tissue for microwaveskin cancer detection. In 2019 International

Applied Computational Electromagnetics Society Symposium (ACES) (pp. 1–2)

23. Setiawan, A. W. (2020). Effect of color enhancement on early detection of skin cancer using

convolutional neural network. In 2020 IEEE International Conference on Informatics, IoT, and

Enabling Technologies (ICIoT) (pp. 100–103). https://doi.org/10.1109/ICIoT48696.2020.908

9631

24. Hoshyar, A.N., Al-Jumaily, A., & Sulaiman, R. (2011). Review on automatic early skin cancer

detection. In 2011 International Conference on Computer Science and Service System (CSSS)

(pp. 4036–4039). https://doi.org/10.1109/CSSS.2011.5974581

Chapter 28

Fine-tuning for Transfer Learning

of ResNet152 for Disease Identiﬁcation

in Tomato Leaves

Lakshmi Ramani Burra, Janakiramaiah Bonam, Praveen Tumuluru,

and B Narendra Kumar Rao

Abstract Plants provide a signiﬁcant portion of the world’s food supply. Tomato

is the most popular plant which is cultivated worldwide. The tomato leaf disease is

the primary factor in productivity loss but can be avoided by monitoring regularly.

Detection of tomato leaf diseases using pre-trained deep learning models can help

to reduce the severity of the disease identiﬁcation. However, instead of using a pre-

trained model directly, there is an optional step to ﬁne-tune the model in transfer

learning, which improves the model performance. The examination of ﬁne-tuning

the model with four various scenarios of transfer learning and the art of employing

pre-trained models were suggested in this work. Experiments were done using the

pre-trained model ResNet152 on tomato leaf disease identiﬁcation.

28.1 Introduction

Agriculture is a signiﬁcant measure of economic growth. Tomato is a global commer-

cial crop and the only vegetable consumed as a fruit, with a nutritional value

exceeding that of fruit. The production of tomato crops has made India the world’s

third-largest producer. Currently, disasters caused by tomato leaf diseases and plant

growth disorders are limiting agricultural output [1]. This condition signiﬁcantly

impacts tomato quality and productivity and results in large economic losses. Pests

L. R. Burra (B)·J. Bonam

Prasad V. Potluri Siddhartha Institute of Technology, Vijayawada, Andhra Pradesh, India

e-mail: blramani@pvpsiddhartha.ac.in

J. Bonam

e-mail: janakiramaiah@pvpsiddhartha.ac.in

P. Tu m ul u r u

Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, India

e-mail: praveenluru@gmail.com

B. Narendra Kumar Rao

Sree Vidyanikethan Engineering College, Tirupati, India

e-mail: narendrakumarraob@gmail.com

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_28

295

296 L. R. Burra et al.

and diseases that affect tomato crops can be identiﬁed in advance, and proper culti-

vation and pest management can be supplied to satisfy crop growth needs, reducing

economic losses.

Traditional methods for identifying tomato leaf diseases rely solely on the culti-

vator’s skill or expert assistance. This technique is not only slow, but it is also inef-

fective, inaccurate, costly, and time-consuming. A team of researchers has proposed

a methodology that accurately identiﬁes the tomato crop’s diseases. Deep learning

models [2] have successfully provided new methods and ideas for identifying plant

diseases and insect pests. This is the underlying motivation for the identiﬁcation of

leaf disease. This work suggested the analysis of ﬁne-tuning the model with four

different scenarios of transfer learning.

The work is organized as follows: Sect. 28.2 “Literature Survey” surveys interre-

lated works and techniques for tomato leaf disease identiﬁcation; Sect. 28.3 “Mate-

rials and Methods” discusses the ﬁne-tune models; Sect. 28.4 “Experimental Results”

depicts the experiments and results; and ﬁnally, Sect. 28.5 “Conclusion” concludes

the work.

28.2 Literature Survey

Deep learning and image processing techniques have been widely used for the detec-

tion of plant leaf diseases. Several works related to plant leaf diseases are observed

in literature and introduced some baseline approaches on deep learning for object

detection. In this section, we describe a recent survey related to this ﬁeld.

Liu and Wang [3] suggested a recognition method with Low memory consump-

tion, high recognition accuracy, and speed. The new solution for predicting tomato

leaf spot early and a new concept for diagnosing tomato leaf spot intelligently.

A deep convolutional neural network with an attention mechanism was proposed

by Zhao et al. [4], which adapts the detection of a number of tomato leaf diseases.

Residual blocks and attention extraction modules make up the majority of the

network structure. The model can extract complex aspects of numerous diseases

with accuracy.

To characterize mango leaves affected by anthracnose and powdery mildew infec-

tions, Janakiramaiah et al. [5] presented a variation of CapsNet called Multilevel

CapsNet. General Infections in mango trees are caused by a variety of climatic and

fungal infections, which has resulted in a decrease in mango quality and quantity.

Picon et al. [6] employed a deep residual neural network-based technique to

diagnose a variety of plant disorders under real-world acquisition situations, with a

few adaptations to detect disease early.

Using highly connected convolutional neural networks, Subramanian et al. [7]

used corn leaf images with three classes of diseases and one healthy class, obtained

from Web sites (CNNs). To ﬁne-tune and minimize the training period of the

suggested models, Bayesian optimization is utilized to determine optimal values

for hyperparameters, and transfer learning is investigated.

28 Fine-tuning for Transfer Learning of ResNet152 … 297

Table 28.1 Dataset classes Classes of tomato leaf Train dataset Test dataset

Healthy 1000 100

Leaf mold 1000 100

Bacterial spot 1000 100

Mosaic virus 1000 100

Early blight 1000 100

Late blight 1000 100

Spider mites two-spotted spider

mite

1000 100

Septoria leaf spot 1000 100

Target spot 1000 100

YL_Curl virus 1000 100

Tot a l 10,000 1000

Fuentes et al. [8] suggested a system for detecting and locating plant anoma-

lies and producing diagnostic results, displaying anomalous areas, and describing

sentence symptoms as output. The newly produced tomato plant anomaly dataset

has an average of 92.5% accuracy.

28.3 Materials and Methods

28.3.1 Dataset

The dataset contains 10,000 images belonging to 10 classes of training data and

1000 images belonging to 10 classes of test data, as shown in Table 28.1. The dataset

classes are Healthy, Bacterial spot, Early blight, Late blight, Leaf Mold, Mosaic virus,

Spider mites Two-spotted spider mite, Septoria leaf spot, Target Spot, YL_Curl Virus

and are shown in Fig. 28.1.

28.3.2 Transfer Learning-Based CNN Model

Transfer learning is commonly expressed in computer vision over pre-trained models

and deep convolutional neural networks (CNN). CNN’s excellent performance and

ease of training are the two key aspects that have driven its popularity throughout

the years. The three signiﬁcant ways in the transfer learning process to transform the

pre-trained model are prediction, feature extraction, and ﬁne-tuning.

298 L. R. Burra et al.

Fig. 28.1 Dataset classes

1. Prediction: If the problem is within the scope of a pre-trained model, a common

approach is taken and predicts the labels for the images. Finally, it delivers

accurate results for similar data used to train the model.

2. Feature extraction—As a mechanism of feature extraction, we can employ a

pre-trained model, but the complete model does not need to be (re)trained. By

dropping the output layer, we may use the entire network as a ﬁxed feature

extractor for the new dataset (the one that gives the probabilities for being in

each of the 1000 classes).

3. Fine-tune model: Unfreeze a few of the top layers of a ﬁxed model base and

train both the new classiﬁer layers and the base model’s ﬁnal layers simultane-

ously. This allows to “ﬁne-tuning” the higher-order feature representations of the

underlying model to make them more relevant for the task at hand and retrain

the new data with a very low learning rate.

In general, for ﬁne-tuning the model, transfer learning in CNN can be applied in

four scenarios.

•Scenario 1—New dataset is small and similar to the original dataset—As the

dataset is small, ﬁne-tuning may lead to over-ﬁtting the model. So, use ConvNet

as a ﬁxed feature detector, add any linear classiﬁer (the dense layers) on top, and

train that classiﬁer only.

•Scenario 2—New dataset is large and similar to the original dataset—the dataset

is large and uses the ﬁne-tune approach to save the time taken in training the

network from scratch. If we try to ﬁne-tune over the network, this can be more

conﬁdent that the model will not lead to over-ﬁt.

•Scenario 3—New dataset is small but very different from the original dataset—If

the dataset is signiﬁcantly different from the original, the ﬁne-tuning of ConvNet

is recommended, but make sure not to go too deep into the network. It only adjusts

the weights of a few layers.

•Scenario 4—New dataset is large and very different from the original dataset—

Because the large dataset and different from the ImageNet data, the possibility is

28 Fine-tuning for Transfer Learning of ResNet152 … 299

to move forward to train the ConvNet from scratch, but it takes huge training time

to the network. However, it is useful to begin with pre-trained model weights in

actuality.

28.3.3 Performance Evaluation

The performance indicators such as recall, precision, accuracy, and F1-score were

used to evaluate the training and testing dataset. The indicators are as follows:

Accuracy: the total no. of correctly classiﬁed images to the total number of images.

TP +TN

TN +FP +TP +FN

Precision: proportion of the no. of correctly predicted observations to the total no.

of optimistic predictions.

FP +TP

Recall: the proportion of correctly predicted to all observations in that class.

FN +TP

F1-Score: harmonic mean between precision and recall.

2∗precision ∗recall

precision +recall

28.4 Experimentation

In this work, the pre-trained model ResNet152 has experimented on a tomato leaf

disease identiﬁcation dataset for analysis in different stages. The parameters used

for this experimentation are the image size of 224 ×224, the execution process of

optimization is SGD, the learning rate is 0.0001, size of the batch is 10, the loss is

cross-entropy, the number of epochs is 50, and the number of trainable parameters

is 58.14M as shown in Table 28.2.

300 L. R. Burra et al.

Table 28.2 Summary of

pre-trained ResNet152 Parameters ResNet152 architecture

Image size 224 ×224

Layers 152

Learning rate 1e-3

Batch size 32

Optimizer SGD

loss Cross-entropy

Epochs 50

Trainable parameters 58.14

1. Prediction: The ResNet152 model is utilized in the experimental for prediction

on tomato leaf disease identiﬁcation test dataset. As this results in wrong predic-

tion because the test dataset images are different from the original ImageNet

dataset. As a result, it is unsuitable for forecasting.

2. Feature Extraction: Using the ResNet152 model, the experimentation with the

tomato leaf disease identiﬁcation test dataset provides ﬁxed features. It is solely

good for extracting the features.

3. Fine-tune Model: In this approach, ﬁne-tuning is applied to the deep learning

model that has been trained before on a different dataset. The experimentation is

done with four different paradigms using ResNet152 model.

a. New dataset is small and similar to the original dataset: Fine-tune the

ResNet152 model on a small dataset pre-trained on the ImageNet dataset. As a

result, any linear model can train the classiﬁer. But here, the tomato leaf disease

detection dataset has large data, so this paradigm is not suitable.

b. New dataset is large and similar to the original dataset: Since the tomato leaf

disease identiﬁcation dataset has more data and doesn’t happen any over-ﬁt, the

entire network has to be ﬁne-tuned. So, this approach is not preferable to this

application.

c. New dataset is small but different from the original dataset: To ﬁne-tune the

ResNet152 model in this approach, the tomato leaf disease detection test dataset

has more data. It doesn’t work to adjust the weights somewhere earlier in the

network. So, this approach is not suitable for this application.

d. New dataset is large and different from the original dataset: Since the tomato

leaf disease identiﬁcation dataset is huge and different from the original ImageNet

dataset, it is suitable to ﬁne-tune the ResNet152 model from scratch and is

beneﬁcial to initialize with weights from a pre-trained model in training.

28.5 Results and Discussion

This work employs ResNet152 pre-trained model on tomato leaf disease identiﬁca-

tion for analysis of transfer learning and using of pre-trained models in deep learning.

28 Fine-tuning for Transfer Learning of ResNet152 … 301

The classiﬁcation report of the model on test dataset was shown in Table 28.3.The

average precision for the 10 classes on tomato leaf disease identiﬁcation is 94%,

recall is 94.2%, F1-score is 94%, and the accuracy is 94.2%.

Similarly, the classiﬁcation report on the train dataset for the model ResNet152

is shown in Table 28.4. The average precision is 96%, recall is 96.75%, F1-score is

96%, and achieved accuracy is 96.75%.

Table 28.3 ResNet152 classiﬁcation report on test dataset

Classes Precision (%) Recall (%) F1-score (%) Support

Healthy 93 93 93 100

Leaf mold 95 97.93 96.44 100

Bacterial spot 95 95.96 95.47 100

Mosaic virus 94 91.26 92.6 100

Early blight 96 93.2 94.57 100

Late blight 95 95 95 100

Spider mites two-spotted spider mite 93 94.89 93.93 100

Septoria leaf spot 94 91.26 92.6 100

Target spot 93 97.89 95.38 100

YL_Curl virus 94 92.15 93.06 100

Accuracy 94.2 1000

Table 28.4 ResNet152 classiﬁcation report on train dataset

Classes Precision (%) Recall (%) F1-score (%) Support

Healthy 95.3 96.94 96.11 1000

Leaf mold 97.5 96.91 97.2 1000

Bacterial spot 97.2 96.81 97 1000

Mosaic virus 96.1 96.58 96.33 1000

Early blight 97.7 96.54 97.11 1000

Late blight 97.2 96.90 97.04 1000

Spider mites two-spotted spider mite 96.6 96.40 96.49 1000

Septoria leaf spot 97.8 96.64 97.21 1000

Target spot 95.9 97.16 96.52 1000

YL_Curl virus 96.2 96.58 96.38 1000

Accuracy 96.75 10,000

302 L. R. Burra et al.

28.6 Conclusion

In the realm of image recognition, deep learning architecture has gained a lot of

popularity. To identify diseased plant leaves, the various pre-trained models effec-

tively analyze the large collection of image data and identify them with minimum

error. Fine-tuning allows expanding the capabilities of state-of-the-art models to

other domains. This work performs ﬁne-tuning the model and its analysis with four

paradigms of transfer learning on tomato leaf disease identiﬁcation. The ResNet152

architecture is employed and suggests the entire network to be trained for this target

dataset.

References

1. Wang, X., Liu, J., & Zhu, X. (2021). Early real-time detection algorithm of tomato diseases and

pests in the natural environment. Plant Methods, 17(1), 1–17.

2. Ramani, B. L., Poosapati, P., Tumuluru, P., Saibaba, C. H. M. H., Radha, M., & Prasuna,

K. (2019). Deep learning and fuzzy rule-based hybrid fusion model for data classiﬁcation.

International Journal of Recent Technology and Engineering, 8(2), 3205–3213.

3. Liu, J., & Wang, X. (2020). Early recognition of tomato gray leaf spot disease based on

MobileNetv2-YOLOv3 model. Plant Methods, 16(1), 1–16. Barbedo

4. Zhao, S., Peng, Y., Liu, J., & Wu, S. (2021). Tomato leaf disease diagnosis based on improved

convolution neural network by attention module. Agriculture, 11(7), 651.

5. Janakiramaiah, B., Kalyani, G., Prasad, L. V., Karuna, A., & Krishna, M. (2021). Intelligent

system for leaf disease detection using capsule networks for horticulture. Journal of Intelligent &

Fuzzy Systems, (Preprint), 1–17.

6. Picon, A., Alvarez-Gila, A., Seitz, M., Ortiz-Barredo, A., Echazarra, J., & Johannes, A. (2019).

Deep convolutional neur al networks for mobile capture device-based crop disease classiﬁcation

in the wild. Computers and Electronics in Agriculture, 1(161), 280–290.

7. Subramanian, M., Lv, N. P., & VE, S. (2021). Hyperparameter optimization for transfer learning

of VGG16 for disease identiﬁcation in corn leaves using Bayesian optimization. Big Data.

8. Fuentes, A. F., Yoon, S., & Park, D. S. (2019). Deep learning-based phenotyping system with

glocal description of plant anomalies and symptoms. Frontiers in Plant Science, 10, 1321.

Chapter 29

AI-Based Mental Fatigue Recognition

and Responsive Recommendation System

Korupalli V. Rajesh Kumar, B. Rupa Devi, M. Sudhakara,

Gabbireddy Keerthi, and K. Reddy Madhavi

Abstract Complete comfort is considered the deﬁnite one among all the foremost

concerns in the present scenario. The organization has been proceeding with more

contemplation in upgrading their work comfort. The workers also proceed with many

perspectives to enhance their ongoing comfort status. However, comfort is generally

related to their everyday activity and conduct, mainly in the worksites where it

inﬂuences both the levels of stress and mood, resulting in mental fatigue. Speciﬁcally,

the characteristic of the person’s comfort will be affected by the conduct in a worksite.

We suggested a mental fatigue identiﬁcation system that embraced deep learning

techniques for supplying non-intrusive observing systems. The comfort level has

been classiﬁed based on surveys: pressure and mood. This preparatory study instructs

the model to generalized and personalized models of classiﬁcations. The model as

personalized perspective has been taken as one of the steps to supply a personalized

health resolution support system that helps in elevating the awareness in customers

and motivates them to upgrade their conduct and eventually put up to the best comfort

system. We have attained an accuracy of 87% on the model generic and 94% on the

model personalized.

K. V. R. Kumar (B)

School of Business, (AI & ML), Woxsen University, Hyderabad, Telangana, India

e-mail: rajesh.kumar@woxsen.edu.in

B. R. Devi

Annamacharya Institute of Technology and Sciences, Tirupati, AP, India

e-mail: rupadevi.aitt@annamacharyagroup.org

M. Sudhakara (B)

School of Computing and Information Technology, Reva University, Bangalore, India

e-mail: malla.sudhakara@reva.edu.in

G. Keerthi

CSE, Vignan’s Foundation for Science, Technology and Research, Guntur, A.P, India

K. R. Madhavi

Sree Vidyanikethan Engineering College, Tirupati, AP, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_29

303

304 K. V. R. Kumar et al.

29.1 Introduction

Complete comfort is considered the deﬁnite one among all the foremost concerns.

Research manifested the work-related comfort that could affect basic comfort in the

long term, including some of the symptoms and depressive. Other research proposed

the health of low levels and comfort, leading to major consequences for the worker

and the organization. The outcomes can incorporate a lawsuit to the work suite related

to disease and the cost for vanished efﬁciency, leading to worksite harassment. The

research proposed which is lessening stress upon the worker is one of the effective

ways to upgrade the worker’s comfort. Diversely, appreciation at work also leads

to a powerful effect on the performance of the work. A research pointed as that in

a positive way of mood is dynamically based on pro-social conduct [1]. This way

needs an extra friendly worksite to attain elevated work performance.

Regarding health, research was managed to investigate the consequence of stress

and mood that results in causing other symptoms that incidentally result in some

actions from utmost stress. The research proposed that inactive working conduct led

to reduced productivity and the possibility of having worry and additional physical

afﬂiction [2]. The research also tapered out the motion during the functioning hour,

which incorporates the swap of standing, sitting, and walking to lessen the possibility

of associated afﬂictions. Overall, mood and stress play a vital role in every indi-

vidual’s everyday comfort. This study suggested preparatory research of everyday

comfort identiﬁcation systems based on mood status by utilizing deep learning tech-

niques in the ofﬁce environment. We have utilized a coalescence of the conduct data,

that are, the movements of both hand and body, and also the climate conditions are

the inputs to the system [3,4]. The perspective was evolved with the theory to be

less invasive to the worker even as possible while keeping up the potential to observe

the important details from the worker. Hence, we have utilized a webcam to observe

content in the investigation. Following this procedure, the framework does not need

workers to put additional devices and remains non-intrusive for workers’ everyday

routines. Besides, we must assume numerous concepts concerning comfort, and we

have utilized algorithms, namely fuzzy clustering, to cluster the attributes of both

pressure and mood to mention comfort level. We managed a preparatory investiga-

tion to evolve the comfort detection design to exhibit the framework. The scheme

also contemplated where every individual has dissimilar conduct in the worksite.

Consequently, we have exhibited an evolution of a customized classiﬁcation pattern.

However, we evolved a general classiﬁcation method based upon the data from all the

available subjects. Its main objective is to give feedback in real time to the utilizer. In

this preparatory research, we evolved the classiﬁcation method upon an everyday

based on the gathered information. The suggested functional system is planned

to obey a resolution-dependent system in elevating the recognition of the person’s

conduct and comfort. Figure 29.1 shows the daily life routine scenarios—stress and

mood conditions to energy levels, resulting in mental fatigue.

29 AI-Based Mental Fatigue Recognition … 305

Fig. 29.1 Human

life—daily

routine—stress-related

aspects

29.2 Related Literature on Responsive System

In this segment, we debate further on how come technologies have given in resolving

the comfort problems. This segment is cleaved into two parts, namely stress-based

mental fatigue and personal mood-based mental fatigue. Physical fatigue is also one

reason for mental fatigue.

29.2.1 A Stress-Based Mental Fatigue—Responsive System

One research debated a way to acknowledge pressure for sustainable life. The

research utilized a mobile phone-like sensor to withdraw mobile phone utilization,

comprising the call log, Bluetooth interactivity, and SMS log. Despite utilizing the

mobile phone, which activates the data, other research suggested a less invasive solu-

tion. The research utilized a smartphone to observe all the physical activities rather

than mobile data. It determines the stress levels of the utilizer through utilizing the

self-analysis process. The conversations about pressure produced a depth review of

the system. Stress detection framework approaches might comprise the data usage:

conduct responses, multimodal methods include many types of data, physiological

signals, and contextual events. This analysis points out key challenges about the

stress detection framework: ubiquitous, obstructiveness, and noninvasiveness. The

analysis proposed those factors that performed a vital role in the utilizers’ satisfaction

and acceptance [5,6].

306 K. V. R. Kumar et al.

29.2.2 Personal Mood-Based Mental Fatigue

The application was represented in mood detection for bipolar disorder. Normal

state mood detection, movement detection, and upper body posture were suggested.

The research estimates the consequences of music on different subject moods, as the

subject’s moods are set on by utilizing an examination. The positions of every subject

are examined, and the research has attained 81.30% precession to the complete

recognition of both happy and sad moods. One more way to recognize the mood

has been known as facial recognition. Other research suggested a facial detection

framework that accurately determines the mood up to 90.43%. Deep learning tech-

niques are applied in this research to attain a precise elevated model to detect. The

feedback on mood recognition methods comprises audio attributes, physiological

attributes, and video attributes. These works have proposed numerous functionali-

ties for recognizing personalized mood and stress levels. The studies of both mood

and stress proposed are related to the whole comfort of an individual. Generally,

these researches tend to contemplate only one: mood and stress. In this research, we

suggested a way to consider both important attributes. The mechanism like clustering

using fuzzy is applied for deciding the person’s comfort level [7–9].

29.3 Methods

29.3.1 Technological Approach—IoT Devices to Acquire

Data

The crucial elements in the suggested approach are formed on the architecture of

M2M and IoT. Figure 29.2 represents an outline of its components and the system.

In this research, we have utilized the web cameras as a sensory system on the device

for every server area afﬁxed to the hardware device—Raspberry Pi board. To the

system, Raspberry Pi will take measures as a gateway. Figure 29.2 shows the IoT

device—Raspberry Pi unit integrated with a camera to monitor the condition. To

lessen the workload on the server gateway is the most crucial component by executing

effortless data preprocessing tasks. The server holds numerous tasks on the cloud

service component that comprises data storage, everyday conduct extraction by deep

learning techniques, and classiﬁcation utilizing instructed models. Outcomes of this

classiﬁcation could be dispatched to utilizer terminals. This response behaves like

part of the conclusion support framework to the utilizer as it assists in raising the

realization concerning their working conduct and comfort. The absolute aim of the

concept is to issue a real-time response to the utilizer about their conduct. The system

could issue a response stand over a synopsis of the tracking information. However,

we exhibited a preparatory experiment in this research by utilizing a synopsis of

information daily [10].

29 AI-Based Mental Fatigue Recognition … 307

Fig. 29.2 IoT device

integrated with camera to

monitor condition—on

worksite—proposed model

29.3.2 Research Procedure

Figure 29.3 represents the whole experiment course of action in this research. We

experimented by executing a web camera with 120° observing angles in the resolution

of HD for the collection of data. The subjects are instructed before the experiment

regarding its purpose. The agreement forms are provided and made it signed by all

subjects. The surveys are discussed later, and instructions are provided to the subjects

regarding the survey completion. The process of monitoring has been lasted and

started for 25 days, utilized the information only when the subjects were available,

and ﬁnished the survey at the workstation. Behind, we have carried out the conduct

extraction by utilizing the hand detector and body detector, which was instructed with

the deep learning algorithm. A procedure of instructing the image identiﬁers has been

debated. Eventually, we instructed the classiﬁcation model besides the withdrawn

conduct data. Information of the algorithms is utilized in instructing the classiﬁcation

models.

Fig. 29.3 Survey

modeling—data analysis

using AI learning framework

308 K. V. R. Kumar et al.

Surveys

This research utilized the data from the extensive surveys conducted in two patterns

to classify the everyday data to dissimilar comfort levels, though this research has not

utilized the subject’s raw score for classiﬁcation. Rather, we executed fuzzy clustering

on the scores of a survey for splitting the scores into two individual groups. Every

group acts for either a lower level or a higher level of comfort. The techniques are

explained in detail below. The following surveys are utilized in this research.

29.3.3 Mood

The PNC, abbreviated as positive and negative conditions, is a mood survey in 2D. To

assess mood, we utilize a popular tool PNC as this is manifested as reliable, mainly

with assessing the short term. Generally, the survey efﬁciently assesses the mood for

“today” or “now”. The directions need to be given clearly where “now” mentions the

feelings of the current moment and “today” mentions the feelings of the entire day.

Research focuses on the survey, which was considered on day basis. Every survey

comprises 15 words, as either negative effect or positive effect. The greater score

on the positive effect relates to words representing as PA—positive affect A greater

score on the negative effect relates to words representing NA—negative affect in the

subject’s condition of mood.

29.3.4 Stress

Here, this research utilizes documenting discernible stress levels on 100 mm VAS.

The scale of “0” represents “stress free”, and the scale of “100” represents “stress

full”. The approach of utilizing this type of survey is represented to create an everyday

record of mood and stress. The list of everyday stress assessments is represented to

subjects as instructions for documenting the stress level.

29.3.5 Methods—Machine Learning Models

In this research, we utilized a supervised learning model to enlarge the image recogni-

tion models utilizing a convolution neural network—CNN. Consequently, this avoids

the bottleneck which might occur. We trained CNN to discover objects, hands, and

bodies. In this research, we documented a video of 25 days, with a period of 1 h

per day, the exterior of experiment duration, and utilized a dataset for the instructing

model. We have taken out frames of 800 of a random content from the video and

named the region boundaries for both hands and body. Next, the data is segmented

into the train and test set with the proportion of 75:25. The CNN model has attained

29 AI-Based Mental Fatigue Recognition … 309

an accuracy of 0.92 for body detection and 0.76 for hand detection. Figure 29.5

represents an image sample when we ﬁnd the hand and body detectors.

Classiﬁcation Algorithms

The algorithms utilized to instruct the comfort classiﬁcation model are as follows:

•Support vector machines (SVM)

•Decision trees (DT)

•K-nearest neighbors (KNN)

The KNN is extensively known as an algorithm for instructing a classiﬁcation

strategy. In this research, KNN is carried out with K=5. The SVM is also extensively

utilized in binary classiﬁcation issues. The clustering survey outcomes are manifested

in two classes for the comfort classiﬁcation, which acts for high and low comfort

levels. Hence, SVM is appropriate to the situation. In this research, we instructed the

SVM model, including Gaussian kernel and linear kernel functions. The Gaussian

function executed is superior, so we debated speciﬁcally the outcomes from the

function. Besides, decision trees are utilized as an algorithm which did not depend

on distance. Hence, it has given other approaches to instructing the model [11,12].

Based on the results, ranking of program executed on sequence using improved

precision in regression testing in [13] and implementation of electronic medical

health record in [14].

29.4 Evaluation and Results

Data Collection

We experimented on two humans, one female and one male, with ages 25 and 26

sequentially. These two subjects have their worksite, so there is no need of traveling

from source to destination. The web camera will be capturing the working conduct of

the top view throughout the daytime. The survey conducted on the subject completes

at the end of every day, and document is analyzed on day basis on mood and stress

levels. The complete process of this research is conducted for 25 days. The data was

utilized when the subjects were subjected to the worksite. The system withdrew the

conduct from these days by utilizing the instructed hand and body detectors.

29.4.1 Survey Results and Clustering

Table 29.1 represents an easy illustrative statistic of the outcomes from both the

survey of the subject, where “Whole” acts for the illustrative statistic of the survey

outcomes from two subjects merged. A kindly record is that both subjects have an

observable dissimilarity in stress and PA score. This might be the outcome of either

whole workload, as subject 1 has greater working hours than subject 2, resulting

310 K. V. R. Kumar et al.

Table 29.1 Subjects—PA

and NA—stress score (on

Scale 100)

Min Max

Sub1PA 21 45

Sub1NA 34 53

Sub1Stress 43 85

Sub2PA 27 48

Sub2NA 37 55

Sub2Stress 46 87

Overall PA 23 47

Overall PA 35 54

Overall Stress 44 86

in greater stress. The whole data is normalized before the process of clustering.

Extracted results are given in Table 29.1.

29.4.2 Clustering

We utilized a fuzzy C-means (FCM) clustering algorithm to categorize the survey

outcomes as clusters, either lower or higher comfort levels. The beneﬁt of the FCM

is which utilizes possibility during computation rather than utilizing distance basis.

In this scenario, score from the survey can be enigmatic as a score of a deﬁnite day

might be adjacent to the mean and categorized into an incorrect class. For every

subject and the merged results, we have a total of two clusters. Subject A’s cluster

shows a greater level of comfort PA mean in the cluster is greater than, both stress

and NA are lesser than the whole mean of subject A. Lesser level of comfort shows

a lesser PA. Both stress and NA values are greater than the whole mean of subject

A, represented by cluster two of subject A. A similar attribute is also represented in

subject B’s whole outcomes and clusters. The mean of every attribute is a bit different

from the datasets of subjects A and B. We have utilized the clustering outcomes of

A cluster to instruct the customized comfort identiﬁcation by utilizing only the data

of subject A (Fig. 29.4).

29.4.3 Classiﬁcation Models

Behavior Information Extraction

In this concept building image processing mainly computer vision techniques are

used. The focus is on determining hands and body in the frame brought out from the

video, and in this research, we have analyzed 1 f/s (frame/second) in every video. The

determined hands and body are utilized to compute the following conduct attributes:

29 AI-Based Mental Fatigue Recognition … 311

Fig. 29.4 Clustering scores

Fig. 29.5 Daily behavior extraction—based on hand movements—stress conditions—mental

fatigue

mean of the body motion, mean of hand 1 and 2 movement, total time available in

the worksite, meantime at worksite/activity, number of activity changes, and total

time not available in the worksite. Hand movements were observed using motion

detection as shown in Fig. 29.5.

The window size for these computations is equal to the total time taken to work for

the subject in a day. All in all, we have seven attributes from the conduct extraction,

and on an everyday basis, these attributes got extracted. Every data point represents

312 K. V. R. Kumar et al.

the conduct of a particular day. The data points are named considering the survey

analysis—outcome score of that particular day. As mentioned, we have utilized FCM

to separate the outcomes into two classes.

29.4.4 Classiﬁcation Model Training

Besides the everyday conduct data, we utilized everyday climate as a key attribute.

This sort of psychological data was manifested to affect the person’s mood. Generally,

the temperature will have a major affect on the person NA. Though this is not required

for relating with one particular rule, preferences and personality are generally taken

into account, as less temperature might decrease NA in one person and increase NA

in another person. We have to document the highest and lowest temperatures every

day and afﬁxed them to their everyday conduct data. Hence, we have seven attributes

from the conduct extraction and two from the weather data. We instructed the model

with the arranged dataset, where 25% is utilized as test data and 75%is utilized

as instructed dataset. The algorithm utilized for instructing is the three algorithms

that are mentioned earlier. Before training process, data is normalized. Generally,

we have trained the model on two criteria: Firstly, model was with two subjects

data. We have utilized 21 days of information from subjects A and B. Secondly, the

customized model is trained with some data from subject A. This approach illustrated

the probability of instructing the model for an individual comfort framework. Kindly

note that we have not instructed the personalized model for subject B as we have

only a small dataset from it.

29.4.5 Results

For the general model, the SVM algorithm attained the highest precision with a

precision of 87%. The D-tree algorithm attained the lowest precision, where the

precision was 65%. The PA model is based A’s data, and the highest precision is

SVM, KNN, and D-tree as 94%,79%, and 71%, respectively, as shown in Fig. 29.6.

Generally, SVM is carried out best among the types of models. This also might be

because the classiﬁcation issue is a binary classiﬁcation. Among machine learning

algorithms D-tree (Decision tree) was not accurate in prediction based on two model

sources, as this only attained the highest precision of 94%. Finally, precision for the

PA model is higher than the general model, which also suggests a good potentiality

in the personal development.

29 AI-Based Mental Fatigue Recognition … 313

Fig. 29.6 Machine learning

algorithms response on

recommendation

system—mental fatigue

100

SVM KNN D-TREE

Machine Learning - Algorithms

NA PA

29.5 Conclusion and Discussion

In this research, we suggested a comfort identiﬁcation framework based upon the

status of mood and stress. We exhibited preliminary outcomes from this research

with machine learning and deep learning techniques for everyday information gath-

ering/extraction in the closed environment. The research exhibited the precision for

the two generic and personalized models, besides the noninvasiveness system. More-

over, we utilized the SVM algorithm to cluster the three attributes from the surveys:

stress and mood related. The clustering outcomes are utilized to classify everyday

conduct data. This approach comes up with comfort recognition as it considers

numerous aspects. We instructed the classiﬁcation model by utilizing three algo-

rithms that represent a satisfactory level of precision. The personalized and generic

models used more than 87% with SVM model. The classiﬁcation model in this

research has exhibited the probability of development the model for PA information

system.

Furthermore, challenges and development to this preliminary research thought

of incorporating all the factors exterior to the ofﬁce, like personal traits, physical

activities, and dietary, can assist the system in detecting the person’s comfort better

and simultaneously keeping the system more personalized. Generally, furthermore,

with the application of techniques, long duration of data collection time, and sensors,

we anticipate achieving a more precise model for the real-time comfort system.

On the whole, the research has suggested a comfort recognition system with two

personalized and generic recognition models utilizing the cutting-edge technology

such as deep learning when we consider numerous attributes regarding comfort.

References

1. Vaskari, R. G., & Sugumaran, V. B. (2020). Prevalence of stress among software professionals

in Hyderabad, Telangana State, India. Central African Journal of Public Health, 6(4), 207.

2. Matsumoto, T., Egawa, M., Kimura, T., & Hayashi, T. (2019). A potential relation between

314 K. V. R. Kumar et al.

premenstrual symptoms and subjective perception of health and stress among college students:

A cross-sectional study. BioPsychoSocial medicine, 13(1), 1–9.

3. Umematsu, T., Sano, A., Taylor, S., & Picard, R. W. (2019). Improving students’ daily life

stress forecasting using LSTM neural networks. In 2019 IEEE EMBS international conference

on biomedical & health informatics (BHI) (pp. 1–4).

4. Kumar, K. V. R., & Elias, S. Use case to simulation: Muscular fatigue modeling and analysis

using opensim. Turkish Journal of Physiotherapy and Rehabilitation, 32(2).

5. Martin, K., Meeusen, R., Thompson, K. G., Keegan, R., & Rattray, B. (2018). Mental fatigue

impairs endurance performance: A physiological explanation. Sports Medicine, 48(9), 2041–

2051.

6. Pageaux, B., & Lepers, R. (2018). The effects of mental fatigue on sport-related performance.

Progress in Brain Research, 240, 291–315.

7. McCormick, M. P., Hsueh, J., Merrilees, C., Chou, P., & Mark, C. E. (2017). Moods, stressors,

and severity of marital conﬂict: A daily diary study of low-income families. Family Relations,

66(3), 425–440.

8. Sudarma, M., & Harsemadi, I. G. (2017). Design and analysis system of KNN and ID3 algorithm

for music classiﬁcation based on mood feature extraction. International Journal of Electrical

and Computer Engineering, 7(1), 486.

9. Taylor, S., Jaques, N., Nosakhare, E., Sano, A., & Picard, R. (2017). Personalized multitask

learning for predicting tomorrow’s mood, stress, and health. IEEE Transactions on Affective

Computing, 11(2), 200–213.

10. Kumar, K. V. R., Kumar, K. D., Poluru, R. K., Basha, S. M., & Reddy, M. P. K. (2020). Internet

of things and fog computing applications in intelligent transportation systems. In Architecture

and security issues in fog computing applications (pp. 131–150). IGI Global.

11. Bhogaraju, S. D., & Korupalli, V. R. K. (2020). Design of smart roads—A vision on Indian smart

infrastructure development. In 2020 International conference on communication systems &

networks (COMSNETS) (pp. 773–778).

12. Bhogaraju, S. D., Kumar, K. V. R., Anjaiah, P., Shaik, J. H., & Reddy Madhavi. (2021).

Advanced predictive analytics for control of industrial automation process. In Innovations in

the industrial internet of things (IIoT) and smart factory (pp. 33–49). IGI Global.

13. Dr. Narendra Kumar Rao, B., & Bhaskar Kumar Rao, B. (2019). Clustering based test suite

selection for ranking of program execution sequence using improved precision in regression

testing. International Journal of Innovative Technology and Exploring Engineering, 8(7).

14. Dr. Narendra Kumar Rao, B., & Bhaskar Kumar Rao, B. (2019). Block chain Based imple-

mentation of electronic medical health record. International Journal of Innovative Technology

and Exploring Engineering, 8(8).

Chapter 30

Multiple Slotted Triple-Band PIFA

Antenna for Wearable Medical

Applications at 2.5–9 GHz

T. V. S. Divakar and G. Anantha Rao

Abstract This paper discusses about a small-sized planar inverted F-antenna (PIFA)

for tripple bands operation. The recommended antenna contains a stepped rectangular

radiating component near to the edge of the PIFA besides having a ground plane with

complete copper coating. The dimensions of this antenna are 26.7 ×16 ×1mm

and feeding technique used is microstrip line. Here, simulated S11 values, voltage

standing wave ratio (VSWR), and gain are fairly related to each other. The proposed

antenna structure operates at 2.5 and 8.9 GHz frequency range below −15 dB. The

mentioned antenna structure is suitable for wearable medical applications.

30.1 Introduction

In recent years, wearable antennas are showing predominant thurst in antenna designs

due to many applications in well-known areas, like health monitoring, physical

training, and tracing. Since this type of antenna is close to the human body and more

over it has many bends, the antenna performance must take into account the losses

because of the human body and also bending losses, besides having the optimum

performance. So, this type of antenna systems must have low radiation effect which

can be measured by speciﬁc absorption rate (SAR). Also, the designed antenna

should have a ﬂexibility to attach to the human bodies to increase comfort while

humans wear it. Since fabric materials are light in weight and easy to wear, they

can be used to incorporate this type of antennas. Besides the advantages of PIFA

antennas like size and cost, there are some disadvantages like “poor impedance

matching, poor proﬁciency, and excitation of surface waves that could bring down

the radiation effectiveness.” A small sized and low proﬁle fabric electromagnetic

bandgap-based (EBG) antenna for wearable health applications which operates at

T. V. S. Divak a r ( B)·G. Anantha Rao

Department of Electronics and Communication Engineering, GMR Institute of Technology,

Rajam, AP, India

e-mail: divakar.tvs@gmrit.edu.in

G. Anantha Rao

e-mail: anantharao.g@gmrit.edu.in

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_30

315

316 T. V. S. Divakar and G. Anantha Rao

2.1 GHz is explored in [1], with help of EBG structure reduction in the back radi-

ation and improvement in FBR was achieved, SAR of 0.0368 w/kg was achieved

with dimensions 46 ×46 ×2.4 mm3. Investigation on textile antenna with artiﬁcial

magnetic conductor is carried out in [2,3]. A dual-resonant mode wearable textile

PIFA for 5 GHz WLAN applications is designed in [4]. In this design, using hollow

copper rivets and wool felt SAR of 0.9307 was achieved. A small-sized dual-band

fabric PIFA for 432 MHz/2.4 GHz ISM bands was proposed which is made of 6 mm

thick felt and 0.17 mm thick conductive textile sheet at the back of the PIFA [5–7].

Supple fractal EBG for mm-Wave wearable antennas was projected in [8].

This paper explains a tripple-band miniaturized PIFA antenna with edge slots at

the end of the PIFA when viewed from feed element to gain good impedance match

and to have tripple-band action at 2.5 and 8.7 GHz. The projected antenna is small

in size when compared with other antennas in the literature [9].

The organization of this manuscript is like this. Section 30.2 describes the antenna

design with all dimensions of the antenna, and Sect. 30.3 describes the results and

discussion followed by conclusions.

30.2 Antenna Design

The complete size of proposed design is 26.7 ×16 ×1mm

3as shown in Fig. 30.1.

It comprises of PIFA structure which can be divided as two vertical patches of 13.54

×2.72 mm2and 14.079 ×2.72 mm2when viewed from microstrip feeding end.

Two more rectangular patchs of 11.57 ×1.38 mm2and 12.10 ×1.38 mm2were

placed to fulﬁll PIFA structure. One square slot of 1 ×1mm

2placed near to the

two patches to decrease the return loss and to get perfect impedance matching for

the structure. Ground plane is completely shielded with copper coating. Simulations

were carried out using HFSS software, and all the measurements were augmented by

trail and error process besides maintaining primary values constant to get preferred

features. The thickness, dielectric constant of the dielectric material, and loss tangent

of FR4 substrate are 1.6 mm, 4.4, and 0.02, respectively. Resonating frequencies are

threefold between 2.5 and 8.9 GHz. Sizes of the recommended antenna are accessible

from Fig. 30.1.

30.3 Results and Discussion

The characteristics of reﬂection coefﬁcient were displayed in Figs. 30.2 and 30.4,

and we can observe that the antenna is resonating at the higher frequency with good

resonance compared to the lower frequency with edge slot compared to without slot.

Figures 30.3 and 30.5 show the corresponding VSWR characteristics. We can observe

that there is a good agreement with the reﬂection coefﬁcient and VSWR at the desired

frequencies. Recommended antenna works in the double band of frequencies at 2.24

30 Multiple Slotted Triple-Band PIFA Antenna … 317

Fig. 30.1 To p v i ew of

antenna

and 8.8 GHz. Because of the edge slot, there is a notable change in the reﬂection

coefﬁcient, and it was observed that if the measurements of air box and operation

frequency was changed, there are minor variations in the operating frequency.

Figure 30.6 displays the reﬂection coefﬁcient vs frequency with two-slot antennas

on the structure, and it can be observed that a return loss of −20 dB for 2.5 GHz

frequency and −30 dB at 8.7 GHz frequency. Figure 30.7 displays the VSWR

versus frequency with two-slot antennas on the structure, and it can be observed that

a VSWR of 1.96 and 1.02 at the desired frequencies, which shows the antenna is

resonating at the operating frequencies.

Figure 30.8 displays the reﬂection coefﬁcient versus frequency with three-slot

antennas on the edge of the structure, and it can be observed that a return loss of −

21 dB for 2.5 GHz frequency and −29 dB at 8.7 GHz frequency. Figure 30.9 displays

the VSWR versus frequency with three slot antennas on the edge of the structure,

and it can be observed that a VSWR of 1.66 and 0.67 at the desired frequencies,

Fig. 30.2 Antenna having

edge and middle slots

318 T. V. S. Divakar and G. Anantha Rao

Fig. 30.3 Antenna having three edge slots

Fig. 30.4 Antenna having one edge slot with three bands

which shows the antenna is resonating at the operating frequencies and better than

the two slot structure.

Figure 30.10 displays the reﬂection coefﬁcient vs frequency with single-slot

antenna on the edge of the structure, and it can be observed that a return loss of

−20 dB for 2.5 GHz frequency and −32.4 dB at 6.7 GHz frequency −21.4 dB at

8.9 GHz frequency. Figure 30.11 displays the VSWR versus frequency with three-

slot antennas on the edge of the structure, and it can be observed that a VSWR of 1.9

and 0.45 and 1.51 at the desired frequencies, which shows the antenna is resonating

at the operating frequencies and triple-band operation was achieved.

Figure 30.12 displays the directivity plot with three edge slot antennas on the

structure, and it can be observed that side lobes are less for the antenna. Figure 30.13

displays the reﬂection coefﬁcient vs frequency with three-slot antennas on the edge

30 Multiple Slotted Triple-Band PIFA Antenna … 319

Fig. 30.5 Antenna having multiple edge slots with three bands

Fig. 30.6 Reﬂection coefﬁcient versus frequency with two slots

of the structure, and it can be observed that triple-band operation also possible with

three slots if we change the slot positions.

Figure 30.14 displays the VSWR versus frequency with three slot antennas on the

edge of the structure and it can be observed that a VSWR of 1.01 and 0.39 and 2.71

at the desired frequencies, which shows the antenna is resonating at the operating

frequencies and triple band operation was achieved.

Figure 30.15 displays the directivity plot with three edge slot antennas on the

structure, and it can be observed that side lobes are less for the antenna.

Figure 30.16 shows the fabricated antenna with edge slot. SMA connector is used

to feed the antenna for power.

320 T. V. S. Divakar and G. Anantha Rao

Fig. 30.7 VSWR versus frequency with two slots

Fig. 30.8 Reﬂection coefﬁcient versus frequency with three edge slots

30.4 Conclusion

A PIFA antenna with edge slot of overall size 26.7 ×16 ×1mm

3with fully covered

ground structure with copper has been simulated and fabricated. Microstrip line

feed is used to feed the structure. Because of small size, it is useful for medical

applications and wearable applications. This antenna meets requirements of 2.5 and

8.9 GHz applications. Determined gain of the PIFA antenna is found to be 3.5 dB.

Simulated reﬂection coefﬁcient and VSWR properties are fairly in good agreement.

30 Multiple Slotted Triple-Band PIFA Antenna … 321

Fig. 30.9 VSWR versus frequency with three slots

Fig. 30.10 Reﬂection coefﬁcient versus frequency with single edge slot

322 T. V. S. Divakar and G. Anantha Rao

Fig. 30.11 VSWR versus frequency with single edge slot

Fig. 30.12 Directivity plot

with single edge slot

30 Multiple Slotted Triple-Band PIFA Antenna … 323

Fig. 30.13 Reﬂection coefﬁcient versus frequency with three edge slots

Fig. 30.14 VSWR versus frequency with three edge slots

324 T. V. S. Divakar and G. Anantha Rao

Fig. 30.15 Directivity plot

with three edge slots

Fig. 30.16 Fabricated

antenna with single slot

References

1. Kaschel, H., & Ahumada, C. (2018). Design of rectangular microstrip patch antenna at 2.4 GHz

for WBAN applications. In ICA-ACCA (pp. 17–19).

2. Ashyap, A. Y., Abidin, Z. Z., Dahlan, S. H., Majid, H. A., Shah, S. M., Kamarudin, M. R., &

Alomainy, A. (2017). Compact and low-proﬁle textile EBG-based antenna for wearable medical

applications. IEEE Antennas and Wireless Propagation Letters, 16, 2550–2553.

3. Gao, G. P., Yang, C., Hu, B., Zhang, R. F., & Wang, S. F. (2019). A wide-bandwidth wearable

all-textile PIFA with dual resonance modes for 5 GHz WLAN applications. IEEE Transactions

on Antennas and Propagation, 67(6), 4206–4211.

4. Alonso-González, L., Ver-Hoeye, S., Fernández-García, M., Vázquez-Antuña, C., & Andrés, F.

L. H. (2019). On the development of a novel mixed embroidered-woven slot antenna for wireless

30 Multiple Slotted Triple-Band PIFA Antenna … 325

applications. IEEE Access, 7, 9476–9489.

5. Ahmad, M. S., Kim, C. Y., & Park, J. G. (2014). Multi shorting pins PIFA design for multiband

communications. International Journal of Antennas and Propagation, 6.

6. Raviteja, G. V., Lakshmi, V. R., & Reddy, N. S. (2019). Multiband planar inverted-F antenna

employing rectangular SRR for UMTS and WiMax/WiFi applications. International Journal of

Computer Applications, 182(36).

7. Ray,J. A., & Chaudhuri, S. B. (2011). A review of PIFA technology. In 2nd International Confer-

ences on Information Technology, Information Systems and Electrical Engineering (ICITISEE).

https://doi.org/10.1109/IndianAW.2011.6264946

8. Rama Krishna, K., Sambasiva Rao, G., & Ratna Raju, K. P. R. (2015). Design and simulation

of dual band planar inverted F antenna (PIFA) for mobile handset applications. International

Journal of Antennas (JANT), 1(1).

9. Ruaro, A., Thaysen, J., & Jakobsen, K. B. (2016). Wearable shell antenna for 2.4 GHz hearing

instruments. IEEE Transactions on Antennas and Propagation,64(6), 2127–2135.

Chapter 31

Fish Classiﬁcation System Using

Customized Deep Residual Neural

Networks on Small-Scale Underwater

Images

M. Sudhakara, Y. Vijaya Shambhavi, R. Obulakonda Reddy, N. Badrinath,

and K. Reddy Madhavi

Abstract Recent improvements in marine science research have increased the

importance of underwater ﬁsh species identiﬁcation. Using technology to auto-

mate ﬁsh species identiﬁcation would positively impact marine biology. Since

deep learning techniques, image classiﬁcation problems have become increasingly

popular. Wild natural habitats make it harder to identify ﬁsh species because of

the complex background and noise in the raw images. Some of the most advanced

approaches for categorizing ﬁsh species in their natural habitats have been devel-

oped in the previous decade. This paper demonstrated an automated approach for

classifying ﬁsh species based on deep residual networks. Existing transfer learning

models do a good job working with smaller datasets, but not so effectively. A novel

RESNET model (SmallerRESNET) is developed to reduce the overﬁtting generated

by the standard pre-trained RESNET model. Convolutional and fully linked layers

are used in the more straightforward form of the RESNET model. We evaluated and

compared six different versions of the RESNET model. In addition to the number

of convolutional and fully connected layers, the number of iterations required to

achieve 80.56% accuracy on training data, batch size, and the dropout layer is exam-

ined. Compared to the original RESNET model, the proposed and modiﬁed RESNET

model with fewer layers obtained 90.26% testing accuracy with a validation loss of

0.0916 on an untrained benchmark ﬁsh dataset. The inclusion of a dropout layer

M. Sudhakara (B)

School of C & IT, Reva University, Bangalore, India

e-mail: malla.sudhakara@reva.edu.in

Y. Vijaya Shambhavi

E.E.E, Annamacharya Institute of Technology and Sciences, Tirupati, AP, India

R. Obulakonda Reddy

C.S.E, Institute of Aeronautical Engineering, Dundigal, Hyderabad, India

N. Badrinath

C.S.E, Annamacharya Institute of Technology and Sciences, Tirupati, AP, India

K. Reddy Madhavi

C.S.E, Sree Vidyanikethan Engineering College, Tirupati, AP, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_31

327

328 M. Sudhakara et al.

enhanced our proposed model’s overall performance. It is more efﬁcient with less

memory, fewer training photos, and less computing complexity than its predecessor.

31.1 Introduction

A ﬁsh detection and recognition system that can recognize, localize, and categorize

ﬁsh and ﬁsh species in underwater photographs would be helpful in a variety of

marine applications. Monitoring duties could include, among other things, population

counting, identifying species present in a speciﬁc area, and surveying ﬁsh movement

patterns. This could happen in dams, streams, lakes, ﬁsh farms, and even the ocean. In

the context of ﬁshing operations, the application could be used to parameterize a ﬁsh

school to estimate the distribution of quantities of various species more efﬁciently.

As a result, the expected bycatch and discard rates can be calculated. Commercial

ﬁsheries can use this information to investigate an identiﬁed school of ﬁsh before

spending quotas ﬁshing it.

Its signiﬁcance in oceanography and marine research is that detecting ﬁsh is very

important in underwater object detection. Classiﬁcation of ﬁsh kinds is helpful to

researchers, ocean scientists, and biologists [1]. It is also helpful to specify biomass

levels and the geological processes in the oceans. Because ﬁsh species recognition is

so important, various computer vision approaches for reliably categorizing different

ﬁsh species have been proposed. Fish species can be classiﬁed as follows:

1. Identifying ﬁsh species from dead ﬁsh carcasses.

2. Recognizing various ﬁsh species in a simulated habitat.

3. Identifying ﬁsh species existing in their local territory.

Finding underwater ﬁsh and their species is complex. Fish recognition research

has been conducted on both dead ﬁsh and ﬁsh that have been removed from the water

or kept in artiﬁcial tanks [2]. However, no serious research into underwater habitats

has been conducted. This is due primarily to the natural environment of the sea,

which contributes to its abundance. The underwater ﬁlms are difﬁcult to watch due

to their poor quality, complicated backgrounds, and low luminance. By visualizing a

ﬁsh species’ movements and activities, we could better evaluate the species’ overall

movement and activity. As a result of the introduction of deep learning algorithms,

image classiﬁcation has become a rapidly growing research topic. This is primarily

because, like CNN, deep learning approaches, no feature extraction is needed before

model training. Most existing object recognition frameworks rely on photographs

taken from the ground or satellites.

31 Fish Classiﬁcation System Using Customized Deep Residual … 329

31.2 Related Works

The rich background and noise of underwater images made it difﬁcult for identifying

ﬁsh species. Many academics have proposed advanced approaches to categorizing

ﬁsh species in natural habitats. There have recently been many advances in ML

methods for underwater species categorization. The algorithms are used to shape

and texture to classify dead ﬁsh samples. General ﬁsh classiﬁcation is complicated

in variable luminance, background noise between the reef and aquatic vegetation,

and water turbidity. Proper classiﬁcation is difﬁcult because many ﬁsh species have

similar shapes, textures, and colors. Shafait et al. [3] used abundance and biomass

to identify ﬁsh species in wild habitats. Hernández-Serna and Jiménez-Segura used

ANN to identify morphology, texture, and geometry [4]. The authors in [3]used

hierarchical categorization to identify active ﬁsh in an open sea. Sun et al. identiﬁed

ﬁsh kinds using poor-resolution photos.

Hsiao et al. [5] classiﬁed 25 ﬁsh types accurately, about 81.8% using S.R.C. and

P.C.A. dataset with 15 variety ﬁsh types and approximately 24,000 photos; Huang

et al. [6] used Gaussian mixture models and SVMs to achieve an identiﬁcation rate of

74.8%. It was introduced in the late twentieth century, but it was unpopular because

it required extensive supervised training and could not deal with difﬁcult or complex

situations. Convolution is a widely used operation in signal processing and computer

vision. Convolutional neural networks are now widely used in computer vision to

reduce noise and identify edges. Convolutional neural networks (CNNs) are like

artiﬁcial neural networks (ANNs), gaining popularity in AI and machine learning

applications. Face and object recognition are two applications of CNNs and their

modiﬁcations. The introduction of powerful GPUs has facilitated the training of

deep and complex neural networks [7].

Rathietal.[8] recommended a deep learning and image processing system. Pre-

processing techniques included Otsus thresholding, erosion, and dilation. CNN uses

the resultant picture and the actual picture to categorize ﬁsh types. Fish4Knowledge

is a utilized dataset with a 96.29% accuracy rate. They proposed using a mono-image

with high-resolution to super-resize a low-resolution picture. A linear SVM and two

deep learning approaches, PCANet and NIN, classify the species. The dataset used

was FishCLEF2015, with PCANet achieving 77.27% precision and NIN achieving

69.84%. Qin et al. [9] are among those who have contributed to this work.

Salman et al. [10] suggested using a CNN in a hierarchical feature combina-

tion system to analyze species-dependent properties. SVM and KNN are applied to

extract and classify. The models FISHCLEF 2014 and FISHCLEF 2015 were used.

The framework was 97.41% accurate. Mohammed et al. introduced an SVM archi-

tecture and strategies to extract the features for the classiﬁcation of Nile Tilapia. The

SIFT and SURF algorithms extracted image features, promising results. In these

frameworks, machine learning was used to classify ﬁsh species. Other frameworks

employ more traditional methods. Moniruzzaman et al. [11] provide an overview

of classiﬁcation strategies for underwater ﬁsh species. Deep learning, for example,

has achieved outstanding results in visual recognition and detection. Previously,

330 M. Sudhakara et al.

researchers had difﬁculty obtaining satisfactory results. It was difﬁcult for various

reasons, including an insufﬁcient sampling of ﬁsh species and sparse datasets. The

recognition accuracy varies depending on the situation. Many scientists are working

on complex models to learn and extract complex features using popular deep learning

models such as ResNet [12]. This study proposes an efﬁcient and instant classiﬁcation

method for low-quality image datasets.

31.3 Deep Residual Network (DRN)

Deep networks have made signiﬁcant advances in image classiﬁcation. However,

there is a degradation problem when considering the convergence of deeper networks,

i.e., as the accuracy increases, it reaches saturation. Accuracy suffers as depth

increases. Overﬁtting is no longer a factor since an error in training and test sets

raises. The research team examined numerous deep learning models and discovered

that RESNET outperformed other models in classiﬁcation performance and that

increasing the depth improves accuracy. The main goal of the RESNET model is to

overcome neural network degradation risk, in which the training error rate increases

with depth [13]. As a solution, the team suggested a residual structure. As a residual

input function, the network layer function is reprogrammed. The residual contrasts

the actual observation and the estimated value in mathematics. RESNET residual

components are mentioned based on project requirements, and Fig. 31.1 depicts a

sample RESNET-20 residual component. Two convolution layers are used in the

residual component. Because the residual component’s input and output dimensions

are similar and could be added immediately, the convolution kernel size of 3 * 3.

When step size is one, padding layer is the original input layer; when step size is two,

the RESNET input performs the same function twice, followed by average pooling

to generate the ﬁller layer. Eventually, the output of the ﬁlling layer plus the residual

component equals the output layer input.

CNNs help extract features or like a classiﬁer. It is a set of convolutional ﬁlters

which extract feature vectors from images using a kernel. The pooling layer subsam-

ples the maximum (M.A.X. pooling) or average (A.V.G. pooling) kernel’s pixel value.

Fully connected layers map with all incoming feature vectors to the next layer. The

framework for identifying ﬁsh species is depicted in Fig. 31.2. 3 convolution layers

are used in the system, including a max-pooling layer. Conv 1, Conv 2, Conv 3 denote

the convolutional layers, and Max P denotes the max-pooling operation. Furthermore,

3x3, 64 ReLU 3x3, 64 ReLU

Fig. 31.1 RESNET-20 residual components

31 Fish Classiﬁcation System Using Customized Deep Residual … 331

Conv 1 Conv 2 Conv 3

Max P Max P Max P

Fig. 31.2 Conventional structure of DRN

two additional layers are used to classify ﬁsh species. The primary advantage of max-

pooling is to extract the features, i.e., edge detection. The identiﬁcation of ﬁsh species

is dependent on determining the ﬁsh border. Image sharpening is used on ﬁsh photos.

Because it averages the kernel pixels, average pooling may be insufﬁcient. The ﬁrst

convolution layer contains thirty-two 3 ×3 ﬁlters; the next contains sixty-four 3 ×

3 ﬁlters, and the third contains 128 3 ×3 ﬁlters. Following the convolution layers is

a2×2 max-pooling layer. A dropout layer is placed before a fully connected layer

to prevent overﬁtting. The rectiﬁed linear unit (ReLU) is utilized in total convolution

layers. Softmax (ﬁnal layer) is utilized as an activation function. However, due to

the models’ ﬁxed parameters, the model’s accuracy is degraded since the original

models are trained on large benchmark datasets. Further, CNN gives better results

on quality image datasets but poorly on the low-scale and low-quality datasets.

31.4 Proposed Method

Preprocessing the sample images is an essential step in recognizing underwater ﬁsh

species. This is signiﬁcant because the sample images are blurry due to the uncon-

trolled environment. As a result, the classiﬁer may fail to learn species-speciﬁc

features accurately. Figure 31.3a depicts the proposed work’s customized architec-

ture, whereas Fig. 31.3b depicts the proposed work’s dropouts between the input

and output layers. In the architecture, three convolutional layers are followed by

two dense layers. Batch normalization and dropouts are used to train a more proba-

bilistic model, reducing overﬁtting caused by the underlying RESNET model. During

testing, real-time image classiﬁcation is possible. The mini-batch mean and standard

deviation are subtracted and divided to normalize layer inputs in batch normalization.

A mini-batch, for example, contains only a subset of the total training data.

ˆp=p−E[p]

√Var [p](31.1)

As a result of normalization, the input distribution to each neuron is the same,

removing the problem of internal covariate shift and enabling regularization. But, the

network’s representative strength has been substantially weakened. Normalization

332 M. Sudhakara et al.

Classification

Dropout on

hidden layer Input laye

CONV

DROPOUT

CONV

DROPOUT

DENSE

DROPOUT

DENSE + S

Output

Dataset

[*2]

Fig. 31.3 Proposed SmallerRESNET. aArchitecture used to train small-scale underwater images.

bRegularization using dropouts

of each layer loses speciﬁc nonlinear correlations and weight changes produced by

the previous layer. This can result in suboptimal weighting. This is ﬁxed by adding

gamma- and beta-trainable parameters to the normalized value.

z=z·ˆp+β(31.2)

Another feature of CNNs is a dropout layer. While some neurons’ inputs are

hidden, others remain unaffected. It is possible to use a dropout layer on an input

vector to remove some of its features or a hidden layer to remove some of its hidden

neurons. When training CNNs, dropout layers prevent overﬁtting on the training

data. The ﬁrst batch of training data has a disproportionate impact on learning if it

is missing. This would prevent learning traits from being discovered in subsequent

samples or batches.

31 Fish Classiﬁcation System Using Customized Deep Residual … 333

31.5 Results and Discussion

For the experimental analysis to be accurate, the study is carried out on an NVIDIA

4 G.B.—GPU and a Core-i5 CPU of the 9th generation. RESNET’s transfer learning

model is being tested on a challenging dataset of underwater photos as the study’s

primary objective.

The Croatian dataset contains 794 images of 12 different species. Six hundred

twenty-six training images and 168 testing images were considered for the training

model [14]. Despite their low resolution, the images are rescaled to 224 * 224 to

obtain more generalized results at the expense of image quality. To accomplish this,

six different versions of RESNET models are investigated, with the models trained on

the Croatian dataset. For each trained model, the accuracy and loss during training,

the accuracy and loss during validation, and the accuracy and loss during validation

are calculated. The loss is calculated using categorical cross-entropy and the Adam

optimizer. Basic augmentation techniques such as scaling, shearing, zooming, and

ﬂipping are applied to expand the proportion of samples in each class.

The RESNET50, RESNET101, and RESNET152 models have lower training and

validation accuracy and higher training and validation loss than the other models.

Figure 31.4 depicts the version1 models’ training and testing accuracy, as well as

training and testing loss (RESNET50, RESNET101, and RESNET150). The second

version models (RESNET50V2, RESNET101V2, and RESNET152V2) outperform

the ﬁrst version models in training and validation accuracy. When version2 models

are compared to their predecessors in terms of training loss, the version2 models have

a lower training loss but a signiﬁcantly higher validation loss. The models appear to

be becoming overﬁt for the provided dataset. Figures 31.5 and 31.6 depict a statistical

comparison of the two models under consideration’s outcomes. In addition to the drop

out layers, the original RESNET model (SmallerRESNET) is modiﬁed by including

the batch normalization and drop out layers. Table 31.1 contains six trained models

as well as a model that was proposed. The results of the proposed architecture are

showninFig.31.7. In addition to accuracy, the proposed framework considers the

metrics F1-score, precision, and recall.

31.6 Conclusion

In their natural habitat, ﬁsh is difﬁcult to distinguish. For underwater ﬁsh classiﬁ-

cation, we proposed a deep CNN-based technique. Researchers will beneﬁt from a

better understanding of underwater animals’ life cycles and habitats. This will aid

ﬁsh shippers and ﬁshery managers in saving ﬁsh. It is made up of three convolu-

tional layers and one fully linked layer. Our proposed method signiﬁcantly outper-

forms the existing RESNET model in classiﬁcation. The ﬁrst RESNET model

employed massive deep neural networks trained on millions of images using 750

photos, including a dropout layer before the softmax classiﬁer improves the model’s

334 M. Sudhakara et al.

Fig. 31.4 RESNET50, RESNET101, RESNET152 models during training the model for 100

epochs. Top: RESNET50, middle: RESNET101, bottom: RESNET150

performance. On average, this model outperforms RESNET. It surpasses the original

RESNET model with an accuracy of 90.26%. The model performs well with fewer

training images and consumes less memory. Undersea organisms can be harmed by

turbid water and background noise.

31 Fish Classiﬁcation System Using Customized Deep Residual … 335

Fig. 31.5 Comparison of training accuracy and validation accuracy of the six residual models

Fig. 31.6 Comparison of training loss and validation loss of the six residual models

Table 31.1 Metrics of six transfer learning models along with the proposed architecture

Architecture Training accuracy Training loss Test accuracy Test loss

RESNET50 41.10 2.1482 37.50 2.0473

RESNET101 39.41 2.0354 38.69 2.3148

RESNET152 40.68 2.0057 37.50 2.5220

RESNET50V2 99.58 0.0388 77.98 6.1305

RESNET101V2 98.94 0.0844 81.55 5.9097

RESNET152V2 99.79 0.0073 77.38 5.8286

SmallerRESNET

(proposed)

80.56 0.5526 90.26 0.0916

336 M. Sudhakara et al.

Fig. 31.7 Visualized results of SmallerRESNET architecture during training for 100 epochs

References

1. Sudhakara, M., & Meena, M. J. (2021). Multi-scale fusion for underwater image enhancement

using multi-layer perceptron. IAES International Journal of Artiﬁcial Intelligence, 10(2), 389.

2. Shu, L., Ludwig, A., & Peng, Z. (2021). Environmental DNA metabarcoding primers for

freshwater ﬁsh detection and quantiﬁcation: In silico and in tanks. Ecology and Evolution,

11(12), 8281–8294.

3. Shafait, F., Mian, A., Shortis, M., Ghanem, B., Culverhouse, P. F., Edgington, D., Cline, D.,

Ravanbakhsh, M., Seager,J., & Harvey, E. S. (2016). Fish identiﬁcation from videos captured in

uncontrolled underwater environments. ICES Journal of Marine Science, 73(10), 2737–2746.

4. Hernández-Serna, A., & Jiménez-Segura, L. F. (2014). Automatic identiﬁcation of species with

neural networks. PeerJ, 2, e563.

5. Hsiao, Y., Chen, C., Lin, S., & Lin, F. (2014). Real-world underwater ﬁsh recognition and

identiﬁcation using sparse representation. Ecological Informatics, 23, 13–21.

6. Jin, L., & Liang, H. (2017). Deep learning for underwater image recognition in small sample

size situations. In OCEANS 2017-Aberdeen, 2017 (pp. 1–4).

7. Rudra Kumar, M., & Kumar Gunjan, V. (2020). Review of machine learning models for credit

scoring analysis. Revista Ingeniería Solidaria, 16(1).

8. Rathi, D., Jain, S., & Indu, S. (2017). Underwater ﬁsh species classiﬁcation using convolu-

tional neural network and deep learning. In International Conference of Advances in Pattern

Recognition.

9. Qin, H., Li, X., Liang, J., Peng, Y., & Zhang, C. (2016). Deepﬁsh: Accurate underwater live

ﬁsh recognition with a deep architecture. Neurocomputing, 187, 49–58.

10. Salman, A., Harvey, E., Jalal, A., Shafait, F., Mian, A., Shortis, M., & Seager, J. (2016).

Fish species classiﬁcation in unconstrained underwater environments based on deep learning.

Limnology and Oceanography, Methods, 14, 570–585.

11. Moniruzzaman, M., Islam, S., Bennamoun, M., & Lavery, P. (2017). Deep learning on under-

water marine object detection: A survey. In International Conference on Advanced Concepts

for Intelligent Vision Systems (pp. 150–160).

12. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–

778). https://doi.org/10.1109/cvpr.2016.90

31 Fish Classiﬁcation System Using Customized Deep Residual … 337

13. Zhang, X., Zhou, X., Lin, M., & Sun, J. (2018). Shufﬂenet: An extremely efﬁcientconvolutional

neural network for mobile devices. In Conference on Computer Vision and Pattern Recognition

(pp. 6848–6856).

14. Jäger,J., Simon, M., Denzler, J., Wolff,V., Fricke-Neuderth,K., & Kruschel, C. (2015). Croatian

ﬁsh dataset: Fine-grained classiﬁcation of ﬁsh species in their natural habitat. Swansea: Bmvc.

Chapter 32

Multiple Face Recognition System Using

OpenFace

Janakiramaiah Bonam, Lakshmi Ramani Burra,

Roopasri Sai Varshitha Godavarthi, Divya Jagabattula, Sowmya Eda,

and Soumya Gogulamudi

Abstract The digitalization of human work has been an ever-evolving process.

Student’s and employee’s attendance systems are automated by using ﬁngerprint

biometrics. Speciﬁcally covid situation has created the need for touchless attendance

system. Many institutions have already implemented a face detection-based atten-

dance system. However, the major problem in designing face-recognising biometric

applications is the scalability and accuracy in time to differentiate between multiple

faces from a single clip/image. This paper used the OpenFace model for face recog-

nition and developed a multi-face recognition model. The Torch and Python deploy-

ment module of deep neural network-based face recognition was used, and it was

predicated accurately in time.

32.1 Introduction

Authentication is one of the most signiﬁcant challenges in society. One of the most

well-known technologies for authentication is human face recognition. The solution

to this problem has substantially improved since the release of FaceNet article was

published. The majority of the research is based on this foundation. The perspective,

dimensions and luminance of the face all impact the performance. The OpenFace

model, which is restricted to images with one face and predicts the individual’s name

and is not suitable for the group image, such as the one below, fails to complete the

task shown in Fig. 32.1. The challenge is to make the OpenFace algorithm work for

all the faces in an image hence bringing multi-face recognition into play and the

block diagram is shown in Fig. 32.2.

FaceNet produces signiﬁcant face mapping from images using deep learning

models like ZF-Net and Inception. FaceNet obtained cutting-edge outcomes in

J. Bonam (B)·L. R. Burra ·R. S. V. Godavarthi ·D. Jagabattula ·S. Eda ·S. Gogulamudi

Prasad V. Potluri Siddhartha Institute of Technology, Vijayawada, Andhra Pradesh, India

e-mail: janakiramaiah@pvpsiddhartha.ac.in

L. R. Burra

e-mail: blramani@pvpsiddhartha.ac.in

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_32

339

340 J. Bonam et al.

Fig. 32.1 A group photograph with multiple faces

Fig. 32.2 Block diagram of OpenFace algorithm

various standard face recognition datasets, including labeled faces in the wild (LFW)

and the Youtube Face database. Then, as a loss function, it used a procedure known

as triplet loss to train this framework.

The idea behind triplet loss in our architecture is to force the framework to impose

an edge among faces of multiple personalities. The primary objective of this loss

function is to minimise the squared distance of two image embeddings which are

impartial of image circumstance and pose of the same individuality while increasing

the squared distance between two images of different identities. Apart from accessible

facial recognition, OpenFace’s modelling approach concentrates on real-time face

recognition on portable devices, allowing you to train a high precision model with

less data.

This study presents a multi-face recognition model using the OpenFace algo-

rithm based on deep learning networks. The transfer learning is used to modify

32 Multiple Face Recognition System Using OpenFace 341

the OpenFace algorithm so that it can be used for multiple faces in a single

image/photograph.

32.2 Related Work

There are various techniques for face recognition, beginning with one of the most

widely used Eigenfaces [1] until the recent approach utilising deep learning, e.g.

deepface [2]. Principal component analysis (PCA) is used in Eigenfaces to minimise

the dimensionality of a series of facial photographs. This approach was developed

for a face categorisation challenge, and is largely viewed as the father of facial

recognition technology [3]. The eigenvectors are created from the covariance matrix

using this procedure, which involves converting each picture into a vector with each

value representing the importance of a single image pixel. Other techniques, like

linear discriminant analysis (LDA) [4] and support vector machine (SVM) [5–8]

effort to compensate for the inadequacies of LDA and SVM.

Face Net: The training images are resized, converted and ﬁrmly trimmed around

the facial area. FaceNet’s loss function is also an essential characteristic. It employs

the triplet loss function. FaceNet differs from other methodologies in that it under-

stands the mapping from images and generates embeddings instead of relying on

any bottleneck layer for identiﬁcation or conﬁrmation tasks. Once the embeddings

are produced, all other tasks such as conﬁrmation, identiﬁcation and so on can be

conducted using conventional methods in that domain, with the newly produced

embeddings serving as the feature vector. For instance, we can utilise k-NN to recog-

nise faces through embeddings as the feature vector. We can utilise any grouping

method to group the faces together by describing a threshold value for conﬁrmation.

TripletLoss Function in Face Net: The idea behind the triplet loss function is that the

anchor image (an image of a particular individual A) to be nearer to favourable char-

acteristics (all of person A’s images) than to negative images (all the other images),

i.e. we require the displacements among anchor image’s embedding and the embed-

dings of our positive images to be relatively small than the ranges among anchor

image’s embedding and the embeddings of our negative images (Fig. 32.3).

Fig. 32.3 Original image sample considered in this study

342 J. Bonam et al.

Fig. 32.4 Original image

sample

Face Detection: Our input images should have properties similar to those used to

train the CNN model to achieve the best results. As a result, the project’s ﬁrst step

is to be able to extract all of the faces presented and apply proper transformations

so that the model input meets the following requirements: The image should be 96

pixels by 96 pixels in size. In the image, only one face is permitted. The face must

be aligned so that the line connecting the two eyes is horizontally oriented and 20

pixels from the top of the image.

Open Face AlignDilib Utility: There are some pre-trained libraries available; in

this particular instance, we are using the AlignDlib functionality from the OpenFace

proposal, which detects and aligns the face using pre-trained landmarks. We can also

use OpenCV’s Haar cascade frontalface, which has already been trained.

This alternative, nevertheless, does not include the alignment task. Figures 32.4,

32.5,32.6 and 32.7 depicts different phases of a sample image processed using

AlignDilib utility. The ﬁnal face detected and the aligned image is shown in Fig. 32.7.

32.3 Proposed Methodology

The aim of this work is to develop a model that can detect and recognise all people

whose faces are present in a picture or real-time captured video and the Logical ﬂow

of Open Face is shown in Fig. 32.11.

Data Preprocessing: As discussed earlier, we have a better chance of training the

model reliably if we have more data. On the other hand, data is expensive, so we

must make the most of what we have. By changing the original image’s orientation,

brightness and contrast, we will be able to produce more data for the database.

When it comes to the actual job, this will also help reduce the effect of the photo’s

32 Multiple Face Recognition System Using OpenFace 343

Fig. 32.5 Face detection

performed

Fig. 32.6 Detected face

from the image

orientation and lighting. To preprocess the data, that is available at hand, will bring

better outcomes. Applied preprocessing techniques are shown in Figs. 32.8,32.9 and

32.10.

OpenFace CNN for Multi-face Detection: OpenFace is a lightweight model. Open-

Face model expects 96 ×96 RGB images as input. It has an output consisting of 128

dimensions. It is developed on the Inception Resnet v1 framework. The model is more

complex but has less parameters compared to other models. Similar to FaceNet and

344 J. Bonam et al.

Fig. 32.7 Detected face

using AlignDilib

Fig. 32.8 Image with

changed orientation

32 Multiple Face Recognition System Using OpenFace 345

Fig. 32.9 Changing contrast

of detected

Fig. 32.10 Changing

gamma of detected face

346 J. Bonam et al.

Fig. 32.11 Logical ﬂow of OpenFace

VGG Net algorithms, the algorithm applies one-shot learning. The model is based on

the condition that different photos of the same person must have less distance among

them, whereas different person photos should be at more distance. The distance that

is considered can be Euclidean distance (Fig. 32.11).

OpenFace detects the facial region in a picture using dlib [2] and generates a box

over each face that may be in various positions. This might be a problem if utilised

as input to the network right away, thus it has to be preprocessed. OpenFace uses 2D

afﬁne transformation as a preprocessing approach, which resizes and crops photos to

the boundaries of the landmarks created by the dlib face detector, bringing the nose

and eye corners closer to mean locations. The outcome of this transformation is a 96

×96 pixel normalised picture. Rather than training the neural network to identify

individuals one by one, the primary concept is to teach it to determine if two images

belong to the same individual.

The training stage will use three input images to do this: an anchor image of

the test subject, [9] a positive image of the same person, and a negative image of a

different person. The network will next use a triplet-loss function to try to reduce the

distance between and while maximising the distance between. Although the model

structure is complex, we just need to know that the input dimension is 3 (channels)

×96 (pixels) ×96 to use it (pixels). It provides an output with a dimension of 1 ×

1×128, which is the image embedding, after numerous convolutional layers, one

of which is an inception network. If the model is correct, images of the same person

should have a short embedding distance, whereas images of different people should

have a large embedding distance.

32 Multiple Face Recognition System Using OpenFace 347

Extraction and Alignment Setup:InFigs.32.12 and 32.13, the embedding distance

between both positive images to the input and negative images to the input image is

the same. This is not expected. This is taking place because the model is contrasting

the whole picture, i.e. the face and body and background. In order to compare only

faces, the images must undergo the extraction and alignment. Once the above step

is completed, we have to generate a database by storing the embedding values in a

dictionary for the comparison task. Our proposed framework, focuses on the above-

discussed criteria utilising them to train a multi-face detection model.

Multi-face Detection Model: Training is a vital task that feeds the input to the model.

We are focusing our model to be trained in such a way that it promotes real-time

Fig. 32.12 Embedding distance between the input image and positive image

Fig. 32.13 Embedding distance between the input image and negative image

348 J. Bonam et al.

multi-face recognition. Real-time multi-face detection enables us to achieve a multi-

face recognising biometric attendance system. The below ﬂow of steps help us to

achieve a multiple face recognition model.

•Detect, extract and align all faces presented in the picture.

•Pass all the output faces to the CNN, which return the corresponding embedding

values.

•For each face, calculate its embedding distance with all the values stored in the

database and return the identity of the person with smallest distance.

•Attached the label to the corresponding face in the original image.

The above steps help us achieve a multi-face recognition system based on Open

Face. As the major algorithm utilised is Open Face and we have only multiplied the

usage of the Open Face algorithm for multiple faces in a single image. The ﬁnal

accuracy and loss of the model will be the same as the Open Face model for face

recognition.

32.4 Results

We can transform our model to work for real time and live feed by simply adding a

piece of further implementation that can collect each frame, pre-process each frame,

and pass it to the trained model as a stream of single images. Figure 32.14 shows

the multiple face recognition from a group photo by the model and this achieves the

scalability and accuracy in time to differentiate between multiple faces from a single

clip/image.

Fig. 32.14 Multiple face recognition from a group photo by the model

32 Multiple Face Recognition System Using OpenFace 349

32.5 Conclusion

The OpenFace facial recognition package has been elaborated in its use case in this

paper. It shows that the embedding distance can be calculated for multiple faces

in a single image and replicate the OpenFace algorithm to every face in an image

leading to a multi-face recognition model for bio-metric attendance systems, saving

attendees’ time. This paper provides a greater utilisation and performance of the

OpenFace algorithm for multi-face detection from a single image.

References

1. Santoso, K., & Kusuma, G. P. (2018). Face recognition using modiﬁed OpenFace, In 3rd

International Conference on Computer Science and Computational Intelligence.

2. Zulﬁqar, M., Syed, F., Khan, M. J., & Khurshid, K. (2019). Deep face recognition for biometric

authentication. In 2019 International Conference on Electrical, Communication, and Computer

Engineering (ICECCE).

3. Cao, Q., Shen, L., Xie, W., Parkhi, O. M., & Zisserman, A. (2017). VGGFace2: A dataset for

recognising faces across pose and age. arXiv preprint arXiv:1710.08092

4. Xu, M., Cheng, W., Zhao, Q., Ma, L., & Xu, F. (2015) Facial expression recognition based on

transfer leaming from deep convolutional networks (pp. 702–708). IEEE.

5. Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A uniﬁed embedding for face

recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and

Pattern Recognition (pp. 815–823).

6. Janakiramaiah, B., Kalyani, G., Karuna, A., et al. (2021). Military object detection in defense

using multi-level capsule networks. Soft Computing.https://doi.org/10.1007/s00500-021-059

12-0

7. Janakiramaiah, B., Kalyani, G., & Jayalakshmi, A. (2021). Automatic alert generation in a

surveillance systems for smart city environment using deep learning algorithm. Evolutionary

Intelligence, 14, 635–642. https://doi.org/10.1007/s12065-020-00353-4

8. Rao, B. N. K., & Rao, B. B. K. (2019). Defect detection in printed board circuit using image

processing. International Journal of Innovative Technology and Exploring Engineering, 9(2).

9. Wolf, L., Hassner, T., & Maoz, I. (2011). Face recognition in unconstrained videos with matched

background similarity. In 2011 IEEE Conference on Computer Vision and Pattern Recognition

(CVPR) (pp. 529–534). IEEE.

Chapter 33

EDAARP-Efﬁcient and Data-Aggregative

Authentic Routing Protocol for Wireless

Sensor Networks

Kurakula Arun Kumar and Karthikeyan Jayaraman

Abstract Wireless sensor networks (WSN) are quickly gaining a lot of research

attention, and several communications are required for data sensing in WSN. The

sensor nodes collect data from multiple locations and relay it to the central control

unit. However, few of the nodes have insufﬁcient resources. Despite the fact that

existing routing algorithms support cluster-based routing, these techniques do not

consider all of the resource constraints associated with the nodes. The goal is to

discuss the design and implementation of a cluster-based network strategy that uses

Presumptive Data Gathering (PDG) and Selective Information Standards (SIS) algo-

rithms. “This aims to improve energy efﬁciency by creating a fully connected dedi-

cated node that connects all sub-clusters” is connected using the best processes of a

controlled set associated with it, and in these cases, relay nodes are used. The Efﬁ-

cient Data-Aggregative Authentic Routing Protocol (EDAARP) is proposed in this

work, to preserve actual energy usage and even consistent transmission of sensed

data.

33.1 Introduction

In WSN, the energy that is being used by the nodes is predominantly related to the

amount of data that is being transmitted among the nodes. The maximum amount

of energy provided for the communication is consumed for data transmission [1]. In

WSN, data transmission possesses a major signiﬁcance as they are networks that are

data-dependent. In WSN, through multiple hops or single-hop transmission proto-

cols, the data is routed to the sink nodes. During this state, routing has a crucial role

in the process of data accumulation. In WSNs, the algorithm designed for routing

offers higher importance toward the energy utilization of each of the sensor nodes.

K. Arun Kumar (B)·K. Jayaraman

SITE, VIT, Vellore, Tamil Nadu, India

e-mail: arun.aitsr@gmail.com

K. Jayaraman

e-mail: karthikeyan.jk@vit.ac.in

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_33

351

352 K. Arun Kumar and K. Jayaraman

From [1], it can be determined that ﬂat network protocol [2] lessens the network

throughput along with increasing the network lifetime. System computational intri-

cacy is increased to a higher level in hierarchical protocols. This is possible by

introducing geographical information-based protocols that need a further setup like

Global Positioning Systems (GPS) devices that are to be mounted on sensor nodes.

Assuring the data delivery during the failure of the nodes and interference in the

communication is the major challenge associated with routing in sensor networks. In

WSN, the most commonly used design principles are clustering and data aggregation

[3]. Clustering in WSN is very common and the main aim of clustering is to lessen

the number of inter-cluster links and even minimize the network trafﬁc. Further,

clustering in WSN helps in attaining network stability. Unlike the other traditional

protocols, the data aggregation protocols [1,3,4] concentrated on the communication

overhead and reacts to the response of the network. Better data gathering networks

are attained in data aggregation protocols as the receiver nodes in these protocols

wait for receiving the data from its nearest node instead of just sending data to

other nodes as soon as it receives. In the data aggregation method, the redundant data

from various sources is eliminated. With this, the bandwidth and power consumption

are preserved. In [5,6], the main innovation associated with the PLAPS protocols

is associated with the establishment of the node to node connectivity. This helps

in successful data transmission. Utilizing the corresponding functional reputations,

the required data is added in PLAPS. Even credible data aggregation processes are

applied in PLAPS. The sender node directs the packet toward its dominator node to

initiate the communication among the nodes that belong to various clusters. Once

the receiver node is discovered, the data is forwarded to its subsequent nodes by

the dominator node. Upon receiving the data, the packets are propagated by the

terminal nodes to neighboring cluster nodes within their range. Relay nodes operate

as boosters when surrounding cluster nodes are out of range of the terminal node.

With this, the neighboring cluster nodes are taken into the range of transmission.

The major assumptions in the projected model are as follows:

•Sensor nodes would be distributed unevenly in the predeﬁned locations. The base

station (BS) is located outside of the network. Static clusters can be seen in the

sensor ﬁeld. Each sensor node and cluster head has their identiﬁcation.

•After deployment, sensors cannot be moved.

•All the nodes possess equal communication and computation capabilities, and

therefore, network is homogeneous.

•Every sensor node can work as an information serving node.

•All nodes own the capacity of converting them into cluster heads.

When the receiver is not in the range and once the data packet is received, subse-

quent routing of data packets from one to another node is shown in Figs. 33.1,

and 33.2 shows how this procedure is utilized to avoid overuse of smart nodes for

communication and ensure network reliability.

33 EDAARP-Efﬁcient and Data-Aggregative Authentic Routing … 353

Fig. 33.1 Relay node

communication

Fig. 33.2 Maximum

number of independent sets

33.2 Efﬁcient Data-Aggregative Authentic Routing

Protocol (EDAARP)

Clustering and aggregation techniques can be merged by using the suggested algo-

rithm which allows ideal routing hierarchy creation with the maximum number of

354 K. Arun Kumar and K. Jayaraman

constructive paths. Misra and Thomasinous [3] that associates whole sensor nodes to

the (base station) BS while maximizing the use of WSN resources. In wireless sensor

networks, the suggested protocol is known as Efﬁcient Data-Aggregative Authentic

Routing Protocol (EDAARP). Simple frame work of relay nodes and healing are

represented in (Figs 33.3,33.4 and 33.5).

Fig. 33.3 Relay node for

MCDS

Fig. 33.4 Reﬁned PLAPS

33 EDAARP-Efﬁcient and Data-Aggregative Authentic Routing … 355

Fig. 33.5 Healing

The purpose of this scheme’s design is to provide continuous clustering with

precise data aggregation and suitable additional communication. This ensures that

information is kept private [7,8]. The EDAARP algorithm is divided into four distinct

parts or modules. In module 1, the deployed nodes use their energy levels and even

their neighborhood information to form diverse clusters. In module 1, for effective

data forwarding, a cluster structure is formed and this structure is formed using the

SIS algorithm. The main focus of module 2 is on data management [9]. However,

PDG scheme (Presumptive Data Gathering) [8,9] provides accurate data aggregated

results. This helps in maximizing throughput and reducing the consumption of energy.

However, module 3 concentrates on network connectivity. PLAPS [5] not only offers

the utmost network connectivity for efﬁcient data routing, but it also ensures the

saving of the data packets. At last, module 4 helps in setting a reliable route among the

sensor nodes base station (BS) and sink node. This ensures improved security features

that involve public-key excess Ncryptosystems [2]. For better resource management

in WSN, SIS is used, which is an efﬁcient clustering approach. In SIS, to form a

network, every node associates with its adjacent node. The dominant node and the

member nodes maintain the SIS and sub-clusters, respectively. The utilization of the

SIS scheme helps in organizing the networks and ensuring communication among the

nodes. Thus, there would a successful establishment of clusters and it also ensures

clusters node management. In algorithm 2, the steps involved in (EAARCP) path

discovery method are described.

The main aim of the PDG protocol [4,5] is to provide an effortless approach

to routing as well as management [10,11]. This further helps in the reduction of

routing table size [11]. In this protocol, very little bandwidth is utilized as the mutual

transfer of messages among the nodes is very less. The sensor topology is considered

to have less maintenance and ease of use. The credibility value of each sensor node

is considered in the PDG protocol during aggregation [9] for identifying the data

redundancy. This helps in increasing the aggregation reliability [10–12]. Meaningful

data is provided to the near end by the supporting nodes. Network integrity is an

important feature of this protocol, and this is ensured by sending random informa-

tion to the sink nodes or the nearest end nodes instead of sending the actual value.

356 K. Arun Kumar and K. Jayaraman

Within a cluster, the compromised node or a supporting node is identiﬁed using

presumptive data aggregation process. There would be a more deviation in value for

the non-supporting nodes when compared to the other supporting nodes’ values. The

presumptive value is derived by multiplying the aging factor by the old reliability

value, which is represented by multiplying the product of many supporting nodes

and non-supporting nodes. With the PDG implementation, the compromised nodes

in the network can be easily detected and even the accuracy of data aggregation [8,

9] could be attained. In algorithms 2, the steps involved in PDG and forwarding the

data packets of EDAARP are presented, respectively.

Algorithm 1. Steps for Presumptive Data Gathering

Input: Data Gathering (Xi), presumptive value of sensor node (Vi),

Output: Aggregated Data

Step 1: Each member node Ni receives a request from the Aggregator for their

credibility value Vi.

Step 2: The presumptive value is reported by Sensor Nodes (Ni) to their aggregator

(Vi).

Step 3: The aggregator gets its presumptive value and uses it to ﬁnd out which

nodes are helpful and which aren’t.

Step 4: The aggregator collects data from the nodes that are supported and

generates Dagg.

Algorithm 2: EAARCP: A Step-By-Step Procedure

Input: Sensor nodes, Neighbor information-based clustering, Relay node, Dominator

node, Connector node, and each node’s actions are all inputs.

Output: Data is delivered to the base station after it has been processed.

Begin

List the packets based on their length and type.

if (the packet is in good condition) SIS is used to ﬁnd a neighbor node.

if The packet arrived at the aggregator node via a neighbor node, which

performed the aggregation.

Else

– Using multi-hop communication, try to reach the aggregator node via multiple

neighbor nodes.

end

if Connect to another aggregator node via relay node if a packet is relayed

from one aggregator node to another aggregator node.

33 EDAARP-Efﬁcient and Data-Aggregative Authentic Routing … 357

end

if the connectivity is at its ﬁnest, To send data to the base station, use the

processor and aggregator nodes.

else

establish a connection to forward data using the relay node, connector node,

and dominator node.

end

path is generated, Pack sent

End

33.3 Simulation Study and Performance Analysis:

NS2 simulator [13] is used for simulating the EAARCP. This analysis helps in

evaluating the EDAARP approach performance. Various EDAARP aspects when

compared to InFRA, DRINA, and SPT [4,10–12]. The analysis also measured the

packet delivery ratio, lifetime, and throughputs. Utilizing various node topologies,

the simulations are carried out. The results that are observed in this research are

presented in Figs. 33.6,33.7,33.8 and 33.9.

33.4 Conclusion

WSN operates in a difﬁcult and sensitive environment. As a result, they are frequently

prone to network connection damage. The network is split into separate segments

as a result of node disconnectivity in a certain region. In this research, the relay

Fig. 33.6 Lifetime versus

network size

358 K. Arun Kumar and K. Jayaraman

Fig. 33.7 Packets processed

versus loss

Fig. 33.8 Throughput

versus network size

Fig. 33.9 Packet delivery

versus loss probability

33 EDAARP-Efﬁcient and Data-Aggregative Authentic Routing … 359

nodes and MCDS ensured optimum network connectivity. Also, EDAARP is being

proposed and implemented in the NS2 simulator, which effectively established a

route among the source node and the BS. With PDG methodology, an energy-efﬁcient

architecture is designed utilizing the PAMPS mechanism and the relay nodes, there

is an establishment of maximum network connectivity. As a part of performance

analysis, the performance assessment factors such as packets per packet involved,

throughput, and lifetime are studied. Also, the above-mentioned factors of EDAARP

are compared and contrasted with DRINA, In FRA, and SPT.

References

1. Ozdemir, S. (2007). Secure and reliable data aggregation for wireless sensor networks. In H.

Ichikawa et al. (Eds.), LNCS (Vol. 4836, pp. 102–109).

2. Ozdemir, S. (2008). Secure data aggregation in wireless sensor networks via homomorphic

encryption. Journal of the Faculty of Engineering and Architecture of Gazi University, 23(2)¸

365–373. ISSN: 1304-4915.

3. Misra, S., & Thomasinous, P. D. (2010). A simple, least-time, and energy-efﬁcient routing

protocol with one-level data aggregation for wireless sensor networks. The Journal of Systems

and Software, 83, 852–860.

4. Villas, L. A., Boukerche, A., Ramos, H. S., de Oliveira, H. A. B. F., de Araujo, R. B., &

Loureiro, A. A. F. (2013). DRINA: A lightweight and reliable routing approach for in-network

aggregation in wireless sensor networks. IEEE Transactions on Computers, 62(4).

5. Thai, M. T., Wang, F., Zhu, S., & Zhu, S. (2007). Connected dominating sets in wireless

networks with different transmission ranges. IEEE Transactions on Mobile Computing, 6(7).

6. Rai, M., Verma, S., & Tapaswi, S. (2009). A power-aware minimum connected dominating set

for wireless sensor networks. Journal of Networks, 4(6).

7. Ji, S., He, J. S., Pana, Y., & Li, Y. (2013). Continuous data aggregation capacity in probabilistic

wireless sensor networks. Journal of Parallel and Distributed Computing, 73, 729–745.

8. Mantri, D. S., Prasad, N. R., & Prasad, R. (2014). Bandwidth efﬁcient cluster-based data

aggregation for wireless sensor network. Computers and Electrical Engineering.

9. Rout, R. R., & Ghosh, S. K. (2014). Adaptive data aggregation and energy efﬁciency using

network coding in a clustered wireless sensor network: An analytical approach. Computer

Communications, 40, 65–75.

10. Amgoth, T., & Jana, P. K. (2014). Energy-aware routing algorithm for wireless sensor networks.

Computers and Electrical Engineering.

11. Zin, S. M., Anuar, N. B., Kiah, M. L. M., & Pathan, A.-S. K. (2014). Routing protocol design for

secure WSN: Review and open research issues. Journalof Network and Computer Applications,

41, 517–530.

12. Wu, X., Yan, X., Huang, W., Shen, H., & Li, M. (2013). An efﬁcient compressive data gath-

ering routing scheme for large-scale wireless sensor networks. Computers and Electrical

Engineering, 39, 1935–1946.

13. The NS-2 simulator.http://www.isi.edu/nsman/ns2

Chapter 34

Mobile-Based Selﬁe Sign Language

Recognition System (SSLRS) Using

Statistical Features and ANN Classiﬁer

G. Anantha Rao, K. Syamala, and T. V. S. Divakar

Abstract This work is to bring mobile-based sign language recognition system into

real time. Selﬁe sign videos are captured with smartphone front camera. Morpho-

logical gradients along with Sobel edge operators are used to extract hand contour

from each sign video frame. Discrete cosine transform (DCT) of hand contour is

optimized by principle component analysis (PCA) to reduce the execution time. The

four statistical features such as mean, skewness, standard deviation, and kurtosis are

calculated for the optimized hand contour DCT. The feature vector formed with these

four statistical features is used for sign classiﬁcation using artiﬁcial neural networks

(ANN) classiﬁer. The performance of SSLRS is evaluated with the word matching

score (WMS).

34.1 Introduction

The international health organizations have ﬁgured that 5% of the total world popu-

lation are hearing impaired. The hearing impaired can’t communicate with others

using acoustic words. The hearing impaired can use sign language to communicate

with others. The signs are formed with hand moments and facial expressions. The

signs are either dynamic or static. The static signs can perform with single moment

of hand, and the dynamic sign can perform with more than one moment of the hands

[1].

In our previous papers, we proposed SSLRS with hand contour DCT optimized

by PCA as feature vector and different sign classiﬁers such as distance classiﬁers,

Adaboost classiﬁer, and ANN. In this paper, the SSLR proposed with statistical

features and ANN classiﬁer. The hand contours of the signer in each frame of sign

videos are obtained, and it is represented with an energy compact representation

G. A. Rao (B)·T. V. S. D iv a kar

Department of ECE, GMRIT, Rajam, India

e-mail: anantharao.g@gmrit.edu.in

K. Syamala

Department of ECE, Avanthi Institute of Engineering and Technology, Cherukupally, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_34

361

362 G. A. Rao et al.

using DCT. The hand contour DCT is treated with PCA. The four statistical features

of PCA-treated DCT are generated. The statistical feature vector (4 ×1) is used for

sign classiﬁcation using ANN.

This paper presents latest literature relevant to SLR in Sect. 34.2, mathematical

models to extract feature vector and sign classiﬁcation with ANN in Sect. 34.3,

results and performance of proposed SSLRS in Sect. 34.3, and the proposed work is

concluded in Sect. 34.5.

34.2 Literature Review

Video segmentation methods using wavelets are proposed to detect hand and head

shape and positions [2]. Tanibata et al. [3] proposed gesture features with orienta-

tion, area, ﬂatness of hand portion. Parul et al. [4] presented features with height,

centroid of hand portion, and distance of centroid from origin of the frame. Rao

et al. proposed SSLR with compact energy features and ANN classiﬁers [5]. And

also tested the performance of SSLR with shape energy features [6]. The better

classiﬁcation rates achieved with compact energy features for Indian sign language

(ISL). Tharwat et al. [7] proposed scale invariant feature transform (SIFT) treated

with linear discriminant analysis (LDA) for American sign language. The optimized

features using LDA were used for sign classiﬁcation using KNN and SVM. The static

signs and ﬁnger spells were categorized into manual, and face expressions were clas-

siﬁed as non-manual category [8]. Lee et al. [9] proposed techniques for capturing

the hand features using kinetic device sensors. SVM is used for training and classi-

ﬁcation of signs based on hand direction, position, and shape. The achieved results

were reasonably good in statistics. Holden et al. [10] presented hidden Markov model

(HMM) and SIFT features along with signs speed. Words and sentences were clas-

siﬁed with accuracy 99% and 97%, respectively. Body and the hand position were

used for the language recognition with independent signer [11]. Successive frames

with no overlap are smallest are chosen to avoid overlapping of signs. HMM is used

for sign classiﬁcation. The performance of recurrent neural networks (partially and

fully connected) was evaluated for Arabic sign language recognition. Fully connected

networks shown better accuracy [12]. Discrete wavelet transform-based (DWT)

features were extracted which were fed to ANN for sign classiﬁcation. Authors used

a database of 32 signs with 640 images. For Khan et al.[13] proposed DWT features

and backpropagation ANN classiﬁer. Global features contained region and boundary

information were used for sign recognition [14]. The seven Hu moments are used

for region information, while Fourier descriptors are used for boundary informa-

tion. Classiﬁcation is done with SVM sign classiﬁer. Kausar et al. [15] proposed

fuzzy-based sign recognition system for which the joint positions and ﬁnger tips are

extracted through the color gloves.

34 Mobile-Based Selﬁe Sign Language Recognition System … 363

34.3 Proposed SSLR

The signs performed with one hand are captured using smart phone front camera by

holding the smart phone with selﬁe stick on the other hand. A sign video database

of 18 Indian words by 10 different signers is created and maintained for SSLRS.

34.3.1 Feature Extraction

Figure 34.1 elaborates the ﬂowchart of SSLR system. The capture noise is removed

using zero mean Gaussian ﬁlters with probability density function

f(x)=1

√2πσ2e−(x−m)2

2σ2

mean and σis standard deviation. The Gaussian ﬁlters with different σvalues

(0.01, 0.1, and 0.15) are used noise removal.

The information regarding the data change in every frame in maximum change

direction is obtained by applying Sobel edge operator (2D gradient). For every frame,

the gradients gxand gyare calculated in xand ydirections.

gx=



m=1

f(x−m,y)g(k)

gy=



m=1

f(x−m,y)gT(k)

where gradient operator gis given by [+1, −1].

Fig. 34.1 SSLR ﬂowchart

364 G. A. Rao et al.

The block thresholding is used to generate the binary image for which the 2D

Sobel convolution masks are used. The block thresholding binary image is given by

BX=



x=1(Sx⊗fx)2+(Sy⊗fy)2

≥



i=1



x=1(Sx⊗fx)2+(Sy⊗fy)2

In the above equation, Sx,Syare the convolutional masks given by

Sx=⎡

⎣

−1−2−1

000

−1−2−1⎤

⎦

Sy=⎡

⎣

−1−2−1

000

−1−2−1⎤

⎦

and sis block size.

Background variations are reduced automatically with the block variational

thresholding. Figures 34.2 and 34.3 show block threshold and global threshold (with

0.2) binary images.

Hand, head contours from binary image are generated by morphological gradients

with connected component analysis as given below.

hC(x)=z|(M3H)z∩Bx= φ−z|(M3H)z∩Bx

hC(y)=z|(M3v)z∩Bx= φ−z|(M3v)z∩Bx

hC(x,y)=hC(x)⊕hC(y)

Fig. 34.2 Block threshold

binary image

34 Mobile-Based Selﬁe Sign Language Recognition System … 365

Fig. 34.3 Global threshold

binary image

where M3V,M3Hare vertical and horizontal directions line masks.

The hand and head contours hC(x, y)hand,hC(x, y)head are separated using four

neighborhood pixel operation which are shown in the Figs. 34.4,34.5,34.6,34.7,

and 34.8.

The DCT of hand contour (hC(x, y))hand is given by

uv =1

4⎛

⎝CuCv



x=1



y=1

hC(x,y)hand cos uπ(2x+1)

Lcos vπ(2y+1)

L⎞

⎠

where Cu=Cv=1

√2for all (u,v)=0

=1 elsewhere.

Figure 34.9 shows the color-coded DCT of hand contour. The maximum energy

of a video frame is concentrated in the ﬁrst 50 ×50 matrix which is because of

the hand movement. In the proposed work, the hand portion is segmented, and the

contour of the hand shape is generated is obtained for which the DCT is calculated.

Fig. 34.4 Frame no. 74

366 G. A. Rao et al.

Fig. 34.5 Binary image

Fig. 34.6 Head and hand

contours

Fig. 34.7 Hand contour

34 Mobile-Based Selﬁe Sign Language Recognition System … 367

Fig. 34.8 Head contour

The ﬁrst 50 ×50 samples are considered for the sign classiﬁcation. Execution time

of those 2500 samples is more. To minimize the execution time, the 50 ×50 matrix

is treated with PCA to obtain 50 ×1 vector [16]. The four statistical features such as

mean, skewness, standard deviation, and kurtosis are calculated for the 50 ×1 vector

that generates a 4 ×1 feature vector which is fed to the ANN for sign classiﬁcation.

Fig. 34.9 Color-coded hand contour energy (DCT) of different frames

368 G. A. Rao et al.

34.3.1.1 Mean

Mean of the 50 ×1 vector is calculated as given in the following equation

μ=50

k=1vk

where vk is the kth sample in the 50 ×1 vector.

34.3.1.2 Standard Deviation

The standard deviation of the 50 ×1 vector is calculated as given in the following

equation

σ=50

k=1(vk −μ)2

where vk is the kth sample in the 50 ×1 vector and μis the mean of the 50 ×1

vector.

34.3.1.3 Skewness

Skewness of the 50 ×1 vector is calculated as given in the following equation

S=50

k=1(vk−μ)3

50 .

34.3.1.4 Kurtosis

The kurtosis of the 50 ×1 vector is calculated as given in the following

K=50

k=1(vk−μ)4

50 .

34.3.2 Classiﬁcation

In the proposed SSLR system, the single hidden layer and multi-hidden layer ANNs

shown in Figs. 34.10 and 34.11 are used for sign classiﬁcation. The extracted 4 ×1

feature vector for every frame of sign video is applied as input to train and test the

ANN with activation function snet=1

1+e−(net), i.e., sigmoid. The performance

of SSLR is veriﬁed the sign recognition rates.

34 Mobile-Based Selﬁe Sign Language Recognition System … 369

Fig. 34.10 Single hidden

layer ANN model for sign

classiﬁcation

Fig. 34.11 Multiple hidden layer ANN model for sign classiﬁcation

370 G. A. Rao et al.

The signs performed with one hand are captured using smartphone front camera

by holding the smartphone with selﬁe stick on the other hand. A sign video database

of 18 Indian words by 10 different signers is created and maintained for SSLRS.

34.4 Result Analysis

The database of selﬁe videos is created for “hai, good, morning, I, am, D, H, R, U, V,

A, nice, to, see, you, thank, you, bye.” For training, these eighteen sign words are kept

in sequence and in different order for testing. Figure 34.12 shows the sample frames;

Fig. 34.13 shows hand, head portions, and Fig. 34.14 shows hand, head contours.

Segmented hand portion and contour are shown in Figs. 34.15 and 34.16.

In the next section, the performance of SSLR system is evaluated with WMS.

Fig. 34.12 RGB frames

Fig. 34.13 Hand and head portions

Fig. 34.14 Hand, head contours

34 Mobile-Based Selﬁe Sign Language Recognition System … 371

Fig. 34.15 Hand segment

Fig. 34.16 Hand contour

34.4.1 Performance Evaluation: Word Matching Score

(WMS)

The DCT of the hand contour is generated, in which the ﬁrst 50 ×50 samples

are considered for sign classiﬁcation. These 2500 samples are treated with PCA to

generate a 50 ×1 sample vector. The statistical parameters mean, skewness, standard

deviation, and kurtosis are calculated to extract the feature vector of size 4 ×1. The

372 G. A. Rao et al.

mean, skewness, standard deviation, and kurtosis of the 50 ×1 vector for the 60th

and 90th frames of 3rd sign (morning) are shown in Table 34.1.

The word matching score is deﬁned by the ratio of number of correct classiﬁcations

to the total number of classiﬁcations.

%M=Number of correct classiﬁcations

Total number of classiﬁcations .

Table 34.1 Statistical features of 60th and 90th frames of 3rd sign (morning)

60th frame 50 ×1 vector Mean Standard deviation Skewness Kurtosis

[−0.6135, −0.6135,

1.1269, 1.1269, −1.1114,

1.1146, −0.3621, 0.2183,

−1.4986, 1.5882, −

2.0728, −1.7498, 1.8040,

1.4834, −2.1938, 1.0954,

−0.3230, −2.0883,

0.6633, 1.8864, 0.4939, −

0.9714, 1.5619, −1.2418,

−1.515, −1.7857, −

0.5680, 0.9373, 1.5394, −

1.4593, 1.9025, 0.1703, −

0.8057, −0.5604, 1.2177,

0.7542, −0.2496, −

1.4241, −1.6085, −

0.1429, 1.8646, −1.6085,

0.9456, −1.0082, 0.1630,

0.7175, 1.3761, −1.7430,

0.2666, −0.2375]

−0.1195 1.2664 0.0553 1.7005

90th frame [−1.8493, 1.8493,

1.3810, 1.3810, 1.2292,

1.3162, 1.7321, 2.5969,

2.6743, 2.1905, 2.1356, −

2.8661, 2.4568, 0.8828, −

2.1926, 1.5707, 0.7369, −

0.9115, −1.2881, −

0.03119, 0.9812, 2.64401,

−0.08503, −0.8926, −

0.8662, −0.5935, −

1.8733, −1.2805, −

2.0614, 1.7324, −2.4475,

2.1250, −0.5854, 1.0626,

0.9244, −1.8336, −

0.0101, −0.4198, 0.5409,

1.1290, −1.1319, 0.2586,

0.1026, −1.8913, −

1.2702, 0.6647, 0.5898, −

1.1216, 0.4245, −0.7854]

−0.2043 1.5186 0.1510 1.9557

34 Mobile-Based Selﬁe Sign Language Recognition System … 373

Table 34.2 WMS with single hidden layer (78 neurons) ANN trained with 10 sets each with 18

signs and tested with 6 sets each with 18 signs

Training Testing Network architecture Output confusion matrix WMS

(%)

10 sets

each with

18 signs

6sets

each with

18 signs

85.5

18 signs of 10 different signers (180 signs) are used to evaluate the performance

of SSLR system. The average WMS achieved is 85.5 % when a single hidden layer

ANN trained with 10 sets of each with 18 signs and tested with 6 sets each with 18

signs shown in Table 34.2. The average WMS achieved is 91 % when a multiple (4)

hidden layer ANN trained with 10 sets of each with 18 signs and tested with 6 sets

each with 18 signs shown in Table 34.3.The average WMS achieved is 86.4% when

a single hidden layer ANN trained with 15 sets of each with 18 signs and tested with

10 sets each with 18 signs shown in Table 34.4. The average WMS achieved is 96

% when a multiple (4) hidden layer ANN trained with 10 sets of each with 18 signs

and tested with 6 sets each with 18 signs shown in Table 34.5. It is observed that the

recognition time with single hidden layer ANN having 78 neurons is 0.272 s for a

total of 72 epochs and with multi-hidden (4) layer ANN having 78 neurons per each

layer is 0.872 s for a total of 42 epochs. And also observed that the recognition time

with single hidden layer ANN having 150 neurons is 0.389 s for a total of 61 epochs

and h multi-hidden (4) layer ANN having 150 neurons per each layer is 1.543 s for

a total of 28 epochs.

34.5 Conclusion and Future Work

A video database of 18 Indian signs for 10 different signers is created to simulate

and test the performance of SLR system using mobile phone. The hand contour of

the signer is extracted for every frame of the sign video. The DCT of hand contour

is calculated and which is treated with PCA to obtain the optimized feature vector.

The statistical features are calculated for the PCA treated feature vector which is

inputted to the ANN for the sign classiﬁcation. The performance of the SSLR system

is evaluated with WMS. The word matching scores and execution times with single

hidden layer (78 neurons) are about 85% which is improved with the multiple hidden

374 G. A. Rao et al.

Table 34.3 WMS with multiple (4) hidden layer (each with 78 neurons) ANN trained with 10 sets

each with 18 signs and tested with 6 sets each with 18 signs

Training Testing Network architecture Output confusion matrix WMS

(%)

10 sets

each

with 18

signs

6sets

each

with 18

signs

Table 34.4 WMS with single hidden layer (150 neurons) ANN trained with 15 sets each with 18

signs and tested with 10 sets each with 18 signs

Training Testing Network architecture Output confusion matrix WMS

(%)

15 sets

each with

18 signs

10 sets

each with

18 signs

86.4

Table 34.5 WMS with multiple (4) hidden layer (each with 78 neurons) ANN trained with 15 sets

each with 18 signs and tested with 10 sets each with 18 signs

Training Testing Network architecture Output confusion matrix WMS

(%)

15 sets

each with

18 signs

10 sets

each

with 18

signs

96.9

34 Mobile-Based Selﬁe Sign Language Recognition System … 375

layers or with increased number of neurons. The feature models, sign classiﬁers need

to be improved in the future work.

References

1. Anantha Rao, G., & Kishore, P. V. V. (2016, October). Sign Language recognition system

simulated for video captured with smart phone front camera. International Journal of Electrical

and Computer Engineering (IJECE),6(5), 2176–2187.

2. Kishore, P. V. V., & Rajesh Kumar, P. (2012). A video based Indian sign language recognition

system (INSLR) using wavelet transform and fuzzy logic. In IACSIT international journal of

engineering and technology (Vol. 4, No. 5, pp. 537–542).

3. Tanibata, N., Shimada, N., & Shirai, Y. (2002). Extraction of hand features for recognition of

sign language words. In Proceedings of the international conference on vision interface.

4. Parul, H. (2014). Neural network based static sign gesture recognition system. International

Journal of Innovative Research in Computer And Communication Engineering (Ijircce), 2(2),

3066–3072.

5. Rao, G. A., & Kishore, P. V. V. (2018). Selﬁe video based continuous Indian sign language

recognition system. Ain Shams Engineering Journal, 9(4), 1929–1939.

6. Rao G. A., Syamala, K., & Divakar, T. V. S. (2020). Selﬁe sign language recognition with

shape energy features and ANN classiﬁer. International Journal of Science Technology and

Research, 9(04), 1936–1939.

7. Tharwat, A., Gaber, T., Hassanien, A. E., Shahin, M. K., & Refaat, B. (2015). SIFT-based

Arabic sign language recognition system. In Proceedings of the Afro-European conference for

industrial advancement (pp. 359–370). Springer Cham.

8. Sulman, D. N., & Zuberi, S. (2000). Pakistan sign language—a synopsis. Pakistan, Jun 2000.

[Online]. Available: www.academia.edu/2708088/

9. Lee, G. C., Yeh, F.-H., & Hsiao, Y.-H. (2016). Kinect-based Taiwanese sign- language

recognition system. Multimedia Tools and Applications, 75(1), 261–279.

10. Holden, E.-J., Lee, G., & Owens, R. (2005). Australian sign language recognition. Machine

Vision and Applications, 16(5), 312.

11. Zieren, J., & Kraiss, K.-F. (2005). Robust person-independent visual sign language recognition.

In Iberian conference on pattern recognition and image analysis 2005 (pp. 335–355).

12. Maraqa, M., & Abu-Zaiter, R. (2008). Recognition of Arabic sign language (ArSL) using recur-

rent neural networks. In Proceedings of the 1st international conference on the applications of

digital information and web technologies (ICADIWT), Aug 2008 (pp. 478–481).

13. Khan, N., Shahzada, A., Ata, S., Abid, A., Khan, Y., & ShoaibFarooq, M. (2014). A vision

based approach for Pakistan sign language alphabets recognition. Pen s ee , 76 (3), 274–285.

14. Ahmed, H., Gilani, S. O., Jamil, M., Ayaz, Y., & Shah, S. I. A. (2016). Monocular vision-based

signer-independent Pakistani sign language recognition system using supervised learning.

Indian Journal of Science and Technology, 9(25), 1–16.

15. Kausar, S., Javed, M. Y., & Sohail, S. (2008). Recognition of gestures in Pakistani sign language

using fuzzy classiﬁer. In Proceedings of the 8th conference on signal processing,computational

geometry and artiﬁcial vision (pp. 101–105).

16. Anil Kumar, D., Kishore, P. V. V., Sastry, A. S. C. S., & Reddy Gurunatha Swamy, P. (2016).

Selﬁe continuous sign language recognition using neural network. In 2016 IEEE annual India

conference (INDICON).

Chapter 35

An Effective Model for Malware

Detection

V. Valli Kumari and Shaik Jani

Abstract Malware detection and identiﬁcation is important to protect an organi-

zation’s data and enable end-to-end monitoring of resources accessible by multiple

users through Internet. Malicious users and Intruders usually try various methods

to gain unauthorized access to data from remote locations. This paper proposes a

model that helps in ﬁnding the malware characteristics by extracting features of the

data provided. This model is also tested for unknown malware ﬁles generated using

various available tools. This paper discusses the steps used in building an effective

model, Model for Malware Detection (MMD) using EMBER dataset and Keras. The

results obtained with model accuracy of 97.2% are presented.

35.1 Introduction

Malware can be deﬁned as a set of malicious ﬁles or programs and may take many

forms like Root kit, Spyware, Botnet, Trojan, Ransomware and gain unauthorized or

unprivileged access to ﬁles in victim systems or servers. A malware can affect devices

like desktops, laptops, mobile phones, health care devices, enterprise servers, clients

and network devices. Malware detection means ﬁnding the presence of malware in

a given host or detecting the malicious behavior of a program. Email, chat clients,

phone conversations, SMS messages, and even postal mail are used to communicate

with other systems or software.

Unskilled users execute malicious code of the attackers allowing them to penetrate

into the network. Many antivirus products rely solely on signature-based methods

and techniques, which often fail to identify malware whose signatures are not yet

available. High-volume malware is evolving at a faster pace, and it is becoming

increasingly difﬁcult to evaluate the massive amounts of data generated by network

transactions. In general, malware ﬁles of this type are recognized and identiﬁed

using datasets derived from trafﬁc data collected from registered and enterprise-based

networks. To protect identiﬁcation of malware ﬁles, malware authors use the most

V. V. K u m a r i ( B)·S. Jani

Andhra University College of Engineering (A), Visakhapatnam, AP, India

e-mail: vallikumari@gmail.com

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_35

377

3 7 8 V. V. K u m a r i an d S . J a n i

complex and reliable obfuscation techniques, like code amalgamation and reordering

of subroutines, which make identiﬁcation of malwares hard. In several predeﬁned

malware detection methods the malwares are identiﬁed with a high false positive

rate.

The work in this paper uses Elastic malware benchmark for empowering

researchers (EMBER) dataset and this dataset contains over 1.1 million Portable

Executable ﬁles (PE) [1]. To analyze this data, Model for Malware Detection (MMD)

is proposed which extracts features and then classiﬁes the malware. The MMD model

gives 97.2% accuracy and helps in the detection and prediction of malware. The work

in this paper contributes the following: (a) Using EMBER-2018 dataset to extract

the features and class labels, which are used to detect malware. (b) This model can

be used for decision making through multimodal approaches (c) Finally, the model

is tested by providing new data or malware generated from different websites or

created using MSF venom and feeding it to the proposed model to ﬁnd whether it is

malicious or not.

The next sections of the paper are as follows. The relevant literature that was

studied is summarized in Sect. 35.2. Section 35.3 discusses the features of the dataset

used for experimentation. Section 35.4 covers the Malware Detection Model (MMD)

proposed for the detection and prediction of malwares. Section 35.5 explains the

results and Sect. 35.6 summarizes the contributions and presents points for further

work on this topic.

35.2 Literature Survey

Much of the published recent research work is focused on the building of automatic

malware detection mechanisms which used statistical methods rather than deter-

ministic rules. Many works have also proposed machine learning-based approaches

solutions for cyber security-related problems. Many of these operations may be auto-

mated to detect cyber-attacks in real-time and prevent damage. It was found in the

survey that most researchers are interested in the detection of malware of different

types.

A model is proposed in [2] using three datasets for training, testing and scaling up.

The paper discusses the use of cascade one-sided perceptron ﬁrst and cascade kernel-

ized one-sided perceptron later to identify malware ﬁles and reduces false positives.

Theworkin[3] considers building a decision model with data processing, decision

making and malware detection to classify and detect any suspicious malware. The

malware analysis system is ML-based and uses shared nearest neighbor technique.

In [4] information was collected from 2510 APK ﬁles of which 1260 were mali-

cious apps. The paper proposed machine learning-based framework for detecting

malicious apps using information from API calls and permissions. In [5] Microsoft

ofﬁce ﬁles related to malicious macros were analyzed using machine learning

methods. In [6] data is collected from packets instead of port numbers and protocols

and automated malware detection was proposed using convolution neural networks

35 An Effective Model for Malware Detection 379

and other machine learning methods. In [7] Malware ﬁles are executed in the Cuckoo

Sandbox and traced malware process behavior to determine generated and injected

processes. Recurrent neural network (RNN) is used for feature extraction and later

CNN is trained to classify. In [8] a model based on deep auto-encoder and CNN was

proposed. The data was collected from 23,000 apps and was processed for use in

deep learning models to identify malware in android apps. In [9] data was collected

from VMs by running various malwares like rootkits and Trojan horses. Both 2D

and 3D CNN were used to improve accuracy of detection. In [10] a binary ﬁle was

classiﬁed to be malicious or benign. The model was tested on a dataset with 11,130

binaries. In [11] data augmentation was used to generate variants of images obtained

from malware samples. In [12] novel ensemble CNN based architecture was used for

detection of both packed and unpacked malware. In [13] static and dynamic analysis

tools were used to extract features from 7000 malware and 3000 benign ﬁles and a

classiﬁcation model was built. In [14] a classiﬁcation model is built by extracting

requested permissions, vulnerable API calls, application’s key information such as

dynamic code, reﬂection code, native code, cryptographic code, and database from

applications.

35.3 Data Preprocessing

The Elastic malware benchmark for empowering researchers [1] (EMBER) dataset,

includes over one million Portable Executable (PE) ﬁles is used. The PE File [15]

format is given in Fig. 35.1. This dataset has features and repository useful for

machine learning and deep learning models training and testing. It includes sha256

hashes and is a repository of many attacks. JSON is used to build Ember dataset. The

work in this paper initially starts with data analysis and preprocessing. Various steps

involved for populating the variables used to build the model include: (i) Creating

a vectorised set of features, (ii) loading the train and test vectorised features and

(iii) scaling the dataset with zero mean and unit variance. The data that is split into

training and testing data is given as input to MMD model.

35.4 Model for Malware Detection (MMD)

35.4.1 Overview

In this section the overall architecture of the MMD model is discussed. This model is

based on deep neural network. Initially, the features are extracted from the EMBER

data set [1] and data is pre-processed. A set of 1,000,000 samples is divided into

training and testing datasets. See Fig. 35.2. The total features are 2381. The labels

are 0 and 1 for benign and malicious with counts of 182,524 and 17,476, respectively.

3 8 0 V. V. K u m a r i an d S . J a n i

Fig. 35.1 PE ﬁle format [15]CCBY4.0,https://commons.wikimedia.org/w/index.php?curid=510

26079

35 An Effective Model for Malware Detection 381

Fig. 35.2 Malware labels distribution in the dataset

Later MMD model is built to classify malware. Using this model, undetected attacks

can be easily spotted, and malware can be detected at an early stage.

35.4.2 MMD Architecture

The proposed architecture helps in extracting features from the datasets and training

the neural network to build a model. Figure 35.3 shows the proposed MMD model.

Inputs are ﬁrst pre-processed using ember packages provided [1]. The set of layers

in Fig. 35.3 is proposed after several combinations of layers and ﬁne tuning was

done. The model has Dense and dropout layers. Each dropout was set to 0.25. Adam

optimizer was used. As the pre-processed data set has only two labels 0 and 1, either

sigmoid or softmax can be used in the last layer. In the output layer sigmoid is used

to obtain binary classiﬁcation. Softmax can be used depending upon how the data is

labeled into different classes in the dataset.

35.5 Experimental Results

Figure 35.4 depicts the model layers and Fig. 35.5 presents the validation accuracy in

malware detection. 1,000,000 samples from EMBER dataset are considered. 800,000

are used for training the model. 200,000 samples are used for testing. Output labels

depicts whether the sample is malicious or not. The accuracy obtained is 97.2% as

showninFig.35.5 for the layers shown in Fig. 35.4. The precision, recall and F1

scores indicated the model quality to be high.

3 8 2 V. V. K u m a r i an d S . J a n i

Fig. 35.3 MMD model

architecture

Fig. 35.4 MMD model layers

35 An Effective Model for Malware Detection 383

Fig. 35.5 Epoch wise training and validation: aAccuracy bloss

35.6 Testing MMD Model with Unknown Malware

This section discusses the generation of a customized malware and testing MMD

model for classifying whether it is a malware or not. Kali Linux OS has Metasploit

Framework which has Msfvenom tool to create the malware. This malware ﬁle is

pre-processed with PEFeatureExtractor() deﬁned in ember modules [1]. The data is

pre-processed and is submitted to the MMD model for prediction and it was detected

as malicious.

3 8 4 V. V. K u m a r i an d S . J a n i

35.7 Conclusion

Malware can be classiﬁed using deep learning techniques. This paper discusses a deep

learning model, MMD for malware detection. The paper has extensively surveyed

several works published in literature. MMD model was trained and tested with Ember

dataset and was found to work with 97.2% accuracy which is quite a good improve-

ment over [1]. Later the model was also tested with unknown malware generated

through Msfvenom, and it was correctly classiﬁed. In future the model will be ﬁne-

tuned and optimized for better accuracy. Detection in real-time is another area to be

explored.

References

1. Anderson, H. S., & Roth, P. (2018). EMBER: An open dataset for training static PE malware

machine learning models.

2. Gavrilu¸t, D., Cimpoe¸su, M., Anton, D., & Ciortuz, L. (2009, October). Malware detection

using machine learning. In 2009 IEEE multiconference on computer science and information

technology (pp. 735–741). IEEE.

3. Liu, L., Wang, B. S., Yu, B., & Zhong, Q. X. (2017). Automatic malware classiﬁcation and new

malware detection using machine learning. Frontiers of Information Technology & Electronic

Engineering, 18(9), 1336–1347.

4. Peiravian, N., & Zhu, X. (2013, November). Machine learning for android malware detection

using permission and api calls. In 2013 IEEE 25th international conference on tools with

artiﬁcial intelligence (pp. 300–305). IEEE.

5. Bearden, R., & Lo, D. C. T. (2017, December). Automated Microsoft ofﬁce macro malware

detection using machine learning. In 2017 IEEE international conference on big data (Big

Data) (pp. 4448–4452). IEEE.

6. Yeo, M., Koo, Y., Yoon, Y., Hwang, T., Ryu, J., Song, J., & Park, C. (2018, January). Flow-

based malware detection using convolutional neural network. In 2018 International conference

on information networking (pp. 910–913).

7. Tobiyama, S., Yamaguchi, Y., Shimada, H., Ikuse, T., & Yagi, T. (2016, June). Malware detec-

tion with deep neural network using process behavior. In 2016 IEEE 40th annual computer

software and applications conference (COMPSAC) (Vol. 2, pp. 577–582). IEEE.

8. Wang, W., Zhao, M., & Wang, J. (2019). Effective android malware detection with a hybrid

model based on deep auto encoder and convolutional neural network. Journal of Ambient

Intelligence and Humanized Computing, 10(8), 3035–3043.

9. Abdelsalam, M., Krishnan, R., Huang, Y., & Sandhu, R. (2018, July). Malware detection in

cloud infrastructures using convolutional neural networks. In 2018 IEEE 11th international

conference on cloud computing (CLOUD) (pp. 162–169). IEEE.

10. Sharma, A., Malacaria, P., & Khouzani, M. H. R. (2019, June). Malware detection using 1-

dimensional convolutional neural networks. In 2019 IEEE European symposium on security

and privacy workshops (EuroS&PW) (pp. 247–256). IEEE.

11. Catak, F. O., Ahmed, J., Sahinbas, K., & Khand, Z. H. (2021). Data augmentation-based

malware detection using convolutional neural networks. PeerJ Computer Science, 7, e346.

12. Vasan, D., Alazab, M., Wassan, S., Safaei, B., & Zheng, Q. (2020). Image-based malware

classiﬁcation using ensemble of CNN architectures (IMCEC). Computers & Security.

13. Jerlin, M. A., & Marimuthu, K. (2018). A new malwaredetection system using machine learning

techniques for API call sequences. Journal of Applied Security Research, 13(1), 45–62.

35 An Effective Model for Malware Detection 385

14. Koli, J. D. (2018, March). RanDroid: Android malware detection using random machine

learning classiﬁers. In 2018 Technologies for smart-city energy security and power (ICSESP)

(pp. 1–6). IEEE.

15. Wikipedia. https://commons.wikimedia.org/wiki/File:Portable_Executable_32_bit_Struct

ure_in_SVG_ﬁxed.svg

Chapter 36

An Efﬁcient Approach to Retrieve

Information for Desktop Search Engine

S. A. Karthik, G. Lalitha, Y. Md. Riyazuddin, and R. Venkataramana

Abstract The Internet may not be the only source of information. Despite not

remembering the relevant terms, individuals understand the need to access docu-

ments. Entity Disambiguation is a technique for deciphering the text in processed

documents and queries. In this work, an outline of a desktop search engine using entity

disambiguation at its foundation has been presented. In this article, disambiguate

terms/entities using a Naive Bayes probabilistic model created to comprehend the

keyword depending on the set it is part of, which is inspired by the probabilistic

taxonomy Probase. There are three parts to our implementation: Text extraction, the

conceptualization of obtained items, and index updating/matching. Experimental

results obtained by implementation of the methodology yield truthfulness of the

approach.

36.1 Introduction

Search and retrieval has been the core features of any system which stores data as fact

ﬁles of information. Many techniques have been developed to make search efﬁcient

on a personal computer. The most popular File Explorer is a default File Browser

in MS OS [1]. It indexes all the ﬁle details such as creation and modiﬁcation dates,

location, ﬁle type, ﬁle size, and even ﬁle contents. N-gram tokens are used as part

of the index. While this is already convenient enough, we often encounter situations

S. A. Karthik (B)

BMS Institute of Technology and Management, Bengaluru, India

e-mail: karthiksa1990@gmail.com

G. Lalitha ·Y. Md. Riyazuddin

Gitam Deemed to Be University, Hyderabad, India

e-mail: lggidalu@gitam.edu

Y. Md. Riyazuddin

e-mail: rymd@gitam.edu

R. Venkataramana

SV College of Engineering, Tirupati, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_36

387

388 S. A. Karthik et al.

where we forgot the ﬁle details and contents but still desire to retrieve them. Search

engine as an answer to the above scenario.

Concept clusters are a good alternative to the standard ontologies [2] for they are

ﬂexible to maintain and apply in text understanding. Terms can be either an entity that

accommodates an is a relationship to the concept cluster or an attribute the describes

the characteristics of the concept clusters.

The machine could assign a similarity score to every content while performing

the search. Whenever a group of items has been recognized and graded, those are

prioritized and accessed by users via the interface based on their scores. Following

the release of the dataset to the user, certain systems allow users to further modify

their search by reviewing the responses, marking those items in the dataset that are

regarded signiﬁcant, and resending them to the system via the pertinent reﬂective

process. A client will have to create a question mostly in form of a list of questions

in simple language, as shown in the ﬁgure above. The IR system will next reply by

gathering the essential content’s appropriate result in the form of records.

The software subsequently generates a new conclusion and presents it to the

viewer. An IR system’s main job is to anticipate which items are constructive and

which are not, based on clients technical speciﬁcations. It is now well acknowledged

that IR plays a critical role in workstation and Internet applications [3–5]. Avoiding

phrase removal, isolating, and many other operations dependent on the software’s

characteristics, such as phrase generation, syllable canonicalization, term elongation,

and are used to recognize and extract phrases from txt ﬁles before they are evaluated.

Probabilistic taxonomy Probase open opportunities to entity disambiguation that

tremendously contributes to natural text understanding of both ﬁle contents and

queries. Probase is a Taxonomy of is A relationships between entities that is easy

to maintain and actively updated in the community. There are many works which

compares Probase with other taxonomies such as FreeBase and WikiTaxonomy to

say that entities of Probase enrich text understanding to greater levels.

Naïve Bayes Probabilistic Model is used to disambiguate text entities by retrieving

the concept clusters the entities belong to. The Model is explained in the later sections

of this paper. A simple Inverted Index is used in conjunction with BM2F Information

Retrieval Model including PageRank to retrieve the desired documents. The proposed

work is compared based on its performance on entity disambiguation on the TREC

datasets for which there are many cutting edge methodologies present.

The remaining of the article is arranged as follows: Recent stuff on search engine

principles as detailed in Sect. 36.2. Suggested techniques Probase has been described

in Sects. 36.3 and 36.4, accompanied by a summary of ﬁndings to demonstrate the

feasibility of the proposed approach. Ultimately, a prognosis and guidelines for future

improvements have been offered.

36 An Efﬁcient Approach to Retrieve Information … 389

36.2 Related Work

In this section, some of the popular methods of gridding of cDNA microarray images

from the literature are discussed. In this section recent work on desktop search engines

using concepts has been presented. Most of the information retrieval focuses on

ontology-based retrieval. MS File Explorer is a File Browser that indexes the details

of the document such as creation-modiﬁcation dates, type of ﬁle, title and size along

with its contents. However, there is no Text Understanding or Intent Mining that

would have enriched the index for Retrieval of documents.

From the literature, a survey conducted it clears that a reasonable quantity of

research was conducted on information retrieval.

Parsath et al. [1] describes entity exploitation as a preparatory technique that

uses rules of the language or n-grams to retrieve words or phrases [2,3]. For

extracting features, algorithms based on language rules to ﬂag sections of utterance

are prevalent. These norms are handwritten and modiﬁed according to Punctuation

and grammar norms. Kuzi et al. [2] Individuals and their related classes are key

properties that impact retrieving models, whether directly annotated [2,4] or cate-

gorized by an application. The ability to use word representations to include coun-

terparts, hyponyms, and abbreviations improves data samples by covering backdrop

while keeping the system insinuating intent. Seok et al. [3] The objective of inquiry

conceptualization is to outline occurrences in an inquiry to concepts characterized

in a certain philosophy or information base. Questions ordinarily do not stare at the

sentence structure of a composed dialect, nor do they contain sufﬁcient signals for

the factual deduction. Agrawal et al. [4] begins with mine assortment of relations

among terms from a huge web corpus and outline them to related concepts employing

a probabilistic information base. At that point, for a given query, we conceptualize

terms within the inquiry employing an irregular walk based iterative calculation.

Andrzejewski and Buttler [5] substantiates the concept, the collection of maxima is

well integrated and contrasts with the keywords in the recipient search. In order to

contrast the similarity across maps, the way to obtain an approximate subdivision

graph is described by Ganguly et al. [6]. The use of vertical translation between

both graphs ensures that the most subordinate graph is scored. Rich, intuitively, and

immersive applications, such as e-readers for electronic books, have sprung out as a

result of the fast spread of hand-held gadgets. Potthast et al. [7] describes applications

spur recovery frameworks that, by leveraging the setting of the user’s exercises, can

certainly fulﬁll any data request of the peruse. The inquiries delivered utilizing the

setting are frequently complicated objects, which recognize such recovery frame-

works from standard search. Zuccon et al. [8] explain the consequence of the rapid

expansion of palm devices, sophisticated, dynamic, and comprehensive solutions

including such e-readers for digital journals sprouted up. These implementations

encourage fetching mechanisms that really can effectively supply any lead to the

increase of the reader by exploiting the context of the user’s activity. The inquiries

generated by the context-aware visual search are frequently complex items, which

390 S. A. Karthik et al.

Table 36.1 Summary of the techniques and their impact on retrieval

Sl. No. Technique Technique measure/score Remarks

1Ontology-based topic

classiﬁcation of the

recognized entities [10]

Keywords: 0.46 pattern

matching: 0.53 NB: 0.65

baseline—VSM: 0.47

14.145% improvement

2Concept graph-based

query generation [11,12]

Random: 0.1 Using CG similarity is

much efﬁcient in doc

retrieval

3Semantic network [13,

14]

Bayesian analysis: 0.84

LDA–co-occurrence and

Probase: 0.83

Seems that random walk

covers much more of the

user’s intent under this

technique

4GOV2 model of word

embeddings [14]

RI score of RM3: 0.392 Including synonyms is

exhaustive

5 Generalized language

model [14]

Recall score of LDA: 0.58 Vector word embedding is

exhaustive

separates them from traditional search engines. Wang et al. [9] gives a theoret-

ical explanation of our approach and conduct a thorough experimental validation to

demonstrate its utility in enhancing electronic documents with high-quality movies

from the web. Table 36.1 depicts a summary of the work carried out in the area of

search engine optimization.

From the review of a post-work survey of the said area, it can be observed that

a lees scope has been put up in the IR domain even though it is one of the popular

areas of research [11–13]. Space-speciﬁc storehouses are being developed with both

estimates and numbers that call for an effective look at the archives and recovery.

Organizing the information on a personal computer becomes more and tougher due to

a variety of types of information that is hard to remember or prioritize. Our proposed

article aims to concentrate following objectives. Initially, to develop an efﬁcient

search algorithm that helps to retrieve the desired documents within the desktop.

Next, to implement word phased inverted indexing and a possible corpus to navigate

the desktop.

36.3 Proposed Methodology

Our proposed approach aims to concentrate on Clustering Probase, POS tagging and

Segmentation conceptualization, and inverted indexing of web documents, which

goes a long way to extract only the noun and verbal phrases of the sentences. If

feasible the lemma versions of the extracted terms would bring up the accuracy. In

this work, Naïve Bayes Probabilistic Model [14] for Conceptualizing the extracted

entities that need to be disambiguated has been proposed. Probase, a Probabilistic

Taxonomy comes into the picture to provide the unique is a relationship and their

36 An Efﬁcient Approach to Retrieve Information … 391

frequency of occurrence between the entity and the concepts. Documents and Queries

are treated to the same text extraction and understanding techniques. The entities

and their concepts of the Documents are arranged into an inverted index that can

be called upon to perform a BM2F retrieval of documents upon matching with the

query’s terms. The Bayes theorem that gives insights that Probability that a concept

P(C/E)=Probability of Cbeing the Concept given occurrence of Entity Ecan be

deduced when the following are a given: P(E/C)=Probability of entity Ebelonging

to Concept CP(C)=Probability of Concept Coccurring in the dataset P(E)=

Probability of Entity Eoccurring in the dataset. Then P(C/E)isgivenbyEq.(36.1)

PCE=PEC∗P(C)

P(E)(36.1)

The above is the implementation of the Naïve Bayes Probabilistic Model that gives

the probabilities from which the max scoring concept can be chosen to represent a

set of terms. However to deal with the set of terms where E1,E2,E3…..Enis entities:

By Assumption of Independence of the entities:

PCE1,E2,E3=PE1

C∗PE2

C∗PE3

C............∗P(C)

P(E1,E2,E3)

(36.2)

The below Fig. 36.1 shows the architecture of the proposed system, where the

user enters a query. The entered query undergoes entity disambiguation after POS

concerning Probase. The results are compared against the inverted index from which

the list of docs are drawn up. The list undergoes ranking as to the most likely document

and presented as a result.

The content from the most recently updated ﬁles should be gathered and analyzed

in order to derive the objects and construct their concepts. All documents which have

been altered from the last processing time are initially gathered as a list. Probase is a

knowledge base comprised IsA relationships between entities. It was created with the

help of a web crawler that analyzed over 1.2 billion online documents and extracted

entities as IsA relationships using inferential principles. So each relationship’s inci-

dence is also part of the knowledge base. Probase was acquired by Microsoft Research

Teams as part of a deal. Microsoft Concept Graph is the ofﬁcial name for Probase.

36.3.1 Clustering Probase

Probase is a collection of more than 120 million ia pairs of entity-concept relation-

ships. To be able to process this on a Personal Computer is a challenge in itself for

392 S. A. Karthik et al.

Fig. 36.1 Overview of proposed methodology

there will always be a shortage of heap memory. The workaround is to make a sample

out of Probase that still best represents it.

In this work manual clustering has been incorporated which is based on how

typical a concept is given an entity as well as how typical an entity is given a concept.

Typicality is a score measured as where n(e,c) is the frequency of occurring rela-

tionships, n(e) is the frequency of term as an entity, n(c) is the frequency of term

as a concept, T(c) is the typicality of a concept cgiven the entity eand T(e)isthe

typicality of an entity egiven the concept c.

T(e)=n(e,c)

n(c)(36.3)

T(c)=n(e,c)

n(e)(36.4)

36.3.2 POS TAGGING and SEGMENTATION

Language Components Tagging is the process of identifying syntactically signiﬁ-

cant sections of phrases. It is accomplished using machine learning algorithms that

have been taught to recognize grammatical links among words in tagged phrases.

36 An Efﬁcient Approach to Retrieve Information … 393

Wink-pos-tagger is a tool for annotating parts of speech in English sentences. This

is centered on Eric Brill’s transformation-based learning (TBL) technique. On the

regular WSJ22-24 test set, it pos-tags and lemmatizes more than 525,000 tokens per s

with a precision of 93.2%. This was tested using the tagRawTokens() API on a

2.2 GHz Intel Core i3 computer with 8 GB RAM.

The practice of splitting written material into recognizable components, such as

keywords, paragraphs, or themes, is known as text segmentation. The information

of reading documents are extracted, however, it must be divided into phrases to

determine the semantics of the entity in that phrase, rather than expressing the data

ﬁles with a few notions.

36.3.3 Conceptualization

Individuals can correlate qualities or characteristics of an object with those of objects

they are already familiar with, which is known as conceptualization. It is essentially a

map of common information. The references underpinning an entity can be extended

through an understanding of the concept.

36.3.4 Inverted Index

An inverted index is a statistical index that stores a mapping between content, such

as phrases or numerals, to their places in tables, text, or series of articles. An inverted

index allows for faster comprehensive searching at the cost of additional processing

when content is uploaded to the database. This was the most frequently used data

structure in information retrieval systems, [1] such as search engines.

36.4 Experimentation Results

In this section result analysis of the proposed work has been presented to judge the

efﬁciency of the proposed method. The Node.js programming language has been

used to create n, such as Web servers. Actual support for Node.js is available for

well-known operating systems and prototype support for FreeBSD. Node.js enables

the creation of fast loading servers in JavaScript by bringing occasion programming

to web servers. To judge the said methodology is efﬁcient, a few parameters are

considered. Whenever an issue has two or more categories that can also be truthful,

micro-averaging is used. The parameters used in this work are micro-avg. recall

(miR), micro-avg. precision (miP), micro-avg. F1 score (miF1), macro-avg. F1 score

(macF1) and it can be computed using the below formulae. TP, FP, TN, FN are

referred to as true positive, true negative, false positive, false negative. The range of

394 S. A. Karthik et al.

Table 36.2 Performance comparison proposed versus existing methodology

Method miR miP miF1 macF1 error

SVM 0.812 0.9137 0.8599 0.5251 0.00365

KNN 0.8339 0.8807 0.8567 0.5242 0.00385

LSF 0.8507 0.8489 0.8498 0.5008 0.00414

NNet 0.7842 0.8785 0.8287 0.3765 0.0047

NB 0.7688 0.8245 0.7956 0.3886 0.00544

Probase N B(Proposed) 0.8992 0.9263 0.9125 – –

all parameters mentioned here is [0, 1]

miR =n

i=1TPi

n

i=1(TPi+FNi)(36.5)

miP =n

i=1TPi

n

i=1(TPi+FPi)(36.6)

miF1 =2×miP +miR

miP ∗miR (36.7)

macF1 =1



i=0

F1 −score (36.8)

The performance comparison proposed versus existing methodology is given in

Table 36.2. Reason good optimization of the result is the implementation of inverted

index and segmentation of the phrases.

From the plot, it is clear that Probase NB relatively performs better than other work

presented in the literature under these metrics and still is a step toward integrating

NLU into PCs (Fig. 36.2).

36.5 Conclusion

The largest challenge had been the lack of heap memory to be able to process

Probase completely. Entity Disambiguation has been achieved and led to a textual

understanding of the ﬁle contents. Different alternatives for document representation

are discussed and chosen to use the Vector Space Model representation. So every

different branched phrase is referred to as validity in the Vector Space Model nota-

tion. Opposed to a simple and nearly impossible to break tags are deleted, phrases are

separated using Porter’s stemming algorithm, and multiplicity is decreased through

using the eigenvalue technique for semantic similarity indexing. Probase NB rela-

tively performs less than NB under these metrics. Probase NB still is a step toward

36 An Efﬁcient Approach to Retrieve Information … 395

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

SVM KNN LSF N Net N B Probase N

B(Proposed)

Methods

Proposed vs. Exisng approaches

miP

miF1

maF1

error

miR

Fig. 36.2 Line plot of obtained results

integrating NLU into PCs. The addition of unannotated data is easily processed in

Probase NB. Heap Memory limitation is an obstacle.

References

1. Prasath, R., Kumar, V., & Sarkar, S. (2015). Assisting web document retrieval with topic

identiﬁcation in tourism domain. In Web intelligence (Vol. 13, No. 1, pp. 31–41). IOS Press.

https://doi.org/10.3233/web-150308

2. Kuzi, S., Shtok, A., & Kurland, O. (2016). Query expansion using word embeddings. In

Proceedings of the 25th ACM international onconference on information and knowledge

management—CIKM ’16.https://doi.org/10.1145/2983323.2983876

3. Seok, M., Song, H.-J., Park, C., Kim, J.-D., & Kim, Y.-S. (2016). Named entity recognition

using word embedding as a feature. International Journal of Software Engineering and Its

Applications, 10.

4. Agrawal, R., Gollapudi, S., Kannan, A., & Kenthapadi, K. (2014). Similarity search using

concept graphs. In Proceedings of the 23rd ACM international conference on conference

on information and knowledge management—CIKM ’14.https://doi.org/10.1145/2661829.266

1995

5. Andrzejewski, D., & Buttler, D. (2011). Latent topic feedback for information retrieval. In

Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and

data mining—KDD ’11.https://doi.org/10.1145/2020408.2020503

6. Ganguly, D., Roy, D., Mitra, M., & Jones, G. J. F. (2015). Word embedding based generalized

language model for information retrieval. In Proceedings of the 38th international ACM SIGIR

conference on research and development in information retrieval—SIGIR ’15.https://doi.org/

10.1145/2766462.2767780

7. Potthast, M., Hagen, M., Stein, B., Graßegger, J., Michel, M., Tippmann, M., & Welsch,

C. (2012). ChatNoir: a search engine for the clueweb09 corpus. In Proceedings of the 35th

international ACM SIGIR conference on research and development in information retrieval—

SIGIR ’12.https://doi.org/10.1145/2348283.2348429

396 S. A. Karthik et al.

8. Zuccon, G., Koopman, B., Bruza, P., & Azzopardi, L. (2015). Integrating and evaluating neural

word embeddings in information retrieval. In Proceedings of the 20th Australasian document

computing symposium on ZZZ—ADCS ’15.https://doi.org/10.1145/2838931.2838936

9. Wang, Z., Zhao Renmin, K., Meng, X., & Wen, J.-R. Query understanding through knowledge-

based conceptualization. In Proceedings of the twenty-fourth international joint conference on

artiﬁcial intelligence

10. Ordonez-Salinas, S., & Gelbukh, A. (2010). Information retrieval with a simpliﬁed conceptual

graph-like representation. In: G. Sidorov, A. Hernandez Aguirre, & C. A. Reyes Garcia (Eds.),

Advances in artiﬁcial intelligence, lecture notes in computer science (Vol. 6437). Springer.

11. Koopman, B., Zuccon, G., Bruza, P., Sitbon, L., & Lawley, M. (2015). Information retrieval as

semantic inference: A graph inference model applied to medical search. Information Retrieval

Journal, 19(1–2), 6–37. https://doi.org/10.1007/s10791-015-9268-9

12. Karthik, S. A., & Manjunath, S. S. (2020). Microarray spot partitioning by autonoumsly

organizing maps through contour model. International Journal of Electrical and Computer

Engineering (IJECE)

13. Karthik, S. A. (2019). A systematic examination of microarray segmentation algorithms.

International Journal of Innovative Technology and Exploring Engineering (IJITEE).

14. Karthik, S. A., (2018). An enhanced approach for spot segmentation of microarray images.

In International conference on computational intelligence and data science (ICCIDS 2018).

Elsevier.

Chapter 37

Baggage Recognition and Collection

at Airports

Aviral Pulast and S. Asha

Abstract This paper proposes a solution to the baggage recognition and collection

process at the airports. Security of the passenger’s luggage becomes a major question

at airports during deplane and standing near the belts to collect the luggage. As

nobody likes to stand and check each bag which is similar or identical to theirs, this

paper worked on ﬁnding a solution to this problem. In this paper, the exact real-

life scenarios is found and comparison between the conventional and the proposed

methods, advantages of the new method and a proper implementation of the same is

carried out with a promising performance.

37.1 Introduction

The Internet of things has changed the way we look around the world. Right from

going to manual operation of a process to an automated working and then to a

wireless, seamless, and effortless processing of information and data. These days

you can ﬁnd solutions to literally any problem we face and that too by using IoT. For,

e.g., let us take our houses, IoT has a very major role here which makes everything

in our home automatic, we call it a smart home automation system. We can see that

various big multinational companies are coming forward and showing their interests

in developing new technologies to make our lives easier and smooth. We, in our daily

lives face many problems, be it in our city, our house, our neighborhood which we

would like to remove or if not remove then ﬁnd an easy solution to it.

A. Pulast ·S. Asha (B)

School of Computer Science and Engineering, Vellore Institute of Technology, Chennai 600127,

India

e-mail: asha.s@vit.ac.in

A. Pulast

e-mail: aviral.pulast2019@vitstudent.ac.in

S. Asha

Centre for Cyber Physical Systems, Vellore Institute of Technology, Chennai 600127, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_37

397

398 A. Pulast and S. Asha

When we talk about the Internet of Things and consider communication, we talk

about various sensors, various bluetooth devices, new technologies such as Zigbee

and many more. But, there’s one thing common in all this is the use of the Internet

and as an extension we have sensors which helps us to implement and enjoy the IoT

services. We would also require cloud computing to manage our data with proper

safety and security. There are many providers like Google, Microsoft, and Amazon

with a highly safe and secure database to store our/our customer’s information.

One such problem, we generally face is at the airports. The problem is that of

recognizing our baggage as there may be many other identical bags or belongings

which might create a possibility for us to pick up the wrong belongings. Use of

the Internet of Things, wireless sensor networks make it easier for us to work and

implement for the above-mentioned problem. This problem is solved by using various

modules and sensors and Internet and calling services. This paper is organized into

various sections explaining the need, the proposal, and the implementation of this

major problem. The advantages and disadvantages of the new method and previous

methods/procedures, respectively, were also discussed in this paper.

37.2 Methodology

In this section, the procedure or the workﬂow is explained by giving proper opera-

tional steps with the reasons along with. This section will also show the comparison

between the process which is followed till date and the proposed process with all the

new technologies implemented.

37.2.1 Conventional Method

At airports, we usually see there are several steps that are to be followed before

we actually get our baggage whenever we deboard an aircraft. The process

followed contains two–three steps before our bags reach us. Figure 37.1 depicts

the conventional procedure of baggage collection at airports.

In the conventional method, there is minimal use/no use of the IoT devices and

even if there is a usage, it is either at the end or at the start. In this way, there is no

provision for crowd management and effortless functioning of the pickup belt. Also,

this method does not ensure proper security and safety of your baggage. There might

be a possibility that your bag might get exchanged with another person provided your

bags are identically similar but that is what and why we are presenting a solution in

this paper.

37 Baggage Recognition and Collection at Airports 399

Fig. 37.1 Conventional method

37.2.2 Proposed Solution

The above-discussed method lacks at many places and many points, so to optimize

the above, a new is proposed to achieve a solution in which IoT devices are being

used and hence making the process easier. The process is shown in Fig. 37.2, and it

includes the following steps.

The above steps give an outline about the solution. It is noticed from the Fig. 37.2

that, two extra steps are included, which are RFID tags and they will be scanning

your baggage and will be displaying the output on the screen to alert the passenger

about their luggage.

It’s not just about scanning RFID tags, there is also a usage of GSM modules which

will transfer the scanned information to the cloud and then an intimation will be sent

to the passenger’s phone number as a SMS hence making the process completely

smooth and easy. Further in the following sections, the detailed explanation about

the inner processes is given.

Fig. 37.2 Proposed method

400 A. Pulast and S. Asha

37.3 Implementation

This section provides the implementation of the idea in both schematic and circuit

diagrams. This helps to understand more about the functioning and the procedure of

the new proposed way.

37.3.1 Schematic Diagram

Figure 37.3 gives the detailed schematic diagram of the proposed method. The

detailed steps of the entire implementation procedure are explained the following

sections.

Fig. 37.3 Detailed schematic diagram of the proposed solution

37 Baggage Recognition and Collection at Airports 401

37.3.1.1 1st RFID Scan

The 1st RFID scan is done to divide the luggage into slots (10–20 in. each). This

step is essential as there can be hundreds of passengers in a ﬂight. So, after luggage

being deplaned, they are divided into slots to maintain the smooth functioning as

it is easier to handle ten bags at a time rather than 100 of them all together. After

scanning them if the scanner is true then they are passed to move further on the belt;

otherwise, they are held back and will be sent to the last slot or in a different category

slot (unidentiﬁed).

37.3.1.2 GSM Module

In continuation to the previous step, if the RFID tags are matched and the luggage

is proceeded to the belt then the data associated with the luggage will be sent to

the internet and through that SMS notiﬁcation will be sent to the passenger’s phone.

This will alert the passengers about their baggage and all the notiﬁed passengers may

come and stand near the belt to collect the baggage.

37.3.1.3 2nd RFID Scan

When the baggage is on the belt and is ready for pickup, another RFID scan will be

done, hence ensuring that the bag which is being scanned is of a single and unique

passenger. This scan is being done to eliminate the possibility of multiple bags being

identical. Since each bag is given a separate RFID tag, therefore that tag will have

a unique information associated with that bag. As a result, when the tag is scanned

and it matches with the RFID reader, it will display the information on the screen

and will send data to the database and from there another SMS will be sent to the

passenger asking them to come and collect their bag from the belt.

37.3.2 Circuit Implementation

In Fig. 37.4, the circuit diagram of the implementation is given. All the implemen-

tation is done in Proteus 8 Professional software. In the following section, the

apparatus used in this proposed method with their application is discussed.

37.3.2.1 Arduino Uno

This is the microcontroller for this project and all the functions are being performed

by this controller only. Arduino has been backed up by a project speciﬁc code and the

whole project works on that. RFID works according to the character array associated

402 A. Pulast and S. Asha

Fig. 37.4 Circuit implementation of the idea [1]

with each scanner tag. It is a 12-ﬁgure character array which acts as a unique ID for

RFID tags. This id is read by the receiver, if the id is matched then the bag will be

passed further otherwise not.

37.3.2.2 RFID Tags

RFID technology consists of two tags: one is sender and the other is receiver. In the

circuit diagram in the above Fig. 37.4, there are two device ﬁgures marked with two

circles. The ﬁgure marked with a green circle is the sender, i.e., the RFID tag which

is placed on each bag, while the red marked RFID is the receiver which means that

whenever sender having a unique array is scanned, the receiver will check that id

and will return the Boolean value (True or False). Further working of the proposed

method is explained in the results section.

37.3.2.3 LED

Here, the LED will be lit when the RFID matches correctly indicating the correct

bag and correct id. When the LED is lit, it means that the data associated with the

bag is sent to the database and from there the message will be sent to the passenger.

Since there is no provision to show the message transfer on mobile phones, we have

put this LED to show the acknowledgement. Further in sections, you will ﬁnd the

simulated output for the given circuit.

37 Baggage Recognition and Collection at Airports 403

37.4 Result and Discussion

As we proceeded with the implementation, we simulated the circuit and got the

desired outputs. We intentionally hard-coded and put the expected id to show the

match and on the other hand gave a false id to show the mismatch.

In Fig. 37.5 given below, the output screen, two terminal-like windows are seen,

where the upper window is the sender and the lower is the output window. In the

upper window, we entered the correct ID “AB123456789A” (also given in code).

Hence, we are getting output as “AB123456789A VALID TAG”. If we get this case

at the airport that would mean that our bag has been identiﬁed and the passenger

is getting an intimation about their luggage. Now in the Fig. 37.6, another output

saying “CD123456789D INVALID TAG” is seen. This means that this time we have

entered the tag/ID as “CD123456789D” which is obviously wrong coz in the code

we have provided “AB123456789A” as the correct id.

Fig. 37.5 Output (valid tag)

404 A. Pulast and S. Asha

Fig. 37.6 Output (invalid tag)

Hence, we can see that this implementation is working ﬁne with every possible

case i. In case of wrong id, we are getting the expected output and in case of correct

id we are getting a correct output.

37.5 Conclusion

In this era of 4th industrial revolution (Industry 4.0), with the aid of Internet of Things

(IoT), things have become a lot easier when it comes to remote data gathering and

processing with smart sensors, cheaper cloud data storage. The Internet of Things

is growing at an exponential rate wherever we look, wherever we go be it malls,

be it universities we see IoT applications. In colleges and universities, we ﬁnd IoT

applications in ID Cards, for attendance, for entry and exit students are using their

ID cards to scan and then register their attendance. Similarly, we have smart city

concepts where there are smart toll booths, smart dustbins making it smoother for

the government and for the residents as well. Taking inspiration from various IoT

usage and implementation we have conducted a study to proceed for this project.

In this paper, we have studied and evaluated different modules and application

programming interfaces (APIs) to simulate and solve a real-world problem which is to

ensure proper identiﬁcation of baggage with proper security and eliminate gathering

and cramming up of passengers in long queues at the airport so as to control the

ﬂow of work, which can be further improved with the help of furthermore resources

and better data queueing and analyzing algorithms alongside with more cloud-based

processing which can handle more amount of data for industrial usages. Above we

discussed our approach to this idea and implementation. Since our idea follows

a minimalistic and simplistic approach making it easier to implement in the real

world and also making it cost efﬁcient because the approach tries to keep less use

of sensors hence getting efﬁcient results with low inputs. We are sure to get more

future applications and future investments in the ﬁeld of IoT.

37 Baggage Recognition and Collection at Airports 405

References

1. RFID security and privacy: A research survey.https://ieeexplore.ieee.org/abstract/document/

1589116

2. An introduction to RFID technology.https://ieeexplore.ieee.org/abstract/document/1593568

3. Rﬁd technologies: Supply-chain applications and implementation issues.https://www.tandfo

nline.com/doi/abs/10.1201/1078/44912.22.1.20051201/85739.7?journalCode=uism20

4. RFID systems and security and privacy implications.https://link.springer.com/chapter/10.

1007/3-540-36400-5_33

5. Mapping and localization with RFID technology.https://ieeexplore.ieee.org/abstract/doc

ument/1307283

6. Smart sensors: Analysis of different types of IoT sensors.https://ieeexplore.ieee.org/abstract/

document/8862778

7. Analysis of three IoT-based wireless sensors for environmental monitoring.https://ieeexplore.

ieee.org/abstract/document/7887698

8. General application research on GSM module.https://ieeexplore.ieee.org/abstract/document/

6063315

9. Anti-theft system based on GSM and GPS module.https://ieeexplore.ieee.org/abstract/doc

ument/6376520

10. Implementation of wireless sensor network (WSN) on garbage transport warning information

system using GSM module.https://iopscience.iop.org/article/10.1088/1742-6596/1175/1/012

054/meta

11. Attendance generating system using RFID and GSM.https://ieeexplore.ieee.org/abstract/doc

ument/7494157

12. Design of smart home controlling system based on GSM SMS.https://en.cnki.com.cn/Art

icle_en/CJFDTotal-DZCL201306030.htm

13. Microcontroller based digital meter with alert system using GSM.https://ieeexplore.ieee.org/

abstract/document/7856033

14. Microcontroller based attendance system using RFID and GSM.https://www.ijeter.eversc

ience.org/Manuscripts/Volume-5/Issue-8/Vol-5-issue-8-M-21.pdf

Chapter 38

Computer Vision Technique to Detect

Accidents

A. Jafﬂet Trinishia and S. Asha

Abstract Detecting accident in smart cities is hypothetical task in day-to-day life.

It is hard to control by trafﬁc police; the police cannot be available for all the 24 * 7

(all the month). Due to this, many accidents are passing by in everyone life. Many

humans are losing their life due to lack of ﬁrst aid support from the hospital. It takes

at least 5 min to pass the accident information to the hospital; hence to overcome this

problem, we have used computer vision technique to identify the accident in speciﬁc

location, and the messages will be passed automatically to the nearby hospital. When

the accident is detected the local hospital, and patrol is intimated by the Gmail else

by SMS through SMTP to take necessary action. Using deep learning techniques,

we are able to achieve a promising solution to this problem.

38.1 Introduction

Trafﬁc accidents can occasionally lead to unexpected damages to society, injuries to

humans and environment, and even fatalities. According to NHTSA yearly reports

on trafﬁc since 1988, probably, more than 5,000,000 car crashes happen which means

in the states each year 30% of them end result in death and injuries. After several

research and after multiple discussion, it is widely believed that reductions of acci-

dent can be achieved via high-quality accident detection methods and also by using

corresponding response strategies. As we are dealing with trafﬁc incident and the

accidents corresponding with them, the trafﬁc management result should be accurate

and speedy detection of trafﬁc accidents is vital, and it should be done within minimal

amount of response time. Traditional techniques normally provide correct location

and time of the site visitors. The common detection methods with solely trafﬁc facts

nevertheless meet positive challenges.

A. J. Trinishia

School of Computer Science and Engineering, VIT University, Chennai, India

e-mail: jafﬂettrinishia.a2020@vitstudent.ac.in

S. Asha (B)

Centre for Cyber Physical Systems, VIT University, Chennai, India

e-mail: asha.s@vit.ac.in

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_38

407

408 A. J. Trinishia and S. Asha

Initially, most of the researchers make use of the ﬁeld information mainly used to

observe the trafﬁc accidents, and it is constructed mainly on an implicit assumption

that the records are reliable. Despite the fact that the ﬁnder screw ups and corre-

spondence botches are lasting difﬁculties in rush hour gridlock tasks. The issue of

failed sensors can reason even additional difﬁculties in mishap location in enormous

districts. The vulnerability idea of trafﬁc designs and non-repetitive events can like-

wise moreover subvert the achievable of trafﬁc measurements in legitimizing the site

guests’ mishaps. Besides auto-collisions, every day trafﬁc activities would perhaps

also go through breakdowns with the guide of uncommon components like proces-

sions, road development, strolling races, and so forth. Hence, the measurements

comprehensive of the trafﬁc ﬂoat and inhabitance inalienably work as a slanted

help for auto-collisions as a substitute of direct veriﬁcation. The existing research

is comparatively low review and additional limitation blunder as opposed to faster

R-CNN. Also, it struggles to ﬁnd close items because of the reality, each lattice can

underwrite exclusively made up of two bounding boxes. The existing algorithms

struggles to discover small objects. This research addresses these difﬁculties R-

CNN procedure is utilized to distinguish the mishap through the PC vision method

which is a piece of profound learning. This paper is organized as follows: Sect. 38.1

gives the introduction about the automatic detection of road accidents. Section 38.2

discusses the existing systems. Section 38.3 gives the details of the proposed system.

Section 38.4 shows the implementation and results. Section 38.5 gives the conclusion

of this research.

38.2 Existing System

An adaptive trafﬁc light control algorithm that alters the sequence and length of trafﬁc

lights in real time based on the detected trafﬁc is discussed in [1]. The intended result

is obtained using an automatic car accident detection method based on cooperative

vehicle infrastructure systems (CVIS) and machine vision in [2]. It gives 90.02%

average precision (AP). In [3], a feedback control method to optimize the signal

time based on the delay estimated through re-identiﬁcation technology. Nellore and

Hancke [4] depicts a taxonomy of several trafﬁc control techniques for preventing

congestion. Trafﬁc optimization with IoT platform for optimal use of allocating

changing time to all trafﬁc signals based on the number of vehicles in the road path

is achieved in [5]. In article [6], using image processing technique, number of cars are

counted using a camera which captures the road trafﬁc. If the number of cars exceeds

a trafﬁc threshold, a warning signal will be invoked automatically. But in paper [7],

the trafﬁc of vehicles in road is measured based on the density of the vehicles within

a particular longitude and latitude. IR sensors are used to evaluate the trafﬁc density

in [8]. The road trafﬁc density is dynamic. This is addressed in paper [9], with the

help of RFID for vehicle-to-vehicle communication to avoid congestion. Shaaban

et al. [10] proposed an evaluation strategy of the existing trafﬁc control mechanisms.

38 Computer Vision Technique to Detect Accidents 409

The use of clustering algorithm on VANETs based on threshold for selecting

green light timing for crossroads in real-time environment is experimented in [11].

The survey on various sensors, their strengths, and weakness is discussed in [12].

Vehicle-logo location recognition system is discussed in [13]. Deep learning models

are used for object detection and tracking system (ODTS) in [14].

All the existing models are having both advantages and disadvantages, but trafﬁc

accidents are not addressed. The proposed method uses a computer vision technique

to avoid road accidents.

38.3 Proposed System

Traditional trafﬁc monitoring device in designed only to screen trafﬁc or to manipu-

late the trafﬁc, however, it does no longer supply any answer to minimize the deadly

unintended human damages price which happen due to lack of clinical resource in

actual time. Consider a situation where the accident took place, however, no one

used to be there to record this accident, the sufferer is crucial and each and every

2nd counts, any lengthen can end result in disability or death. We cannot root out

accidents completely, however, we can improve in supplying publish unintentional

care just-in-time. There are a lot of sensors primarily-based structures reachable in

the market as nicely, however, that requires automobile proprietors to set up these

sensors in their vehicles. The working of these systems is based totally on any injury

being sensed through the sensors installed; these indicators from the sensors will set

off a machine that will alert close by clinical help or an emergency contact number.

But what if the accident took place of an automobile which is no longer outﬁtted

with such sensor primarily-based system. For example, consider the Fig. 38.1 that

displays the accident scene in a highway. This type of accidents cannot be addressed

immediately due to the lack of technology. We want an increase artiﬁcial talent-based

totally surveillance gadget which now not solely can notice prevalence of accident,

however, additionally can alert to close by hospitals/ambulance or trafﬁc policemen

in real-time.

The proposed structure bears the cost of a vigorous procedure to acquire an exor-

bitant detection rate and a low-false alarm rate on common street trafﬁc CCTV obser-

vation ﬁlm. This structure used to be assessed on various speciﬁcations like enormous

sunlight, low perceivability, downpour, hail, and snow the utilization of the proposed

dataset. This structure used to be noticed superb and prepares to the improvement

of universally useful vehicular accident location calculations progressively. At the

point when the accident is distinguished, the local facility and patrol are insinuated

via the Gmail and SMS through SMTP. The levels of the accident moreover broke

down whether it is generally signiﬁcant or minor the SMTP protocol plays the role

of alerting the hospital. The proposed method focuses on vehicle detection, accident

identiﬁcation, accident classiﬁcation as a major and minor type of accidents and send

alerting message the nearest control centre. The architecture of the proposed model

is given in Fig. 38.2. It begins from getting the information from the CCTV camera,

410 A. J. Trinishia and S. Asha

Fig. 38.1 Real-time road accident scene

processes the data and sending an alert to the nearest hospital about the accident

occurred in few seconds.

(a) Vehicle Detection

Fig. 38.2 Architecture of the proposed model

38 Computer Vision Technique to Detect Accidents 411

Vehicle detection and monitoring are necessary in self-driving applied sciences

to power automobile safely. It can be carried out by using following the under

tasks. Perform a histogram of oriented gradients (HOG) characteristic extrac-

tion on a labeled education set of snap shots and educate classiﬁer linear SVM

classiﬁers. Implement a sliding window approach and use your skilled classiﬁer

to search for automobiles in images. Run pipeline on a video ﬂow and create

a warmth map of habitual detections body through body to reject outliers and

comply with detected vehicles. Estimate a bounding container for automobiles

detected vehicle and non-vehicle pics as NumPy array are loaded to the sepa-

rate listing the usage of function. There is a variety of characteristic extraction

strategies has been used to teach the classiﬁer to notice the vehicles efﬁciently.

(b) Accident Identiﬁcation

Recently, computerized incident detection has attracted an awful lot interest in

parkway manage structures to minimize trafﬁc delay, strengthen avenue safety,

potential, and actual time trafﬁc manage due to the fact when throughway and

arterial incidents occur, they motive congestion and mobility loss, if they are no

longer constant immediately, they can reason 2nd trafﬁc accidents.

The high purpose of this lookup paper is to analyze the avenue accidents and

determines the severity of an accident by means of making use of superior

computing device mastering strategies. There exist so many developed strate-

gies in computing device gaining knowledge of to study this sector. The trafﬁc

accident analysis by means of making use of four superior and most famous

supervised gaining knowledge of strategies of computing device getting to

know due to the fact of their conﬁrmed accuracy in this sector. They also use

some techniques-decision tree, K-nearest neighbors (KNNs), Naïve Bayes, and

adaptive boosting (AdaBoost).

(d) Alerting System

If fundamental peer-to-peer email server to take on giant bulk emailing jobs,

transport fee would go through and it would inevitably clog your bandwidth and

probably lengthen sending and receiving peer-to-peer emails. SMTP is used to

send alert mail to the respective hospital mail id which is near to the hospital

location.

(e) SMTP Gateway

Python oversees smtplib module, which describes a SMTP, that can be used to

send letters to any Internet ﬁguring contraption with a SMTP or ESMTP crowd

daemon. Host—The host strolling your SMTP server if fundamental peer-to-

peer email server to take on giant bulk emailing jobs, transport fee would go

through and it would inevitably clog your bandwidth and probably lengthen

sending and receiving peer-to-peer emails.

Figure 38.3 gives the block diagram of the proposed system. It decides whether

the accident is minor or major, and based on it, alert messages will be communicated.

412 A. J. Trinishia and S. Asha

Fig. 38.3 Block diagram

It uses the fast R-CNN to achieve the expected result. The following session gives

the details of faster R-CNN and RPN.

38.3.1 Faster R-CNN

Our object recognition framework, alluded to as faster R-CNN, is comprised of two

modules. The principal module is profound completely convolutional network that

38 Computer Vision Technique to Detect Accidents 413

proposes locales, and the following module is the fast R-CNN identiﬁer that utilizes

the proposed areas. The entire framework is a solitary, brought together local area for

object recognition. Utilizing the as of now celebrated phrasing of neural organizations

with ‘attention’ systems, the RPN module will talk about the faster R-CNN module

which spot to look. We present the plans and places of the local area for territory

proposition. We reinforce calculations for training every module with components

shared.

38.3.2 Region Proposal Networks

A region proposal network, or RPN, takes a picture of any size as input and produces

a number of rectangular article propositions, each with an objectless score. This

technique is modeled using a fully convolutional local region, as seen in this segment.

We rely on each net sharing a sequential arrangement of convolutional layers since

our goal is to impart calculation to a fast R-CNN object location local area. To

provide area suggestions, we use the excess common convolutional layer to lead a

small local region over the convolutional work map yield. The information in this

small local area comes from the information convolutional trademark map’s n×

nspatial window. In this case, each sliding window is designed to have a lower-

dimensional capacity of 256-d for ZF and 512-d for VGG, in addition to ReLU. A

container relapse layer (reg) and a crate order layer are the two kin layers that are

taken care of in this trademark (cls). When we utilize n=three in this study, we notice

that the useful open area on the information image is enormous (171 and 228 pixels

for ZF and VGG). This small partnership is depicted in a single work. The more

modest than usual association works in a sliding window style, with the completely

connected layers shared throughout each spatial territory. A n×nconvolutional

layer is usually combined with two family 1 ×1 convolutional layers to complete

this scheme (Fig. 38.4).

The following Table 38.1 gives the comparison of fast R-CNN with R-CNN, the

features, and their limitations. Table 38.2 gives the efﬁciency in term of time is given.

Based on this, it is understood that fast R-CNN is recommended and can be applied.

38.4 System Implementation and Results

Because even after ﬁltering by thresholding across the classes scores, we still wind

up with a lot of overlapping boxes, the non-maximum suppression (NMS) is the

second stage of ﬁltering employed to get rid of them. Non-maximum suppression

(NMS) is a computer vision approach that is utilized in a variety of jobs. It is a set of

methods for selecting a single thing. Figure 38.5 depicts how NMS is used to choose

a single entity.

414 A. J. Trinishia and S. Asha

Fig. 38.4 Region proposal network. Source Chengjun Xie (2018) [15]

Table 38.1 Comparison of fast R-CNN and R-CNN algorithms

Algorithm Features Prediction time/image (s) Limitations

Fast R-CNN The CNN receives each

image only once, and

feature maps are

extracted

On these maps, selective

search is employed to

create predictions. All

three R-CNN models are

combined in this model

2Because selective search

is slow, computation

time remains high

Faster R-CNN Replaces the selective

search method with a

region proposal network,

resulting in a

substantially faster

process

0.2 Object proposal takes

time, and because there

are multiple systems

working in parallel, each

system’s performance is

inﬂuenced by the prior

system’s performance

Table 38.2 Efﬁciency of

faster R-CNN Comparison R-CNN Fast R-CNN

Time taken to test per image (s) 50 2

Speed ups 1×25×

mAP 66.0 66.9

38 Computer Vision Technique to Detect Accidents 415

Fig. 38.5 Non-maximum suppression. Source Jatin Prakash (June 2, 2021)

38.4.1 Anchors

At each sliding window area, we simultaneously a few area recommendations, the

spot the amount of most suitable proposition for each space we will indicated that

ask. Now, the reg layer has 4kyields that will be encoding the directions of the

alright boxes, and the CLS layer yields 2krankings that gage possibility of article

or now not item for each proposition. The kproposition is deﬁned comparative with

kreference boxes, which we name secures. An anchor is established at the sliding

window being referred to and is connected with a scale and thing proportion. As

a matter of course, we utilize three scales and three segment proportions, yielding

k=9 anchors at each sliding position. For a convolutional trademark guide of an

estimation W×H(regularly ∼2400), there are WH k secures in complete.

38.4.1.1 Anchor Boxes

In faster R-CNN, anchor boxes are one of the most signiﬁcant notions. These are in

charge of providing a predetermined collection of bounding boxes of various sizes

and ratios that are used as a point of reference when the RPN is ﬁrst predicting item

positions. These boxes are often chosen based on object sizes in the training dataset

to capture the scale and aspect ratio of speciﬁc object classes you want to detect.

Anchor boxes are usually placed in the center of the sliding glass.

The original approach employs three scales and three aspect ratios, resulting in k

=9. The total number of anchors generated will be W*H*kif the ﬁnal feature

map from the feature extraction layer has width Wand height H.

The major reason for using anchor boxes is to analyze all object predictions at

the same time. They aid in the detection portion of a deep learning neural network

framework’s speed and efﬁciency. Anchor boxes also make it easier to recognize

many items, objects of varying scales, and overlapping objects without having to

scan an image with a sliding window that makes a separate prediction for each

possible position. This allows for real-time object detection.

Object recognition is an enamoring territory in PC vision. It goes to a whole

new stage when we are managing video information. The intricacy ascends a score,

416 A. J. Trinishia and S. Asha

notwithstanding so do the prizes! We can work fantastic useful high-esteem obliga-

tions like observation, guests the board, battle wrongdoing, and so forth the use of

object detection algorithm. In this automobile detection, attainable boxes, with the

aid of a sliding window, that may also include an automobile by using the use of a

support vector machine classiﬁer for prediction to create a warmness map. The warm-

ness map records are then used to ﬁlter out false positives earlier than identiﬁcation

of automobiles by means of drawing a bounding container round it.

Two signiﬁcant principles have been set for characterizing order boundaries of a

mishap that will genuinely portray the mishap probability and seriousness: decline

the scope of boundaries, characterizing the mishap, while keeping the cherished

records about mishap seriousness. As information, the road mishap dataset kept up

through the Czech Police road trafﬁc police branch was once utilized. The outcomes,

which are not, at this point secured in this article, have delivered intriguing infor-

mation. One perspective with regards to explicit—a monstrous amount of records is

extremely vague and just enigmatically depicts the specialized history of a mishap

method and results. The mishap information is fundamentally founded habitually on

lawful punishments of a mishap. Be that as it may, every single vehicle outﬁtted with

an underlying insurance contraption has information about its current day climate,

which cannot be put away with the guide of the dataset. A combination of the road

accident dataset assessment and a hypothetical methodology from the auto-abilities

perspective created the accompanying boundaries that can sincerely characterize the

probability of an accident and the conﬁrmation of harm:

1. stress of impacting objects;

2. contact cross-part of impacting objects in the conventional course of crash;

3. amount of resulting crashes;

4. auto-relative speed sooner than the expected mishap;

5. sort of impact;

6. auto-movement sooner or later of the mishap;

7. auto occupation;

8. reaction space.

The above boundaries can sincerely depict the plausible seriousness of a mishap.

To acquire realities about the possibility of impact, the boundary reaction territory

is added. Figure 38.6 gives the details of the dataset used in this implementation.

This structure depends absolutely on neighborhood viewpoints like direction

crossing point, speed estimation, and their abnormalities. Every one of the inves-

tigations completed corresponding to this structure approves the proﬁciency and

effectivity of the recommendation and along these lines conﬁrms the truth that the

system can deliver convenient, cherished records to the stressed specialists. The

joining of more than one boundary to consider the chance of a mishap enhances

the unwavering quality of our framework. Since we are zeroing in on an exact area

of movement round the recognized, faster vehicles, we ought to restrict the mishap

occasions.

38 Computer Vision Technique to Detect Accidents 417

Fig. 38.6 Accident image analysis dataset

38.5 Conclusion and Future Work

The proposed system is in a situation to acknowledge mishaps effectively with 71%

detection rate with 0.53% false alarm rate on the accident ﬁlms got under a scope

of surrounding requirements like light, evening, and snow. The exploratory impacts

are consoling and show the ability of the proposed system. Nonetheless, one of the

issues of this work is its incapability for unreasonable thickness site guests because

of errors in car recognition and following that will be tended to in future work.

What is more, goliath limits hindering the space of perspective on the cameras may

likewise affect the checking of vehicles and in ﬂip the collision detection. Faster

R-CNN is a variant of fast R-CNN that has been tweaked. It also proposes using

selective search to locate the regions of interest, which is a slow and time-consuming

approach. Detecting objects takes about 2 s per image, which is substantially faster

than R-CNN. However, when dealing with big real-world datasets, even a fast R-

CNN appears to be slow. R-CNN is now faster. Replaces the selective search method

with a region proposal network, resulting in a substantially faster process.

References

1. Zhou, B., Cao, J., Zeng, X., & Wu, H. (2010, September). Adaptive trafﬁc light control in

wireless sensor network-based intelligent transportation system. In 2010 IEEE 72nd Vehicular

Technology Conference-Fall (pp. 1–5). IEEE.

2. Tian, D., Zhang, C., Duan, X., & Wang, X. (2019). An automatic car accident detection method

based on cooperative vehicle infrastructure systems. IEEE Access, 7, 127453–127463.

418 A. J. Trinishia and S. Asha

3. Henry, X. L., Oh, J. S., Oh, S., Chau, L., & Recker, W. (2001). On-line trafﬁc signal control

scheme with real-time delay estimation technology. In California partners for advanced transit

and highways (PATH). Working papers: Paper UCB-ITS-PWP-2001.

4. Nellore, K., & Hancke, G. P. (2016). A survey on urban trafﬁc management system using

wireless sensor networks. Sensors, 16(2), 157.

5. Janahan, S. K., Veeramanickam, M. R. M., Arun, S., Narayanan, K., Anandan, R., & Parvez, S.

J. (2018). IoT based smart trafﬁc signal monitoring system using vehicles counts. International

Journal of Engineering & Technology,7(2.21), 309–312.

6. Jadhav, P., Kelkar, P., Patil, K., & Thorat, S. (2016). Smart trafﬁc control system using image

processing. International Research Journal of Engineering and Technology (IRJET), 3(3),

2395-0056.

7. Amaresh, A. M., Bhat, K. S., Ashwini, G., Bhagyashree, J., & Aishwarya, P. (2019, May).

Density based smart trafﬁc control system for congregating trafﬁc information. In 2019 Inter-

national Conference on Intelligent Computing and Control Systems (ICCS) (pp. 760–763).

IEEE.

8. Ghazal, B., ElKhatib, K., Chahine, K., & Kherfan, M. (2016, April). Smart trafﬁc light

control system. In 2016 Third International Conference on Electrical, Electronics, Computer

Engineering and Their Applications (EECEA) (pp. 140–145). IEEE.

9. Atta, A., Abbas, S., Khan, M. A., Ahmed, G., & Farooq, U. (2020). An adaptiveapproach: Smart

trafﬁc congestion control system. Journal of King Saud University-Computer and Information

Sciences, 32(9), 1012–1019.

10. Shaaban, K., Khan, M. A., Hamila, R., & Ghanim, M. (2019). A strategy for emergency

vehicle preemption and route selection. Arabian Journal for Science and Engineering, 44(10),

8905–8913.

11. Jindal, V., Bedi, P., Dhankani, H., & Garg, R. ATSOT: Adaptive trafﬁc light system using mOtes

based on threshold.

12. Padmavathi, G., Shanmugapriya, D., & Kalaivani, M. (2010). A study on vehicle detection and

tracking using wireless sensor networks. Wireless Sensor Network, 2(02), 173.

13. Liu, Y., & Li, S. (2011, July). A vehicle-logo location approach based on edge detection and

projection. In Proceedings of 2011 IEEE International Conference on Vehicular Electronics

and Safety (pp. 165–168). IEEE.

14. Lee, K. B., & Shin, H. S. (2019, August). An application of a deep learning algorithm for auto-

matic detection of unexpected accidents under bad CCTV monitoring conditions in tunnels.

In 2019 International Conference on Deep Learning and Machine Learning in Emerging

Applications (Deep-ML) (pp. 7–11). IEEE.

15. Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection

with region proposal networks. Advances in neural information processing systems, 28.

Chapter 39

An Efﬁcient Machine Learning Approach

for Apple Leaf Disease Detection

K. R. Bhavya, S. Pravinth Raja, B. Sunil Kumar, S. A. Karthik,

and Subhash Chavadaki

Abstract Apples are one of the most popular agricultural products. Despite being

one of the most widely grown commodities, apple demand is on the rise. As a result,

this crop, which was formerly only grown in temperate climates, is now being grown

in tropical climates. Pest and disease infestations are a major issue that affects apple

output each year. In this paper, an approach has been made which combines machine

learning and image processing concepts to identify diseases from infected apple

leaves. This method effectively differentiates between diseased and non-diseased

apple leaves. Pre-processing of the image is done using grab cut segmentation which

is the primary stage in the disease identiﬁcation process. The infected type from the

original leaf image is recognized by 96% using the segmentation of the diseased

portion, and multiclass SVM detects the infected type from 500 images using the

feature extraction.

39.1 Introduction

One of the most widely produced crops is Apple. According to the Food and Agri-

culture Organization, the world produced 125,377 kilotons of fresh apples in 2019.

It can kill a wide range of diseases. It also helps to avoid diseases such as cancer and

diabetes. It is high in vitamins and ﬁbre, which are both good for health and also

help to maintain the health of our bodies and brains, as well as the development of a

robust immune system. It makes a substantial contribution to the country’s economy.

K. R. Bhavya (B)·S. Pravinth Raja

Department of CSE, Presidency University, Bengaluru, India

e-mail: bhavyacs08@gmail.com

B. Sunil Kumar

Department of CSE, Annamacharya Institute of Technology and Science, Tirupati, India

S. A. Karthik

Department of C&IT, REVA University, Bengaluru, India

S. Chavadaki

Department of Mechanical Engineering, GITAM University, Visakhapatnam, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_39

419

420 K. R. Bhavya et al.

Apple production is impacted by the diseases like cedar apple rust and black rot. As a

result, many farmers are facing signiﬁcant losses. Most of them are having ﬁnancial

problems as a result of their inability to return the loan amount on time. As a result,

early discovery of this illness can help to limit the number of apples lost. Most of the

time, individuals try to diagnose ailments by looking at them. Because apple fruits

and leaves come in a variety of forms and colours, people may occasionally make

the wrong choice. Plant diseases come in a variety of forms. As a result, inaccurate

results may be obtained, thus slowing down performance. Furthermore, one may not

receive the required result on time [1]. As a result, increased production rates, plant

monitoring by framers, and automation are required [2,3].

Most of the leaf defect detection techniques entail the use of pricey, high-

performance devices and it is also difﬁcult to get excellent precision. Furthermore,

additional human resources are needed to diagnose these disorders from leaves with

the naked eye. As a result, this study provides an approach based on current techno-

logical advancements. The disease can cause a variety of symptoms in leaves. One

can reduce it by applying image processing techniques.

The entire article is organized as Sect. 39.2 brieﬂy highlights related work carried

out in the domain of apple leaf disease detection. Using the multiclass SVM approach,

apple leaf disease is detected which has been described in Sect. 39.3. To decide the

effectiveness of the proposed methodology, results analysis has been done which is

mentioned in Sect. 39.4. Eventually, a conclusion has been made based on results

obtained by the said approach.

39.2 Related Work

Detecting diseases from plant leaves has been performed using a variety of tech-

niques. Padol and Yadav [4] used support vector machine classiﬁers to efﬁciently

identify categories of grape leaf diseases. Author used nine colours and textures as

SVM features after segmenting with K-means clustering. Throughout 137 images,

they were able to achieve an accuracy of 88.89%. Gavhale et al. [5] utilized the

same method to diagnose anthracnose and canker disease, achieving 95% accuracy

with 200 and 100 images. Hossain et al. [6] utilized support vector machine to iden-

tify three varieties of leaf diseases to obtained an accuracy of 90%. Their approach

might automatically choose the optimal cluster. Using superior pre-processing tech-

niques and the GLCM matrix, the suggested methodology has increased efﬁciency.

On 30,000 images, Sladojevic et al. [7,8] utilized a convolutional neural network

classiﬁer to recognize 13 categories of damaged leaves. Their technique has a 96.3%

accuracy rate in detecting diseases. From 87,848 images, their proposed approach

can identify 58 diseases in 25 plants with 99.53%. They also employed ﬁne-tuning

to signiﬁcantly improve overall accuracy. Using the 54,306 images as the data set,

Mohanty et al. [9] used the technique to identify 26 types of diseases amongst 14

different types of crops varieties with 99.35%. The time required to classify them is

longer than that required by the proposed method. The system’s accuracy reduces

39 An Efﬁcient Machine Learning Approach for Apple Leaf … 421

when the test data is somewhat changed, such as several leaves on one image or light

colour leaves. Sannakki et al. [10] also used neural networks to investigate grape

leaf diseases. They employed K-means clustering to separate the data. Thresholding

was applied to get rid of the green pixels. Because they used the backpropaga-

tion approach, they were able to achieve greater accuracy, although their data set

was restricted. All of these systems contain less than 10 leaf image characteris-

tics, compared to the methods described in Sect. 39.3. Furthermore, using only NN

classiﬁers, it is challenging to forecast the neurons in a limited sample.

Artiﬁcial intelligence has become more popular in analysing physical features,

particularly in agriculture, where physical state or quality is critical [11,12]. Fuzzy

logic, for example, was employed to divide coconut fruits into three phases of matu-

rity: male hog, malakanin, and malakatad [13]. There is currently no standard proce-

dure in place for determining the maturity of coconuts [14]. The VGGNet16 model

was determined to be the most accurate of the three, accuracies of 95.75%, 95.9%,

and 98.75%, respectively. To determine if a tomato leaf is infected or healthy with

phoma rot, leaf miner, or target spot, a deep convolutional neural network was devel-

oped [15]. The plant leaf disease recognition was 91.67% accuracy and 95.75% for

recognizing the model using transfer learning.

The major goal our research is to develop an system that can detect the quality of an

apple (Malus domestica) leaf after colour and texture feature extraction and the model

must be able to distinguish between a healthy and an infected leaf [16–18]. If the leaf

is found to be infected, the system will classify it according to the disease kind. The

system will have to distinguish between three forms of apple leaf diseases: Venturia

inaequalis,Botryosphaeria obtusa, and Gymnosporangium juniperi-virginianae.

39.3 Proposed Method

Our proposed method comprises ﬁve steps by considering the parameters such as

colour, texture, and the ﬂow of the training procedure using a multiclass SVM

classiﬁer. The proposed methodology is illustrated in Fig. 39.1.

Disease

Detection

Apple

Leaf from

Kaggle

Image Processing

Texture and

Color

Extraction of

Segmented

Leaf

Preprocessing of

apple leaves

Segmentation

Fig. 39.1 Disease detection model

422 K. R. Bhavya et al.

39.3.1 Data Acquisition

The detection and classiﬁcation models were trained and tested using 500 single-

apple (Malus domestica) leaf images from Kaggle. 200 images are labelled as healthy

leaves, while the other 300 are labelled as infected leaves. Each image data set of

healthy and infected leaves is randomly categorized into two types for the detection

model: 80% for training the system and testing the accuracy of 20%. The 300 image

data set was divided into three subgroups, each with 200 images reﬂecting one of

three types of apple diseases: apple scab, black rot, and cedar apple rust. Some were

taken in the laboratory, while others were taken in the outdoors, and all of the images

were taken from eight distinct apple species. The following elements are identiﬁed

by our model. Each sub-data set likewise had an allocation of 80% for model training

and 20% for accuracy testing. The following elements are identiﬁed by our model.

•Black rot (Botryosphaeria obtusa).

•Cedar apple (Gymnosporangium juniperi-virginianae).

•Apple scab (Venturia inaequalis).

•Healthy apple leaves.

39.3.2 Image Segmentation

Pixels in apple leaf were separated from the background pixels using graph cut

segmentation. CIE 1976 L*a*b* values were assigned to the raw RGB images. The

parameters for L*a*b* are then conﬁgured as follows: 0–100 for L*, −86.1827

to 98.2343 for a*, and −107.8602 to 94.4780 for b*. As illustrated in Fig. 39.2,

scribbles were created in the foreground and background of the image, which was

then transformed to the CIELab colour spectrum using a procedure known as lazy

snapping. After conducting lazy snapping, the RGB vegetation region, where the

apple leaf pixels are placed, was created. For colour and texture extraction, an RGB

vegetation surface is required. In binary mode, the marked region now matches the

apple leaf pixels. This method will guarantee that the affected surfaces are removed

together with the healthy ones.

39.3.3 Feature Extraction

The image data sets were subjected to colour and texture feature extraction [19–25].

12 data points were collected for the colour feature extraction: R,G,B,H,S,V,

L*, a*, b*, Y, Cb, and Cr. Five data points were collected for the texture feature

extractions: contrast, correlation, energy, homogeneity, and entropy given in Table

39.1.

39 An Efﬁcient Machine Learning Approach for Apple Leaf … 423

Fig. 39.2 Image segmentation: (Left) Lazysnapping, (Right) annotated vegetation region

Table 39.1 Feature extraction parameters

Features Equations Description

Entropy K−1

a,b=0k2K−1

a,b=0Pd,θ(a,b)It is a measure of texture in the fruit

image

Energy K−1

a,b=0Pd,θ(a,b)2It is a measure of matching pixel degree

repetition

Contrast K−1

a,b=0Pd,θ(a,b)logPd,θ (a,b)It is measure of power pixel and its

neighbour

Homogeneity K−1

a,b=0

Pd,θ(a,b)

1+|a−b|Closeness

Correlation K−1

a,b=0Pd,θ(a,b)(a−μa)(b−μb)

σaσbJoint probability

Feature selection was applied to the 17 data points or predictors extracted from

colour and texture feature extractions. The method of limiting the input variables or

predictors to enhance the performance of a statistical model is known as feature selec-

tion. This study employed the neighbourhood component analysis (NCA) feature

selection method. Only three predictors, R,V, and b*, were found to have a signif-

icant impact in detecting whether an apple leaf is infected or healthy, as shown in

Fig. 39.3.

39.3.4 Disease Detection

To develop and compare the accuracy of the trained model, the classiﬁcation learner

app in MATLAB is utilized. By splitting the data set into many folds and estimating

the accuracy on k-cross fold-validation, folds are also set to ten to prevent overﬁtting.

424 K. R. Bhavya et al.

Fig. 39.3 Predictors 1, 6, and 9 which matches to R,V,andb* must have important effects of

deﬁning apple health

The accuracies of the training and testing stages were used to compare six SVM

models: linear, quadratic, cubic, ﬁne Gaussian, medium Gaussian, and course Gaus-

sian. The type of kernel function used is the only difference between these models.

The Gaussian ﬁne, medium, and coarse scales, on the other hand, differ from the

kernel scale chosen. Kernel scaling is critical for every SVM model to improve

performance. Fine, medium, and coarse Gaussian SVM kernel scale values were

0.43, 1.7, and 6.9, respectively. The hyperparameter optimization was turned off and

the multiclass method set was one-vs-one.

39.4 Results and Discussion

In our proposed method, we included 600 images of apple leaves from Kaggle [26].

All of the images were 512 by 512 pixels in size, and in the machine, Windows

10 with enough RAM of 16 Giga-Byte has been used to check the efﬁcacy of said

methodology. There are 271 healthy leaves and 329 unhealthy leaves in our sample.

While performing an experimental study, 65% of the data set was utilized for training,

which meant 325,175 images were used for training and testing. Using this split, we

obtain good results.

Three examples of leaf images are displayed in Fig. 39.4, two defected and one

non-defected using our system ﬁnal outputs. The performance of our entire system

is reviewed below.

Performance Evaluation

We analysed the working performance using a confusion matrix and a receiver oper-

ating characteristic curve. We also calculated the F1-score to evaluate accuracy. Table

39 An Efﬁcient Machine Learning Approach for Apple Leaf … 425

Fig. 39.4 Input and output of the proposed method

39.2 gives the validation for the test data and it may be seen in the confusion matrix,

which shows Category 1, Category 2, and Category 3. For this project, we utilized

500 images. We can observe from Category 3 that it ﬁnds healthy leaf images of

169 out of 171 photographs. The results of the other two detections are likewise

satisfactory.

F1-score was calculated at a 65 and 35% ratio of training data and validation

data, and we attain 96% accuracy in our system. Table 39.3 lists the performance

measures. The curve is shown in Fig. 39.5. Accuracy becomes more important when

Table 39.2 Confusion

matrix of proposed model Category 1

Black rot

Category 2

Cedar apple

rust

Category 3

Non-diseased

Category 1

Black rot

92 8 3

Category 2

Cedar apple

rust

7219 0

Category 3

Non-diseased

0 2 169

426 K. R. Bhavya et al.

Table 39.3 Performance

measures Category Precision (%) Recall (%) F1-score (%)

Category 1

Black rot

96 94 95

Category 2

Cedar apple rust

94 98 97

Category 3

Non-diseased

98 96 96

Average 98 97 96

Fig. 39.5 Receiver operating curve

the curve is moves closer to topmost left side [26]. It now includes a micro-average

ROC curve. It shows the area under the curve for several classes. The ROC curve

reﬂects a 96% accuracy rate, which is an excellent result.

Indeed, the proposed technology performed perfectly on these data sets. The

technology not only identiﬁes diseases but also calculates the overall percentage of

affected regions on each leaf.

Our technology takes not more than 3 s to get a result. In addition, the suggested

system’s performance is compared to relevant works in Table 39.4. It also shows the

algorithm that supports all of the works.

We can observe that the proposed approach in this study has a 96% accuracy rate,

which is greater than any of the previous research.

39 An Efﬁcient Machine Learning Approach for Apple Leaf … 427

Table 39.4 Comparison of different leaf infected detection methods

References Method Data set Accuracy

Hu et al. [24]Hyperspectral imaging Potato 95

Prakash et al. [25]Image processing and support vector machines Citrus 90

Asfarian et al. [26] Texture analysis and probabilistic neural network Rice 83

Xu et al. k-nearest neighbour Tom a t o 82

Proposed work Grab cut segmentation and support vector machine Apple 96

39.5 Conclusion

Novel techniques of farming the crop are being developed and tested each year to

meet the expanding demand for apples. The growing of apples in tropical climates is

one of them. The spread of pests and diseases is a factor that has an impact on apple

output each year. This work proposed a hybrid model for detecting if an apple is

healthy or infected and classifying the type of disease present through its leaf. Each

apple leaf image is used to extract colour and texture data sets. The R,V, and b*

values were shown to have substantial implications for detection and classiﬁcation

in this investigation, hence feature selection is done. As R,V, and b* are helpful

for disease detection and classiﬁcation, the future study can focus on extracting one

colour space and then obtaining the required value from other colour spaces to save

time.

The objective of this paper is to differentiate between two forms of apple diseases:

black rot and cedar apple rust. It also tried to detect the apple’s healthy leaves using

image segmentation as a pre-processing stage. Our data is trained and tested using

a multiclass SVM. Our small contribution will assist farmers in detecting these two

severe diseases. This method proves to be a signiﬁcant step forward in increasing

apple production.

References

1. Kellerhals, M., Tschopp, D., & Roth, M. Challenges in apple breeding (pp. 12–18).

2. Behera, S. K., Rath, A. K., & Sethy, P. K. (2021). Maturity status classiﬁcation of papaya

fruits based on machine learning and transfer learning approach. Information Processing in

Agriculture, 8(2), 244–250.

3. Syazwani, R., & Nurazwin, W., et al. (2021). Automated image identiﬁcation, detection and

fruit counting of top-view pineapple crown using machine learning. Alexandria Engineering

Journal.

4. Khan, N., et al. (2021). Oil palm and machine learning: Reviewing one decade of ideas,

innovations, applications, and gaps. Agriculture, 11(9), 832.

5. Patil, P. U., et al. (2021). Grading and sorting technique of dragon fruits using machine learning

algorithms. Journal of Agriculture and Food Research, 4, 100118.

428 K. R. Bhavya et al.

6. Munera, S., et al. (2021). Discrimination of common defects in loquat fruit cv. ‘Algerie’ using

hyperspectral imaging and machine learning techniques. Postharvest Biology and Technology,

171, 111356.

7. Tripathi, M. K., & Maktedar, D. D. (2021). Detection of various categories of fruits and vegeta-

bles through various descriptors using machine learning techniques. International Journal of

Computational Intelligence Studies, 10(1), 36–73.

8. Koyama, K., et al. (2021). Predicting sensory evaluation of spinach freshness using machine

learning model and digital images. PLoS ONE, 16(3), e0248769.

9. Brighty, S., Sahaya, P., Shri Harini, G., & Vishal, N. (2021). Detection of adulteration in fruits

using machine learning. In 2021 Sixth International Conference on Wireless Communications,

Signal Processing and Networking (WiSPNET). IEEE.

10. Rodrigues, B., et al. (2021). Ripe-unripe: Machine learning based ripeness classiﬁcation. In

2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS).

IEEE.

11. Naz, F., Irshad, G., & Abbasi, N. A. (2018). Surveillance and characterization of

Botryosphaeria obtusa causing frogeye leaf spot of Apple in District Quetta Abstract (Vol.

16, pp. 111–115).

12. Strickland, D., Carroll, J., & Cox, K. (2020). Cedar apple rust.

13. Rifﬂe, J. W., & Peterson, G. W. (1986). Diseases of trees in the great plains (General Technical

Reports—U.S. Department of Agriculture, Forest Service, no. RM-129). https://doi.org/10.

5962/bhl.title.99571

14. Rigor, D. B., Oryan, C., Ochasan, J. M., Boncato, T., Pedroche, N., & Amoy, M. (1997).

Evaluation of temperate zone fruits in the highlands of Nothern Luzon, Philippines. Acta

Hortic, 441, 59–66. https://doi.org/10.17660/ActaHortic.1997.441.5

15. Concepcion, R. S., Loresco, P. J. M., Bedruz, R. A. R., Dadios, E. P., Lauguico, S. C., &

Sybingco, E. (2020). Trophic state assessment using hybrid classiﬁcation tree-artiﬁcial neural

network. International Journal of Advances in Intelligent Informatics, 6(1), 46–59. https://doi.

org/10.26555/ijain.v6i1.408

16. Concepcion, R., Lauguico, S., Alejandrino, J., Dadios, E. P., & Sybingco, E. (2018). Lettuce

canopy area measurement using static supervised neural networks based on numerical

image textural feature analysis of Haralick and gray level co-occurrence matrixs. Journal

of Agricultural Science, 156(1), 1. https://doi.org/10.1017/S0021859618000163

17. Javel, I. M., Bandala, A. A., Salvador, R. C., Bedruz, R. A. R., Dadios, E. P., & Vicerra,

R. R. P. (2019). Coconut fruit maturity classiﬁcation using fuzzy logic. In 2018 IEEE 10th

International Conference on Humanoid, Nanotechnology, Information Technology, Communi-

cation and Control, Environment and Management (HNICEM 2018).https://doi.org/10.1109/

HNICEM.2018.8666231

18. De Luna, R. G., Dadios, E. P., Bandala, A. A., & Vicerra, R. R. P. (2019). Tomato fruit image

dataset for deep transfer learning-based defect detection. In Proceedings of the IEEE 2019 9th

International Conference on Robotics, Automation and Mechatronics (RAM) (pp. 356–361).

https://doi.org/10.1109/CIS-RAM47153.2019.9095778

19. De Luna, R. G., Dadios, E. P., & Bandala, A. A. (2019). Automated image capturing system

for deep learning-based tomato plant leaf disease detection and recognition. In IEEE Region 10

Annual International Conference, Proceedings/TENCON (Vol. 2018, pp. 1414–1419). https://

doi.org/10.1109/TENCON.2018.8650088

20. Chien, C.-L., Tseng, D.-C., et al.: Color image enhancement with exact HSI color model.

International Journal of Innovative Computing, Information and Control, 7(12), 6691–6710.

21. Yu, C., Dian-ren, C., Yang, L., & Lei, C. (2010). Otsu’s thresholding method based on gray level-

gradient two-dimensional histogram. In 2010 2nd International Asia Conference on Informatics

in Control, Automation and Robotics (CAR 2010), vol. 3. IEEE, 2010, pp. 282–285.

22. Ehsanirad, A., & Sharath Kumar, Y. H. (2010). Leaf recognition for plant classiﬁcation using

GLCM and PCA methods. Oriental Journal of Computer Science and Technology, 3(1), 31–36.

23. Zweig, M. H., & Campbell, G. (1993). Receiver-operating characteristic (ROC) plots: A

fundamental evaluation tool in clinical medicine. Clinical Chemistry, 39(4), 561–577.

39 An Efﬁcient Machine Learning Approach for Apple Leaf … 429

24. Hu, Y., Ping, X., Xu, M., Shan, W., & He, Y. (2016). Detection of late blight disease on potato

leaves using hyperspectral imaging technique. Guang Pu Xue Yu Guang Pu Fen Xi =Guang

Pu, 36(2), 515–519.

25. Prakash, R. M., Saraswathy, G., Ramalakshmi, G., Mangaleswari, K., & Kaviya, T. (2017).

Detection of leaf diseases and classiﬁcation using digital image processing. In 2017 Inter-

national Conference on Innovations in Information, Embedded and Communication Systems

(ICIIECS) (pp. 1–4). IEEE.

26. Asfarian, A., Herdiyeni, Y., Rauf, A., & Mutaqin, K. H. (2013). Paddy diseases identiﬁcation

with texture analysis using fractal descriptors based on Fourier spectrum. In 2013 International

Conference on Computer, Control, Informatics and Its Applications (IC3INA) (pp.77–81).

IEEE.

Chapter 40

Precipitation Estimation Using Deep

Learning

Mohammad Gouse Galety , Fanar Fareed Hanna Rofoo,

and Rebaz Maaroof

Abstract In the new computational era, CNN rules all over; they have proved the

best models for learning characteristics and features of the extensive dimension data.

The network architecture is deﬁned meticulously to ease out the problem-speciﬁc

data. Precipitation estimation is the objective of the CNN, which is discussed in the

paper. The problem statement is the critical element of estimating precipitation in a

typical atmospheric model with all meteorological support of scalar data. Statistical

downscaling in the numerical model of atmosphere with the collected snapshot data

on the geo-grid with all dynamical elements is the input for the CNN, used for

precipitation estimation. A new model for numerical precipitation estimation and a

data-driven method with CNN framework is the idea proposal of the paper.

40.1 Introduction

Numerical modeling of the atmosphere is a very fundamental task for mathemati-

cians. Several primitive dynamical equations are composed in the numerical model

of the atmosphere. There are primitive parameters such as thermodynamics, kine-

matics, humidity, heat, radiation, diffusion, and water on surface, which guide envi-

ronmental properties. The notions characterized by the parameters are discretized and

eased out for many kinds of analyzes using computing technologies, out of which the

precipitation analysis is elementary of all the tasks. The properties of particle coales-

cence, density ambulance, pass change, and the clouds’ phenomenal distribution are

simulated with thermal equilibrium and other physical properties during the study

of precipitation estimation. A unique physical science faculty called cloud physics

contributes well to the prognosis of atmospheric conditions. Prognostic variables are

the primary variables of the atmospheric system. Certain differential properties are

M. G. Galety (B)·F. F. H . Rofo o ·R. Maaroof

Department of Information Technology, College of Computer Science and Information

Technology, Catholic University in Erbil, Erbil, Kurdistan 44001, Iraq

e-mail: m.galety@cue.edu.krd

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_40

431

432 M. G. Galety et al.

evolved from the atmospheric studies that are mapped to a speciﬁc grid and sub-grid-

scales, while analyzes of predictive variables. Consensus collected on the prognos-

tics and variables is applied with various heuristics, and large averages are computed

manually with less predictability. The properties are collected from the atmosphere,

and heuristics are used during the discovery process of atmosphere status. The atmo-

spheric properties are deduced by performing downscaling operations in a sub-grid

scale for a particular demographic zone. Scale resolution eliminates discrepancies

between the simulated models and real-time applications during downscaling in the

atmospheric experiments.

The data-driven model statistical downscaling is applied to estimate accuracy

in atmospheric parameters. Statistical parameters of atmosphere are dynamically

down-scaled; a numerical model applies statistical downscaling of the atmospheric

parameters, calibrating, and customizing the benchmarks speciﬁcations. The numer-

ical model is ranked with its outperformance for undertaking various schemes. Many

processes that are unresolved are resolved, during the processes incurred in the

numerical model. Therefore, the phenomenon is a practical implication that demands

relevant parameterization techniques and enhances prediction probability.

While fostering the weather forecast requirements, the task of precipitation predic-

tion and statistical downscaling to improve the accuracy levels, the support of

machine learning and deep learning caters stronger, and sufﬁcient frameworks that

can enrich the computational background for the numeric model.

40.2 Related Works

Most of the studies on atmospheric modeling are associated with statistical down-

scaling. Deep neural networks are discussed here to support a brief review of the

statistical downscaling.

40.2.1 Statistical Downscaling

Downscaling is the mechanism of providing higher resolutions of predictions. Statis-

tical downscaling in weather forecasting is applied to narrow the climatic projections

from global climate to local models. Statistical downscaling is also referred to as

dynamical downscaling, where climate models are with high resolution also noted

as high-resolution climate models (HRCMs). Three approaches are in vogue to apply

statistical downscaling in atmospheric modeling: excellent prognosis, weather gener-

ators, and model outcome statistics. The excellent prognosis and model outcome

statistics are approached depending on the objectives and requirements and their

deterministic nature. The principal weather components such as precipitation mois-

ture, wind ﬁeld, corresponding pressure, and speciﬁc raw variables contribute to the

composition of predictors in building a linear regression model.

40 Precipitation Estimation Using Deep Learning 433

The model predictors are instantaneously biased and numerical so that model

output statistics will determine model prediction accuracy. The consistency of precip-

itation and the prediction based on the consistency biases is the antecedents for

ascertaining the validity of precipitation prediction in the ‘model output statistics’.

Though the model is successful, the data speciﬁcations and sources are not stable

and guaranteed.

40.2.2 Applications

Learning in DNN is accomplished in machine learning. The scope of applica-

tions of DNN in statistical modeling and prediction is to catalyze the operations of

feedback and feed-forward methodologies as computer-aided modeling and predic-

tion. However, deep learning has a different application framework compared with

machine learning.

The deep neural networks deal with deep learning; the statistical downscaling

employs deep neural networks to accomplish operable end-to-end workﬂow model.

The modeling process entreats the feature extraction processes, enabling the subject’s

pre-engineered features to learn and develop the customized features. During

learning, objective of all deep neural networks is to understand the features, which

are further customized into multiple levels and based on the data representation.

Higher the abstraction, higher the level of learning. The latest features give only

an outline of the data rather than details. In statistical downscaling, all resolutions

levels of data have to process for learning in the deep neural networks. Particularly

in precipitation prediction, a local scale value forms a conglomerate of parameters

and deﬁnes abstraction in learning.

CNNs are a particular category of deep neural networks that undertake a whole-

some building of ‘workﬂow-based frameworks’. The convolutional neural networks

contrast with the conventional neural networks, which emerge from essential capac-

ities in dealing with complex computations toward stratiﬁed computations on the

extensive dimensional data. Understanding the model comprises learning the abstract

forms of the data later, for conceptualizing the results and decision-making parame-

ters, a detailed version of data is incorporated in the process. Thus, the CNN reduces

structural redundancy ﬁrst to foster effective information representation.

Voluminous data is mounted into the numerical simulations of remote sensing

observations, as geophysical data possesses intrinsic structures with time and space

collections. Such data poses reliable candidacy for the experiments in deep neural

networks. The farthest weather conditions can be determined using CNN with the

possible rates. The most trending experiments with weather data are nowcasting,

which is possible with CNNs [1–3].

In long-standing deep learning research proposals, convolution neural networks

that suit to work on super-resolution data. Parameters of atmospheric data for esti-

mating precipitation using statistical downscaling methods possess and work on high-

resolution data sets. Therefore, research has increased on weather nowcasting using

434 M. G. Galety et al.

convolutional neural networks on super-resolution weather data. A brief compre-

hension of the atmospheric data is fetched from the experiments on weather data

using convolutional neural networks, with a detailed exploration into parameterizing

uncertain data, modeling in geo-ﬂuids, and dynamic modeling of atmospheric data.

Deep neural networks proved the questions of comprehensive applications on exten-

sive dimension data, empirically notiﬁed observations, and numerical simulations in

the improvising precipitation estimation framework.

40.3 Methodology

Many real-time scenarios are based on the formulation of the precipitation estimation

problem. During the formulation, the established models describe the situations of

circulation-precipitation connection are we analogized. A windward slope, merid-

ional temperature gradients during winters, and particularly storms deﬁne winter

precipitation as an extra-tropical process or primarily as a cyclone. Similar situations

in the mountainous regions are deﬁned as an orographic effect. Using conventional

mechanisms, researchers and meteorological scientists lost critical observations for

some areas of geography,which penetrates or travels across several landmasses, while

determining the accurate precipitation estimation. A super-resolution case needs to be

employed when the grid and sub-grid level precipitation estimations can be prepared

from the data.

Energy plays a vital role in the atmosphere, such as atmospheric energy as potential

energy to kinetic energy, encompassing baroclinic disturbances, even positively as

baroclinic waves. The parameters that support the build-up of super-resolution data

seem unstable to draw cyclonic effects, though they provide grid and sub-grid levels

of data. Moisture convection and precipitation formation can be known from the

disturbances dealing with the distinct precipitation distribution pattern.

To understand the characteristics of the atmospheric disturbances, an empirical

precipitation estimation is evolved during the numeric weather prediction.

The circulation design to precipitation estimation is represented with empirical

mapping as follows:

E(X,C)=fc(X,β) (40.1)

The cited formula with E,P,X, and Crepresents the computation of expected

value, precipitation-based estimates, predictors, and local-based climate conditions.

The empirical consequence for the climate-based condition (C) is deﬁned as fc.

Major parameters for the deﬁnition of fcare represented by β.

The precipitation observed daily is referred to by P, representing drawings signif-

icant to target-related grid. Pis predicted from the above formulation needs an

essential clariﬁcation about fc,which is subsumed as the corresponding training and

validation based on spatio-temporal resolutions.

40 Precipitation Estimation Using Deep Learning 435

Ultimately, to derive deterministic precipitation based on the prediction drawn in

cited formula, Xis computed with signiﬁcant clariﬁcations (from realistic values),

exhibiting the utilization in the iterations of analysis of the products yield into

observations.

The essential capacities of the complex computation on the extensive dimension

data are wholly vested with the construction of neural networks; however, conven-

tional neural networks depose constraints, a deep learning neural network is suitable

using convolutions, which reduces the structural redundancy of the model fostering

the parameters required for effective information extraction.

The statistical downscaling model projects a geo-grid with a minimum character-

istic scale. A stipulated horizontal velocity of the atmosphere is set up for the model’s

proﬁle. Essential information of circulation context and the dynamical downscaling

on the geo-grid is assumed for the deliverables to suit the atmosphere’s dynamics.

Particularly in the application of precipitation estimation, the role of the feature

vector is introduced to estimate the transformation as the initial activity for sample

four-dimensional physical ﬁeld, which also represents the circulation and moisture

proﬁle. Also implicit is that the relationship between predictors and predictands with

feature vector comprises predictors’ characteristics.

Precipitation estimates which are intervening are composed of circulation ﬁeld

with a signiﬁcant climate scale with respect to the local climate conditions, enabling

the features for the statistical downscaling in precipitation prediction are adopted. A

cyclogenesis instance is considered, where ‘precipitation is directly proportional to

local climate conditions’ that impose cyclones. Where the parameters set are different

globally, and as the local climate, comprehensive information for the precipitation

estimation is not seen.

Encoded local values extract the contingencies of circulation and precipitation

estimation. Parameters related to circulation and moisture from the extensive dimen-

sion data fed into a typical CNN architecture with the support of sensible features to

estimate the climate with precipitation at the candidate geo-grid comprehensively.

As CNN works with pooling, pooling operations direct the parameters with convo-

lutions to generate the expected patterns of parameters in the atmosphere. The higher

layers in the network promote the computations of the variables of precipitation

drawn from the extensive dimension data, which are classiﬁed and ﬁltered with

geo-potential height to overcome the circulation constraint.

The moisture constraint related to the expected precipitation is deﬁned as precip-

itable water. It is known spatially as a coverage constant for the dynamic ﬁelds of

the geo-grid, which are represented as xand y.

Now, the predictors are computed with the circulation and moisture constraints,

where cis determined as the predictor’s category, the convolutional layer applies

tensor with c×c×m×nwith a minimum predeﬁned stride of 3 ×three as input.

The dot-product is drawn from the tensor, and various patches in the information

element wise and further noted as C×x×yarray. Convolution is performed as a

computation for the dot-product. At the same time, a relative tensor c×c×m×n

deﬁnes kernel for the convolution-based layer. The kernel tensor, c, and m×nare

channel numbers and receptive meadow output.

436 M. G. Galety et al.

The results obtained in the computation are further treated with the nonlinear-

based transformation f, which is imposed as a ReLU function, f(x)=max (0, x).

The cis represented as the convolution-based operation, utilizing ﬁlters to convert

the input as essential objectives of learning. The geo-grid’s critical local dynamic

design is the resource for learning at the activated ﬁlters.

Hence, local features contribute to the learning at an abstract level on the geo-grid

receptive ﬁeld at higher layers in the convolution framework. A maximum local patch

is computed by a feature map of the convolution neural network in the pooling layer.

The nature of the convolution neural network is complex, but non-linearity at pooling

stages will sufﬁce the stacked parameter problems with the features. Subsequent

incidence of the input to the dense base layers will succeed in extracting all known

attributes for the beneﬁt of the target being learned.

Temporal collection of the snapshots of the dynamical meadow in the select geo-

grid is recorded and computed with averages to enable the convolution to provide

the total daily precipitation and ﬁnally ﬁgure out the patterns with a unique picture

of precipitation estimation. A CNN model can be used to map the dynamic ﬁelds of

the geo-grid as a snapshot collected temporally and can be very well represented as

a daily precipitation estimation.

Compared to conventional neural networks, more layers deﬁne the complex

computations on a geo-grid’s complex structured large dimension dynamical data.

Traditional machine learning and probabilistic networks could only project the latent

values. Hence, the training and validation processes in the CNN model thrust on the

prediction of the data exceptionally. Regularization is the process of avoiding over-

ﬁtting. A dropout and batch normalization could be accomplished as modules after

employing the CNN, which ensures the improvisation of the model. Limitations in

the training and validation process adapt continuous requirements to the input layers

while reﬁning the learning process.

Further, backpropagation shall also be employed in the network models to

improvize the extensive dimension data with dynamic climate parameters drawn

for a geo-grid. The tuning or adjustment of parameters is decided by the stride size

which shall lead to ‘improvisation of learning rate’, using gradient descent method.

For every known loss, a function is computed in the gradient, which attributes for the

circulation-snapshot of the dynamical ﬁeld of geo-grid temporally, where the outputs

are summed up.

40.4 Discussion

Among are the computational theories; the methods converge and connect entirely to

the hypothesis of the problems and the experimentations with the highest propensity

of proving methods. CNN can deal with extensive dimension data with coarse, uncer-

tain, and formalizable properties. The CNN can relentlessly perform computations

yielding local and global climate modeling decision-making patterns with relevant

snapshots temporally collected. In precipitation estimation using a numerical model,

40 Precipitation Estimation Using Deep Learning 437

the CNN is prudent with strengths of receptive meadow and super-resolution dynam-

ical data of the geo-grid. A single geo-grid may be input for an elementary test to

achieve the best descriptive results. As more features are vectorized, more learning

parameters shall be deduced to work with the CNN; the overall complexity with the

CNN is multiplication and stride with the deciding kernels, which is almost treated

as negligible when compared with the tasks of the numerical model in the statistical

downscaling for precipitation estimation.

40.5 Conclusion

Prediction of precipitation alleviates the catastrophic problems and aid the meteoro-

logical balance of the atmosphere. Thrifty usage management of natural resources

can manage the atmospheric balance; computations of a numerical atmosphere model

are needed in times of untoward imbalances. CNN introduces models to overcome the

dimensionality of the data and improvised performances compared with conventional

precipitation estimation of numerical weather forecasting. The statistical down-

scaling model with convolutional neural networks is implemented to estimate the

atmosphere’s copiousness using snapshots.

References

1. Shen, C. (2018). A transdisciplinary review of deep learning research and its relevance for

water resources scientists. Water Resources Research, 54(11), 8558–8593.

2. Miikkulainen, R., et al. (2019). Evolving deep neural networks. In Artiﬁcial intelligence in the

age of neural networks and brain computing (pp. 293–312). Academic Press.

3. Kamilaris, A., & Prenafeta-Boldú, F. X. (2018). Deep learning in agriculture: A survey.

Computers and Electronics in Agriculture, 147, 70–90.

4. Pinardi, N., et al. (2017). From weather to ocean predictions: a historical viewpoint. Journal

of Marine Research, 75(3), 103–159.

5. Kalnay, E. (2019). Historical perspective: Earlier ensembles and forecasting forecast skill.

Quarterly Journal of the Royal Meteorological Society, 145, 25–34.

6. Zarekarizi, M., Rana, A., & Moradkhani, H. (2018). Precipitation extremes and their relation

to climatic indices in the Paciﬁc Northwest USA. Climate Dynamics, 50(11–12), 4519–4537.

7. Akatsuka, S., Susaki, J., & Takagi, M. (2018). Estimation of precipitable water using numerical

prediction data. Engineering Journal, 22(3), 257–268.

8. Galety,M., Al Mukthar, F. H., Maaroof, R. J., & Rofoo, F. (2021). Deep neural network concepts

for classiﬁcation using convolutional neural network: A systematic review and evaluation.

Technium: Romanian Journal of Applied Sciences and Technology,3(8), 58–70. https://doi.

org/10.47577/technium.v3i8.4554

9. Gouse, G. M., Haji, C. M., Saravanan (2018) Improved reconﬁgurable based lightweight crypto

algorithms for IoT based applications. Journal of Advanced Research in Dynamical & Control

Systems, 10(12), 186–193.

10. Reshma, G., Al-Atroshi, C., Nassa, V. K., Geetha, B., Sunitha, G., Galety, M. G., &

Neelakandan, S. (2022). Deep learning-based skin lesion diagnosis model using dermoscopic

images. Intelligent Automation and Soft Computing, 31, 621–634.

Chapter 41

The Adaptive Strategies Improving

Digital Twin Using the Internet of Things

N. Venkateswarulu, P. Sunil Kumar Reddy, O. Obulesu, and K. Suresh

Abstract Digital twins for factories and processes are becoming more prevalent

and more valuable as a result of recent technological breakthroughs and the rise of

smart manufacturing. There are also more potential for closed-loop analytics with

digital twins, as well as with the rise of connection, data storage, and the Industrial

Internet of Things (IIoT). Some factories have employed discrete event simulations

(DES) to construct digital twins that are connected to the manufacturing ﬂoor and

can be monitored in real time. However, it is difﬁcult to quantify the advantages

of a digital twin that is linked to the real world. With the emergence of the new

generation of mobile network (5G), Tactile Internet, as well as the deployment of

Industry 4.0 and Health 4.0, multimedia systems are moving towards immersed

haptic-enabled human–machine interaction systems such as the digital twin (DT).

Speciﬁcally, Industry 4.0 will be using DT and robots on a large scale. This will

increase human–machine and interaction to a great extent. There will be multimodal

communications used to interact with digital twins and robots, especially haptics.

Hence, Tactile Internet will replace the conventional Internet today. In fact, a DT

system can also be extended in Health 4.0 domain to act as a COVID-19 is a COVID-

19 early warning system. When a person’s temperature and other symptom data are

tracked in real time, it may be determined whether or not it is time to see a doctor or

undergo a COVID examination. In conjunction with a COVID tracing programme,

the digital twin may be able to provide further information about the virus in relation

to the individual. Since there are currently no well-recognized models to evaluate the

performance of these systems, to address this research lacuna, we proposed a Quality

of Experience (QoE) model for DT systems con-training multi-levels of subjec-

tive, objective, and physiopsychological inﬂuencing factors. The model is itemized

through a fully detailed taxonomy that deduces the perceived user’s emotional and

N. Venkateswarulu ·O. Obulesu (B)

Department of CSE, GNITS, Hyderabad 500104, India

e-mail: Obulesh196@gmail.com

P. Sunil Kumar Reddy

Tibco Software India Ltd, Hyderabad, India

K. Suresh

Department of CSE, Sree Vidyanikethan Engineering College, Tirupati, Andhra Pradesh, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_41

439

440 N. Venkateswarulu et al.

physical states during and after consuming spatial, temporal, proximal, and abstracted

multi-modality media between humans and machines.

41.1 Introduction

In recent years, research into the concept of “Industry 4.0” has exploded, robotics,

visual systems, autonomous control, and closed-loop feedback which is essential

changing current production practices in the process [1]. New production technolo-

gies such as to Industry 4.0. If you’re interested in learning more about how to

increase quality control through Industry 4.0, there are plenty of options for research

and development [2]. As a result of a wide range of factors, including a tremendous

amount of data available to owners and operators, this change has occurred. Recent

advances in technology, having been integrated into factories.

“Internet of Things” (IoT) is an umbrella term that refers to the expansion of the

Internet into the physical world through the widespread deployment of devices that

have embedded identiﬁcation and sensing capabilities. “Internet of Things” refers

to the widespread deployment of devices with embedded identiﬁcation and sensing

capabilities. According to Wikipedia, “The Internet of Items (IoT) is an infrastruc-

ture that enables improved services by networking both physical and virtual things,

based on existing and evolving interoperable information and communication tech-

nologies”, according to the ITU-T Y.2060.The fourth industrial revolution origi-

nates from the advent of IoTs. Through the advancement of information capture and

sharing, processing and communication, systems may interact to deliver services not

formerly attainable. Tangible objects, the things, are in the physical world and are

efﬁcient of being sensed and attached to others in its network. These things every

so often have a virtual controller that enables the interaction and data exchange

with sensory things in the environment. This infrastructure manifests itself in the

domain of Industry 4.0 as a technology known as cyber-physical systems (CPS).

Khaitan and McCalley [3] deﬁnes CPS as “systems in which physical and software

components are closely coupled, each working on a different special and temporal

scale, exhibiting diverse and distinct behaviours, and interacting with one another in a

multitude of ways that change with context”. Cybernetic capabilities in every compo-

nent, high levels of automation, different scales of networking, and multiple levels of

integration across time and space are all necessary CPS aspects. A modular dynamic

that allows for reorganization and reconﬁguration is another essential CPS element.

Providing haptic contact with audiovisual feedback, as well as technology solutions

that allow not only visual engagement, but also that involving robotic systems, to

be controlled and guided with an imperceptible time delay, are required for next-

generation innovation. The Tactile Internet will be created as a result of this new

wave of innovation (TI). According to the perspective of El Saddik [1], a digital twin

is an exact digital reproduction of a live or nonliving physical thing. Data can be sent

between the physical and virtual worlds without interruption, allowing the virtual

and physical worlds to coexist in real time. In order to monitor, comprehend, and

41 The Adaptive Strategies Improving Digital Twin Using … 441

Fig. 41.1 Digital twin technology

optimize the physical entity, a digital twin is an integrated multimedia that enriches

the pathways and gives continuous input to improve quality of life and well-being.

Artiﬁcial intelligence (AI), mixed reality (MR), and haptics, IoT, cybersecurity, the

Tactile Internet (TI), industry, and health are some of the technologies that make up

a digital twin. In addition to the quality of experience in particular (Fig. 41.1).

A key component of the digital twin vision is the optimization of multi-modal

communication/interaction from the user perspective, which did not get any attention

from industrial and academic stake holders until so far. As a result, the demands to

conduct extensive researches and developments in this domain are still growing and

infancy stages.

41.2 Relevant Work

Due to COVID-19 outbreak, most of the daily activities such as work, research,

and education are taken place online rather than in an inline style. According to

Feldmann et al. [4] Since March 2020, trafﬁc to applications for teleworking and

online education, such as VPN and video conferencing, has increased by more than

200 per cent on an annual basis. Because of this, there has been a signiﬁcant increase

in Internet trafﬁc. As a matter of fact, according to a recent CNN news piece 1, popular

video providers such as Netﬂix and YouTube are slowing down their operations in

North America and Europe in order to prevent the Internet from crashing completely.

The COVID-19 pandemic has altered how we live, breathe, and do business. It has

brought the global economy to a halt, posing insurmountable continuity problems

to any industry. With that being addressed, there is an increasing demand to enable

digital twin technology and ultimately over the Tactile Internet.

When stepping into the era of 5G, the emergence of the Tactile Internet (TI) can

help haptic communication enter the new world for digital twins’ applications. The

most directly related is people’s health and well-being in life. Up to now, people

442 N. Venkateswarulu et al.

Fig. 41.2 Industry 4.0 using digital twin (DT) architecture over the Tactile Internet

who live in some suburban areas still suffer from not getting high-quality and timely

treatment. With the help of superb network conditions and the support from the haptic

technical system, the patients whose physical conditions are not allowed to travel

in long distance to seek medical treatment can be diagnosed and treated at home or

local hospital by some experienced doctors at remote places all over the world. One

of its well-known applications is telesurgery, as illustrated in Fig. 41.2.

41.3 Proposed Work

The predicted decay pattern of the spectrum. The same pattern was observed in all of

the participants studied. If you look at Fig. 41.2, you can clearly see that the median

frequency patterns of muscles like the deltoid and deltoid muscles diminish over

time as a sign of muscle exhaustion. Because the EMG MDF signal ﬁts the second

polynomial regression model, we could utilize its slope as a good predictor of muscle

weariness in that respondent.

Algorithm 1 MDF Fatigue measurement

1: procedure SUBJECT’S MUSCLE

2: for each subject do

3: for each muscle do

4: GET EMG signal

5: GET Accy signal

6: do FFTsEMG(ω)

7: Calculate MDF

8: V =second polynomial regression ﬁt of MDF

9: end for

10: FI =MDF d2V dt2

41 The Adaptive Strategies Improving Digital Twin Using … 443

11: end for

12: end procedure

Algorithm 1 is based solely on the high-level reported subjective parameters of

the taxonomy. Both models outperform the others with a Pearson correlation for the

SVM =87.4% and 86.9% for the normal and generalized regression. In terms of

how the reported QoE varies from its model-predicted level. The error rate for the

SVM was the minimal with er =6.5%. This means that SVM can predict QoE for

Mukers HVR game with results that only deviate 6.5 points on a scale of 100 from

the reported subjective QoE values. The linear regression model also performs well

and was capable to quantify the predicted QoE with Eq. 4.4:

QoEP =α1.CQ +α2HWQ +α3NWQ +α4.UX +α5.US +β(41.1)

where αirepresents the weighting factor for each inﬂuencing quality and βis the

intercept. With coefﬁcient of determination =91.0%, Eq. 4.4 can be realized with a

95% conﬁdence interval as Eq. 4.5:

QoEP =5.751CQ +0.1848HWQ +0.3766NWQ +0.1723.UX +0.0372US

−11.29 (41.2)

As can be noticed from Eqs. 41.1 and 41.2 has the most signiﬁcant factors due

to the fact that the users were was not aware of the simulated QoS disturbance

for the controlled scenarios and most of them reported that there was a noticeable

latency that impact the responsiveness of the game as well as physics and the force

feedback from the haptic devices were abrupted during the HVR interaction. Keep in

mind that the Mukers is a dynamic Telehaptic VR environment that demands more

communication with the server in order to update the objects 3D model and its haptic

and locomotion parameters.

41.4 Results and Discussion

In terms of QoE, prediction based on biosignals only. As can be seen in Fig. 41.3,the

SVM again outperforms the other ML algorithms. It should be noted that even with

using biosignals as the input features for the ML algorithms, we have achieved an

accepted prediction level with the largest relative error for Chi-squared Automatic

Interaction Detection algorithm (CHAID) equal to 16.64%. It is obvious that incor-

porating both subjective and physiopsychological inputs improves the performance

of all 6 ML algorithms reaching almost 92% correlation rate for the case of SVM.

More statics about the performance of the 6 ML algorithms are depicted in Fig. 41.3.

Lastly, it should be reported that when we feed all the attributes to the ML regression

pipeline.

444 N. Venkateswarulu et al.

Linear

Biometric only 0.834 0.833 0.828 0.827 0.833 0.793

Subjective+Biometric 0.919 0.917 0.915 0.9145 0.899 0.866

Linear

Biometric Only

0.077 0.094 0.0968 0.1017 0.1029 0.1664

Subjective+Biometric 0.054 0.0569 0.0601 0.0637 0.0974 0.1147

Fig. 41.3 Evaluation performance for ML QoE based on biometrics only (red) and subjective +

biometrics

41.5 Conclusion

The technology employed in the experiment as well as the research methodologies

used to realize our DT QoE taxonomy. We wanted to elaborate and show the reasoning

and justiﬁcations that drove the implementation of this experiment. We also comment

about the variables that were important for this QoE modelling and why they were

selected. We also build multiple QoE prediction models based on machine learning.

These models can automatically estimate the QoE based on explicit users’ subjective

and implicit biomedical features. The models were benchmarked and assessed using

multiple metrics. In the next two chapters, we will emphasize the QoE of DT tele-

operation on a large scale as well as we will model an important physiological

attribute, i.e., fatigue when designing a DT system. We compared our technique with

the existing technique in terms of precision and response time.

41 The Adaptive Strategies Improving Digital Twin Using … 445

References

1. El Saddik, A. (2018). Digital twins: The convergence of multimedia technologies. IEEE

Multimedia, 25(2), 87–92.

2. Steinbach, E., Strese, M., Eid, M., Liu, X., Bhardwaj, A., Liu, Q., Al-Ja’afreh, M., Mahmoodi,

T., Hassen, R., El Saddik, A., et al. (2018). Haptic codecs for the tactile internet. Proceedings

of the IEEE, 107(2), 447–470.

3. Khaitan, S. K., & McCalley, J. D. (2014). Design techniques and applications of cyberphysical

systems: A survey. IEEE Systems Journal, 9(2), 350–365.

4. Feldmann, A., Gasser, O., Lichtblau, F., Pujol, E., Poese, I., Dietzel, C., Wagner, D., Wichtl-

huber, M., Tapiador, J., Vallina-Rodriguez, N., et al. (2020). The lockdown effect: Implications

of the covid-19 pandemic on internet trafﬁc. In Proceedings of the ACM Internet Measurement

Conference (pp. 1–18).

5. Arima, R., Sithu, M., Ishibashi, Y., et al. (2017). QOE assessment of fairness between players in

networked virtual 3D objects identiﬁcation game using haptic, olfactory, and auditory senses.

International Journal of Communications, Network and System Sciences, 10(07), 129.

6. Narumi, T., Kajinami, T., Nishizaka, S., Tanikawa, T., & Hirose, M. (2011). Pseudo-gustatory

display system based on cross-modal integration of vision, olfaction and gustation. In 2011

IEEE Virtual Reality Conference (VR) (pp. 127–130). IEEE.

7. Hassen, R., & Steinbach, E. (2018). Hssim: An objective haptic quality assessment measure

for force-feedback signals. In 2018 Tenth International Conference on Quality of Multimedia

Experience (QoMEX) (pp. 1–6). IEEE.

8. 3D systems. Touch haptic device. (2021). Online accessed; Thursday, May 6, 2021.

9. Hoda, M., Haﬁdh, B., & El Saddik, A. (2015). Haptic glove for ﬁnger rehabilitation. In 2015

IEEE International Conference on Multimedia & Expo Workshops (ICMEW) (pp. 1–6). IEEE.

10. Holland, O., Steinbach, E., Venkatesha Prasad, R., Liu, Q., Dawy, Z., Aijaz, A., Pappas, N.,

Chandra, K., Rao, V.S., Oteafy, S., et al. (2019). The IEEE 1918.1 “tactile internet” standards

working group and its standards. Proceedings of the IEEE, 107(2), 256–279.

11. ITU-T Study Group 13. (2012). Recommendation itu-t y.2060, overview of the internet of

things.

12. Fettweis, G. P. (2014). The tactile internet: Applications and challenges. IEEE Vehicular

Technology Magazine, 9(1), 64–70.

13. International TelecommunicationUnion ITU-T. (2014). The tactile internet. ITU-T Technology

Watch Report.

14. Rank, M., Shi, Z., Müller, H. J., & Hirche, S. (2010). Perception of delay in haptic telepresence

systems. Presence: Teleoperators and Virtual Environments, 19(5), 389–399.

15. Ache, B. W., & Young, J. M. (2005). Olfaction: Diverse species, conserved principles. Neuron,

48(3), 417–430.

16. El Saddik, A., Orozco, M., Eid, M., & Cha, J. (2011). Haptics technologies: Bringing touch to

multimedia.Springer Series on Touch and Haptic Systems. Springer.

17. Yuan, Z., Ghinea, G., & Muntean, G.-M. (2014). Quality of experience study for multiple senso-

rial media delivery. In 2014 International Wireless Communications and Mobile Computing

Conference (IWCMC) (pp. 1142–1146). IEEE.

18. Yuan, Z., Chen, S., Ghinea, G., & Muntean, G.-M. (2014). User quality of experience of

mulsemedia applications. ACM Transactions on Multimedia Computing, Communications,

and Applications, 11(1s), 15:1–15:19.

19. Eid, M., & El Saddik, A. (2012). Admux communication protocol for real-time multi-

modal interaction. In Proceedings of the 2012 IEEE/ACM 16th International Symposium on

Distributed Simulation and Real Time Applications, DS-RT ’12 (pp. 118–123), Washington,

DC, USA. IEEE Computer Society.

20. Robles-De-La-Torre, G. (2006). The importance of the sense of touch in virtual and real

environments. IEEE Multimedia, 13(3), 24–30.

Chapter 42

Deep Learning for Breast Cancer

Diagnosis Using Histopathological

Images

Mohammad Gouse Galety , Firas Husham Almukhtar,

Rebaz Jamal Maaroof, and Fanar Fareed Hanna Rofoo

Abstract Deep learning capabilities with convolution neural networks are unlim-

ited in achieving absolute learning results in all kinds of medical imaging methods.

The algorithmic mechanism on datasets is prudent enough to qualify their efﬁciency.

Many random textures and structures are found in the histopathological images of

breast cancer that deal with multi-color and multi-structure components. Most of

the experiments performed in the wet labs derive results conventionally, but when

assisted with the computation models of learning, the accuracy, reliability, and speci-

ﬁcity of the results are boosted empirically. The process of employing the compu-

tational methods using convolution neural networks in parallel to the conventional

experimentation of classiﬁcation for diagnosing malign breast cancer images attains

satisfactory results for effective decision-support.

42.1 Introduction

Despite the annoying rise in breast cancer counts, many developed countries have

preventive healthcare systems developed to reduce the death rate in recent years. A

boon of technologies and awareness in the country milieu is the signiﬁcant landmarks

that prevent falling prey to the cancerous epidemic. Medical imaging technologies

yell in almost all the corners of the world for early detection and screening for

effective prevention. However, medical and health care still work with a varied set of

diagnostics tests and mechanisms in parallel with technologies with high sensitivity

and speciﬁcity verging into the classiﬁcation as ‘cancer’ or ‘no cancer.’

Research and Literary consensus demonstrate multi-level and multiple levels of

resolutions on the histopathological images. Many time-hard analyses in the research

jot on to valuable academic theses. In vitro, datasets are the golden collection of

benchmarking datasets, where experts with minimal diagnostic risks in computer-

aided diagnosis (CAD) have derived many sophisticated decision-making methods.

M. G. Galety (B)·F. H. Almukhtar ·R. J. Maaroof ·F. F. H. R ofoo

Department of Information Technology, College of Computer Science and Information

Technology, Catholic University in Erbil, Erbil, Kurdistan Region 44001, Iraq

e-mail: m.galety@cue.edu.krd

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_42

447

448 M. G. Galety et al.

Experts and research multitude annotate prospective decision-making in building

automated systems with promising features of reducing the risks and improving

efﬁciency in diagnosis.

42.2 Literature Review

Classiﬁcation is the process of categorizing and identifying the types of breast cancer

from high-resolution histopathological images used in deep learning using convolu-

tional neural networks. Classifying benign and malign tissues is the order of work in

the prognostic evaluation [1]. The carcinoma, which is also hostility in the tissues, is

studied with many morphological properties to determine the various levels of sever-

ities [2]. Hematoxylin and Eosin (H&E) is the mechanism conventionally used to

trace the primary stains on the specimens. Rule-based methods in machine learning to

convolution neural networks of deep learning work with many digital compositions

on histopathological analyses to achieve the automated end-to-end process [3,4].

Gradient descent and gradient boosting algorithms were prioritized during the

iterations of a CNN model in the ImageNet. Collections of H&E stained images of

histopathological breast cancer forms are processed in the pre-trained ImageNet. A

fully automated Computer-Aided Diagnostic system effectively uses DL and ML to

diagnose breast cancer [5]. Detecting and extracting patches containing masses and

classiﬁcation are the core of the CNN frameworks. Early screening and detection and

preprocessing of pectoral muscle by removal of mass segments examination need

not be encouraged, with external interventions, if CAD can do effectively.

Many inherent features and relationships in the histopathological images are

discovered using machine learning. The performance of machine learning is further

extended to convolutional neural networks, as prudent as a hallmark for intelligent

decision-making. Learning-based critical problems are solved meticulously using

convolutional neural networks, particularly temporal and spatial datasets [6].

An inter-pathologist variability is an essential consideration in the traditional

systems while conducting all diagnostic experiments, which are prone to error and

biases. Using quantitative analyses in the clinical setup, such variations are well miti-

gated [7]. Still, H&E promotes the aspects of variability for the stained areas, which

paves ardent avenues into the research for the pronunciation of new contributions in

proving accuracy in image analysis.

Resolution, format, and the structure of images inﬂuence the development of

imaging devices. In contrast, large images occupy much storage space format, and

the structure of the image plays a signiﬁcant role. Whole-Side digital pathology

Images (WSI) are examples of the detailed imagery of histopathological breast cancer

images. These images are difﬁcult to slice into hundreds and many smaller tiles during

the preprocessing and selection stage; hence, the lowest and detailed magniﬁcation

of imagery is considered for classiﬁcation and segmentation tasks. Metadata plays

are essential in sliced images to map with WSI positions, which must be integrated

with information. The WSI experiments break the barriers of handling errors during

42 Deep Learning for Breast Cancer Diagnosis Using … 449

the mapping of sliced images with the whole image format, subsequently providing

input to the convolutional neural networks to alleviate the challenges of uncertainty,

size, and format. As problems of tumor identiﬁcation are well dealt with machine

learning frameworks, AI is the carrier for mitigating the imaging problems and their

sizes without compromising the quality of the problem identiﬁcation and description

about the tumor detection, quoted by Robertson et al. [2,8].

42.3 Proposed Method

Computer-Aided Diagnosis, software that supports medical diagnosis, shall use

datasets, preferably benchmark data sets which are proven for the experimentations

of deep learning and conventional models. Benchmark datasets like BreakHis have

clinically relevant public breast cancer histopathology dataset, with different kinds of

trade-offs for practitioners, which is a signiﬁcant study to date; however, the relative

availability of the internal clinical data is worth compared to the benchmark data.

Pathological specimens from the experimentations are thoroughly expedited during

automated examination, a just in time process and adapts economy in analyses [9–11]

(Fig. 42.1).

Various resolutions of images and WSI have been considered for the experimen-

tations; based on the degree of magniﬁcation of the image, the images are collected

into four major categories of benign and malign class. The preprocessing stage of

breast cancer diagnosis using convolutional neural networks begins with differenti-

ating the images as benign or malign. Further, they are sub-categorized, as shown

above. For effective classiﬁcation and sub-classiﬁcation of images, a binary classiﬁer

is employed to classify the identity and benign or malign to overcome the problem

of iterative re-works.

The convolutional neural network, which shall be employed on the datasets,

is based on the classiﬁcation requirements. Binary classiﬁcation can ideally work

on magniﬁcation-speciﬁc (MS) training scenarios, and multi-category classiﬁca-

tion works linearly on the training strategies magniﬁcation-independent (MI). The

MS employs on various models with different sets of speciﬁcally-magniﬁed image

datasets, per WHO identiﬁcation of benign and malign cancers. Data abstraction

is considered during the MI implementation as all the magniﬁcation levels are

considered, unlike the MS method.

Fig. 42.1 WHO

recommended classiﬁcation

and pathological

categorization of breast

cancer

450 M. G. Galety et al.

Feature Extraction

Recent research methods are proposed to extract the features of images, which contain

certain constituent blocks of features such as edges, corners, blobs, clouds, and

ridges. One of these properties of the features can be considered for the analyses in

the breast cancer images. These properties are reﬂected as pixels representing the

color attributes and the pathological attributes viewed in the physical image. Matrix-

based methods, binary-pattern methods, and color-histogram methods can be used

in deep learning to identify and extract the features from the images systematically.

These methods with complex computations in parallel with pathological studies will

undoubtedly compete with the experimentation of the wet lab.

An image is ﬂoated considering the tumor/cancer, which correlated with healthy

reference image symptoms, where some areas of the images are examined using simi-

larity computations and statistics. Methods of statistics such as cross-correlations,

phase correlations, image ratio uniformity, and difference of squares; other methods

of mutual information exchange for the selected areas of the populated features

include complex computations. The mapping of feature points of the healthy image

with the ﬂoating images is examined in the similarity process. Speciﬁc parame-

ters identiﬁed from the health to tumor are considered conversion or transformation

parameters, with bifurcation and cross-over points, which are studied algorithmically

(Fig. 42.2).

When identifying tumors or cancer, speciﬁc essential components are not visible

in the images, where grayscale aberrations must be eliminated. Tissue staining

methods like H&E are implemented before the process of the visualization. The

process of staining could enable the reading of highlighted areas of the compo-

nents and morphological features, where the same can be seen with a high deﬁnition

microscope.

Fig. 42.2 Basic image analysis framework

42 Deep Learning for Breast Cancer Diagnosis Using … 451

Fig. 42.3 Samples of mammography mass lesions; abenign; bmalignant

Deep Learning Methodology

From images corpuses, candidate dataset for the classiﬁcation process is selected

by scaling the original images into different sizes. The scaling and phasing of the

images inﬂuence the learning time, with a possible least time, and irrelevant portions

of the images could be eliminated from the learning process. However, grayscale

images are useful but cause aberrations and conﬂicting brightness, which intrude the

entire process of identifying and detecting the tumor parts, where mainly the shape

of the tumor is the only identiﬁcation of its benign or malign nature. Further, the

convolution neural network framework is implemented on the collected benign and

malign images (Fig. 42.3).

42.4 Experimental Results

Many essential aspects of malign nature proved pathologically from the breast cancer

images are revealed using the convolutional neural network. From the pashing and

scaling, the normalized images of breast cancer are sized to 220 ×220 are the primary

tensors of size 192 ×192. The beginning convolution in the conﬁgured CNN shall be

initiated with a 3 ×3×2 kernel ﬁlter with an essential stride of 1 ×1, and 24 ﬁlters

are applied. A max-pool is produced from the ﬁrst convolution using the pooling

layer with 2 ×2, reducing to 96 ×96. A ReLU is applied on the resulting output of

the ﬁrst convolution and sent through the subsequent convolution with nonlinearity

and into the successive layers. For the second convolution operation, kernel ﬁlters

of size 3 ×3×24 with 48 ﬁltrations are applied, and the input size is reduced to

48 ×48; after max-pooling, with a stride of 2 ×2, the output is further scaled. Once

again, the nonlinearity is added to the output of the previous convolution layer. For

the third convolution operation, kernel ﬁlters of size 3 ×3×96 with 96 ﬁltrations

are applied, and the input size is reduced to 24 ×24 after max-pooling with a stride

of 2 ×2 the output is further scaled. Nonlinearity is imbibed in the activation stage

452 M. G. Galety et al.

using the ReLU function, promoting the fourth convolution with 192 ﬁlters of 3 ×

3×192 as the kernel size.

To clear the anomaly of activations, by ﬁlling the space during reduction of the

output, the image is max-pooled to the size 12 ×12. Further convolutions are operated

to process the results of all the pre-conﬁgured layers of the network, including ReLU

and 240 ﬁltrations. The tensor 6 ×6×240 results from the convolutions and then

linearized and ﬂattened to shape the feature. The values of the features within the

neurons reﬂect the symptoms of the malign tissues.

Convolution neural networks face the problem of underﬁtting and overﬁtting. To

overcome overﬁt, speciﬁc dropout values are used in the dropout layer, where the

feature can be deﬁned in a realistic format. Fewer neurons are used to determine the

class of the datasets to minimize the ambiguity from the fully connected layers. The

fully connected layer at the ﬁnal stage of convolution will bring out the tensor with

a limited number of neurons; in the experiment, it is observed as 48 are converted to

several classes under the malign and benign. Signiﬁcant loss and while improvising

accuracy occurred during the experimentation’s training and validation, depicted in

the following graphs (Figs. 42.4 and 42.5).

Fig. 42.4 Misclassiﬁed histopathological breast cancer images and the loss incurred during training

and validation toward the accuracy of benign nature of images

Fig. 42.5 Misclassiﬁed histopathological breast cancer images and the loss incurred during training

and validation toward the accuracy of malign nature of images

42 Deep Learning for Breast Cancer Diagnosis Using … 453

Fig. 42.6 ROC curve demonstrating the AUC for benign and malign breast cancer images

The generalization of the proposed model is based on the selection of the images,

where the grading inaccuracies will affect the interest of deep learning methodology.

From the observations of the experimentations conducted, the classiﬁcation of breast

cancer images as malign categories with their sub-categories, based on the selection

of the datasets and propensity of the proposed method, is referenced by the AUC of

the ROC, which are drawn as follows determining for the malign and benign classes.

Accuracy (ACC) =0.7866, Sensitivity (TPR) =0.7921, Speciﬁcity (TNR) =0.7837,

False Positive Ratio (FPR) =0.2163, Positive Predictive Value (PPV) =0.6597, and

Negative Predictive Value (NPV) =0.8769 are the ROC factors obtained for the

benign class. Accuracy (ACC) =0.7849, Sensitivity (TPR) =0.788, Speciﬁcity

(TNR) =0.7832, False Positive Ratio (FPR) =0.2168, Positive Predictive Value

(PPV) =0.673, and Negative Predictive Value (NPV) =0.8671 are the ROC factors

obtained for the malign class (Fig. 42.6).

Therefore, the CNN proposes a learning model that has achieved measurable

results on the scaled, phased, and normalized histopathological images, with different

dimensions and resolutions classifying the images as malign. Different kinds of

CNN architectures may be proposed, almost like ImageNet, AlexNet, etc., where

the malign images of 240 ×240 are given as input to the CNN’s convolution,

pooling, and ReLu. The proposed model is prudent and ﬂexible in deriving the

desired results. Python with Tensorﬂow Keras in Anaconda Navigator with snip-

pets of code in Jupyter Notebook and speciﬁc areas of adding activation function

performed with colab of Google.

42.5 Conclusion

The CNN model for the classiﬁcation of breast cancer images has been proposed in

the paper, which proves a simple CNN with a sequential model can be implemented

for image classiﬁcation. All possible structures and textures of the features of breast

454 M. G. Galety et al.

cancer can be thoroughly examined by overcoming various kind of color-scale aber-

rations. The proposed model has achieved the best classiﬁcation of breast cancer

images as malign. The proposed model can compete with the wet lab experiments

and has promising quantitative and qualitative analysis features. More qualitative

and good resolution images can be applied to obtain the absolute image classiﬁcation

results.

References

1. Elston, C. W., & Ellis, I. O. (1991). Pathological prognostic factors in breast cancer. I. The

value of histological grade in breast cancer: Experience from a large study with long-term

follow-up. Histopathology, 19(5), 403–410.

2. Robertson, S., et al. (2018). Digital image analysis in breast pathology—From image processing

techniques to artiﬁcial intelligence. Translational Research, 194, 19–35.

3. Rakhlin, et al. (2018). Deep convolutional neural networks for breast cancer histologyimage

analysis. In International Conference Image Analysis and Recognition (pp. 737–744). Springer.

4. Spanhol, et al. (2015). A dataset for breast cancer histopathological image classiﬁcation. IEEE

Transactions on Biomedical Engineering, 63(7), 1455–1462.

5. Shayma’a, et al. (2020). Breast cancer masses classiﬁcation using deep convolutional neural

networks and transfer learning. Multimedia Tools and Applications, 79(41), 30735–30768.

6. Khan, et al. (2020). A survey of the recent architectures of deep convolutional neural networks.

Artiﬁcial Intelligence Review, 53(8), 5455–5516.

7. Srinidhi, et al. (2020). Deep neural network models for computational histopathology: A survey.

Medical Image Analysis, 101813.

8. Reshma, G., Al-Atroshi, C., Nassa, V. K., Geetha, B., Sunitha, G., Galety, M.G., &

Neelakandan, S. (2022). Deep learning-based skin lesion diagnosis model using dermoscopic

images. Intelligent Automation and Soft Computing, 31, 621–634.

9. Galety,M., Al Mukthar, F. H., Maaroof, R. J., & Rofoo, F. (2021). Deep neural network concepts

for classiﬁcation using convolutional neural network: A systematic review and evaluation.

Technium: Romanian Journal of Applied Sciences and Technology, 3(8), 58–70. http://doi.org/

10.47577/technium.v3i8.4554

10. Sahu, B., Gouse, M., Pattnaik, C. R., & Mohanty, S. N. (2021). MMFA-SVM: New bio-marker

gene discovery algorithms for cancer gene expression. Materials Today: Proceedings. ISSN

2214-7853. http://doi.org/10.1016/j.matpr.2020.11.617

11. Gouse, D. G. M., Haji, C. M., & Saravanan, D. (2018). Improved reconﬁgurable based

lightweight crypto algorithms for IoT based applications. Journal of Advanced Research in

Dynamical & Control Systems, 10(12), 186–193.

Chapter 43

Implementation of 12 Band Integer

Filter-Bank for Digital Hearing Aid

K. Ayyappa Swamy and Zachariah C. Alex

Abstract To compensate for sensorineural hearing loss, audio signals should be

ampliﬁed selectively and very sensitively. For this, it can pass through a ﬁlter-bank

structure followed by a gain adjustment block. In this research, we proposed to design

a 12 band integer ﬁlter-bank that can produce ﬁltered and ampliﬁed audio signals

without the need for additional ampliﬁcation. Integer ﬁlter-bank also reduces the

latency when related to the fractional ﬁlter-bank. The proposed method is examined

with four audiograms having different kinds of hearing losses. The proposed 12 band

integer ﬁlter-bank provides minimum matching error with acceptable latency.

43.1 Introduction

A digital hearing aid is a device that can compensate for hearing losses of different

kinds. Conductive hearing loss can be compensated using a simple ampliﬁer with

constant gain. In mixed and sensorineural hearing loss cases, gain values may be

different at each frequency. To compensate mixed and sensorineural hearing loss

pass, the audio signal through the set of ﬁlters known as ﬁlter-bank. The output of

each ﬁlter is feed to an ampliﬁer with some gain which will be depending on the

level of hearing loss at that band of frequency. The block diagram representation of

digital hearing aid with ﬁlter-bank technique is shown in Fig. 43.1 [1].

K. Ayyappa Swamy ·Z. C. Alex (B)

School of Electronics Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu 632014,

India

e-mail: zachariahcalex@vit.ac.in

K. Ayyappa Swamy

Department of ECE, Aditya Engineering College, Surampalem, Andhra Pradesh 533437, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_43

455

456 K. Ayyappa Swamy and Z. C. Alex

Fig. 43.1 Block diagram for digital hearing aid with ﬁlter-bank structure [1]

From past 3 to 4 decades, researchers are working on developing better algorithm

for hearing aid using ﬁlter-bank structure. Currently, researches focused on to reduce

the complexity as well as to reduce the latency. Mostly used ﬁlter-bank is uniform

ﬁlter-bank [2] in which entire range of audio signal is divided into equal bands.

In this paper, we have implemented an efﬁcient 12 band non-uniform integer

ﬁlter-bank to compensate for hearing loss. We introduced a new integer ﬁlter-bank

which was designed by converting FIR ﬁlter coefﬁcients in ﬁlter-bank to integer

values. Integer ﬁlter-bank performs better in terms of speed when compared with

fractional ﬁlter-bank [3]. Proposed integer ﬁlter-bank performs gain adjustment inter-

nally without the need for any additional gain adjustment block. The performance

of the proposed algorithm was tested on two different types of audiograms.

The following is a breakdown of the paper’s structure. Implementation of integer

ﬁlter-bank and how it ampliﬁes without multipliers are discussed in Sect. 43.2.

Design examples are given in Sect. 43.3. In Sect. 43.4, the test results are examined.

Section 43.5 draws the conclusion.

43.2 Integer Filter-Bank with Internal Gain Adjustment

43.2.1 Integer Filter-Bank

The proposed ﬁlter-bank shown in Fig. 43.2 consists of 12 non-uniform FIR ﬁlters,

it is a combination of low pass, high pass and band pass ﬁlters. In this paper, integer

ﬁlter-bank is implemented by converting fractional ﬁlter coefﬁcients into integer

values by multiplying the fractional ﬁlter coefﬁcients with 32,768 (215) and rounded

it to the nearest integer value. Fractional input samples are also converted into inte-

gers by multiplying with 32,768 (215) to perform integer multiplications and additions

which will reduce the delay when compared to fractional operations. After ﬁltering,

43 Implementation of 12 Band Integer Filter-Bank for Digital … 457

Fig. 43.2 Magnitude spectrum for FIR ﬁlter-bank designed using a Taylor window

ﬁltered outputs which are in the form of integers will be converted back to fractional

value by dividing it with 230. To reduce the computational complexity, all multipli-

cations and divisions required in this process are performed by shift operations as

the multiplications are in the order of 2N.

In ﬁlter-bank, each FIR ﬁlter was designed using Taylor window [4] with Nbar

4, side-lobe level 60 dB, with order 70 and at 16,000 Hz sampling frequency. It is

observed that the Taylor window gives better stop-band attenuation for minimum

ﬁlter order [5]. For hearing aid applications, ﬁlters should have at least 60 dB attenu-

ation which will be helpful to amplify up to 60 dB [6,7] in case of moderately severe

hearing loss.

In this research, we considered 12 band non-uniform FIR ﬁlter-bank. 12 band

ﬁlter-bank is implemented by combining few ﬁlters in 16 band non-uniform ﬁlter-

bank proposed in [1]. Figure 43.2 shows the magnitude spectrum of 12 band ﬁlter-

bank for each FIR ﬁlter designed using a Taylor window with the speciﬁcations

mentioned above. As shown in the ﬁgure, the designed ﬁlter-bank provides 60 dB

stop-band attenuation which is best when comparison with other ﬁlter-banks types.

Filter-banks proposed in [8,9] provides up to 50 dB stop-band attenuation only.

43.2.2 Gain Adjustment

In sensorineural or mixed hearing loss gain values vary with frequency, different

range of frequencies may have different gain values. So, in hearing aid algorithm,

gain adjustment should be performed selectively and very sensitively. The output of

each ﬁlter needs to be multiplied with a predeﬁned gain value given by the audiogram

to indemnify the hearing loss in the particular frequency range. The gain value is

458 K. Ayyappa Swamy and Z. C. Alex

calculated based on the hearing threshold at that frequency range which was recorded

using pure tone audiometry.

In the proposed algorithm, no need for an additional multiplier for gain adjustment

block, this is performed in the ﬁltering process internally. Gain values are converted

in the form of a sum of 2ms and that mwill be subtracted from 30 which is used

to get fractional ﬁlter value in the integer ﬁlter-bank explained in the above section

(integer ﬁlter-bank). For example, if the gain value is 20, then it will be converted in

theformofasumof2

msas2

4+22=20. That means instead of shifting right by 30

it should be shifted by 26 −30 (4) and 28 −30 (2) and add both of them to get the

ampliﬁed output with a gain of 20.

43.3 Design Example and Performance Evaluation

Proposed 12 band FIR ﬁlter-bank is implemented using an audio signal processing

toolbox in MATLAB software. Audiograms with various kinds of hearing impair-

ments are used to assess the efﬁciency of the proposed non-uniform integer ﬁlter-bank

structure. The algorithm proposed in this paper is evaluated with audiograms having

a mild conductive hearing losses and high loss at frequencies.

Example 1: Mild conductive hearing loss

The result of audiometry for such cases is appeared in Fig. 43.3. Hearing edges

of right side ear are symbolised with ‘O’ were taken into consideration for gain

adjustment. As per the audiogram, the increase esteems 25, 25, 25, 35, 25 and 30 are

given to the proposed ﬁlter-bank.

Example 2: Middle frequency moderate hearing loss

Audiogram with moderate hearing misfortune at mid frequencies is considered for

assessing the proposed model in this example. As per the audiogram, the hearing

loss thresholds are 10, 20, 40, 50, 20 and 10, respectively. From the above outcomes,

plainly the proposed ﬁlter-bank gives 4.25 dB maximum matching error (MME) at

4000 Hz.

43.4 Results and Discussions

Figure 43.4 depicts the ﬁlter-bank’s response after gain adjustment. Figure 43.5

speaks about matching error. Tables 43.1 and 43.2 clear that the latency in the

designed hearing aid algorithm is less compared with other hearing aid algorithms.

Table 43.1 is for mild conductive hearing loss discussed in Example 1, and Table

43.2 is for moderate level hearing loss at mid frequencies as discussed in Example 2

in the design examples section.

43 Implementation of 12 Band Integer Filter-Bank for Digital … 459

Fig. 43.3 Audiogram for moderately severe conductive hearing loss

Fig. 43.4 Magnitude response of 12 band ﬁlter-bank with gain adjustment

460 K. Ayyappa Swamy and Z. C. Alex

Fig. 43.5 Matching curve and matching error

Table 43.1 Proposed ﬁlter-bank results comparison with other ﬁlter-banks for example 1

Type of ﬁlter-bank MME (dB) Delay of the ﬁlter-bank (ms)

Fixed uniform (direct design) 6.39 4.3

Fixed non-uniform [10]9.61 15.7

Reconﬁgurable [8]4.82 29

Reconﬁgurable [9]5.63 12.1

Reconﬁgurable [7]3.33 5.55

Proposed algorithm 4.25 6.56

Table 43.2 Proposed ﬁlter-bank results comparison with other ﬁlter-banks for example 2

Type of ﬁlter-bank MME (dB) Delay of the ﬁlter-bank (ms)

Fixed uniform (direct design) 5.86 4.3

Fixed non-uniform [10]3.67 15.7

Reconﬁgurable [8]2.67 25

Reconﬁgurable [9]1.84 12.1

Reconﬁgurable [7]4.1 5.55

Proposed algorithm 2.7 6.56

43 Implementation of 12 Band Integer Filter-Bank for Digital … 461

43.5 Conclusion

In this paper, a new non-uniform ﬁlter-bank structure with integer ﬁlter coefﬁcients

is presented. The speed of the ﬁlter-bank has been improved in three ways: (1) by

replacing fractional ﬁlters with integer ﬁlters, (2) by replacing multiplications with

shifting operations in gain adjustment and (3) by using a symmetrical ﬁlter in ﬁlter-

bank. According to design examples, the suggested structure outperforms the existing

algorithm in terms of MME and latency (Figs. 43.6,43.7 and 43.8).

Fig. 43.6 Audiogram for moderate hearing loss at mid frequencies

462 K. Ayyappa Swamy and Z. C. Alex

Fig. 43.7 Magnitude response of 12 band ﬁlter-bank with gain adjustment

Fig. 43.8 Matching curve and matching error

Acknowledgements Work reported in this publication was supported by “TIDE, DST, Govern-

ment of India, under grant reference No. #SEED/T I DE/015/2017/G”.

43 Implementation of 12 Band Integer Filter-Bank for Digital … 463

References

1. Wei, Y., & Lian, Y. (2006). A 16-band non-uniform FIR digital ﬁlter bank for hearing aid. In

IEEE Biomedical Circuits and Systems Conference (pp. 186–189).

2. Brennan, R., & Schneider, T. (2001). An ultra low-power DSP system with a ﬂexible ﬁlterbank.

In Conference Record of Thirty-Fifth Asilomar Conference on Signals, Systems and Computers

(Vol. 1, pp. 809–813).

3. Chrysaﬁs, A. P., & Lansdowne, S. (1988). Fractional and integer arithmetic using the DSP56000

family of general-purpose digital signal processors. Motorola.

4. Prabhu, K. M. (2014). Window functions and their applications in signal process ing.Taylor&

Francis.

5. Gabr, R. H., & Kadah, Y. M. (2008). Digital color Doppler signal processing. In 2008 Cairo

International Biomedical Engineering Conference (pp. 1–5).

6. Tiwari, N., & Pandey, P. C. (2019). Sliding-band dynamic range compression for use in hearing

aids. International Journal of Speech Technology, 22(4), 911–926.

7. Swamy, K. A., & Alex, Z. C. (2021). Efﬁcient low delay reconﬁgurable ﬁlter bank using parallel

structure for hearing aid applications with IoT. Personal and Ubiquitous Computing, 1–14.

8. Wei, Y., & Liu, D. (2013). A reconﬁgurable digital ﬁlterbank for hearing-aid systems with a

variety of sound wave decomposition plans. IEEE Transactions on Biomedical Engineering,

60(6).

9. Wei, Y., & Wang, Y. (2015). Design of low complexity adjustable ﬁlter bank for personalized

hearing aid solutions. IEEE/ACM Transactions on Audio, Speech, and Language Processing,

23(5), 923–931.

10. Lian, Y., & Wei, Y. (2005). A computationally efﬁcient nonuniform FIR digital ﬁlter bank for

hearing aids. IEEE Transactionson Circuits and Systems I: Regular Papers,52(12), 2754–2762.

Chapter 44

Comparative Analysis on Heart Disease

Prediction Using Convolutional Neural

Network with Adapted Backpropagation

K. Suneetha, Kamala Challa, J. Avanija, Yaswanth Raparthi,

and Suresh Kallam

Abstract According to global medical records, the deadliest disease in the world is

Heart Disease. In this current generation, most of the population is suffering from

heart disease due to a lack of awareness regarding healthy food habits and ﬁtness

maintenance. The death rate has increased, so early diagnosis is essential, and regular

health checkups are necessary. The problems arise due to insufﬁcient blood ﬂow to

the heart, short breathing, tiredness, or fatigue. Traditional methods are not adequate

to diagnose heart disease. There is a critical need for the medical system to detect

and predict heart disease early and deals with a more accurate analysis. Previously

the health industry generates massive unstructured data; because of change in the

modern world, the health industry generates structured machine-understandable data.

Nowadays, Artiﬁcial Intelligence methodologies are becoming more popular. This

study proposed convolutional neural network (CNN) method to detect heart disease

early. It deals with the 13 medical features as input and the CNN trained by Modi-

ﬁed backpropagation. Comparing the proposed model with the existing models like

Gradient boosting tree, voting, Naïve Bayes, and Hybrid random forest linear model.

The results show that CNN gives a 96% accurate outcome by predicting heart disease

during testing.

K. Suneetha (B)

CS and IT, Jain (Deemed-to-be University), Bangalore, India

e-mail: keerthisuni.k@gmail.com

K. Challa

Department of Information Technology, VNR VJIET, Hyderabad, Telangana, India

J. Avanija ·S. Kallam

Department of CSE, Sree Vidyanikethan Engineering College, Tirupati, India

Y. Raparthi

School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamil

Nadu, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_44

465

466 K. Suneetha et al.

44.1 Introduction

The World Health Organization (WHO) statistics shows, heart disease prediction was

the leading death causes around the world. The incidence of heart disease has the

highest rate in many countries, including India. 2.6 million people are suffering from

heart disease. After the ﬁrst or second treatment, half of them lost their lives. The death

rates are increasing due to not aware of the disease in the starting stage. The physician

could give effective medicines and treatments to the patient if the disease predicted

early, saving many lives. Developing a system that offers less operational cost and

high accuracy is the primary point. In medical analysis, artiﬁcial intelligence works

to ﬁnd the hidden patterns of the data. To identify heart disease early and provide

solutions, some risk factors such as hypertension, diabetes, gender, age. Here, data

collection is the most foremost task, and the medical domain contains vast databases

to store all types of patient records.

Patients’ data is stored in a raw manner that the machine language cannot interpret

in Fig. 44.1. Data wrangling is the process of transforming raw data into machine-

readable information. There will be a selection and ﬁltering of raw data by the

physician in order to choose and ﬁlter out the necessary information for analysis.

The training algorithm discovers the hidden patterns and rules in the ﬁltered data, and

the test algorithm evaluates the model’s performance by determining its accuracy. If

the model’s accuracy is adequate after training and testing, the data will be put to

use. Deployment is the result of integrating optimization with operational activities.

For heart disease prediction of automated health diagnosis in this study, proposed a

convolutional neural networks model. The convolutional neural networks model is

used to predict heart disease early and trained by using a modiﬁed backpropagation

method. The prediction is done by preforming the analysis on heart disease dataset

and results were compared with Gradient boosted tree, voting, Naïve Bayes, Hybrid

random forest Linear model.

Paper organization as follows Sect. 44.2 related works, Sect. 44.3 data collection

with detailed information, Sect. 44.4 detailed explanation about the convolution

neural network, Sect. 44.5 detailed description about proposed CNN with adapted

backpropagation, Sect. 44.6 Performance metrics measure for the proposed model

on heart disease prediction, experimental results and ﬁnally Sect. 44.7 is discussed

the conclusion and future work.

44.2 Literature Survey

Many researchers were rigorously doing their research work on heart disease predic-

tion. Waris and Koteeswaran [1] used an improved k-means neighbor classiﬁer tech-

nique to predict heart disease and illustrated it with the ER diagram and architecture.

Sinha [2] implemented MRI techniques that include conventional lymphangiograms

44 Comparative Analysis on Heart Disease Prediction Using … 467

Database

Data as services to physician for

medical diagnosis

Physician

Analysis the data to make a

decision using machine learning

technique

Diagnoses the disease

patient’s data

Fig. 44.1 Architecture machine learning technique for data analytics

and dynamic contrast-enhanced magnetic resonance imaging to predict disease in

the heart.

Doppala [3] worked on the genetic algorithm using feature selection with radial

basis function to detect coronary illness. Here, the proposed study gave better accu-

racy after features minimized to nine characters during data preprocessing. Rani

[4] proposed the hybrid decision system and genetic algorithm to predict heart

disease. Bazoukis and Stavrakis [5] explained the conventional clinical methods of

heart disease. The study illustrates the search strategy, data extraction, and statistical

analysis, predicting the left ventricular assist device’s output promising inefﬁcient

procedures structure.

Patel et al. [6] implemented the IoT and MQTT-based machine learning frame-

work, which estimates precision, disposition, and affectability for cardiovascular

disease. Al-Yarimi et al. [7] worked on n-gram attributes optimization by distinct

attribute correlation weight, and experiments gave signiﬁcant results. Halima El

Machine learning techniques including Naive Bayes, support vector machine, deci-

sion tree, random forest, and K-nearest neighbor were used by El Hamdaoui [8]to

predict cardiac disease. With an accuracy rate of 84.17% in training and 84.28% in

testing, the Bayes algorithm was the most accurate method.

Kamalapurkar and Samyama Gunjal [9] worked on a web-based system using

machine learning. It provides better accurate results compared with two artiﬁcial

intelligence techniques like support vector machine and random forest. Spencer

[10] predicted heart disease using four different datasets (Cleveland dataset, Long-

Beach-VA-dataset, Hungarian dataset, Switzerland dataset). Filter and data extraction

methods used like PC features, Chi-squared, Relief, SU, and classiﬁcation methods

like logistic and AdaBoost.

468 K. Suneetha et al.

Mohan [11] used cybersecurity for information-centric Internet of things and

machine learning algorithms applied to different features to predict heart disease.

Einarson et al. [12] estimated the cardiovascular disease in type 2 diabetes and

reviewed ten years (2007–2017) published papers. Abiko [13] measured the serum

concentration of hs-cTnT to predict heart disease. Shanmugasundaram [14]used

machine learning approaches such as Naïve Bayes, Decision Tree, K-nearest

neighbor to predict heart disease. Brain tumor classiﬁcation of MRI images using

deep convolutional neural network [15]. Cardiac Arrhythmia Detection using

Dual-Tree Wavelet Transform and convolutional neural network [16].

Based on the above literature survey, in the medical domain, many machine

learning techniques are developed to analyze the patient’s data generated from the

medical database. The time to analyze medical data is a signiﬁcant concern because

the patient data are complex and hard to diagnose, so it is difﬁcult to predict the disease

based on the patient data stored in the medical database. Due to unpredictability in

massive datasets, the false positive rate increased and precision reduced in diagnosis.

Therefore, an efﬁcient learning algorithm must lower computation time and improve

diagnostic accuracy using medical data analytics.

44.3 Methodology

44.3.1 Data Collecting

The Heart Disease dataset is available in the UCI Machine Learning repository.

The researchers extensively use for evaluation and analysis purpose. Heart disease

contains 303 instances and 13 features shown in Table 44.1. In the dataset, age

value ranges from 34 to 77 years, where 0 represents the female gender patients,

and 1 represents male gender patients. Chest pain is categorized into four types: (i)

Atypical Angina; the heart will not receive the required amount of blood and resultant

in narrowing the coronary arteries. (ii) Typical Angina will not receive the necessary

amount of blood and consequent in narrowing the coronary arteries. The difference

is the mental stress and emotional feeling with the chest pain in Angina type 2. (iii)

Asymptomatic, chest pains that are not related to type 3 angina experience various

reasons won’t consider heart disease. (iv) Non-anginal pain, symptoms which are

not reﬂecting heart disease. The trestbps signiﬁes the resting blood pressure, the

level of cholesterol labeled as chol. Exercise-induced angina −1 means the chest

pain caused due to the lack of oxygen supply to the heart, and 0 indicated no chest

pain and represented with the feature code as exang. FBS means fasting blood sugar.

Achieving the maximum heart rate is thalach. Ca means the number of signiﬁcant

vessels colored by ﬂuoroscopy. ST-depression prompted by exercise relative rest

gives an old peak. Normal Resting electrocardiographic results indicated with 0, ST-T

wave abnormality in estecg indicated with 1, and restecg left ventricular hypertrophy

is delivered with 2. The peak’s slope means increment in the heart rate, where 1

44 Comparative Analysis on Heart Disease Prediction Using … 469

represents up sloping, 2 represents ﬂat, and 3 means downsloping. A thallium scan

means checking how much blood reaches all body parts by using the radioactive

tracer. There are missing values in the dataset; the missing values are replaced with

the interpolation values. Next, the dataset splits into 85% of training data with 258

instances and 13 attributes and 15% of testing data with 45 cases and 13 attributes.

Another attribute will considered as a label which is the output parameter for the

neural network. Where 1 represents the presence of heart disease and 0 means absence

of heart disease.

Table 44.1 Details of the dataset

Features Codes Description

Age Age Age in years

Chest pain type Cp 1=typical angina

2=atypical angina

3=non-anginal pain

4=asymptotic

Exercise-induced angina Exang 0=No

1=Yes

Fasting blood sugar Fbs 0=false

1=true

Maximum heart rate achieved Thalach Displays maximum heart rate

Number of major vessels Ca Displays values as integers or ﬂoat

Oldpeak Oldpeak Displays values as integers or ﬂoat

Resting blood pressure Trestbps Displays values in mmHg (unit)

Resting electrocardiographic results Restecg 0=normal

1=ST-T wave abnormality

2=left ventricular hypertrophy

Serum cholesterol Chol Displays values in mg/dl (unit)

Sex Sex 0=female

1=male

Slope of the peak Slope 1=up sloping

2=ﬂat

3=down sloping

Thallium scan Thal 3 =normal

6=ﬁxed defect

7=reversible defect

470 K. Suneetha et al.

44.3.2 Classiﬁcation Modeling

44.3.2.1 Gradient Boosted Tree

The Gradient boosted trees are constructed based on the high entropy rate for

the data’s training samples. It follows a top-down recursive divide and conquers

approach. The Gradient boosting tree pruning is performed to evaluate the error

optimization of the dataset.

Entropy =−



j=1

xij log2xij (44.1)

44.3.2.2 Voting

Voting is an aggregation technique, and it is the combination of multiple classiﬁers.

The vote means dividing the training dataset into smaller and equal subsets of the

data and construct a classiﬁer for each subset of the data. The ﬁnal decision based on

the maximum number of votes that the class achieved by summing up all the votes

and choosing the highest aggregate value.

44.3.2.3 Naïve Bayes

The Naïve Bayes model applies the Bayes rule through independent attributes. For the

highest subsequent probability, the instance of data allocated to the class. The model

trained through the Gaussian function with prior probability Pxf=priority ∈

(0:1)

Pxf1,xf2,....,xfm|c=πn

i=1P(xfi|c)(44.2)

Pxf|ci=Pci|xfPxf

P(ci)(44.3)

44.3.2.4 HRFLM

HRFLM is the combination of random forest and linear model. In this method, prepro-

cessing data followed by feature selection based on the classiﬁcation performance

evaluation, entropy, and outcome accuracy is achieved. The feature selection model

repeats for various combinations of attributes. The HRFLM model’s performance

44 Comparative Analysis on Heart Disease Prediction Using … 471

based on the thirteen attributes and the machine learning technique used for repe-

tition and performance is recorded. A linear model was developed in statistics and

studied as a model for understanding the relationship between input and output vari-

ables. The ensemble classiﬁer constructs several random decision trees and combines

them to get the best outcome. It mainly applies bagging or bootstrap aggregating. In

the preprocessing data stage, the missing values removed from the data.

For the given data, x=x1,x2,....,xnwith y=x1,x2,....,xnwhich repeats

bagging from n=1toN. The average prediction for an individual tree is:

j=1



n=1

x(44.4)

44.4 Convolutional Neural Network

In the deep neural networks, convolutional neural networks are higher version and

practical models for predicting. It has more layers to study lower and advanced-level

features. The convolutional neural networks contain convolution layer, pooling, fully

connected layers, and softmax layer. The convolution layer has the learning property

that gives pixels of images by splitting the images into minor pixels’ boxes. In

this layer, Cnn performs kernel and ﬁltering operations on the data. The input is the

resultant of the previous layer. Dropping all the unused parameter in the pooling layers

which reduces dimensions of feature maps. (i) The max-pooling layer performance

operations on the maximum elements of the input data from the feature map area.

(ii) The average pooling calculates the average of the input data present in the size of

the feature map. (iii) The global pooling will reduce each network in the feature map

to a signal value. The fully connected layers take the transformed vector–matrix.

Here the feature map converts into a vector and fed into the neural network, and each

layer is connected to the activation unit. The fully connected network takes a vector

matrix and combines them into a one-dimensional feature vector to form a model

and activation function classify softmax function which is shown in Fig. 44.2.

44.5 CNN Modiﬁed Backpropagation

The convolutional neural network is a set of connected input and output elements.

Each of the corresponding elements has weights associated with it: In the learning

stage, the convolutional neural network adjusts the weights to predict the input tuple’s

accurate class label. Backpropagation is the fast matrix-based algorithm to compute

the output from a network. It is a learning algorithm and ﬁne-tuning weight of a neural

network based on the previous epoch’s error rate. It helps calculate the Gradient

472 K. Suneetha et al.

Fig. 44.2 Convolutional neural network architecture

of a loss function concerning all the neural network weights. The convolutional

neural network-based model prediction was trained using a modiﬁed backpropaga-

tion training method. The below are the equations used to calculate the input and

output deviation. The low deviation indicates the value close to the mean (expected

value), while variation is high, it suggests that the values are in a wide range. Weights

and bias are essential models in a neural network. During the transmission between

the neurons and bias, the weight is applied to the input layers and passed into the

activation function and ﬁnding patterns to predict heart disease.

Input deviation of the nth neuron.

dI0

n=(yn−tn)ϕ(ϑn)=ϕ(ϑn)dO0

n(44.5)

Outcome deviation of the nth neuron.

dO0

n=(yn−tn)(44.6)

Bias and weight of the nth neuron

Bias0

n=dI0

n(44.7)

Weight0

n,p=dI0

nyn,p(44.8)

Input bias of the nth neuron in hidden layer L

dIL

n=ϕ(ϑn)dOL

n(44.9)

44 Comparative Analysis on Heart Disease Prediction Using … 473

Outcome bias of the nth neuron in hidden layer L

dOL

n=

i++



dI0

nWi,n(44.10)

Bias and weight difference in row p, column qin the kth feature pattern, layer in

front of nneurons in hidden layer L.

WeightL,n

k,p,q=dIL

nyk

p,q(44.11)

BiasL

n=dIL

n(44.12)

Input Bias of row p, column qin the subsamples layer sand kth feature pattern

dIs,k

p,q=ϕ(ϑn)dOs,k

p,q(44.13)

Output Bias of row p, column qin the subsamples layer sand kth feature pattern

dOs,k

p,q=

i++



dIL

k,p,qWL,n

k,p,q(44.14)

Bias and weight difference in row p, column qin the subsamples layer sand kth

feature pattern,

Weights,k=



p=0



q=0

dIs,k

2,q

2Oc,k

p,q(44.15)

Cis represented as a convolution layer

Biass,k=



p=0



q=0

dOc,k

p,q(44.16)

Input bias of row pand column qin the kth feature pattern and convolutional layer

dIc,n

p,q=ϕ(ϑn)dIc,n

p,q(44.17)

Output bias of row pand column qin the kth feature pattern and convolutional

layer c

dOc,k

p,q=dIs,k

2,q

2Wn(44.18)

474 K. Suneetha et al.

Fig. 44.3 CNN ﬂow chart

Weight variation of row p, column qin the kth the core layer, corresponding to

the nth feature pattern in the convolutional layer c

Weightn,k

r,c=



p=0



q=0

dIc,n

p,qOi−1,k

p+r,q+c(44.19)

The complete bias variation of the convolutional layer core:

Biasc,n=



p=0



q=0

dIc,n

p,q(44.20)

The convolutional neural network is beneﬁcial because of its high discriminative

power. Figure 44.3 represents the ﬂowchart of the proposed method.

Figure 44.3 represents CNN ﬂow chart:

(a) The dataset is given as input to the model in the data collection process.

(b) Essentially focused on reduction of noise and selection of features in the Data

preprocessing stage.

mining stage

(d) In the Pattern evaluation stage, the performance analysis is executed to calculate

the outcomes.

(e) In the Discovery of knowledge stage, the level of heart disease will be

discovered.

44.6 Results and Discussion

The experimental results and discussion of the CNN and existing models used to

calculate the model performance efﬁcacy. Several standard metrics such as accu-

racy, precision, F1-score, recall have been measured and compared via graphical

44 Comparative Analysis on Heart Disease Prediction Using … 475

representation (Fig. 44.4). Accuracy means it classiﬁes true positive and true nega-

tive over the total number of classiﬁcations. Precision shows the closeness between

actual values and measurement. F1 score represents the mean of precision and recall.

Recall shows the model’s performance at all the classiﬁcation models, and it is the

fraction of relevant instances retrieved. The metrics contain true positive (TF—appro-

priately predicted occurrences), true negative (TN—properly expected events which

are not required). False positive (FP—incorrectly predicted occurrences), false nega-

tive (FN—incorrectly predicted occurrences are not necessary). These are needed

to measure the accuracy outcomes. The heart disease prediction can be achieved

accurately, and success indicates heart disease using metric evaluation performance.

Accuracy =TP +TN /(TP +FP +TN +FN)(44.21)

Precision =TP/(TP +FP)(44.22)

Recall =TP/(TP +FN)(44.23)

F1-score =2∗Precision ∗Recall/(precision +recall)(44.24)

100

120

ACCURACY PRECISION RECALL F1_SCORE

Performance

Evaluaon metrics

Gradient Boosted tree Vong Naïve bayes HRFLM CNN

Fig. 44.4 Result and analysis

476 K. Suneetha et al.

Machine learning builds a mathematical model based on the sample data. It

discovers new knowledge from the dataset and develops a system that automati-

cally adapts and customizes itself to individual users. The best performance model

in comparison to the existing model is the primary emphasis of machine learning

approaches. Convolutional neural networks were offered as a method for predicting

heart disease because of its high accuracy and low classiﬁcation error.

Figure 44.4 contains all the experimental results compared with the existing

model where CNN achieved the highest accuracy. The convolutional neural network

is trained with four fully connected layers and with 91 feature maps. These fully

connected layers contain hidden layers of 1024 units each. The system performance

is calculated by changing the number of convolutional layers. Next, increment the

convolutional layers from 2 to 5 with similar feature maps to note the convolutional

layer’s impacts. Note that no change is made in entirely associated layers. The best

resultant is 96% of the accuracy is accomplished for four convolutional layers.

44.7 Conclusion

In this study, a convolutional neural network algorithm used to predict the pres-

ence of heart disease. Because of this neural network, the minimal cost can detect

heart disease with low time consumption. The model calculated on the heart disease

dataset contains thirteen features and one class label. A modiﬁed cnn backpropa-

gation algorithm is used, which gave the outcome of the patient’s health condition

(heart disease present or not). Most of the research work done using various machine

algorithms, convolution neural networks gave the best results compared with the

Gradient Boosted tree, Voting, Naive Bayes, Hybrid random forest linear model.

Future research may consider combining different machine learning techniques and

new feature selection methods that can be implemented by adding more diseases to

predict the risk of patients suffering from a heart attack.

References

1. Waris, S. F., & Koteeswaran, S. (2021). Heart disease early prediction using a novel machine

learning method called improved K-means neighbor classiﬁer in python. Materials Today

Proceedings

2. Sinha, S., Lee, E. W., Dori, Y., & Katsuhide, M. (2021). Advances in lymphatic imaging and

interventions in patients with congenital heart disease. Progress in Pediatric Cardiology.

3. Doppala, B. P., Bhattacharyya, D., Chakkravarthy, M., & Kim, T. H. (2021). Hybrid machine

learning approach to identify coronary diseases using feature selection mechanism on heart

disease dataset. Distributed and Parallel Databases.

4. Rani, P., Kumar, R., Ahmed, N. M. O., & Jain, A. (2021). A decision support system for heart

disease prediction based upon machine learning. Journal of Reliable Intelligent Environments.

44 Comparative Analysis on Heart Disease Prediction Using … 477

5. Bazoukis, G., Stavrakis, S., Zhou, J., et al. (2020). Machine learning versus conventional clinical

methods in guiding management of heart failure patients—A systematic review. International

Journal of Recent Advances in Multidisciplinary Topics.

6. Patel, W. D., Vala, B., & Parekh, H. (2021). An advanced cognitive approach for heart disease

prediction based on machine learning and internet of medical things (IoMT). In Proceedings of

the Second International Conference on Information Management and Machine Intelligence.

7. Al-Yarimi, F. A. M., Munassar, N. M. A., et al. (2020). Feature optimization by discrete weights

for heart disease prediction using supervised learning. Methodologies and Application.

8. El Hamdaoui, H., Boujraf, S., El Houda Chaoui, N., & Maarouﬁ, M. (2020). A clinical

support system for prediction of heart disease using machine learning techniques. In 2020

5th International Conference on Advanced Technologies for Signal and Image Processing

(ATSIP).

9. Kamalapurkar, S., & Samyama Gunjal, G. H. (2020). Online portal for prediction of heart

disease using machine learning ensemble method (PrHD-ML). In 2020 IEEE Bangalore

Humanitarian Technology Conference (B-HTC).

10. Spencer, R., Thabtah, F., Abdelhamid, N., & Thompson, M. (2020). Exploring feature selection

and classiﬁcation methods for predicting heart disease.

11. Mohan, S., Thirumalai, C., & Srivastava, G. (2019). Effective heart disease prediction using

hybrid machine learning techniques. In Smart Caching, Communications, Computing and

Cybersecurity for Information-Centric Internet of Things.

12. Einarson, T. R., Acs, A., Ludwig, C., & Panton, U. H. (2018). Prevalence of cardiovascular

disease in type 2 diabetes: A systematic literature review of scientiﬁc evidence from across the

world in 2007–2017. Cardiovascular Diabetology, 17 (1), 1–19.

13. Abiko, M., Inai, K., et al. (2018). The prognostic value of high sensitivity cardiac troponin T

in patients with congenital heart disease. Journal of Cardiology.

14. Shanmugasundaram, G., Malar Selvam, V., Saravanan, R., & Balaji, S. (2018). An investigation

of heart disease prediction techniques. In 2018 IEEE International Conference on System,

Computation, Automation and Networking (ICSCA).

15. Kuraparthi, S., & Reddy Madhavi, K. (2021). Brain tumor classiﬁcation of MRI images using

deep convolutional neural network. Traitement du Signal, 38(4), 1171–1179. https://doi.org/

10.18280/ts.380428

16. Reddy Madhavi, K., et al. (2021). Cardiac arrhythmia detection using dual-tree wavelet

transform and convolutional neural network. Soft Computing.

Chapter 45

Applying Machine Learning to Enhance

COVID-19 Prediction and Diagnosis

of COVID-19 Treatment Using

Convalescent Plasma

Lavanya Kongala, Thoutireddy Shilpa, K. Reddy Madhavi,

Pradeep Ghantasala, and Suresh Kallam

Abstract The Coronavirus 2 relentless discriminating Respiratory Syndrome, the

source of coronavirus syndrome (COVID-19), has sparked a universal ﬁtness catas-

trophe. So far, there is no such thing as approved prophylaxis solutions for the ones

who were revealed to neither SARS-CoV-2 nor therapies for individuals who have

acquired COVID-19. Impervious, i.e., “convalescent” plasma relates toward plasma

obtained as of persons, subsequent infection decree, and antibody production. Flaccid

supervision of antibodies throughout Convalescent plasma transfusion can happen

to provide Just for the short term policy on behalf of giving instantaneous resis-

tance headed for liable Participants. In COVID-19 convalescent plasma was also

used epidemic; It is limited in China records advise scientiﬁc detriment, together

with radiological motion, viral load diminution, and enhanced endurance. Interna-

tionally, blood centers have a robust infrastructure for storing and building a catalog

of convalescent plasma to congregate increasing stipulate.

45.1 Introduction

Corona Miridae [1] family viruses have positive thinking, single-strand, 26–32-

kilo base-length RNA structure. Coronaviruses have identiﬁed countless hosts and

multiple mammals like mice, camels, dogs, cats, bats, and, much later, crusty bristles.

L. Kongala (B)

Department of Computer Science, Vignan Nirula Institute of Technology and Science for Women,

Guntur, Andhra Pradesh, India

e-mail: sailavanya45@gmail.com

T. Sh i l pa

Department of CSE, B V Raju Institute of Technology, Narsapur, Medak, Telangana, India

K. Reddy Madhavi ·S. Kallam

Department of CSE, Sree Vidyanikethan Engineering College, Tirupati, India

P. Ghantasala

Chitkara University Institute of Engineering and Technology, Chandigarh, Punjab, India

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_45

479

480 L. Kongala et al.

Some Coronaviruses are pathogenic to a person but cause serene or asymptomatic

symptoms. On the other hand, two lethal diseases have come out within this family in

the last two eras: relentless acute respiratory syndrome [2] coronavirus, and MERS

in the center East. They are illustrating Extreme fever (89%), unproductive cough

(75%), myalgia (55%), and dyspnea (60%), with the high level of permission to a

concentrated concern unit. A novel member of the Corona Miridae family linked to

stern. There was pneumonia found In December 2019, in Wuhan, China. Patients had

similar clinical results with high fever, dyspnea, and chest X-rays showing insidious

multilobed lesions To COV-SARS and COV-MERS. Originally named the book of

2019 coronavirus (2019–nCoV), the virus is currently known as the 2019 coronavirus

disease developing SARS-CoV-2 [3].

COVID-19 mortality [4] rates are higher than seasonal inﬂuenza mortality rates,

where mortality rates are generally below 0.1% 3 Patients with any comorbidity or

elder age are linked employing more inferior medical results than younger or non-

comorbid patients. If the number of comorbidities increases, the development of

the medical procedure worsens. Smoking, chronic disruptive pulmonary syndrome,

hypertension, diabetes, cardiovascular disease along melanoma be stated to be the

menace factors on behalf of extreme COVID-19 and mortality rise. While medical

trials have been performed concerning therapeutic and vaccine production, currently

there are no approved COVID-19 Vaccines or Treatments. It is signiﬁcant ﬁnding a

cure within a limited period of the elevated transience rate and the lofty numeral of

grimy issues.

Reactive antibody treatment entails administering antibodies to a susceptible

person against a speciﬁc agent to prevent or indulgence infectious disorder related

to this agent. Active vaccination, on the contrary, necessitates stimulation of an

impervious reaction that takes time to create. Therefore, passive administration of

an antigen is merely meant to automatically susceptible immune system individ-

uals. Flaccid Antibodies treatment includes one tradition dating In the 1890s and

until antimicrobial therapy, was the solitary means to extravagance several infectious

diseases evolved about the 1940s.

Today, illness supervision is difﬁcult, and being deﬁcient in medical records with

antiviral agents is the norm. Lopinavir/Ritonavir [5] treatment approaches have failed

to show a decrease in overall mortality. The latest randomized medical revise of

hydroxychloroquine demonstrated a decline in body warmth and cough remission

relative to controls in the intrusion community.

Nevertheless, the limited taster size and the brief follow-up duration prevent

assumptions about its efﬁciency. Other work indicates an Azithromycin plus Hydrox-

ychloroquine might decrease epidemiologic load, but the medical reaction linked by

this method has not been calculated and relics to be established. Recently this mixture

remained related to more inferior results once administering hydroxy chloro-quine at

high doses. And there is no efﬁcient or safe medication for COVID-19 supervision.

Recognition of the deﬁciency in support for the classic and historical diagnosis

of COVID-19 and vaccines methods has been thanked as possibilities for disease

prevention. It is that the case with Curbing plasma, an inert inoculation procedure.

It was used since the early twentieth century in the hindrance and organization of

45 Applying Machine Learning to Enhance COVID-19 Prediction … 481

contagious diseases. The CP [6] is retrieved through the aphaeresis For Survivors,

previous Pathogen-caused infections concern where Antimicrobials to the disease’s

causative means is produced.

The critical goal for its eradication [7] is to neutralize the pathogen. Because of its

speedy acquisition, CP has been considered a disaster response to many Pandemics,

together with SARS-CoV, Spanish ﬂu, Ebola, and West Nile in recent years [8]. CP

administered early after the onset of symptoms seen a decrease in the death toll in

contrast to panacea or no treatment for severe Acute Virus Respiratory Infections

such as Grippe and CoV-SARS. Still, related feedback At Ebola germ malady was

not it experiential.

Speciﬁc Proteins, for example, anti-inﬂammatory cytokines, clot issue, normal

defensins, antibodies, intransigent, and some more unidentiﬁed are proteins acquired

from spenders during aphaeresis [9] in addition to neutralizing antibodies. In that

respect, CP passage to contaminated patients can afford additional advantages

including immune modulation through the improvement of extreme provocative

reply. The former may be the case with COVID-19 [10] where a Systemic “cytokine

earthquake” or hyper inﬂammation-induced By IL-1β, IL-2, IL-6, IL-17, IL-8, CCL2,

and TNFαthat result in over-activation of the resistant system. This inﬂammatory

reaction can exacerbate ﬁbrosis and pulmonary capacity damage. Here we recom-

mend the possible Proﬁtable Pathways for the administration from CP-COVID-19

patients and availability of a review the indication for that strategy in today’s endemic.

Secretory IgA, whichever is the dominant immunoglobulin isotype on the mucosal

façades, are crucial players in setting respiratory viral infections. They were prepared

up of 2 molecules of IgA, a joining protein (J chain), and a secretory portion. IgM and

IgA are vigorously ecstatic via the polymeric Ig receptor or through the neonatal Fc

receptor through the epithelium, although IgG could inertly migrate keen on alveolar

ﬂuids.

The lung needs unique IgG2a antiviral on behalf of defense in bronchial oils then

alveoli terminals. Because of the COVID-19 pandemic emergency [11], this study

shortened historical application situations. It analyzed existing know-hows for the

assortment, manufacture, pathogen inactivation and funding of convalescent blood

yields, by a particular emphasis on potential COVID-19 presentations.

45.2 Convalescent Plasma as Potentially COVID-19

Therapy

Convalescent Plasma (CP) therapy encompasses to manage the plasma-containing

immune globulins of a freshly convalesced patient since a particular septicity SARS-2

for somewhat person prone to or contaminated with a severe disease alike COVID-

19 meant for prophylaxis and treatment purposes. Immunized plasma works through

directly mandatory to a speciﬁc pathogen like infection SARS-2 and inducing its

482 L. Kongala et al.

Fig. 45.1 The antiviral reactions

denaturation, ultimately exterminating the concluding from the peripheral blood-

stream, whereas further antibody-interceded conduits, like complementary mech-

anism, antibody-needy phagocytosis and cytotoxicity, may also subsidize to the

therapeutic possessions attained (Fig. 45.1).

In the absenteeism in some approved medications or, treatment, convalescent

plasma is historically usage in Machupo virus eruption, Lassa fever, Junín virus,

and an insufﬁcient others to list. Previously, convalescent plasma treatment was used

successfully in the healing of ﬂurry of SARS, MERS [12], and Ebola viruses. Only

approximately readings indicate convalescent plasma therapy has been operative in

the treatment of inﬂuenza H5N1, avian ﬂu, and H1N1. The usage of pooled plasma or

immune globulins derived after convalesced Western Nile encephalitis patients has

conﬁrmed a caring result in infected mice as well as a clinical advantage in patients.

In additional meta-analysis reading of 10 patients by seven reported cases of

COVID-19 and three patients with viral capacity not observable nevertheless indi-

viduals per symptoms were preserved with transfusion About 200 ml immune

of plasma 1:640 Antibody neutralizer titers besides the antiviral products, then

methylprednisolone, post-transfusion tests revealed full symptom determination in

1:700.

Of these, nine patients were treated with Arbidol monotherapy or Remde-

sivir, Peramivir or Ribavirin combination therapy although one patient received

Monotherapy for Ribavirin. Six of these, too, administered methylprednisolone intra-

venous. In the same analysis, comparative results were reported with ten further

patients who did not undergo plasma therapy in conjunction through corticosteroid

methylprednisolone then antiviral medicines in which it was found that three patients

pass away. In contrast, six others stood in stable circumstance, and one case for control

group showed symptom perseverance, resulting in a higher mortality rate of around

30%.

45.3 The Covid-19 Healing of Convalescent Plasma

In the latest global outbreak, CP was used to take care of patients with COVID-19

China. Five decisively ailing COVID-19 patients who were wayward to antiviral and

hormone therapy obtained Of 5 separate donors, 400 mL CP in studies carried out

45 Applying Machine Learning to Enhance COVID-19 Prediction … 483

by Shan et al. Both donors had an about 1:1000 Antibody Titer with SARS-CoV-

2-particular ELISA; and an antibody titer over 40 In 4 (80%) of 5 patients, after

CP transfusion; stabilized temperature in body surrounded by three days; decreased

Sequential Evaluation of Organ Failure; and increased PaO2/FiO2In 12 days (range

172–276 and 284–366); decreased Viralist load and within 12 days it turned negative;

increased ELISA and Antibody titers neutralized.

After 13 days, severe respiratory suffering disorder enhanced in 5 individuals

(85%); 2 weeks later, four patients were enthralled; 3 out of 5 patients (60%) The

hospital was discharged and the other two patients remained healthy 37 days last 14

reports, Researchers diagnosed ten decisively ill COVID-19 patients amid antiviral

along with steroid treatment, plus a 200 mL dosage of CP Investigators prospectively

compared signs and results from the laboratory four days after infusion with CP.

All patients have well-accepted CP. This considerably increased the neutralization

of antibodies [13] to a high degree; viremia vanished in 7 days; medical indicators

rapidly resolved within three days. It was one increase in the number of lymphocytes,

and in SaO2; On Radiology inspection, lung lesions were conﬁrmed to have improved

considerably within weekdays. Although these experiments include a limited patient

number, available evidence indicates The CP supervision has secure also diminishes

the load of a virus (Fig. 45.2).

Fig. 45.2 Chests of various patients recovering from COVID-19

484 L. Kongala et al.

Fig. 45.3 Effects of immunomodulatory

45.3.1 Plasma Convalescent: The Principle of Action

The exact CP function has not yet been established in COVID-19. Past research,

however, showed that CP’s Other Viral Action Mechanism infections Like Ebola

disease also Syncytial Respiratory [14] germ are predominantly virus neutralizing.

The different pathways are cytotoxic to cells that are caused by the antibody, comple-

ment activation and phagocytosis. The neutralization of antibodies supplied with CP

will provide viral charge regulation. Additionally, non-neutralizing anticorps will

help prophylaxis [15] and improve revitalization (Fig. 45.3).

45.3.2 Collections Steps of Convalescent Plasma

It is imperative to control each particular phase of CP gathering. Preliminary from

donor consideration to patient supervision of CP, all measures should be cautiously

coordinated and carried out by professional ﬁtness employees.

(a) Eligibility for a Donor

(b) Donor Pre-donation Appraisal

(d) Convalescent plasma processing at aphaeresis centers.

(a) Eligibility for a Donor: The eligibility requirements for CP donors can differ

across countries. Individuals that get together the following criteria may be a

CP donor according to the FDA:

(i) Blood donor checks were performed, recovered from COVID-19, and was

appropriate for contribution;

(ii) COVID-19 proof reported by a laboratories inspection Via the diagnostic

examination, for instance, Nasopharyngeal swab, when sick, or by after

45 Applying Machine Learning to Enhance COVID-19 Prediction … 485

recovery, complimentary serological check for SARS-CoV-2, if precedent

to recovery. YIELENO AND AL [16]. 369 Diagnostic monitoring at the

time COVID-19 was supposed was not carried out.

(iii) Full symptom resolution no later than 14 days before contribution.

(iv) The male or female contributor who were Donors not expectant or female

who were screened for Antibodies to HLA since their latest pregnancy

and have perceived the consequences as unfavorable.

(v) It is recommended that antibody titers of at least 1:190 be neutralized when

calculating the neutralizing antibody titers are available. A 1:90 titer can

be measured appropriate if nearby is no alternative matched unit single.

(b) Donor Pre-donation Appraisal: Real-time reverse transcriptase PCR [17]is

currently a preferential examine for coronavirus discovery. RNA detectability,

however, was imperfect in tasters serene previous today-6, and days 15–40.

For this cause, pre-donation investigation on coronavirus must be accompa-

nied by Antibodies detection checking. Any donor must meet his or her eligi-

bility requirements, and the entire tests needed. On regular donation of blood

must be assessed. Women givers for one record one would be pregnant monitor

for Antibodies to HLA reduce danger for severe lung damage associated with

transfusion.

a role in attracting the contributors. In Turkey, therapeutic aphaeresis centers

approved by the department of ﬁtness and the Turkish Crescent Rouge are

carrying exposed donor actions to receive CP.

(d) Convalescent Plasma Processing by Aphaeresis Clusters: Providers that

completed the pre-donation efﬁcaciously valuation has mentioned in the insides

of aphaeresis. For compelling more momentous volumes in squat periods, CP

would be acquired through aphaeresis. Alike to the providers entire blood

volume; Circa 200–600 mL of plasma could be used achieved concluded

aphaeresis appliances. For, respectively, practice, the congregated plasma

capacity excepting Solution Anticoagulants does not outstrip 750 mL engaging

the donor’s endorsement a plasma donation conference might be scheduled

over. The time among contributions would ﬂuctuate between nations. CP can be

processed to freeze, or to apply without freezing In 6 h. A freeze will begin within

the ﬁrst 6 h after completion of the aphaeresis cycle. For traceability, plasma

ingredients should be marked on with ISBT128 Encryption scheme. These gath-

ered Tools should be used tagged About 200 mL alienated constituents individu-

ally, and speciﬁed as 1 unit. Barcoded goods should have been kept in a separate

storage cabinet at or below 18/25.

486 L. Kongala et al.

45.4 The Plasma Convalescent Dose

CP dosing is highly variable in the preceding studies. One plasma unit (200 mL) was

designed for use in prophylaxis in clinical trials, and one or two companies were

prepared for treatment. The period of effectiveness of the antibodies is uncertain.

Still, it is predicted to last for weeks to several months. In prior use CP therapy, 5 mL

per kg of plasma in SARS was used on 1:160 titers. A part or parts of the drug dose

was used in earlier trials for prophylactic rationale.

On Normal equivalence, 3250 mL/kg per plasma employing a titer of > 1:65 will

endow with 1/4 of 5 mL/kg of plasma among 1:160 titers an equivalent immune

globulin level. Dose by corpse credence should be used in pediatric transfusions

[18]. In the pediatric age group, COVID-19 is seldom symptomatic. Therefore, each

method in this epoch faction should be carried out in conjunction employing state and

global ﬁtness powers that be within the framework of clinical research (Fig. 45.4).

Patient Medley: Numerous current medical examinations have very disparate

admissibility requirements around the incredibly pretentious and Individuals poste-

rior to exposure. The alternative of participants will diverge across Nations. The

FDA [19] permitted using the CP in patients getting to know the requirements below:

Conﬁrmation of COVID-19 by laboratory [6].

Severe or endangering existence COVID-19 instantly.

The deﬁnition of serious disease is All of them, or more above:

•Dystrophy,

•30/min tachypnea,

Fig. 45.4 Convalescent

plasma

45 Applying Machine Learning to Enhance COVID-19 Prediction … 487

Fig. 45.5 Work ﬂow of potential regenerative blood products (CBP)

•Saturation of Oxygen to the blood 93%,

•PaO2/FiO2< 300,

•Pulmonary penetration within 24–48 h > 50%.

This describes life-threatening illness like one or other of above:

•Airborne malfunction,

•Sepsis Stroke,

•Dysfunction of Related organs (Fig. 45.5).

(a) Possible Risks

Transfusion of human plasma is a normal, regular activity in new hospice. Human

plasma Anti-SARS-CoV-2 only differs from normal plasma because antibodies to

SARS-CoV-2 are present. Donors must meet all blood donation criteria based on the

eligibility of voluntary contributors under federal and state regulations. We are going

to be obtained At FDA approved blood axis.

The Beneﬁciaries Threats of transfusion will therefore also be no dissimilar and

those from typical plasma recipients. The menace of transfusion-infectious infectivity

is precise low in an USA and further high-proceeds nations. Approximations usually

allude to are lesser than one infectivity with HIV bacteria, hepatitis B, and hepatitis

C per two million donations. Non-infectious transfusion hazards are also present.

For example, allergic transfusion retorts, transfusion induced vascular surcharge,

and acute damage linked to transfusion. While the probability of TRALI for every

488 L. Kongala et al.

5000 transfused units is typically less than one, TRALI [20] is from scrupulous

apprehension in extreme COVID-19 since possible Pulmonary Endothelium Priming.

Nonetheless, regular contributor screening involves the HLA anticorps screening

of female donors pregnancy Background to the reduce the menace of TRALI.

Remember that TACO Risk Factors, e.g., cardiovascular [21] illness and old age,

kidney failure, etc. Those at risk of COVID-19 are shared, illustrating requirement

close control of the amount of ﬂuid.

(b) Plasma Preparation Process and Eminence Manage

Aphaeresis [22] was achieved Using CS Baxter 300 Baxter cell partition. Conva-

lescence [23] was plasma harvested with 40 donors for diagnosis. The mean age

was IQR 42.0 years, 32.5–49 years. Based on epoch and body weight, a 250–450-

mL ABO-complaint plasma trial be composed all contributor, and every sample was

broken down and stored at 4 °C as 200-mL Non-detergent aliquots or warmth healing.

Then the CP was tested in the therapeutic plasma germ [24] inactivation cabinet for

30 min with ethylene blue and light treatment.

45.5 Conclusion

The chances of infection with COVID-19 are signiﬁcant. Individual Recovered

plasma screens with is COVID-19 expected being both a protected with the possibly

useful remedy on behalf of diagnosis moreover prophylaxis upon exposure. Consid-

erable proof of beneﬁt with previous use for viral illness gives such an approach

a clear precedent. Nonetheless, well-controlled medical examinations are critically

essential to validate the effectiveness, thus notify informed evidence-based super-

visory. However, in the case of disapprovingly hostile patients, plasma transfusions

boost their clinical ailment and decrease transience rates, auxiliary research and

controlled irrefutable trials remain still needed to evaluate their effectiveness and

meticulous function in the usage of new Coronavirus.

References

1. Lu, H. (2020). Drug treatment options for the 2019-new corona virus (2019-nCoV). Bioscience

Trends, 14, 69–71.

2. Cheng, Y., et al. (2005). Use of convalescent plasma therapy in SARS patients in Hong Kong.

European Journal of Clinical Microbiology and Infectious Diseases, 24, 44–46.

3. Ko, J. H., et al. (2018). Challenges of convalescent plasma infusion therapy in Middle East

respiratory corona virus infection: A single centre experience. Antiviral Therapy, 23, 617–622.

4. World Health Organization. (2019). Novel-corona virus.

5. Wang, M., Cao, R., Zhang, L., Yang, X., Liu, J., et al. (2020). Remdesivir and chloroquine effec-

tively inhibit the recently emerged novel corona virus (2019-nCoV) in-vitro. Cell Research,

30, 269–271.

45 Applying Machine Learning to Enhance COVID-19 Prediction … 489

6. Shen, C., Wang, Z., Zhao, F., Yang, Y., et al. (2020). Treatment of 5 critically ill patients with

COVID-19 with convalescent plasma. JAMA, 323(16), 1582–1589.

7. Hoenen, T., Groseth, A., & Feldmann, H. (2019). Therapeutic strategies to target the Ebola

virus life cycle. Nature Reviews Microbiology, 17, 593–606.

8. Luke, T. C., Kilbane, E. M., Jackson, J. L., & Hoffman, S. L. (2006). Meta-analysis: Conva-

lescent blood products for Spanish inﬂuenza pneumonia: A future H5N1 treatment? Annals of

Internal Medicine, 145, 599–609.

9. Wong, H. K., Lee, C. K., Hung, I. F., Leung, J. N., Hong, J., Yuen, K. Y., & Lin, C. K.

(2010). Practical limitations of convalescent plasma collection: A case scenario in pandemic

preparation for inﬂuenza A (H1N1) infection. Transfusion, 50, 1967–1971.

10. Su, S., Wong, G., Shi, W., Liu, J., Lai, A. C. K., Zhou, J., et al. (2016). Epidemiology, genetic

recombination, and pathogenesis of corona viruses. Trends in Microbiology, 24, 490–502.

https://doi.org/10.1016/j.tim.2016.03.00

11. Cavanagh, D. (2007). Corona virus avian infectious bronchitis virus. Veterinary Research, 38,

281–297. https://doi.org/10.1051/vetres:2006055

12. Sherer, Y., Levy, Y., & Shoenfeld, Y. (2002). IVIG in autoimmunity and cancer–efﬁcacy versus

safety. Expert Opinion on Drug Safety, 1, 153–158. https://doi.org/10.1517/14740338.1

13. Katz, U., Achiron, A., Sherer, Y., & Shoenfeld, Y. (2007). Safety of intravenous immunoglob-

ulin (IVIG) therapy. Autoimmunity Reviews, 6, 257–259. https://doi.org/10.1016/j.autrev.2006.

08.011

14. Bloch, E. M., Shoham, S., Casadevall, A., Sachais, B. S., Shaz, B., Winters, J. L., et al. (2020).

Deployment of convalescent plasma for the prevention and treatment of COVID19. The Journal

of Clinical Investigation.https://doi.org/10.1172/JCI138745

15. Satya Sree, K. P. N. V., Bikku, T., Mounika, S., Ravinder, N., Kumar, M. L., & Prasad, C. (2021).

EMG Controlled Bionic Robotic Arm using Artiﬁcial Intelligence and Machine Learning. In

2021 Fifth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)

(I-SMAC) (pp. 548–554). https://doi.org/10.1109/I-SMAC52330.2021.9640623

16. Johns Hopkins Bloomberg School of Public Health, Department of Molecular Microbiology

and Immunology, Baltimore, MD.

17. Holshue, M. L., et al. (2020). Washington state 2019-nCoV case investigation team, ﬁrst case of

2019 novel corona virus in the United States. New England Journal of Medicine, 382, 929–936.

18. Toomula, S., Paulraj, D., Bose, J., Bikku, T.,& Sivabalaselvamani, D. (2022). IoT and wearables

for detection of COVID-19 diagnosis using fusion-based feature extraction with multikernel

extreme learning machine. In Wearable Telemedicine Technology for the Healthcare Industry

(pp. 137–152). Academic Press.

19. Chen, L., Xiong, J., Bao, L., & Shi, Y. (2020). Convalescent plasma as a potential therapy for

COVID-19. The Lancet Infectious Diseases, 20, 398–400.

20. Bikku, T., Satya Sree, K. P. N. V., Jarugula, J., & Sunkara, M. (2022). A Novel Integrated IoT

Framework with Classiﬁcation Approach for Medical Data Analysis. In 2022 9th International

Conference on Computing for Sustainable Global Development (INDIACom). pp. 710-715.

https://doi.org/10.23919/INDIACom54597.2022.9763297

21. Duan, K., Liu, B., Li, C., Zhang, H., Yu, T., et al. (2020). The feasibility of convalescent plasma

therapy in severe COVID-19 patients: A pilot study. BMJ Yale.

22. Cao, W. C., Liu, W., Zhang, P. H., Zhang, F., & Richards, J. H. (2007). Disappearance of

antibodies to SARS-associated corona virus after recovery. New England Journal of Medicine,

357, 1162–1163.

23. Kong, L. K., & Zhou, B. P. (2006). Successful treatment of avian inﬂuenza with convalescent

plasma. Hong Kong Medical Journal, 12, 489.

24. Wong, S. S. Y., & Yuen, K.-Y. (2008). The management of corona virus infections with partic-

ular reference to SARS. Journal of Antimicrobial Chemotherapy, 62, 437–441. https://doi.org/

10.1093/jac/dkn243

Chapter 46

Analysis of Disaster Tweets Using

Natural Language Processing

Thulasi Bikku, Pathakamuri Chandrika, Anuhya Kanyadari,

Vuyyuru Prathima, and Borra Bhavana Sai

Abstract Now-a-days social media has become a crucial part of life. Twitter is a

social networking site on which people post and interact with messages renowned

as tweets. Users registered ofﬁcially can tweet, like and re-tweet messages. During

the emergence of a disaster or crisis social media has become a signiﬁcant means of

communication. The widespread use of mobile phones and other forms of commu-

nication allows individuals to express and alert others about real-life disasters. Such

knowledge relating to disasters spread over the media could save thousands of indi-

viduals by warning others and allowing them to take the required actions. It is being

worked on by many ﬁrms to analyze tweets and observe tweets relating to disasters

and emergencies using programming. Such efforts may be useful to loads of people

using the internet. However, this effort has other problems, such as detecting and

distinguishing catastrophe tweets from non-disaster tweets. Often the data available

in twitter is not structured so processing is to be done on the data to classify data as

‘disaster’ and ‘non-disaster’. This paper deals with developing a model that can tell if

a user is sharing data about a disaster. The data set used includes 10,000 tweets along

with classiﬁers. This Optimized SVM model pre-processes the data using Natural

Language Processing (NLP) and then builds the classiﬁer model that gives maximum

accuracy.

46.1 Introduction

Twitter came into existence from July 15 in 2006, in the beginning it was a messaging

service and used among small group. It expanded and became a micro-blogging,

social networking service. It enabled users to post writings, images, and messages

renowned as ‘tweets’. Registered members can post, like and re-tweet messages,

whereas those not registered can only access and view tweets. Twitter attained a lot

of fame since its beginning from 2006. According to a data produced in 2019, on

T. Bikku (B)·P. Chandrika ·A. Kanyadari ·V. P r a t h i m a ·B. B. Sai

Department of Computer Science and Engineering, Vignan’s Nirula Institute of Technology and

Science for Women, Guntur, India

e-mail: thulasi.bikku@gmail.com

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7_46

491

492 T. Bikku et al.

an average around 6000 tweets are generated every second internationally. This is

producing a huge data every year world-wide it is estimated to be 200 billion tweets

approximately.

The data generated is unstructured as it is not in any authorized format. There

comes the exact challenge in analyzing the data available through twitter, so we

make use of Natural Language Processing (NLP). NLP is used to extract keywords

[1] that imply that the tweet is a disaster. Before using NLP, the data is to be processed

and then tweets are divided into words called tokens using tokenization, then analysis

is done on the tokenized data [2,3].

There are some methods that are used to pre-process the data in order to clean

it. These include removal of punctuation, tokenization, removal of stop-words,

and stemming. The process of removing punctuation marks from data is known

as removal of punctuation. The process of splitting text into units such as characters,

words, or sentences is tokenization. Some words do not give much signiﬁcance such

words are stop-words and they are to be removed. Removal of words containing

preﬁxes and sufﬁxes is necessary this can be accomplished by using stemming. The

objective is that we classify the tweets into either belonging to disaster class or a

non-disaster class. The question here is a yes or no question, we have to ﬁnd out if

it belongs to a disaster class or not, and we later check the accuracy of the system

using Confusion matrix.

46.2 Literature Survey

Natural language processing is being the most crucial part in text mining and there are

many remarkable and exemplary research works relating to this ﬁeld. The proposed

work on natural language processing highlights the process involved in processing

the tweets in order to analyze the tweets [4]. Sentiment analysis of twitter is also

another famous ﬁeld which laid emphasis in developing disaster tweets models.

There are many works relating to sentiment analysis of twitter. One such paper

developed a model that is predictive in nature to analyze sentiment of tweets [5].

Lee [6] gives an overview of performing classiﬁcation using K-Neighbors classiﬁer.

Hasan et al. [7] highlights classiﬁcation of tweets grounded on politics, movies, fraud

news, fashion, humanity and justice into three main categories like positive, negative,

and neutral, they collected tweets in Pakistan and did analysis using classiﬁcation

algorithms like Naïve Bayes and SVM. Algorithms like KNN, Naïve Bayes, and

Modiﬁed K-Means are used to perform sentiment analysis on twitter data to classify

them into three classes like positive, negative, and neutral classes [8]. They compared

the accuracy results using all the above mentioned algorithms. Feature extraction

involves cleaning the data using tokenization, stop-words removal, and stemming

followed by this feature selection [9–11] is performed and a comparative study using

SVM, KNN, and Naïve Bayes is performed [1]. Text mining, feature selection and

also on feature transformation for analyzing text data are performed in our model

[12,13].

46 Analysis of Disaster Tweets Using Natural Language Processing 493

46.3 Proposed Model

Classiﬁcation algorithms are used to classify the tweets into either a disaster or a

non-disaster. After running various classiﬁcation algorithms [14] like MPP, KNN,

Random Forest, BPNN, SVM, K-means, SVM gave the most accurate results. Clas-

siﬁcation using optimized SVM algorithm is implemented in our proposed model as

shown in Fig. 46.1.

We utilized the SVM model in the sk-learn package to develop the Support Vector

Machine (SVM) technique. The svm.py ﬁle in the projects repository contains the

implementation for the SVM algorithm. Simply run svm.py with Python3 and no

command line arguments to execute the ﬁle. There are two methods in this ﬁle: main()

and SVM(). The main() method pulls in the training and testing data, pre-processes

the data, and invokes the SVM() function to run the model on the project’s data when

this ﬁle is executed. The SVM() function is a standalone implementation of the SVM

model that accepts as input parameters training and testing data. This function will

also provide a confusion matrix with a heat map, as well as the performance metrics

Precision, Recall, F1-Score, and Overall Accuracy. The SVM model utilized allows

you to change many settings; however we found that keeping everything at default

yielded the greatest results in our tests. The performance metrics and how it compared

to other implementations can be found in the results section.

Figure 46.1 explains the Architecture of Proposed system, here initially we collect

the data, and after collecting the data we do the analysis. During Data Analysis

Phase we inspect, clean, transform and model the data, then we pre-process the data,

we process raw data into an understandable format. After preprocessing data, we

structure the model. After structuring, testing is done on the model. After testing if

Fig. 46.1 Architecture of proposed system

494 T. Bikku et al.

we attain efﬁcient score we ﬁnally implement it to produce the output, if we attain

poor score we again pre-process the data until we get the efﬁcient score.

46.4 Algorithm

Input:

Sin(total number of input vectors)

Ssv(number of support vectors)

Sft(total number of attributes in a support vector)

SV5[Msv](support vector array)

IQ[Min](input vector array), bs*

Output:

V(output of conclusion function) for f β1toSsv,by1do

H=0

for qβ1toSsv,by1do

Distance=0

for sβ1toSft,by1do

distance==(SV1[h].feature[m]−IN[f].feature[r])^2

end

v=exp(−X×distance)

H+=SV5[i].α*×v

end

H=H+bs*

end

The Support Vector Machine (SVM) is a type of directed AI computation that

may be used for both planning and detecting relapses. However, we consider that

challenges with relapse are the best candidates for arrangement. The purpose of SVM

is to ﬁnd an N-dimensional hyper plane that organizes the informative components

in a certain way. The amount of highlights determines the hyper plane’s component.

The hyper plane is more than a line when the number of information highlights is

two elements. The hyper plane becomes a two-dimensional plane when the number of

information highlights approaches three, it becomes harder to see when the number

of components surpasses three.

46 Analysis of Disaster Tweets Using Natural Language Processing 495

46.5 Results

As shown in Fig. 46.2, a receiver operating characteristic curve (ROC curve) is a

graph that illustrates how well a classiﬁcation model performs across all classiﬁcation

thresholds. The True Positive Rate and False Positive Rate are plotted on this curve.

This curve plots characteristic curve for SVM algorithm. AUC is the Area under the

ROC Curve. For this algorithm, it is shown as 0.86. AUC is a composite measure of

performance that takes into account all potential categorization levels.

As shown in Fig. 46.3, a receiver operating characteristic curve, often known

as a ROC curve, is a graph that depicts how a binary classiﬁer system’s diagnostic

performance changes over time. This curve plots characteristic curve for BPNN

algorithm. As shown in the Fig. 46.3 it gives 78.4% accuracy on the overall data. It

has less accuracy than SVM algorithm. AUC of this BPNN algorithm is 0.85.

Figures 46.4 and 46.5 shows the ROC curves of Random Forest and KNN algo-

rithms, the AUC of Random Forest Classiﬁer is 0.85 whereas AUC of K-Neighbors

Classiﬁer is 0.84.

Figures 46.6,46.7,46.8,46.9 show the confusion matrix of various algorithms

used in our model. Confusion matrix is an N×Nmatrix used to evaluate the perfor-

mance of a classiﬁcation model, where Nis the number of target classes. The matrix

compares the actual goal values to the machine learning model’s predictions. This

provides us with a comprehensive picture of how well our classiﬁcation model is

working and the types of errors it makes. For a binary classiﬁcation task, we’d use a

2×2 matrix with four values, as seen in ﬁgures: There are two possible values for

the target variable: positive or negative. The target variable’s real values are shown

in the columns. The rows indicate the target variable’s expected values.

The true positive value for MPP algorithm as derived from the confusion matrix

of Fig. 46.6 is 1538 alongside the true negative value is 985. The false positive and

false negative values are 323 and 417. Figure 46.7 is the confusion matrix of KNN

Fig. 46.2 ROC curve of

SVM

496 T. Bikku et al.

Fig. 46.3 ROC curve of

BPNN

Fig. 46.4 ROC curve of

random forest

algorithm, the true positive and true negative values are 1681 and 859, respectively,

and the false positive and false negative values are 180 and 543, respectively.

The true positive value for BPNN algorithm as derived from the confusion matrix

of Fig. 46.8 is 1525 alongside the true negative value is 1018. The false positive and

false negative values are 336 and 384.

Figure 46.9 is the confusion matrix of Random Forest algorithm, the true positive

and true negative values are 1663 and 929, respectively, and the false positive and

false negative values are 198 and 473, respectively.

Figure 46.10 shows the confusion matrix derived from SVM algorithm. The true

positive value is 1699 and true negative value is 918. The false positive and false

negative values are 162 and 484, respectively. Figure 46.11 shows the overall accuracy

46 Analysis of Disaster Tweets Using Natural Language Processing 497

Fig. 46.5 ROC curve of

KNN

Fig. 46.6 Confusion matrix

for MPP

Fig. 46.7 Confusion matrix

for KNN

498 T. Bikku et al.

Fig. 46.8 Confusion matrix

for BPNN

Fig. 46.9 Confusion matrix

for random forest

graph which is plotted by taking in the accuracies of various algorithms used. The

graph reached its maximum at 80.2%.

Table 46.1 shows the Performance metrics like Precision, Recall, F1 score and the

Overall accuracy of the various algorithms used to develop the model. Precision is

one measure of a machine learning model’s performance—the accuracy of a model’s

positive prediction. The number of true positives divided by the total number of

positive predictions is known as precision (i.e., the number of true positives plus the

number of false positives). The Precision values of the algorithms used are listed in

table and the highest precision was shown by optimized SVM algorithm (85.0%)

followed by KNN with 82.7%.

Recall is a statistic that measures how many correct positive predictions were

produced out of all possible positive predictions. All the Recall values of algorithms

are listed in Table 46.1. The highest recall value was with BPNN algorithm. F1 score

46 Analysis of Disaster Tweets Using Natural Language Processing 499

Fig. 46.10 Confusion

matrix for SVM

Fig. 46.11 Overall accuracy graph

Table 46.1 Performance metrics of the algorithms used

Algorithm used Precision Recall F1score Overall accuracy

MPP 75.3 70.3 72.7 77.3

KNN 82.7 61.3 70.4 77.8

Random Forest 81.4 66.4 73.1 79

Optimized SVM 85.0 65.5 74.0 80.2

K-means 30.4 4.2 7.4 54.7

BPNN 76.7 71.7 74.1 78.5

500 T. Bikku et al.

is calculated using precision and recall. The highest F1 score is for BPNN algorithm

which means that there is a balance between precision and recall of that algorithm.

Overall accuracy of all the algorithms can be seen from Table 46.1. The highest

accuracy was achieved by optimized SVM algorithm with 80.2%.

46.6 Conclusion

This research paper performed analysis and text mining on twitter datasets. It tried

classifying the twitter data into two categories as either disaster or non-disaster. From

the experimental ﬁndings of this paper it can be winded up that twitter can safely

classify the tweets with a great accuracy. The highest accuracy was achieved using

optimized SVM on the reduced data set, the accuracy was expected to be 80.5%.

There are certain drawbacks, the performance of the was slower than models that

provided better accuracy. So there comes the room for improvement that the accuracy

can be improved by improving the fusion testing and more tweaking is also necessary.

There is improved chance of future improvement in this particular ﬁeld. This paper

used SVM as the classiﬁcation algorithm as it gave best results. Other algorithms,

such as Decision trees, K-Nearest Neighbors, Random forest, and fusion approaches,

can be used instead of these to increase accuracy, these aspects could be studied in

the future.

References

1. Khan, A., Baharudin, B., Lee, L. H., & Khan, K. (2010). A review of machine learning algo-

rithms for text-documents classiﬁcation. Journal of Advances in Information Technology, 1(1),

4–20.

2. Chowdhury, G. G. (2003). Natural language processing. Annual Review of Information Science

and Technology,37(1), 51–89.11.V

3. Indurkhya, N., & Damerau, F. J. (Eds.). (2010). Handbook of natural language processing

(Vol. 2). CRC Press.

4. Shetty, B. (2018). Natural language processing (NLP) for machine learning. Retrieved

November 24, 2018.

5. Bagheri, H., & Islam, M. J. (2017). Sentiment analysis of twitter data. arXiv preprint arXiv:

1711.10377

6. Lee, E. (2019). Using k-nearest-neighbors (knn) machine learning technique to classify

archived helicopter wear debris data. In AIAC18: 18th Australian International Aerospace

Congress (2019): HUMS-11th Defence Science and Technology (DST) International Confer-

ence on Health and Usage Monitoring (HUMS 2019): ISSFD-27th International Symposium

on Space Flight Dynamics (ISSFD) (p. 816). Engineers Australia, Royal Aeronautical Society.

7. Hasan, A., Moin, S., Karim, A., & Shamshirband, S. (2018). Machine learning-based sentiment

analysis for Twitter accounts. Mathematical and Computational Applications, 23(1), 11.

8. Bharti, O., & Malhotra, M. (2016). Sentiment analysis on Twitter data. International Journal

of Computer Science and Mobile Computing, 5(6), 601–609.

9. Bikku, T., & Priya, A. P. A novel algorithm for clustering and feature selection of high

dimensional datasets.

46 Analysis of Disaster Tweets Using Natural Language Processing 501

10. Deepa Lakshmi, S., & Velmurugan, T. (2016). Empirical study of feature selection methods

for high dimensional data. Indian Journal of Science and Technology, 9(39), 1–6.

11. Dayanika, J., Archana, G., SivaKumar, K., & Pavani, N. (2020). Early detection of cyber attacks

based on feature selection algorithm. Journal of Computational and Theoretical Nanoscience,

17(9–10), 4648–4653.

12. Goswami, S., & Raychaudhuri, D. (2020). Identiﬁcation of disaster-related tweets using natural

language processing. In International Conference on Recent Trends in Artiﬁcial Intelligence,

IOT, Smart Cities & Applications (ICAISC-2020), May 26, 2020.

13. Bikku, T., Nandam, S. R., & Akepogu, A. R. (2018). A contemporary feature selection and

classiﬁcation framework for imbalanced biomedical datasets. Egyptian Informatics Journal,

19(3), 191–198.

14. Sen, P. C., Hajra, M., & Ghosh, M. (2020). Supervised classiﬁcation algorithms in machine

learning: A survey and review. In Emerging technology in modelling and graphics (pp. 99–111).

Springer.

Author Index

Abhishek, B., 135

Aishwarya Govindkar, 169

Almukhtar, Firas Husham, 447

Anantha Rao, G., 315,361

Angadi Lakshmi, 203

Anil, K., 149

Anuhya Kanyadari, 491

Asha, S., 397,407

Avanija, J., 465

Aviral Pulast, 397

Ayyappa Swamy, K., 455

Badrinath, N., 327

Badugu Samatha, 179

Balaji Bhanu, B., 127

Balasundaram, A., 135,189

Bhaskar Kumar Rao, 127

Bhaskar, T., 49

Bhavya,K.R.,419

Bibhuti Bhusan Dash, 109

Biksham, V., 49

Borra Bhavana Sai, 491

Brahmananda Reddy, A., 69

Damodaram, A. K., 81

Deepak, V., 179

Divakar, T. V. S., 315,361

Divya Jagabattula, 339

Gabbireddy Keerthi, 303

Gadhiraju Tej Varma, 217

Galety, Mohammad Gouse, 431,447

Ganesh Davanam, 95

Gavini Sreelatha, 169

Gayathri, N., 235

Govinda, K., 39

Harini, M., 1

Hussian, Md. Asdaque, 249

Indraneel, S., 159

Jafﬂet Trinishia, A., 407

Jagadeesh Kannan Raju, 135,189

Janakiramaiah Bonam, 295,339

Jasthi Siva Sai, 203

Jayasree, K., 259

Kamala Challa, 465

Kamal Hajari, 29

Kannan, I., 281

Karthikeyan Jayaraman, 351

Karthik, S. A., 387,419

Kerenalli Sudarshana, 9

Khanjan Shah, 135

Kiran, K. V. D., 235

Kondra Pranitha, 265

Korupalli V. Rajesh Kumar, 303

to Springer Nature Singapore Pte Ltd. 2023

B. N. K. Rao et al. (eds.), Intelligent Computing and Applications, Smart Innovation,

Systems and Technologies 315, https://doi.org/10.1007/978-981-19-4162-7

503

504 Author Index

Kotte Vinaykumar, 49

Krishnaveni, C. V., 265

Kurakula Arun Kumar, 351

Lakshmi Ramani Burra, 295,339

Lalitha, G., 387

Lavanya Kongala, 479

Maaroof, Rebaz Jamal, 431,447

Maganti Venkatesh, 109

Malathy, C., 59

Marni Srinu, 109

Mekala Narendar, 95

Midhun Chakkaravarthy, 265

Mohan, A., 127

Mukesh Chinta, 203

Mukkamala Namitha, 203

Mulugu Suma Anusha, 203

Muniraju Naidu Vadlamudi, 249

MylaraReddy, C., 9

Naga Badra Kali, M., 119

Nagarjuna Karyemsetty, 179

Nagendra Panini Challa, 119,127

Narendra Kumar Rao, B, 127,265,295

Naveen Kumar Polisetty, S., 159

Nuthalapati Sudha, 69

Obulakonda Reddy, R., 327

Obulesu, O., 439

Padyala Venkata Vara Prasad, 235

Pathakamuri Chandrika, 491

Pattan Afrid Ahmed, 189

Pavan Kumar Vadrevu, 227

Pavan Kumar, T., 95

Prabhu Gantayat, 189

Pradeep Ghantasala, 479

Praveen Tumuluru, 295

Pravinth Raja, S., 419

Priyanka Gaba, 19

Rabinarayan Satpathy, 109

Raghav Srinivaas, M., 135

Rajasekhar Kommaraju, 235

Ram Shringar Raw, 19

Ranjana, 265

Reddy Madhavi, K., 81,303,327,479

Riyazuddin, Y. Md., 387

Rofoo, Fanar Fareed Hanna, 431,447

Roopasri Sai Varshitha Godavarthi, 339

Routhu Ramya Dedeepya, 203

Rupa Devi, B., 303

Sampath Korra, 49

Sarika Jay, 189

Sarukolla Ushaswini, 169

Sasi Kumar Bunga, 227

Shaik Jani, 377

Shyam Mohan, J. S., 119,127

Sivakumar, B., 1

Sivaprakasam, S., 149

Sivaprakasam, T., 159

Soumya Gogulamudi, 339

Sowmya Eda, 339

Sreenivasa Chakravarthi, S., 81

Sree T. Sucharitha, 281

Sridhar, P., 149

Sri Krishna Adusumalli, 217,227

Subbarao Gogulamudi, 179

Subhash Chavadaki, 419

Sudhakara, M., 303,327

Sumam Mary Idicula, 259

Suneetha, K., 465

Sunil Kumar Reddy, P., 439

Sunil Kumar, B., 419

Sunil Kumar, M., 95

Suresh, K., 439

Suresh Kallam, 465,479

Syamala, K., 361

Thalakola Syamsundararao, 179

Thoutireddy Shilpa, 479

Thulasi Bikku, 491

Ujwalla Gawande, 29

Vaidhehi, M., 59

Valli Kumari, V., 377

Var u n K u ma r, K . A ., 281

Author Index 505

Veeramanickam, M. R. M., 227

Venkataramana, R., 387

Venkata Rama Raju, P., 119

Venkata Sai Satvik, 189

Venkata Subbaiah, C., 39

Venkateswara Reddy, L., 81,95

Venkateswarulu, N., 439

Vijaya Shambhavi, Y., 327

Vijaya Kumar Gudivada, 109

Vuyyuru Prathima, 491

Yaswanth Raparthi, 465

Yogesh Golhar, 29

Zachariah C. Alex, 455

Zameer Ahmed Adhoni, 9

ResearchGate has not been able to resolve any citations for this publication.

Multi-scale fusion for underwater image enhancement using multi-layer perceptron

Article

Full-text available

Jun 2021

span id="docs-internal-guid-54b35aa6-7fff-0992-ed4c-aca4d05cfcfa"> Underwater image enhancement (UIE) is an imperative computer vision activity with many applications and different strategies proposed in recent years. Underwater images are firmly low in quality by a mixture of noise, wavelength dependency, and light attenuation. This paper depicts an effective strategy to improve the quality of degraded underwater images. Existing methods for dehazing in the literature considering dark channel prior utilize two separate phases for evaluating the transmission map (i.e., transmission estimation and transmission refinement). Accurate restoration is not possible with these methods and takes more computational time. A proposed three-step method is an imaging approach that does not need particular hardware or underwater conditions. First, we utilize the multi-layer perceptron (MLP) to comprehensively evaluate transmission maps by base channel, followed by contrast enhancement. Furthermore, a gamma-adjusted version of the MLP recovered image is derived. Finally, the multi-scale fusion method was applied to two attained images. The standardized weight is computed for the two images with three different weights in the fusion process. The quantitative results show that significantly our approach gives the better result with the difference of 0.536, 2.185, and 1.272 for PCQI, UCIQE, and UIQM metrics, respectively, on a single underwater image benchmark dataset. The qualitative results also give better results compared with the state-of-the-art techniques. </span

Internet Of Things (IOT)

Presentation

Full-text available

Jan 2023

The Internet of things (IoT) describes physical objects (or groups of such objects) with sensors, processing ability, software and other technologies that connect and exchange data with other devices and systems over the Internet or other communications networks. Internet of things has been considered a misnomer because devices do not need to be connected to the public internet, they only need to be connected to a network and be individually addressable.

Block chain Based Implementation of Electronic Medical Health Record

Article

Full-text available

Apr 2022

Dr.B.Narendra Kumar Rao

Electronic medical records (EMRs) square measure vital, sensitive personal data in aid, and wish to be often shared between peers. Blockchain Technologyfacilitates a shared, im-mutable and history of all the transactions creatingsoftwareof trust, responsibility and transparency. This provides a novel chance to implement a secure and reliable EMR knowledge management and sharing, system victimization. In this paper, we gift our views on blockchain primarily based aid knowledge management, specially, for EMR knowledge sharing between aid suppliers and for analysis studies. we have a tendency to propose a framework for managing EMR knowledge for cancer patient care. together with an Hospital, we have a tendency to enforced our framework in an exceedingly image that ensures privacy, security, convenience, and fine-grained access management over EMR knowledge. The planned paper will considerably scale back the turnaround for EMR sharing, improve higher cognitive process for medical aid, and scale back the value. Confidentiality in health industry refers to the "obligation of professionals" , World Health Organization canhave access to patient records or exchange information to carry that data in confidence. Managing electronic health data presents distinctive challenges for restrictive compliance, for moral concerns and ultimately for quality of care. As the meaningful use of Electronic Health record system expands from the health devices, its aiding organizations grow. All World Health Organization work with health data-health information processing and management professionals, doctors, researchers, business directors have responsibility to accept that data. And as patients, we've privacy rights with rele-vancy our own health data Associate in Nursing an expectation that our data be control in confidence and guarded. Confidentiality of patient medical records is of utmost importance. Access to patient medical records in hospital software package ought to be with the treating/admitting practician and therefore the team. Access to medical records mustn't lean to everybody within the hospital network. one in all the thanks to address this confidentiality issue is "Blockchain Technology". Victimization digital signatures on Blockchain-based knowledge permits access for multiple folks may regulate the provision and maintain the security of health records. Additionally, a community of individuals, together with stakeholders of health care industry, might be a part of the Blockchain, can reduce fraud in payments.

Cardiac arrhythmia detection using dual-tree wavelet transform and convolutional neural network

Article

Full-text available

Apr 2022
SOFT COMPUT

The non-stationary ECG signals are used as key tools in screening coronary diseases. ECG recording is collected from millions of cardiac cells and depolarization and re-polarization conducted in a synchronized manner as: the P wave occurs first, followed by the QRS-complex and the T wave, which will repeat in each beat. The signal is altered in a cardiac beat period for different heart conditions. This change can be observed in order to diagnose the patient’s heart status. Simple naked eye diagnosis can mislead the detection. At that point, computer-assisted diagnosis (CAD) is therefore required. In this paper dual-tree wavelet transform is used as a feature extraction technique along with deep learning (DL)-based convolution neural network (CNN) to detect abnormal heart. The findings of this research and associated studies are without any cumbersome artificial environments. This work investigates the viability of using deep learning-based architectures for heartbeat classification. The DL architecture is used for the proposed project, and the results suggest that it is feasible to use 2D images to train a deep learning architectures for heartbeat classification. The CNN produced the highest overall accuracy of around 99%. The CAD method proposed has high generalizability; it can help doctors efficiently identify diseases and decrease misdiagnosis.

Deep Learning and Fuzzy Rule-Based Hybrid Fusion Model for Data Classification

Article

Full-text available

Jul 2019

Data mining is the promising field that attracted the industries to manage huge volumes of data. The most effective and challenging techniques of data mining is data classification. The main intension of this research is to design and develop a data classification strategy based on hybrid fusion model using the deep learning approach, Adaptive Lion Fuzzy System (ALFS), and Robust Grey wolf based Sine Cosine Algorithm based Fuzzy System (RGSCA-FS). The hybrid model consists of three phases: In the first phase, the data is classified using ALFS and the rule base of the fuzzy system is updated by optimally generating the rules using adaptive lion optimization (ALA) from the training data. The second step is the fuzzification process, which converts the scalar values in the training data into fuzzy values with the help of membership function, which is based on Adaptive Genetic Fuzzy System (AGFS). Finally, the classified score of data instances is determined using defuzzification process, which converts the linguistic variable into fuzzy score. In the second phase, the data is classified using Robust Grey wolf based Sine Cosine Algorithm based Fuzzy System (RGSCA-FS), which is used for selecting the optimal fuzzy rules. In the third phase, the data is classified using deep learning networks. The outputs from three phases are fused together using the hybrid fusion model for which the weighed fusion is employed. The performance of the system is validated using three different datasets that are available in UCI machine learning repository. The proposed hybrid model outperforms the existing methods with sensitivity of 0.99, specificity of 0.9350, and accuracy of 0.9411, respectively.

IoT and wearables for detection of COVID-19 diagnosis using fusion-based feature extraction with multikernel extreme learning machine

Chapter

Full-text available

Jan 2022

Presently, wearables act as a vital part of healthcare sector and they are able to offer exclusive perceptions about the person’s health conditions. In contrast to traditional diagnosis in a hospital environment, wearables can give unrestricted access to real-time physiological data. COVID-19 epidemic is increasing at a faster rate with limited test kits. Hence, it becomes essential to develop a novel COVID-19 diagnostic model. Numerous studies were based on the utilization of artificial intelligence techniques on radiological images to precisely identify the disease. This chapter presents an efficient fusion-based feature extraction with multikernel extreme learning machine (FFE-MKELM) for COVID-19 diagnosis using internet of things (IoT) and wearables. Primarily, the wearables and IoT are used to capture the radiological images of the patient. The presented FFE-MKELM model incorporates Gaussian filtering based preprocessing for removing the noise that exists in the radiological image. Besides, directional local extreme patterns with deep features based on Inception v4 model are applied for the FFE process. In addition, MKELM model is utilized as a classification model to determine the appropriate class label of the input radiological images. Moreover, monarch butterfly optimization algorithm is applied to fine tune the parameters involved in the MKELM model. Experimental validation of the FFE-MKELM model is performed against benchmark dataset and the outcomes are inspected under different measures. The resultant simulation outcome ensured the betterment of the FFE-MKELM method by demonstrating an increased sensitivity of 97.34%, specificity of 97.26%, accuracy of 97.14%, and F-measure of 97.01%.

Natural Language Processing (NLP) in AI

Chapter

Jun 2023

Natural language processing (NLP) is a growing field of artificial intelligence (AI) that combines machine learning and linguistics to enable computers to understand and generate human language. Applications of NLP range from voice assistants like Apple’s Siri and Amazon’s Alexa to text summarization, machine translation, and spam filtering. NLP is particularly challenging given the complexity and hierarchical nature of human lan at the most basic level, individual words can take guage, which can include subtle meanings. Fortunately, rapidly improving computing power, new tools and avenues of mass data collection, and recent improvements in NLP algorithms (large language models) have all made it possible to train computers to understand human language more efficiently and more accurately.

Grid Base Energy Efficient Coverage Aware Routing Protocol for Wireless Sensor Network

Conference Paper