Table 1. Structure of Hbase table for image features storage

Source publication

Medical content based image retrieval by using the Hadoop framework

Conference Paper

Full-text available

May 2013

Most medical images are now digitized and stored in large image databases. Retrieving the desired images becomes a challenge. In this paper, we address the challenge of content based image retrieval system by applying the MapReduce distributed computing model and the HDFS storage model. Two methods are used to characterize the content of images: th...

Context 1

... based image retrieval (CBIR) is composed of two phases: 1) offline phase, 2) online phase. In the offline phase, the signature vector is computed for each image in databases and they will be stored. In the online phase, the query is constructed by computing the vector signature of input image. Then, the query signature is compared with signatures of images in the database, MapReduce is known for its ability to handle large amounts of data. In this work, we use the open source distributed cloud computing framework Hadoop and its implementation of the MapReduce model to extract vectors features of images. The implementation method of distributed features extraction and image storage is given in figure 4. Storage is the base of CBIR system, given the amount of images data produced daily by the medical services, retrieve and processed these images need important computation time. Therefore, parallel processing is necessary. For this reason, we adopt Mapreduce computing model to extract the visual features of images and then write the features and image files all into HBase. HBase partitions the key space. Each partition is called a Table. Each table declares one or more column families. Column families define the storage properties for an arbitrary set of columns [6]. The given table in figure 5 shows the structure of our Hbase table, the row key of our Hbase table is assigned to the ID of image and families are files and features. Label ”source” and ”class” are added under family ”file”, representing for source image and class of image (the DDSM database is classified in 3 levels of diagnosis (’normal’, ’benign’ or ’cancer’)) respectively. Under family ”features”, label ”feature BEMD-GGD Alpha”, ”feature BEMD-GGD Beta”, ”feature BEMD-HHT mean”,”feature BEMD-HHT standard deviation”, ”feature BEMD-HHT phase”, ”feature BEMD-residue histogram” are added, representing features extracting by using BEMD-GGD and BEMD-HHT methods. In the figure (given below), we describe the online retrieval phase. This phase is divided into 7 steps: 1) The user sends a query image to SCL, then the image will be stored temporarily in HDFS. 2) Run a map-reduce job to extract features from query image 3) Store image features in HDFS 4) The similarity/distance between the features vectors of the query image in HDFS and the target images in the HBASE are computed. 5) A reduce collect and combines all the result from all the map function. 6) The reducer stores the result into HDFS. 7) Send the result to the user IV. RESULT The method is tested on the DDSM database (see II-A). We made experiments on mean precision at 20, which is the ratio between the number of pertinent images retrieved and the total images retrieved. We give below the principle of our retrieval ...

View in full-text

FIGURE 1. The Overall Process of Proposed SIARS Framework.

FIGURE 2. The Layered Architecture of VGG-16 Model

An Efficient Framework for Secure Image Archival and Retrieval System Using Multiple Secret Share Creation Scheme

Article

Full-text available

Aug 2020

Due to the advanced growth in multimedia data and Cloud Computing (CC), Secure Image Archival and Retrieval System (SIARS) on cloud has gained more interest in recent times. Content based image retrieval (CBIR) systems generally retrieve the images relevant to the query image (QI) from massive databases. However, the secure image retrieval process...

A Comparative Study of Algorithms used for Detection and Classification of Plant Diseases

Article

Full-text available

Feb 2017

Our Country's economy prospect lies mainly in agricultural sector. Although there is much advancement in technology, still chances of predicting the diseases in plants are vague. In this paper, a technical solution for the farmers to detect and diagnose the right disease affecting the plants is discussed. The Content Based Image Retrieval (CBIR) te...

A novel image retrieval based on visual words integration of SIFT and SURF

Article

Full-text available

Jun 2016

With the recent evolution of technology, the number of image archives has increased exponentially. In Content-Based Image Retrieval (CBIR), high-level visual information is represented in the form of low-level features. The semantic gap between the low-level features and the high-level image concepts is an open research problem. In this paper, we p...

Multi-Featured Content-Based Image Retrieval Using Color and Texture Features

Article

Full-text available

Aug 2016

Content-based image retrieval (CBIR) system becomes a hot topic in recent years. CBIR system is the retrieval of images based on visual features. CBIR system based on a single feature has a low performance. Therefore, in this paper a new content based image retrieval method using color and texture features is proposed to improve performance. In thi...

Searching Images with Images: Efficient Image Retrieval Technique using Colour and Texture Features

Article

Full-text available

Jan 2018

In the modern times, there has been a great advance in technology. With this comes the problem of the enormous data generated by this technology. Smartphones, laptops and even televisions are connected to the internet, constantly generating data. With this said, there has been a huge push towards digitization of information. With the data processed...

Optimised Image Storage and Retrieval on Hadoop

Article

Full-text available

Jun 2023

With the exponential growth of data, it is difficult to efficiently store and retrieve data using traditional methods. There is a need to optimize the storage and to efficiently retrieve relevant data matching the user query. Traditional methods lack optimized storage and to effectively retrieve data. To overcome these limitations, in this project, we propose a distributed architecture framework to optimize memory usage and to effectively retrieve relevant data using Content-Based Image Retrieval (CBIR). The experimental results show that the proposed model enhances storage performance and retrieval time by 20%.

Big Data in multiscale modelling: from medical image processing to personalized models

Article

Full-text available

May 2023

The healthcare industry is different from other industries–patient data are sensitive, their storage needs to be handled with care and in compliance with regulative, while prediction accuracy needs to be high. This fast expansion in medical image modalities and data collection leads to generation of so called “Big Data” which is time-consuming to be analyzed by medical experts. This paper provides an insight into the Big Data from the aspect of its role in multiscale modelling. Special attention is paid to the workflow, starting from medical image processing all the way to creation of personalized models and their analysis. A review of literature regarding Big Data in healthcare is provided and two proposed solutions are described–carotid artery ultrasound image processing and 3D reconstruction, and drug testing on personalized heart models. Related to the carotid artery ultrasound image processing, the starting point is ultrasound images, which are segmented using convolutional neural network U-net, while segmented masks were further used in 3D reconstruction of geometry. Related to the drug testing on personalized heart model, similar approach was proposed, images were used in creation of personalized 3D geometrical model that is used in computational modelling to determine pressure in the left ventricle before and after drug testing. All the aforementioned methodologies are complex, include Big Data analysis and should be performed using servers or high-performance computing. Future development of Big Data applications in healthcare domains offers a lot of potential due to new data standards, rapid development of research and technology, as well as strong government incentives.

Leveraging big data analytics in healthcare enhancement: trends, challenges and opportunities

Article

Full-text available

Jan 2021
MULTIMEDIA SYST

Clinical decisions are more promising and evidence-based, hence, big data analytics to assist clinical decision-making has been expressed for a variety of clinical fields. Due to the sheer size and availability of healthcare data, big data analytics has revolutionized this industry and promises us a world of opportunities. It promises us the power of early detection, prediction, prevention, and helps us to improve the quality of life. Researchers and clinicians are working to inhibit big data from having a positive impact on health in the future. Different tools and techniques are being used to analyze, process, accumulate, assimilate, and manage large amount of healthcare data either in structured or unstructured form. In this review, we address the need of big data analytics in healthcare: why and how can it help to improve life?. We present the emerging landscape of big data and analytical techniques in the five sub-disciplines of healthcare, i.e., medical image analysis and imaging informatics, bioinformatics, clinical informatics, public health informatics and medical signal analytics. We present different architectures, advantages and repositories of each discipline that draws an integrated depiction of how distinct healthcare activities are accomplished in the pipeline to facilitate individual patients from multiple perspectives. Finally, the paper ends with the notable applications and challenges in adoption of big data analytics in healthcare.

Leveraging Big Data Analytics in Healthcare Enhancement: Trends, Challenges and Opportunities

Preprint

Apr 2020

Clinicians decisions are becoming more and more evidence-based meaning in no other field the big data analytics so promising as in healthcare. Due to the sheer size and availability of healthcare data, big data analytics has revolutionized this industry and promises us a world of opportunities. It promises us the power of early detection, prediction, prevention and helps us to improve the quality of life. Researchers and clinicians are working to inhibit big data from having a positive impact on health in the future. Different tools and techniques are being used to analyze, process, accumulate, assimilate and manage large amount of healthcare data either in structured or unstructured form. In this paper, we would like to address the need of big data analytics in healthcare: why and how can it help to improve life?. We present the emerging landscape of big data and analytical techniques in the five sub-disciplines of healthcare i.e.medical image analysis and imaging informatics, bioinformatics, clinical informatics, public health informatics and medical signal analytics. We presents different architectures, advantages and repositories of each discipline that draws an integrated depiction of how distinct healthcare activities are accomplished in the pipeline to facilitate individual patients from multiple perspectives. Finally the paper ends with the notable applications and challenges in adoption of big data analytics in healthcare.

Deep Learning and Big DataTechnologies in Medical Image Analysis

Conference Paper

Full-text available

Dec 2018

Deep Learning and Big Data Analytics are the two high-focus areas in medical image analysis in the recent times. Owing to the great volume of imaging data in the databases, a lot of research has been focused in the area of medical image analysis involving big data tools and techniques. Also due to the saturation of considerable advances in shallow reasoning-based machine leaning algorithms, complex reasoning-based algorithms like Deep learning are employed to address the issues of image data in biomedical field. This paper discusses the challenges of traditional medical image analysis and reviews some of the latest researches in the areas of medical image analysis involving deep leaning and employing big data platforms.

Big Data Mining Methods in Medical Applications

Chapter

Full-text available

Oct 2018

Ved Prakash Mishra

The data related to human health and medicine can be stored, searched, shared, analysed, and presented in ingenious ways and the scale of this medical big data is continuously growing with advancements in medical technology and hospital information. However, there are predicaments and problems that remain to be overcome in its current stage of inception especially on how to analyze this data in a reliable manner. In this chapter, how data mining technology is more convenient for integrating this medical data for a variety of applications such as disease diagnosis, prevention, hospital administration has been discussed. In this chapter, the practicality of big data analytics, methodological and technical issues such as data quality, inconsistency and instability, analytical and legal issues and lastly, the issue of integration of big data analytics with clinical practice and clinical utility have been analysed. It is important to overcome these challenges to secure the application of big data technology in medical field and to thus improve patient outcome and more essentially to reduce resource wastage in medical field, which should be the real aim of big data studies. This chapter also aims at exploring methods to overcome these obstacles using big data tools and understanding the potential of Hadoop, which is an open-source distributed data storage and analysis application, in managing healthcare data. An analysis and examination of possible future work for these areas is also done with a translational approach of using data from all levels of human existence.

Chapter

Full-text available

Aug 2018

The paper studies the influence on the similarity by extracting and using m from n frames on videos, the purpose is to evaluate the amount of the proportion similarity between them, and propose a new Content-Based Video Retrieval (CBVR) system. The proposed system uses a Bounded Coordinate of Motion Histogram (BCMH) [1] to characterize videos which are represented by spatio-temporal features (eg. motion vectors) and the Fast and Adaptive Bidimensional Empirical Mode Decomposition (FABEMD). However, a global representation of a video is compared pairwise with all those of the videos in the Hollywood2 dataset using the k-nearest neighbors (KNN). Moreover, this approach is adaptive: a training procedure is presented, and an accuracy of 58.1% is accomplished in comparison with the state-of-the-art approaches on the dataset of 1707 movie clips.

Hadoop Cluster Analysis and Assessment

Article

Full-text available

Jun 2018

Large amount of data are produced daily from various fields such as science, economics, engineering and health. The main challenge of pervasive computing is to store and analyze large amount of data.This has led to the need for usable and scalable data applications and storage clusters. In this article, we examine the hadoop architecture developed to deal with these problems. The Hadoop architecture consists of the Hadoop Distributed File System (HDFS) and Mapreduce programming model, which enables storage and computation on a set of commodity computers. In this study, a Hadoop cluster consisting of four nodes was created. Regarding the data size and cluster size, Pi and Grep MapReduce applications, which show the effect of different data sizes and number of nodes in the cluster, have been made and their results examined.

Prototyping a Web-Scale Multimedia Retrieval Service Using Spark

Article

Jun 2018
ACM T MULTIM COMPUT

The world has experienced phenomenal growth in data production and storage in recent years, much of which has taken the form of media files. At the same time, computing power has become abundant with multi-core machines, grids, and clouds. Yet it remains a challenge to harness the available power and move toward gracefully searching and retrieving from web-scale media collections. Several researchers have experimented with using automatically distributed computing frameworks, notably Hadoop and Spark, for processing multimedia material, but mostly using small collections on small computing clusters. In this article, we describe a prototype of a (near) web-scale throughput-oriented MM retrieval service using the Spark framework running on the AWS cloud service. We present retrieval results using up to 43 billion SIFT feature vectors from the public YFCC 100M collection, making this the largest high-dimensional feature vector collection reported in the literature. We also present a publicly available demonstration retrieval system, running on our own servers, where the implementation of the Spark pipelines can be observed in practice using standard image benchmarks, and downloaded for research purposes. Finally, we describe a method to evaluate retrieval quality of the ever-growing high-dimensional index of the prototype, without actually indexing a web-scale media collection.

Biomedical informatics from big data perspective: Expectations and challenges

Conference Paper

Full-text available

May 2018

Table 1. Structure of Hbase table for image features storage

Context in source publication

Similar publications

Citations