ArticlePDF Available

Using Machine Learning Algorithms for Housing Price Prediction: The Case of Islamabad Housing Data

Authors:

Abstract and Figures

House price prediction is a significant financial decision for individuals working in the housing market as well as for potential buyers. From investment to buying a house for residence, a person investing in the housing market is interested in the potential gain. This paper presents machine learning algorithms to develop intelligent regressions models for House price prediction. The proposed research methodology consists of four stages, namely Data Collection, Pre Processing the data collected and transforming it to the best format, developing intelligent models using machine learning algorithms, training, testing, and validating the model on house prices of the housing market in the Capital, Islamabad. The data used for model validation and testing is the asking price from online property stores, which provide a reasonable estimate of the city housing market. The prediction model can significantly assist in the prediction of future housing prices in Pakistan. The regression results are encouraging and give promising directions for future prediction work on the collected dataset.
Content may be subject to copyright.
Soft Computing and Machine Intelligence Journal Vol (X), Issue (X), 20XX
Using Machine Learning Algorithms for Housing Price
Prediction: The Case of Islamabad Housing Data
Imran 1,, Umar Zaman 2,Muhammad Waqar1and Atif Zaman 1
1Department of Computer Science, Bahria University Islamabad, Pakistan
2Department of Computer Science, Iqra University Islamabad, Pakistan
*Correspondence: imranjejunu@gmail.com; Tel.: +82-1093369498
Abstract:
House price prediction is a significant financial decision for individuals
working in the housing market as well as for potential buyers. From investment to
buying a house for residence, a person investing in the housing market is interested in the
potential gain. This paper presents machine learning algorithms to develop intelligent
regressions models for House price prediction. The proposed research methodology
consists of four stages, namely Data Collection, Pre Processing the data collected and
transforming it to the best format, developing intelligent models using machine learning
algorithms, training, testing, and validating the model on house prices of the housing
market in the Capital, Islamabad. The data used for model validation and testing is the
asking price from online property stores, which provide a reasonable estimate of the city
housing market. The prediction model can significantly assist in the prediction of future
housing prices in Pakistan. The regression results are encouraging and give promising
directions for future prediction work on the collected dataset.
Keywords:
machine learning for regression; housing dataset; Property stores; house
price prediction; housing property value; real estate market;
1. Introduction
The real estate market in Pakistan is a widespread trade, and with Projects like CPEC, the property
dynamics are changing quickly. Investors, as well as individuals, want to invest money in the housing
sector. Buyers and owners observe real estate trends, particularly in the housing market; these
trends also reflect the economic situation and social sentiment of any developing country. House
price estimation is a significant financial decision for individuals working in the housing market and
potential buyers. From investment to buying a house for residence, a person investing in the housing
market is interested in the potential gain. To understand this study’s background, We first overview
the housing market of Pakistan and then give an overview of the dataset used in this study. There
are many factors which determine the houses prices. If we look into real estate in general, then an
increase in the real estate market is explained by the rise of the particular area’s inhabitants’ income.
However, careful analysis suggests that we can only temporarily suggest that the prices of real estate
are increasing due to these factors, such as demand-oriented variables and others. Therefore, we can
conclude that the factors can be changed from time to time.The house prices are based on income of
the inhabitants of the area, house stock supply and the payment system, whether accept installment or
require cash payment.
Other essential variables can include whether the price is the affordable, unemployment rate,
demographics, and others, but we can explain house prices as a general income function.In this study
do not consider all the possible variables that can be used to predict housing prices. In this study, we
Soft Computing and Machine Intelligence Journal, Vol (1), Issue (1), 2021 Page: 1 of 12
Vol (1), Issue (1), 2021
use only the housing data available from the property websites to predict the housing prices by looking
at the recent trends.Pakistan’s economy is slowly on the way towards recovery. The unemployment
rate is on a downwards stream, and consumer spending is going up. Nevertheless, the growth rate is
still struggling, which indicates that the Pakistan economy still has a long way to go before it is up and
running.
In Europe and other advanced countries, real estate companies challenge developing algorithms
that can forecast real estate property prices more accurately. Researchers are using some well-known
housing datasets, e.g., Boston and King city USA datasets. One of the gaps for Pakistan is the absence
of a comprehensive housing dataset. Some real estate property sites in Pakistan provide a reasonable
estimate of the Pakistan housing market, but currently, they are not using house price forecasting tools.
Websites like Zillow
1
, a US real estate market place organizes competitions on kaggle
2
to encourage
researchers to come up with accurate house price forecasting algorithms. Since such challenges are not
part of the Pakistan housing market yet, making it very difficult for a research scholar to develop such
forecasting algorithms for the Pakistan real estate housing data, the only sources of housing data are
these online property stores.In the Pakistan real estate market, there are currently no Machine-based
forecasting tools used to estimate houses or any other real estate properties. There are some blogs
and magazines where human real estate market experts advise Pakistan real estate forecasts. Lack of
scientific research competitions for forecasting Pakistan’s housing prices and hence lack of housing
dataset make housing Price prediction for Pakistan real estate a difficult and challenging task.
Figure 1. Properties count based on locations
Figure 1displays the Property counts of the dataset with respect to sectors of the capital Islamabad.
We collect the dataset for this study from the leading property websites in the country. The dataset for
this study is from online property stores based in Pakistan. These websites contain details of property
listings from various cities of Pakistan. The dataset for this study is of Islamabad. The dataset is in
tabular textual format consisting of 23 columns and 44647 rows collected over a period of one year.
1www.zillow.com
2www.kaggle.com
Page: 2 of 12
Vol (1), Issue (1), 2021
2. Related Work
In literature, the approaches used for house price prediction can be classified as regression
models, machine learning models, and hybrid models.A variety of research work has been done
to estimate housing prices. Gaussian Processes (GP) for regression Model benefit from the London
housing dataset’s spatial structure; for this purpose, smaller local models are developed, which works
independently from each other. Once local models are trained, the overall predictions are obtained by
recombining predictions from local models. For generating visualization to clients through mobile [
1
],
the model is trained at the server-side, and prediction is generated for the user via a mobile app.
Linear Regression and Gradient Boosting methods are used to predict Zillow Estimation. Zillow
is offering competition on Kaggle to develop the most accurate property value forecasting algorithm.
They used property data to train their linear regression and gradient boosting models with which
they make predictions about other properties. For gradient boosting models, they use grid search to
fine-tune their model’s hyperparameters. Oladunni et al. [
2
] reduce errors in the Hedonic housing
regression model by investigating Spatial Dependency substitutability of submarket and geospatial
attributes. The model is trained using best subset linear regression and regression tree algorithms.
Bayesian information criterion and residual mean deviance are used as performance matrices.
Ahmed et al. [
3
] design a neural network-based model for predicting housing Market Performance.
This model is trained through a historical market performance dataset to predict unforeseen future
performances. The model testing and validation show that the error in predicting his Neural Net
is in the range between –2 and +2 percent.To predict the Singapore housing market, Lim et al. [
4
]
design neural networks. They used two algorithms for prediction, the multilayer perceptron, and
autoregressive integrated moving average. The model with high accuracy score is used for prediction,
and the model with lower mean square error (MSE) of the ANN models shows that ANN is best
over other predictive tools. Chica et al.[
5
] designed Cokriging a Multivariate Spatial Method for
predicting Housing Location Price. This method estimates correlated spatial variables, interpolated
maps of house prices are created, providing information about house location prices to appraisers
and real estate agents. During the experiment, housing location price prediction value is estimated
using methods: isotopic data cokriging and heterotopic data cokriging. Results from both methods are
compared, and prediction from the best method is selected.
Bahia et al. [
6
] used a data mining model using an Artificial Neural network to the real estate
market. Two network models were developed during the study FFB and CFBP. Both of these models
were trained using the Boston dataset, and the performance matrix used was regression value. The
CFBP prediction results are best, and the regressional value is .964; the study suggests that CFBP
prediction accuracy is 96 percent. Stevens et al. [
7
] used text mining to predict housing prices. His
prediction price involves pricing indicators, e.g., selling price, asking price, and price fluctuation. This
study shows that the SGD classifier performed best for all pricing indicators and achieved the best
results. The study uses stemmed n-grams for classification and regressions. R2 Matrix performance
value for prediction is 0.303. The study suggests that both of these results are good due to the task
complex nature.
Nissan et al. [
8
] used various algorithms to Predict real estate property prices in Montreal. The
study suggests a prediction model that predicts asking and selling prices based on features, such as
location, area, rooms, nearest police station, fire station, etc. They used many regression models for
regression prediction. These regression methods include linear regression, SVR, kNN, regression Tree,
and Random Forest Regression. The proposed prediction models predict the Asking price with an
error of 0.0985 and the selling price with an error of 0.023.
Nghiep et al.[
9
] compared multiple regression analysis to artificial neural networks (ANN) using
three different-sized training sets of single-family houses. The prediction Model uses features, e.g.,
Page: 3 of 12
Vol (1), Issue (1), 2021
area, number of bathrooms and bedrooms, the property build year, which shows how much property is
old in terms of years, number of quarters, selling status, and whether or not the property has a garage
or carport. The researchers proposed that while MRA performs best on smaller-sized training sets,
ANN was found to outperform as the dataset size increases. Byeonghwa and Jae [
10
] applied various
prediction techniques to predict prices of houses in Fairfax County, VA. They build various models on
5359 townhouses. They evaluated and compared these models and proposed that RIPPER, Bayesian,
and AdaBoost. RIPPER is best than other prediction models. They also applied Naive Bayesian to
the same dataset, but RIPPER algorithm performance is outstanding for housing price prediction. In
literature, descriptive analysis have applied to effective management of waste data [
11
],healthcare and
thermal comfort applications [
12
,
13
] energy optimization domain [
14
,
15
]. Predictive analysis has been
proven helpful to forecast the outcome of a certain situation in the near future. Predictive analysis has
been applied recently to recommendation systems[
16
], safety applications[
17
,
18
] , policy-making [
19
],
and convergence applications [20].
3. Materials and Methods
We present the experimental design in three stages, where the former presents data collection,
and second presents Preprocessing steps, and the third presents regression models for house prices
prediction.
Pre-Processi ng Layer
Handling
Missing Value
Simple Data OR
Normalized Data
Prediction Layer
Data Collecti on Layer
P1
P2
P3
P4
Input
Layer
Hidden
Layer
Output
Layer
O=1
H=10
I= 4
Neuron
Output
Moving
Average
Batch
Normalization
P1
P4
P7
Input
Layer Hidden
Layer Output
Layer
O=1
H=20
I=8
Neuron
Output
P2
P3
P5
P6
P8
Smoothing
Performance
Evaluation La yer
Root Mean Square Error
Mean Absolute Error
Mean Absolute
Percentage Error
House features variables
Figure 2. Experimental Design.
3.1. Data Collection
Data is collected using scraping software that collects data from the internet in a format that the
machine learning model can use. When parsing, the output data is interpreted by a machine, but
the human can not understand it easily. Data scrapping is also referred to as data extraction.Data
scrapping is very useful as if humans perform the data collection from the internet, and there are
many chances of error as machines transfer data between programs in the form of data structures that
provide high integrity of the data. However, the script is written for a pre-determined format, and it
may not be necessary that the data is always in the given format. The data may have issues in terms
of data consistency and correctness. Therefore, data scrapping only collects raw data and requires
extensive preprocessing and, in some situations, also requires human involvement.The data scrapping
Page: 4 of 12
Vol (1), Issue (1), 2021
activity is primarily dependant on the Internet sources from where data is being collected and can not
be fully automated. For example, in the case of you scraping data from the website, the best format
is that if the developer has assigned to each unique HTML element, an attribute ID and an attribute
of the class are assigned to each item of the same group. This helps to create a script in almost any
programming language. Comparative study[
21
] of open-source scraping tools suggests that scrapy is
the best open-source tool for scrapping, So in this study, We use Python scrapy library for creating our
crawler.
However, note that web scraping from well-settled companies is not trivial as the companies
use defensive algorithms and software to protect un-wanted access to their website. Most of them
are blocking any type of script in their robots.txt. So the idea is to write a script that can scrap data
intelligently like a human being. This is achieved by automating human behavior when browsing a
website. For example, if scrapping an entry is delayed by 5 seconds or 10 seconds, the system may
not recognize data extraction from the website and could consider it a regular activity. Data scraping
is done on publicly available data via browser either without login or after authentication to their
website. In the case of using SQL Injection to hack their database is a saviors internet offense.Web
search engines, e.g., Google, yahoo, bing, and others, play an essential role in reaching a website. For
example, we type a keyword, and after the query is entertained, the search engine gives us results
based on that query. This helps find a data host, and it gives both benefits to the data host and the
person who is scraping. The Mechanism search engines use the same as web scraping, but they are not
blamed for data scraping as the data is used for the user’s convenience.
If we consider Google, Google has two part of their search engine, one Googlebot a software bot
which crawls billions of web pages from the websites on the internet and is stored in the Google data
hosts and another part of the system is an algorithm which entertains the user-queries based on the
data crawled and displays results to user with the help of a ranking algorithm.In regard to whether
web scraping is legal or illegal, Michael Mahoney observes [
22
] that legal action is taken against airline
price aggregators such as Orbitz Kayak and Expedia. Another example in this regard is Facebook.
Facebook has a history of suing third-party applications that have accessed and republished Facebook
user data [23].
Another exciting example is Craigslist [
24
] which provides services like Padmapper, 3Taps of an
improper gathering of their information and reposting it as a map interface which is plotted as the
chart on the location of the user-generated ads. The author states there is "no direct legal protection for
databases. However, data hosts can file a case against scraper if they can prove the scraper has harmed
them in any way". One such example is Intel and Hamidi’s case that ruled that server inconveniences
do not constitute an actionable harm [
25
]. Scraping may consume the bandwidth of websites and,
in extreme cases, crash a website or server. In summary, the legality of scraping by [
26
]: multiple
instances of data hosts pairing up with scraper show that data host should seek ways to embrace
scrapers that seek to improve their services. Further, the scrapers should review their business model.
If a data host thinks scraper is parasitic, then he can sue the scraper. Table 1and Table 2presents
physical, geographical and other features of the collected dataset.
Page: 5 of 12
Vol (1), Issue (1), 2021
Name of the attribute Description Data type
Area living area in Square feet Numeric
Bedrooms number of bed rooms Numeric
Bathrooms number of bath rooms Numeric
Dining Room dining room? (yes/no) Binary
Drawing Room Drawing Room? (yes/no) Binary
Laundry Room Laundry Room? (yes/no) Binary
Lounge Lounge? (yes/no) Binary
Garden Garden? (yes/no) Binary
Flooring Flooring? (yes/no) Binary
Study Room Study Room? (yes/no) Binary
Swimming Pool Swimming Pool? (yes/no) Binary
Central Air Conditioning Central Air Conditioning system? (yes/no) Binary
Build house build type? (old/new) Binary
Table 1. List of physical features selected for the dataset.
Name of the attribute Description Data type
Location sector name of the location Nominal
Nearby Hospitals Nearby Hospital? (yes/no) Binary
Nearby Schools Nearby School? (yes/no) Binary
Nearby Shopping Malls Nearby Shopping Malls? (yes/no) Binary
Maintenance Staff Maintenance Staff? (yes/no) Binary
Security Staff Security Staff? (yes/no) Binary
Nearby Airport(yes/no) Nearby Airport(yes/no)? (yes/no) Binary
View house View? (good/best/normal) Nominal
Parking Spaces Parking Spaces? (yes/no) Binary
Price price of the house in PKR? Numeric
Table 2. List of geographic and environmental features.
3.2. Preprocessing
Data preprocessing is done in order to transform the dataset into a clean dataset for better machine
learning models. Data preprocessing techniques are applied to data in raw format, which is not feasible
for analysis. As in our case, the data is collected from different property websites where property
agents entered it, so there are missing values, data in various formats, and incorrect data. We performed
data integration to combine the data from various sectors of the capital into an integrated dataset.
Data transformation methods were applied to transform the data records to a format that is good for
machine learning analysis.
To perform iterative analysis on data, we cleaned the dataset from missing and incorrect
values. Data Wrangling, Data Munging are similar terms used in the Data Science community;
data wrangling/data munging are techniques used to convert raw data into a format that is best for
using the data. In our case, we converted the textual data such as yes and no to binary variables.
Locations, views, and other variables were encoded into numbers for better analysis results.
We computed the binary variable Build from the year of construction of the house. The house’s
asking price was in various currencies and units, e.g., lacks, thousands, crore. We converted it into
lacks units and PKR currency. Machine learning algorithms such as neural networks perform best
on data values ranges from 0 to 1, so we scaled down our dataset values between 0 and 1 using the
Min-Max scaling algorithm. Later on, for performance evaluation, the values are scaled up to their
original range. Equation 1 shows how to scale down values between 0 and 1.
x0=xmin(x)
max(x)min(x)(1)
Page: 6 of 12
Vol (1), Issue (1), 2021
Figure 3shows the Correlation between housing features, it is used to calculate the strength of
the relationship of housing features i.e., Bedrooms and Bathrooms, Build, Dining Room with price
feature. The Correlation Coefficient value for Bedrooms and Bathrooms with respect to price features
is high than the rest of the features, which shows that price features having a strong relationship with
Bedrooms and Bathrooms features, and hence these will contribute more than other features in house
price prediction. Its clear from Figure 3that all the features except Area, Central air conditioning,
location, view having some sort of relationship with the price feature. In this study, we used the
Pearson correlation coefficient to measure the strength of the features variables’ relationship. Pearson
correlation coefficient can be calculated using Equation 2.
ρ=cov(X,Y)
σxσy(2)
Figure 3. Correlation between features
After applying Preprocessing the dataset, we Partition the dataset into training, validation, and
testing subsets. Each of these partitioned datasets is further divided into dependent and independent
variables, set X and Y.
3.3. Analysis procedure
This study developed various regression models, including Intelligent machine learning-based
models, and applied them to our dataset. The development toolkit used for developing our regression
Page: 7 of 12
Vol (1), Issue (1), 2021
models is anaconda spyder. We now discuss ten machine learning regression procedures applied to
our dataset.
3.3.1. Machine Learning Regression Methods
Linear regression (LR) [
27
] are used too much because its easy, straightforward to understand.
It is one of the most basic and popular algorithms in machine learning. In this study, we build a
multivariate LR Model to predict housing prices. LR Model will find the best possible line that fits
the training set and then predicts the unseen house price from the test set. We applied Support Vector
Regression(SVR) [
28
] into the same housing dataset for housing price prediction. SVR is slightly
different from the famous machine learning algorithm Support Vector Machine(SVM). The main
difference is that SVM is used for classification, and SVR is used for a regression problem. In SVM,
a hyperplane is used as a separation line between classes. In SVR, we define the hyperplane line
for predicting the continuous value or housing price value. Other concepts, i.e., boundary line and
support vectors, are the same between SVM And SVR.
We estimated the housing price prediction problem using a machine learning probabilistic model
called Bayesian Ridge Regression (BRR) [
29
]. We estimate the house prices t be Gaussian distributed
around the independent housing features. The main advantage of using BRR for house price prediction
or other regression problems is that it can adapt to the data at hand, and second that it can be used to
include regularization parameters in the housing price estimation procedure.
LassoLars regression [
30
] is one of the simple techniques to reduce model complexity and prevent
over-fitting, resulting from simple linear regression. Lasso regression helps in reducing over-fitting
and in feature selection. Just like Ridge regression, the regularization parameters can be controlled for
better estimation of the housing prices. Elastic Net [
31
] first emerged as a result of critique on lasso
regression, whose variable selection can be too dependent on data and thus unstable. The solution is
to combine the penalties of ridge regression and lasso to get the best of both worlds. The elastic Net
main aim is minimizing the loss function.
Gradient boosting regression(GBR) [
32
] is a machine learning that can be used to build a prediction
model for regression problems like house price prediction in the form of an ensemble of weak prediction
models. GBR repetitively leverages residuals patterns and strengthens a housing price prediction
model with weak predictions, and makes it better. The main aim is minimizing our loss function,
such that test loss reaches its minima. Random Forest(RF) [
33
] is an ensemble technique capable
of performing both regression and classification tasks with the use of multiple decision trees and a
technique called Bootstrap Aggregation, commonly known as bagging.Stochastic gradient descent
(SGD) [
34
] is based on some addition to gradient descent. It is an iterative method for optimizing
an objective function and is mostly used as black-box optimizers. SGD can be called a stochastic
approximation of gradient descent optimization. Passive Aggressive Algorithms [
35
] are a family of
online learning algorithms. We use the Passive-Aggressive regression(PAR) model for the house price
prediction problem. The idea is elementary, and the house price estimation using this regression model
is better than many other alternative methods. Theil-Sen estimator is a method used for simple linear
regression, and it chooses the median of the slopes of all lines through pairs of points.
Page: 8 of 12
Vol (1), Issue (1), 2021
4. Results
This section of the study explains the experimental results of the machine learning models used in
the study for house price prediction. Figure 4visualize the comparison of the house’s original listing
price and predicted price values by various Machine learning Algorithms.
(a) Bayesian Regression (b) Linear Regression
(c) Support Vector Regression (d) Stochastic gradient descent
(e) ElasticNet Regression (f) Gradient Boosting Regression
(g) Passive Aggressive Regression (h) Theil-Sen Regression
Figure 4. Predictive analysis of house price prediction with ML models.
Page: 9 of 12
Vol (1), Issue (1), 2021
Each subfigure listing price is represented using a dashed blue color line, whereas the machine
learning algorithm’s predicted price value is represented using a solid orange color line. Horizontal
access of the chart represents the housing property instance, and the Vertical axis represents price
values.
4.1. Performance matrices
The performance evaluation matrices used for the evaluation of the regression models are
MAPE(Mean absolute percentage error), RMSE(Root Mean Squared Error), and MAE(Mean absolute
error).Table 3 presents Comparison of regression methods performance analysis on the prepared
housing dataset.
4.1.1. Mean absolute percentage error
This performance measure computes an average deviation found in predicted house price value
from actual listing house price values. MAPE is calculated by dividing the sum of absolute differences
between the actual house price values and predicted house values by the machine learning algorithm
we applied in this study with the total number of price value data items, i.e., n.
MAPE =100%
n
n
t=1
et
yt
(3)
4.1.2. Root Mean Squared Error
MSE sometimes increases the actual error, making it difficult to realize and understand the actual
error amount. This problem is resolved by the RMSE measure, which is obtained by simply taking the
square root of MSE.
RMSE =s1
nΣn
i=1difi
σi2(4)
4.1.3. Mean absolute error
mean absolute error is a measure of difference between two continuous variables.In our case these
continuous variables are listing price value and predicted price value f the house property.
MAE =1
n
n
t=1
|et|(5)
Method MAPE MAE RMSE
LR 5627.9369 10928.2603 16658.4158
BRR 7383.9969 10930.6388 16661.3350
SVR 1918.4957 8595.6057 18209.5558
SGDR 10698.1442 13139.1928 17345.1444
ElasticNet 7388.1547 10927.7181 16658.2267
GBR 5267.4830 9563.4324 16772.3870
LassoLars 7382.6600 10938.5807 16670.3489
RF 7371.0746 10902.9762 17105.2596
PAR 2133.8370 8621.9391 18069.2298
Theil-Sen 6031.6336 10151.4884 16754.2930
Table 3. Comparison of regression methods performance
Page: 10 of 12
Vol (1), Issue (1), 2021
5. Conclusions
In this study, We have explored eleven machine learning algorithms used to develop housing
price prediction models for estimating the future house pricing of the capital Islamabad. One of our
contributions in this study is collecting housing data and developing the first scientific housing dataset
for the Pakistan housing market. Machine learning algorithms such as Passive-aggressive Regression,
Support Vector Regression, and Deep learning Network can estimate the prices very close to the listing
price. The results show that SVR performs best than the rest of the machine learning algorithms. In
this study, we compare various machine learning regression models’ performance for finding best
model for a better housing price prediction. There is currently no Machine learning or other house
forecast tools used in the best of our knowledge. We strongly believe that machine learning house price
prediction models will help those who work in the real estate market and potential buyers in making
a good house purchasing decision. In the future, this work can be used as base for several types of
studies, including the real estate market, stock price prediction, oil and petroleum prices forecast. In
the future, this textual tabular dataset can be used with the houses’ visual features, such as images
of the houses’ interior and exteriors, to build a more robust, novel house price prediction. Lastly, the
housing market can be influenced by other macro-economic variables such as price of gold, stock price
index, property tax, and the appraised value of a property; considering these can help develop house
price prediction models that can accurately estimate the house prices.
Acknowledgments:
We are thankful to Dr. Muhammad Muzammal, Associate professor Bahria
University for his supervision and valuable suggestions during this research study.
References
1.
Ng, Aaron and Deisenroth, Marc. Peer-to-peer energy trading mechanism based on blockchain and machine
learning for sustainable electrical power supply in smart grid. Imperial College London 2015, , 142–149.
2.
Sangani, Darshan and Erickson, Kelby and Al Hasan, Mohammad. Predicting zillow estimation error using
linear regression and gradient boosting. IEEE 14th International Conference on Mobile Ad Hoc and Sensor Systems
(MASS). 2017,IEEE,530–534.
3.
Khalafallah, Ahmed et al. Neural network based model for predicting housing market performance. Tsinghua
Science and Technology. 2008,TUP,13,S1,325–328.
4.
Lim, Wan Teng and Wang, Lipo and Wang, Yaoli and Chang, Qing. Housing price prediction using neural
networks. 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery
(ICNC-FSKD). 2016,IEEE,518–522.
5.
Chica-Olmo, Jorge et al. Prediction of housing location price by a multivariate spatial method: Cokriging. Taylor
& Francis. 2007, Taylor & Francis,29,1,91–114.
6.
Bahia, Itedal Sabri Hashim and others. A Data Mining Model by Using ANN for Predicting Real Estate Market:
Comparative Study. International Journal of Intelligence Science. 2013, Scientific Research Publishing,03,04,162.
7.
Stevens, Dick and Wubben, S and van Zaanen, MM. Predicting real estate price using text mining. Department of
Communication and Information Sciences. 2014, Tilburg University
8.
Pow, Nissan and Janulewicz, Emil.Prediction of real estate property prices in Montreal. Repéré à urlhttp://rl. cs.
mcgill. ca/comp598/fall2014/comp598_submission_99. pdf. 2014,
9.
Nghiep, Nguyen and Al, Cripps. Predicting housing value: A comparison of multiple regression analysis and
artificial neural networks. Journal of real estate research. 2001, Taylor & Francis,22,3,313–336.
10.
Park, Byeonghwa and Bae, Jae Kwon. Using machine learning algorithms for housing price prediction: The
case of Fairfax County, Virginia housing data. Expert systems with applications. 2015, Elsevier,42,6,2928–2934.
11.
Imran, S. Ahmad and D. H. Kim, "Quantum GIS Based Descriptive and Predictive Data Analysis
for Effective Planning of Waste Management," in IEEE Access, vol. 8, pp. 46193-46205, 2020, doi:
10.1109/ACCESS.2020.2979015.
Page: 11 of 12
Vol (1), Issue (1), 2021
12.
Imran; Ahmad, S.; Kim, D. Design and Implementation of Thermal Comfort System based on Tasks Allocation
Mechanism in Smart Homes. Sustainability 2019, 11, 5849. https://doi.org/10.3390/su11205849
13.
Imran; Iqbal, N.; Ahmad, S.; Kim, D.H. Health Monitoring System for Elderly Patients Using
Intelligent Task Mapping Mechanism in Closed Loop Healthcare Environment. Symmetry 2021, 13, 357.
https://doi.org/10.3390/sym13020357
14.
Jamil, Faisal and Iqbal, Naeem ,Imran and Ahmad, Shabir and Kim, Dohyeun and others. Peer-to-peer energy
trading mechanism based on blockchain and machine learning for sustainable electrical power supply in smart
grid. IEEE Access 2021,IEEE, 9,39193–39217.
15.
Wahid, F.; Fayaz, M.; Aljarbouh, A.; Mir, M.; Aamir, M.; Imran. Energy Consumption Optimization and User
Comfort Maximization in Smart Buildings Using a Hybrid of the Firefly and Genetic Algorithms. Energies 2020,
13, 4363. https://doi.org/10.3390/en13174363
16.
S. Ahmad, Imran, F. Jamil, N. Iqbal and D. Kim, "Optimal Route Recommendation for Waste Carrier Vehicles for
Efficient Waste Collection: A Step Forward Towards Sustainable Cities," in IEEE Access, vol. 8, pp. 77875-77887,
2020, doi: 10.1109/ACCESS.2020.2988173.
17.
Imran; Iqbal, N.; Ahmad, S.; Kim, D.H. Towards Mountain Fire Safety Using Fire Spread Predictive
Analytics and Mountain Fire Containment in IoT Environment. Sustainability 2021, 13, 2461.
https://doi.org/10.3390/su13052461
18.
Imran;Ahmad, Shabir and Kim, Do Hyeun.A task orchestration approach for efficient mountain fire detection
based on microservice and predictive analysis In IoT environment. Journal of Intelligent & Fuzzy Systems
2021
,
IOS Press,1–16.
19.
S. Ahmad, Imran, N. Iqbal, F. Jamil and D. Kim, "Optimal Policy-Making for Municipal Waste Management
Based on Predictive Model Optimization," in IEEE Access, vol. 8, pp. 218458-218469, 2020, doi:
10.1109/ACCESS.2020.3042598.
20.
Imran; Ghaffar, Z.; Alshahrani, A.; Fayaz, M.; Alghamdi, A.M.; Gwak, J. A Topical Review on Machine
Learning, Software Defined Networking, Internet of Things Applications: Research Limitations and Challenges.
Electronics 2021, 10, 880. https://doi.org/10.3390/electronics10080880
21.
Yadav, M., Goyal, N. (2015). Comparison of Open Source Crawlers-A Review. International Journal of Scientific
Engineering Research, 6(9), 1544-1551.
22. Michael Mahoney. Orbitz Sued by Southwest Airlines. 2001, E-Commerce Times.
23. Inc v. Power Ventures. 2012, Facebook, Inc v. Power Ventures,844 F.Supp.2d 1025 (E.D. Cal).
24. Daniel J. Gervais. The Protection of Databases. 2007, 92 CHI.-Kent L. Rev. 1109.
25. Report. Intel Corp. v. Hamidi. 2003, 71 P.3d 296.
26.
Hirschey, Jeffrey et al. Symbiotic Relationships: Pragmatic Acceptance of Data Scraping. SSRN Electronic
Journal. 2014, SSRN.
27. Weisberg, Sanford. Applied linear regression. 2005,John Wiley & Sons.
28.
Panigrahi, S. S., Mantri, J. K. (2015, October). Epsilon-SVR and decision tree for stock market forecasting. In
2015 International Conference on Green Computing and Internet of Things (ICGCIoT) (pp. 761-766). IEEE.
29.
Vinod, H. D. (1978). A survey of ridge regression and related techniques for improvements over ordinary least
squares. The Review of Economics and Statistics, 121-131.
30.
Gluhovsky, I. (2011). Multinomial least angle regression. IEEE transactions on neural networks and learning
systems, 23(1), 169-174.
31. Li, Q., Lin, N. (2010). The Bayesian elastic net. Bayesian analysis, 5(1), 151-170.
32.
Efron, Zemel, R. S., & Pitassi, T. (2001). A gradient-based boosting algorithm for regression problems. In
Advances in neural information processing systems (pp. 696-702).
33. Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
34. Ruder, S. (2016). An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747.
35.
Nishikawa, H., Arita, K., Tanaka, K., Hirao, T., Makino, T., Matsuo, Y. (2014, August). Learning to generate
coherent summary with discriminative hidden semi-markov model. In Proceedings of COLING 2014, the 25th
International Conference on Computational Linguistics: Technical Papers (pp. 1648-1659).
Page: 12 of 12
... In parallel with all these developments, a wide variety of machine learning approaches have been used by many researchers in the task of residential real estate price prediction. Random Forest Yilmazer & Kocaman, 2020;Gupta et al., 2021;Tchuente & Nyawa, 2021;Bilgilioğlu & Yılmaz, 2021;Kim et al., 2021;Steurer et al., 2021;Yazdani, 2021;Imran et al. , 2021;Truong et al., 2020;Ho et al., 2021;Bergadano et al., 2021;Jui et al., 2020;Fu, 2018;Alkan et al., 2022), Support Vector Regression (Yacim and Boshoff, 2020; Manasa et al., 2020;García-Magariño et al., 2020;Pai and Wang, 2020;Tchuente and Nyawa, 2021;Bilgilioğlu and Yılmaz, 2021;Imran et al., 2021;Chou et al., 2022 ;Ho et al., 2021;Alkan et al., 2022), Decision Trees (Sawant et al., 2018;Pérez-Rave et al., 2020;Pai and Wang, 2020;Alfaro-Navarro et al., 2020;Mrsic et al., 2020;Bilgilioğlu and Yılmaz, 2021;Sing et al. , 2021;Sangha, 2021;Büyük and Ünel, 2021;Chou et al., 2022;Shi et al., 2022), Neural Networks (Štubňová et al., 2020;Yacim and Boshoff, 2020;Pai and Wang, 2020;Lee and Park, 2020;García-Magariño et al., 2020;Sevgen and Aliefendioğlu, 2020;Mankad, 2021;Rampini and Cecconi, 2021;Tchuente and Nyawa, 2021;Torres-Pruñonosa et al., 2021;Bilgilioğlu and Yılmaz, 2021;Kalliola et al., 2021;Steurer et al., 2021;Sa'at et al., 2021;Terregrossa and Ibadi, 2021;Tabar et al., 2021 ;Abhyankar and Singla, 2021;Yazdani, 2021;Chou et al., 2022;Seya and Shiroi, 2022), K-Nearest Neighbor (Zhao et al., 2019;Yıldırım, 2019;Mrsic et al., 2020;García-Magariño et al., 2020;Tchuente and Nyawa, 2021;Bergadano et al., 2021;Yazdani, 2021;Alkan et al., 2022), Gradient Boosting (Walthert and Sigrist, 2019; Truong et al., 2020;Mrsic et al., 2020;Manrique et al., 2020;Imran et al., 2021;Ho et al., 2021 ;Sangha, 2021;Bergadano et al., 2021) are just some of these approaches. Consistency, precision and accuracy of models realized with machine learning approaches are directly related to data quality. ...
... In parallel with all these developments, a wide variety of machine learning approaches have been used by many researchers in the task of residential real estate price prediction. Random Forest Yilmazer & Kocaman, 2020;Gupta et al., 2021;Tchuente & Nyawa, 2021;Bilgilioğlu & Yılmaz, 2021;Kim et al., 2021;Steurer et al., 2021;Yazdani, 2021;Imran et al. , 2021;Truong et al., 2020;Ho et al., 2021;Bergadano et al., 2021;Jui et al., 2020;Fu, 2018;Alkan et al., 2022), Support Vector Regression (Yacim and Boshoff, 2020; Manasa et al., 2020;García-Magariño et al., 2020;Pai and Wang, 2020;Tchuente and Nyawa, 2021;Bilgilioğlu and Yılmaz, 2021;Imran et al., 2021;Chou et al., 2022 ;Ho et al., 2021;Alkan et al., 2022), Decision Trees (Sawant et al., 2018;Pérez-Rave et al., 2020;Pai and Wang, 2020;Alfaro-Navarro et al., 2020;Mrsic et al., 2020;Bilgilioğlu and Yılmaz, 2021;Sing et al. , 2021;Sangha, 2021;Büyük and Ünel, 2021;Chou et al., 2022;Shi et al., 2022), Neural Networks (Štubňová et al., 2020;Yacim and Boshoff, 2020;Pai and Wang, 2020;Lee and Park, 2020;García-Magariño et al., 2020;Sevgen and Aliefendioğlu, 2020;Mankad, 2021;Rampini and Cecconi, 2021;Tchuente and Nyawa, 2021;Torres-Pruñonosa et al., 2021;Bilgilioğlu and Yılmaz, 2021;Kalliola et al., 2021;Steurer et al., 2021;Sa'at et al., 2021;Terregrossa and Ibadi, 2021;Tabar et al., 2021 ;Abhyankar and Singla, 2021;Yazdani, 2021;Chou et al., 2022;Seya and Shiroi, 2022), K-Nearest Neighbor (Zhao et al., 2019;Yıldırım, 2019;Mrsic et al., 2020;García-Magariño et al., 2020;Tchuente and Nyawa, 2021;Bergadano et al., 2021;Yazdani, 2021;Alkan et al., 2022), Gradient Boosting (Walthert and Sigrist, 2019; Truong et al., 2020;Mrsic et al., 2020;Manrique et al., 2020;Imran et al., 2021;Ho et al., 2021 ;Sangha, 2021;Bergadano et al., 2021) are just some of these approaches. Consistency, precision and accuracy of models realized with machine learning approaches are directly related to data quality. ...
... In parallel with all these developments, a wide variety of machine learning approaches have been used by many researchers in the task of residential real estate price prediction. Random Forest Yilmazer & Kocaman, 2020;Gupta et al., 2021;Tchuente & Nyawa, 2021;Bilgilioğlu & Yılmaz, 2021;Kim et al., 2021;Steurer et al., 2021;Yazdani, 2021;Imran et al. , 2021;Truong et al., 2020;Ho et al., 2021;Bergadano et al., 2021;Jui et al., 2020;Fu, 2018;Alkan et al., 2022), Support Vector Regression (Yacim and Boshoff, 2020; Manasa et al., 2020;García-Magariño et al., 2020;Pai and Wang, 2020;Tchuente and Nyawa, 2021;Bilgilioğlu and Yılmaz, 2021;Imran et al., 2021;Chou et al., 2022 ;Ho et al., 2021;Alkan et al., 2022), Decision Trees (Sawant et al., 2018;Pérez-Rave et al., 2020;Pai and Wang, 2020;Alfaro-Navarro et al., 2020;Mrsic et al., 2020;Bilgilioğlu and Yılmaz, 2021;Sing et al. , 2021;Sangha, 2021;Büyük and Ünel, 2021;Chou et al., 2022;Shi et al., 2022), Neural Networks (Štubňová et al., 2020;Yacim and Boshoff, 2020;Pai and Wang, 2020;Lee and Park, 2020;García-Magariño et al., 2020;Sevgen and Aliefendioğlu, 2020;Mankad, 2021;Rampini and Cecconi, 2021;Tchuente and Nyawa, 2021;Torres-Pruñonosa et al., 2021;Bilgilioğlu and Yılmaz, 2021;Kalliola et al., 2021;Steurer et al., 2021;Sa'at et al., 2021;Terregrossa and Ibadi, 2021;Tabar et al., 2021 ;Abhyankar and Singla, 2021;Yazdani, 2021;Chou et al., 2022;Seya and Shiroi, 2022), K-Nearest Neighbor (Zhao et al., 2019;Yıldırım, 2019;Mrsic et al., 2020;García-Magariño et al., 2020;Tchuente and Nyawa, 2021;Bergadano et al., 2021;Yazdani, 2021;Alkan et al., 2022), Gradient Boosting (Walthert and Sigrist, 2019; Truong et al., 2020;Mrsic et al., 2020;Manrique et al., 2020;Imran et al., 2021;Ho et al., 2021 ;Sangha, 2021;Bergadano et al., 2021) are just some of these approaches. Consistency, precision and accuracy of models realized with machine learning approaches are directly related to data quality. ...
Article
Full-text available
For those who invest in real estate as an investment tool, as well as those who buy and sell real estate, the price of real estate should be predicted realistically and with the highest accuracy. It should be noted that the predict model should be the most appropriate representation of the underlying fundamentals of the market. Otherwise, the mistake to be made in the real estate valuation will cause some undesirable results such as inconsistent and unhealthy increase or decrease of the property tax, excessive gains or losses in favor of some groups, and adverse effects on investors and potential real estate owners. At this point, data-driven real estate valuation approaches are preferred more frequently to create highly accurate and unbiased estimates. However, the consistency, precision and accuracy of the models realized with machine learning approaches are directly related to the data quality. At this point, the effects of outlier detection on prediction performance in real estate valuation are investigated with a large data set obtained in this study. For this purpose, a heterogeneous data set with 70.771 real estate data and 283 variables, 4 different outlier detection methods were tested with 3 different machine learning approaches. The empirical findings reveal that the use of different outlier detection approaches increases the prediction performance in different ranges. With the best outlier detection approach, this performance increase was at a high 21,6% for Random Forest, with a 6,97% increase in average model performance.
... Potential borrowers are unable to obtain financial aid as a result of this one action [4,5]. Credit problems may arise from banks' improper management of creditors' debts [6,7]. A credit crisis arises when lending practices turn irresponsible and unsustainable over time, costing banks and lending institutions money [8]. ...
... The quantity of candidate solutions that make up the VSA is equal to the number of possible fixes, denoted as n. The sample mean serves as the center, x is the random variable's d1 vector, and the covariance matrix serves as a covariance indicator in equation (6). When variances equal zero covariances, a spherical distribution is formed, as shown by equation (7). ...
... To create models from datasets, data mining techniques are used, and the datasets represent a collection of details. Data mining algorithms learn from datasets, or they learn to anticipate the crucial consequence of a certain input [12]. This type of knowledge acquisition has no impact on the workstations' ability to hold onto data, but it does change how they operate so that future improvements may be made [13]. ...
Article
Full-text available
span>Ongoing loan fraud is a source of concern for financial institutions, as it has a direct financial impact and also scares off customers. This pattern, which can be traced to the development of modern technology, the introduction of novel ideas, and the quickening pace of international connections, makes the detection of fraud an expensive endeavour. This article proposes a novel framework for enhancing the fraud detection of loan banking using data mining algorithms. The framework extracts a number of predictive analysis techniques for identifying loan fraud. Several methods employing a wide range of pipeline architectures have been tried in order to select the optimal champion model. Autotuning has also been used to find the best possible setting for the model’s hyperparameters. The results of the evaluation show that autoencoder with gradient boosting outperformed the other classification algorithms with an accuracy of 98.62%. The proposed framework has the potential to significantly improve the fraud detection process of loan banking, which can ultimately lead to better faster fraud detects rates by combining data mining techniques with dimensionality reduction strategies in the feature space.</span
... The results showed that incorporating visual features improved the accuracy of the model, and the NN model outperformed the SVM model using the same dataset. In another study, Imran et al. [8] evaluated 11 different machine learning regression models to determine the best model for predicting house prices in the capital Islamabad. They used three different metrics to evaluate the performance of the models, namely MAPE, RMSE, and MAE. ...
Article
Full-text available
The real estate industry relies heavily on accurately predicting the price of a house based on numerous factors such as size, location, amenities, and season. In this study, we explore the use of machine learning techniques for predicting house prices by considering both visual cues and estate attributes. We collected a dataset (REPD-3000) of 3000 houses across 74 cities in the USA and annotated 14 estate attributes and five visual images for each house's exterior, interior-living room, kitchen, bedroom, and bathroom. We extracted features from the input images using convolutional neural network (CNN) and fed them along with the estate attributes into a multi-kernel deep learning regression model to predict the house price. Our model outperformed baseline models in extensive experiments, achieving the best result with a mean absolute error (MAE) of 16.60. We compared our model with a multi-kernel support vector regression and analyzed the impact of incorporating individual feature sets. In future, we plan to address class imbalance by having the same number of houses in each class and explore feature engineering for improving the model's performance.
... The final results showcase the superiority of the ensemble trees when compared to others. Imran [16] follows another approach for the capital of Pakistan, Islamabad. Alongside the basic property characteristics, they gather some features related to the surrounding area of a property. ...
Article
Full-text available
Real estate markets depend on various methods to predict housing prices, including models that have been trained on datasets of residential or commercial properties. Most studies endeavor to create more accurate machine learning models by utilizing data such as basic property characteristics as well as urban features like distances from amenities and road accessibility. Even though environmental factors like noise pollution can potentially affect prices, the research around this topic is limited. One of the reasons is the lack of data. In this paper, we reconstruct and make publicly available a general purpose noise pollution dataset based on published studies conducted by the Hellenic Ministry of Environment and Energy for the city of Thessaloniki, Greece. Then, we train ensemble machine learning models, like XGBoost, on property data for different areas of Thessaloniki to investigate the way noise influences prices through interpretability evaluation techniques. Our study provides a new noise pollution dataset that not only demonstrates the impact noise has on housing prices, but also indicates that the influence of noise on prices significantly varies among different areas of the same city.
Article
Full-text available
Electronic commerce (e-commerce) brings huge advantages to businesses for selling products through multiple online shops. However, companies have difficulties in supervising the prices of products set by different retail shops on e-commerce platforms. Addressing these difficulties, we suggest a method to identify and predict products that sell at incorrect prices using a machine learning model combined price analysis. The study uses four machine learning models: K-nearest Neighbor (KNN), Random Forest (RF), Support Vector Machine (SVM), and Multinomial Naive Bayes (MNB) and two text-based information extraction methods: BoW and TF-IDF to find to the best method. The research results show that the RF model and text-based information extraction method by the BoW provide more average accuracy than other specific models, when experimenting on the filter dataset the average accuracy after 10 runs are RF: 98.06%, SVM: 83.92%, MNB: 92.21%, KNN: 94.06%. Experimental results on the product dataset have an accuracy of RF: 83.02%, SVM: 55%, MNB: 79.33%, KNN: 79.36%.
Article
In this paper, we are going to use machine learning algorithms for house price prediction. House prices increases drastically every year, so we felt a need for a system that will predict house prices in the future. Due to a lack of knowledge of property assets people cannot guess the accurate price of houses. Therefore, we felt a need for a model that will predict an accurate house price. So, the main aim of our project is to predict the accurate price of the house without any loss. This survey also deals with a comparative analysis of the results of the algorithms used and the model with the highest accuracy and minimum error rate will be implemented. For the choice of prediction ways, we tend to compare and explore numerous prediction ways. We tend to utilize Linear and random forest regression as our model attributable to its liable and probabilistic methodology on model Choice. Our result exhibits that approach to the problem ought to achieve success and has the flexibility to predictions that will be compared to different house price prediction models. We have a proclivity to propose a house price prediction model to hold up a customer to estimate the proper valuation of a house.
Chapter
Full-text available
This book presents use-cases of IoT, AI and Machine Learning (ML) for healthcare delivery and medical devices. It compiles 15 topics that discuss the applications, opportunities, and future trends of machine intelligence in the medical domain. The objective of the book is to demonstrate how these technologies can be used to keep patients safe and healthy and, at the same time, to empower physicians to deliver superior care. Readers will be familiarized with core principles, algorithms, protocols, emerging trends, security problems, and the latest concepts in e-healthcare services. It also includes a quick overview of deep feed forward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, practical methodology, and how they can be used to provide better solutions to healthcare related issues. The book is a timely update for basic and advanced readers in medicine, biomedical engineering, and computer science. Key topics covered in the book: o An introduction to the concept of the Internet of Medical Things (IoMT). o Cloud-edge based IoMT architecture and performance optimization in the context of Medical Big Data. o A comprehensive survey on different IoMT interference mitigation techniques for Wireless Body Area Networks (WBANs). o Artificial Intelligence and the Internet of Medical Things. o A review of new machine learning and AI solutions in different medical areas. o A Deep Learning based solution to optimize obstacle recognition for visually impaired patients. o A survey of the latest breakthroughs in Brain-Computer Interfaces and their applications. o Deep Learning for brain tumor detection. o Blockchain and patient data management.
Chapter
Full-text available
This book presents use-cases of IoT, AI and Machine Learning (ML) for healthcare delivery and medical devices. It compiles 15 topics that discuss the applications, opportunities, and future trends of machine intelligence in the medical domain. The objective of the book is to demonstrate how these technologies can be used to keep patients safe and healthy and, at the same time, to empower physicians to deliver superior care. Readers will be familiarized with core principles, algorithms, protocols, emerging trends, security problems, and the latest concepts in e-healthcare services. It also includes a quick overview of deep feed forward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, practical methodology, and how they can be used to provide better solutions to healthcare related issues. The book is a timely update for basic and advanced readers in medicine, biomedical engineering, and computer science. Key topics covered in the book: o An introduction to the concept of the Internet of Medical Things (IoMT). o Cloud-edge based IoMT architecture and performance optimization in the context of Medical Big Data. o A comprehensive survey on different IoMT interference mitigation techniques for Wireless Body Area Networks (WBANs). o Artificial Intelligence and the Internet of Medical Things. o A review of new machine learning and AI solutions in different medical areas. o A Deep Learning based solution to optimize obstacle recognition for visually impaired patients. o A survey of the latest breakthroughs in Brain-Computer Interfaces and their applications. o Deep Learning for brain tumor detection. o Blockchain and patient data management.
Article
Full-text available
It is expected that peer to peer energy trading will constitute a significant share of research in upcoming generation power systems due to the rising demand of energy in smart microgrids. However, the on-demand use of energy is considered a big challenge to achieve the optimal cost for households. This paper proposes a blockchain-based predictive energy trading platform to provide real-time support, day-ahead controlling, and generation scheduling of distributed energy resources. The proposed blockchain-based platform consists of two modules; blockchain-based energy trading and smart contract enabled predictive analytics modules. The blockchain module allows peers with real-time energy consumption monitoring, easy energy trading control, reward model, and unchangeable energy trading transaction logs. The smart contract enabled predictive analytics module aims to build a prediction model based on historical energy consumption data to predict short-term energy consumption. This paper uses real energy consumption data acquired from the Jeju province energy department, the Republic of Korea. This study aims to achieve optimal power flow and energy crowdsourcing, supporting energy trading among the consumer and prosumer. Energy trading is based on day-ahead, real-time control, and scheduling of distributed energy resources to meet the smart grid’s load demand. Moreover, we use data mining techniques to perform time-series analysis to extract and analyze underlying patterns from the historical energy consumption data. The time-series analysis supports energy management to devise better future decisions to plan and manage energy resources effectively. To evaluate the proposed predictive model’s performance, we have used several statistical measures, such as mean square error and root mean square error on various machine learning models, namely recurrent neural networks and alike. Moreover, we also evaluate the blockchain platform’s effectiveness through hyperledger calliper in terms of latency, throughput, and resource utilization. Based on the experimental results, the proposed model is effectively used for energy crowdsourcing between the prosumer and consumer to attain service quality.
Article
Full-text available
Waste management is an issue of grave concern in the modern urban scenario with the exponentially rising population. Over the past few decades, the Korean government has established several policies to tackle challenges pertaining to solid waste management. To devise a policy, it is necessary to investigate the trends and behaviour of people towards waste disposal. This article fills this gap by proposing a systematic approach of analyzing the solid waste data based on waste profiles of residential grids in Jeju Island. The solid waste data, along with predictive analytics, help the municipality to devise customized policies for different residential grids. We define policy in terms of the number of waste collection human resources cost, waste carrier’s vehicle cost and fuel cost. Thus, the paper aims to suggest the number of resources which lead to a minimum cost and also ensure a certain level of hygiene in the area. The analysis is carried out on the solid waste dataset of 2017-2019 generated from different residential grids. The analysis, coupled with prediction algorithms allows the policy-makers to generate a waste profile specific to a residential grid. The optimization algorithm then proposes minimum resources which are enough to ensure hygiene standard of the area based on the waste amount and frequency inside the grid. The results of different areas are illustrated, and the minimum cost is suggested, which enables the policy-makers to not only allocate optimal resources but also helps in ensuring a green and clean environment.
Article
Full-text available
This research work proposed a hybrid model to maximize energy consumption and maximize user comfort in residential buildings. The proposed model consists of two widely used optimization algorithms named the firefly algorithm (FA) and genetic algorithm (GA). The hybridization of two optimization approaches results in a better optimization process, leading to better performance of the process in terms of minimum power consumption and maximum occupant’s comfort. The inputs of the optimization model are illumination, temperature, and air quality from the user, in addition with the external environment. The outputs of the proposed model are the optimized values of illumination, temperature, and air quality, which are, in turn, used in computing the values of user comfort. After the computation of the comfort index, these values enter the fuzzy controllers, which are used to adjust the cooling/heating system, illumination system, and ventilation system according to the occupant’s requirement. A user-friendly environment for power consumption minimization and user comfort maximization using data from different sensors, user, processes, power control systems, and various actuators is proposed in this work. The results obtained from the hybrid model have been compared with many state-of-the-art optimization algorithms. The final results revealed that the proposed approach performed better as compared to the standard optimization techniques.
Article
Full-text available
The exponentially growing population, urbanization, and economic development have led to the rising generation of municipal solid waste. Municipal solid waste management is thus a significant hurdle for urban societies as it consumes a large chunk of public funds, and, when mishandled, it can lead to environmental and social hazards. Some of the prerequisites required for effective waste management are the monitoring of bins, timely collection of bins, and prioritization of those areas which produce more solid waste. In this paper, we propose an optimal route recommendation system for waste carriers vehicles to effectively collect solid waste based on the profile of a particular area. This article contributes with a multi-objective optimization approach to generate a route by minimizing the route distance and maximizing the amount of waste. Then, a family of evolutionary methods is employed to solve the proposed objective function and find the optimal route for waste carrier vehicles. The experiment is carried out on the real-world solid waste data of Jeju Island, South Korea. The data is processed to predict the behavior of people of a specified grid location in terms of waste disposal. Therefore, the recommendation system caters to the predicted waste across a set of bins inside the area, and considering the constraints such as total allowed distance and time, proposes a route that is best in terms of distance (fuel consumption) and waste collection. Different use cases are illustrated to signify the proposed system, and results indicate that it can be a step forward for the implementation of smart cities, which is the goal of Jeju Island.
Article
Full-text available
The recent trend in the Internet of Things (IoT) is bringing innovations in almost every field of science. IoT is mainly focused on the connectivity of things via the Internet. IoT's integration tools are developed based on the Do It Yourself (DIY) approach, as the general public lacks technical skills. This paper presents a thermal comfort system based on tasks allocation mechanism in smart homes. This paper designs and implements the tasks allocation mechanism based on virtual objects composition for IoT applications. We provide user-friendly drag and drops panels for the new IoT users to visualize both task composition and device virtualization. This paper also designs tasks generation from microservices, tasks mapping, task scheduling, and tasks allocation for thermal comfort applications in smart home. Microservices are functional units of services in an IoT environment. Physical devices are registered, and their corresponding virtual objects are initialized. Tasks are generated from the microservices and connected with the relevant virtual objects. Afterward, they are scheduled and finally allocated on the physical IoT device. The task composition toolbox is deployed on the cloud for users to access the application remotely. The performance of the proposed architecture is evaluated using both real-time and simulated scenarios. Round trip time (RTT), response time, task dropping and latency are used as the performance metrics. Results indicate that even for worst-case scenarios, values of these metrics are negligible, which makes our architecture significant, better and ideal for task allocation in IoT network.