ChapterPDF Available

Customer Segmentation via Data Mining Techniques: State-of-the-Art Review

January 2022

January 2022

DOI:10.1007/978-981-16-9447-9_38

In book: Computational Intelligence in Data Mining (pp.489-507)

Authors:

Das Saumendra

GIET University Gunupur

Customers are more vigilant, intelligent, and dynamic in society. They change their preferences and habits according to their needs. Knowing the needs of customers is an important part of marketing where a company should discover the loyal customers in this heterogeneity. The concept of dividing heterogeneity into homogeneous forms is termed as customer segmentation. Customer segmentation is an integral part of marketing where companies can easily develop relationships with customers with a huge set of customer data in an organized manner. Understanding the customer’s hidden knowledge is a resourceful idea of computational analysis where accurate information could be optimized for the taste and preference of the customer. This type of computational analysis is termed as data mining. This paper discussed on a systematic review of customer segmentation via data mining techniques. It is a systematic review of supervised, unsupervised and other data mining techniques used in segmentation.KeywordsCustomer segmentationData miningSupervisedUnsupervised

Customer segmentation techniques

…

Proposed model for selection of data mining techniques

…

Impact study of segmentation variables Types of segmentation Focus area References

…

Figures - uploaded by Das Saumendra

Content may be subject to copyright.

Content uploaded by Das Saumendra

Content may be subject to copyright.

Customer Segmentation via Data Mining

Techniques: State-of-the-Art Review

Saumendra Das and Janmenjoy Nayak

Abstract Customers are more vigilant, intelligent, and dynamic in society. They

change their preferences and habits according to their needs. Knowing the needs of

customers is an important part of marketing where a company should discover the

loyal customers in this heterogeneity. The concept of dividing heterogeneity into

homogeneous forms is termed as customer segmentation. Customer segmentation is

an integral part of marketing where companies can easily develop relationships with

customers with a huge set of customer data in an organized manner.Understanding the

customer’s hidden knowledge is a resourceful idea of computational analysis where

accurate information could be optimized for the taste and preference of the customer.

This type of computational analysis is termed as data mining. This paper discussed

on a systematic review of customer segmentation via data mining techniques. It is

a systematic review of supervised, unsupervised and other data mining techniques

used in segmentation.

Keywords Customer segmentation ·Data mining ·Supervised ·Unsupervised

1 Introduction

Understanding consumer behaviour is a resourceful idea in marketing that makes

customers proﬁtable. Always, the manufacturer provides high-quality goods or

services for customers to fulﬁl their needs and wants by providing adequate knowl-

edge. Basically, the needs and wants of customers are closely observed through their

habits and preferences. So, knowledge is an important asset for companies to make

customers loyal. Any marketer should assemble the information seamlessly to satisfy

them by providing customized services at each point of delivery to avoid negative

S. Das

Department of MBA, Aditya Institute of Technology and Management (AITAM), Tekkali 532201,

India

J. Nayak (B)

Department of Computer Science, Maharaja Sriram Chandra BhanjaDeo (MSCB) University,

Baripada, Odisha 757003, India

e-mail: jnayak@ieee.org

J. Nayak et al. (eds.), Computational Intelligence in Data Mining, Smart Innovation,

Systems and Technologies 281, https://doi.org/10.1007/978-981-16-9447-9_38

489

490 S. Das and J. Nayak

reaction from consumers [1]. Over the years, consumers’ behaviour has changed

continuously. Now, consumers are more volatile than before. Often, they change

their habits and preferences. Therefore, it is impossible for a seller or manufacturer

to identify the consumer’s needs and wants in the mass markets. The idea of dividing

the market into various groups or sub-groups is typically known as segmentation.

The concept of segmentation is justiﬁed and explained by different experts to iden-

tify the needs and wants of customers rationally. This strategic application of market

targeting will ensure to anticipate consumer reaction, because they may have varied

preferences for consuming goods or services according to their proﬁle [2]. Neverthe-

less, the selection of segmentation techniques consistently depends on the variables

input, such as the geographic, demographic, behavioural, or psychological proﬁle of

consumers forecasted with some statistical or non-statistical approaches.

According to Smith [3], segmentation is a distinctive marketing strategy closely

associated with product differentiation and homogeneity. The customer may obtain

a variety of alternatives from manufacturers. In this diversiﬁed market structure,

manufacturers may get confused about selecting or retaining the customer. To attract

and retain customers, often marketers adopt selective techniques through advertising

or sales promotion rather than to understand the customer’s motives. In the gener-

alization of the mass market, it is difﬁcult to identify the needs and wants of the

customer through all kinds of promotional techniques. Therefore, customer segmen-

tation could be a choice for the marketer to provide preferential goods or services to

the customer. The basic idea of customer segmentation is to cluster/group customers

to identify, understand and target their needs. This concept of customer segmentation

was initially introduced by Smith in 1956 as an unconventional technique for product

differentiation strategy. A segment or group of customers can be depicted as a set of

customers who have similar types of demographic, psychological, and behavioural

proﬁles [4]. Now the selection of segmentation techniques is a sophisticated area of

research in this information and communication age, particularly in the areas of data

mining (DM) and database management systems (DBMS). With the huge data sets,

now the traditional market forecasting techniques are becoming of no use. Several

statistical techniques, like multivariate analysis, time series and so on, are also failing

to perform satisfactory clustering or segmentation. In this connection, a new form

of knowledge management technologies with soft computing and hard computing

like data mining, machine learning, artiﬁcial intelligence, etc. will deﬁnitely solve

market-related problems [5].

In this competitive world, today, most sellers want to know the needs and pref-

erences of the customer. Now they profusely maintain good relationships with

customers at every stage of business operations. The concept of maintaining a

good relationship with the customer is known as customer relationship management

(CRM). This theory of customer relationship management is becoming an integral

part of marketing strategy. With the proliferation of the Internet, the idea of relation-

ship management has become popular due to several computational approaches. The

company and customers can easily interact and understand each other by learning

the hidden knowledge from the enormous quantity of data. The concept of under-

standing and analysing the hidden knowledge of the customer is data mining. Data

Customer Segmentation via Data Mining … 491

mining is a computational analysis process that discovers the consumer’s taste and

preferences through customer segmentation, dividing huge sets of data [6]. The data

mining approach is also useful for manufacturers who have lost their quality when

the products decay. In this case, the recency, frequency, and monetary (RFM) form of

segmentation failed to quantify the exact preference rather than other methods like the

Fuzzy Analytic Network Process (FANP) [7]. Sometimes, data mining techniques

are useful for proﬁling the customer base, targeting, aligning the right channels,

cross-selling products, enhancing customer relationships and providing value to the

customer [8]. However, prioritising the customer within the existing customer base is

also an important technique in data mining. To improve the service quality and effec-

tiveness of the product, importance-performance analysis (IPA) is also a part of data

mining [9]. Customer segments are highly volatile; they may change according to the

preference of the customer, which creates confusion about the re-computation of data.

These uncertainties require streaming of data in a proper form where data mining

helps to cluster the data. As a result, customer segmentation performs continuously

[10]. Data mining techniques are also predicting the future probability and behaviours

that allow businesses to be more practical and knowledge-driven [11]. Data mining

techniques also provide the advantage of customer segmentation functions [12]. Data

mining also classiﬁes blogs into supervised and unsupervised learning models for

extracting knowledge from voice over the Internet protocol [13].

After a meticulous review of 550 academic literature, 57 research articles and

17 conference papers were considered in this review process. This paper discusses

customer segmentation via data mining techniques from a review perspective. This

paper is a systematic investigation into supervised, unsupervised and other data

mining techniques. The supervised approaches, such as neural networks, naive

Bayes, linear regression, logistic regression, support vector machine (SVM), K-

nearest neighbour, boosting and decision tree (DT), hidden Markov model (HMM),

and random forest have an enormous contribution to object detection and classiﬁca-

tion. In unsupervised approaches, complex classiﬁcation of data, identiﬁcation and

processing of variables have more emphasis through K-means clustering, K-nearest

neighbours (KNN), hierarchal clustering, anomaly detection, neural networks, prin-

ciple component analysis, independent component analysis, apriori algorithm, etc.

Some of the research articles on other data mining techniques, such as chi-square

automatic interaction detector (CHAID), RFM, genetic algorithm (GA), and logistic

regression, etc., have revealed classiﬁcation and relationship management. The paper

is organised into 5 sections. Section 2presents various issues involved with customer

segmentation. Section 3explains various segmentation techniques. Section 4is about

critical investigation. Section 5concludes the discussion and conclusion.

492 S. Das and J. Nayak

2 Various Issues Involved with Customer Segmentation

Consumers have different needs and expectations as per their characteristics.

In consumer behaviour research literature, we can observe several segmen-

tation variables, such as demographic, geographic, psychographics, decision-

making, behavioural, purchase behaviour, personality, lifestyle, situation factors, etc.

However, from a broader perspective, the researcher classiﬁed the customer segmen-

tation into four major areas, such as geographic characteristics, demographic proﬁle,

psychographic proﬁle, and behavioural aspects. On the other hand, some researchers

have classiﬁed it into two distinct forms. They are observed and unobserved vari-

ables. The observed variables, in general, are geographic, demographic, and socio-

economic, whereas purchase frequency and customer loyalty are considered as

product-speciﬁc or brand-speciﬁc variables. Sometimes, variables like lifestyle and

psychographics are unobserved in general and product beneﬁts, intention, preference,

etc. are considered product-speciﬁc [5]. So, customer segmentation is an emerging

area of research that has several issues replicating consumer behaviour on a product

or brand. In the decision-making process, customer segmentation is an integral part

of the marketing strategy which builds customer relationships, segregates customers

into different groups, and provides different facilities in the niche market. In partic-

ular, for mobile users, it develops VIP customer segmentation which can easily

identify their needs and facilitate the service [14].

The rapid development of computer technologies across the globe has changed

the tastes of telecom subscribers. Now it is high time for a telecom company to under-

stand the characteristics of the consumer to provide distinct services. Segmentation

is the only way to cluster customers into different bases and provide the service to

attract and retain customers [15]. Customer segmentation is also important in the

retail sector today. With the huge quantity of customer data, a retail ﬁrm may not be

able to keep the customer informed. So, data mining techniques could help to mine

the data among lost customers and help the retailer to build customer relationships

[16]. In this regard, customer segmentation will provide a wealth of information about

customers. Customer segmentation is the strategic resource for an enterprise to gain

competitive advantages and make customers proﬁtable [17]. Segmentation is impor-

tant for providing customer lifetime value (LTV). But now the statement has become

vague due to a lot of competition. Therefore, customer values like current value,

potential value, and customer loyalty will be an important asset for any marketer to

understand the customer better [18]. Customer segmentation is classifying the value

via the RFM model and rough set theory (RST) theory to understand the customer

and maintain the relationship [19]. According to the previous literature, segmen-

tation has various critical issues like problem recognition, design of the research,

data collection, data analysis, and implementation [5]. Table 1depicted major issues

related to customer segmentation issues to counter the problems.

Customer segmentation offers a tactic decision for supporting services and prof-

itability for businesses. It supports all kinds of business decisions for ﬁnancial growth

and development. Therefore, making a good customer segmentation method is a

Customer Segmentation via Data Mining … 493

Tabl e 1 Issues related to

customer segmentation Issues of customer

segmentation

Major consideration

Recognize the problem • Segmentation concept

• Information related to customer

• Classiﬁcation of the variables

• Customer segmentation base

selection

• Finance and other limitations

Design the research • Collection of data

• Instrument validity

• Objectives of segmentation

• Stability of variables

Data collection • Source of data

Data analysis • Data analysis and classiﬁcation of

segmentation

• Clustering data sets

• Reliability and validity of

information

Implementation • Implement on target customer

• Select segments

systematic way of deﬁning the tools that help the business to grow and develop.

Therefore, selection of the right tools involves cross-functional cause to deal with

the business goal. Customer segmentation has a lot of pros and cons while classi-

fying customers into different proﬁles. Sometimes it procures, retains, and attracts

the customer. It clusters the customers according to the market demand. However,

it could be successful when accurate data interpretation, knowledge discovery, and

information dissemination are properly done. Often, due to inexact information, it is

not effective. The manual process of segmentation is time-consuming, un-scalable,

and not agile. Therefore, segmentation does not help one-to-one marketing. With

the help of the latest technologies like data mining, artiﬁcial intelligence, machine

learning, etc., accurate segmentation is possible and makes the customer proﬁtable.

3 Segmentation Techniques

In general, customer segmentation involves a broad variety of techniques like cluster

analysis [10], cluster-wise-regression, AID/CHAID, multiple regression, discrimi-

nation analysis, latent class structure, inductive learning techniques, soft computing

techniques [5], and data mining (the detailed theory proposed in the next section)

are used in different market conditions. However, it is difﬁcult to classify the group

of customers according to their attributes. We have to consider the classical method.

In the classical theory, some researchers gave importance to the data preparation

framework and data analysis framework, which include supervised, unsupervised

494 S. Das and J. Nayak

Segmentation

techniques

Data preparation

framework

Data analysis

framework

Supervised

approach

Unsupervised

approach

Other data-

mining approach

Fig. 1 Customer segmentation techniques

and other methods of data mining approach (Fig. 1). Most of the techniques related

to artiﬁcial neural networks (ANNs), fuzzy logic (FL), machine learning (ML), RST

and evolutionary methods (EM) such as GA are the main data mining tools to analyse

data perfectly. These technologies have been widely used in data preparation and data

analysis. It is a challenging task for modern marketing professionals to consider the

right technique or algorithm. Most of these algorithms have signiﬁcant advantages

and disadvantages also. To avoid this problem, researchers should consider either

a supervised or unsupervised approach. The supervised approach is a classiﬁca-

tion method where the inputs and outputs are mapped properly. In the supervised

approach, all the common algorithms, i.e. support vector machines, logistic regres-

sion, artiﬁcial neural networks, naive Bayes, and random forests, signiﬁcantly work

further. These approaches follow a hierarchical process to maintain a good relation-

ship between input and output datasets. The unsupervised approaches are clustering

of data inherently. Some familiar algorithms include k-means clustering, principal

component analysis, and auto encoders. Since no labels are provided, there is no

speciﬁc way to compare model performance in most unsupervised approaches. In

this connection, DM techniques using neural networks, decision trees, genetic algo-

rithms, fuzzy logic, and K-nearest neighbour could be able to predict, comprehend,

and cluster the customers properly [20]. Besides the non-traditional methods, some

traditional techniques like self-organizing maps (SOM) can also be used to make

segmentation. In this approach, a set of initial cluster prototypes are made before

applying the K-means to get the ﬁnal clusters of data sets through near visualization.

Some researchers said that the U-matrix is also one of the best options for clustering

the data for analysing the results by time of hits.

Customer Segmentation via Data Mining … 495

3.1 Data Preparation Framework

Data preparation is a systematic way of transforming raw data into a basic form

of data for predictive analysis to remove errors or mistakes. Data preparation is a

challenging task to acquire proper prediction analysis. It uses automatic search like

grid and random search to ﬁnd unity in data preparation. Often it is difﬁcult to gather

a variety of data. For example, the data might be stored in a CSV ﬁle for classiﬁca-

tion and regression consists of rows, columns, and values for any data preparation

method. However, most of the authors articulated that data preparation techniques

are inferred using statistical and non-statistical techniques. Statistical techniques like

exploratory factor analysis and correspondence analysis; and computational tech-

niques such as soft computing tools (e.g. RST or GA) are typically used in data

preparation [5]. Exploratory factor analysis (EFA) is a common statistical method

applicable to multivariate statistics to uncover a relatively large set of data. Most of

the time, researchers use this technique for scaling the data sets through the question-

naire. EFA is accurate as each factor is symbolized by multiple measured variables.

EFA is based on common factors, unique factors, and errors of measurement. With

this EFA model, we can easily identify the common factors and other related manifest

variables. The correspondence analysis (CA) is an expansion of principal compo-

nent analysis appropriate for discovering relationships amongst qualitative variables

(or categorical data). Like principal component analysis, it also offers a solution for

summarizing and visualizing the data in two-dimension plots. Correspondence anal-

ysis is a signiﬁcant form of geometric approach for visualizing rows and columns

of a two-way contingency table appropriately. The main aim of this tabular form is

to provide a global view of the data for easy interpretation. However, these statis-

tical techniques have been replaced by soft computing to segment or classify the

data and provide accurate results. In particular, soft computing (SC) is an improved

technique over conventional traditional systems. It is also part of hard computing.

It has many intelligent and user-friendly features. Soft computing consists of FL,

ANNs, RST, and EM. The principal component of soft computing is to eliminate the

uncertainty and vagueness of data through fuzzy tools and EM, which are involved in

the optimization and searching process. Furthermore, ANNs and RST will solve the

classiﬁcation and rule generation problems. Recently, soft computing technologies

have been used for resolving data mining problems. Soft computing is widely used

for the analysis and interpretation of data. RST is mathematical computation and

granular approximation which discovers the hidden pattern in an uncertain environ-

ment widely used in soft computing. Therefore, soft computing is a computational

method that is useful for data preparation.

496 S. Das and J. Nayak

3.2 Data Analysis Framework

Segmenting the customer into different groups, such as geography, demography,

psychographic, and behavioural, is an easy form of classiﬁcation of customer data

to analyse the customer’s needs and expectations. There are various approaches

to classifying the market into different groups, popularly known as cluster anal-

ysis. In an article, Calantone and Johar [21] proposed that cluster analysis could

classify customer data explicitly. They proposed that beneﬁts customers should be

analysed properly in the tourism industry, where the marketing strategy formulation

such as understanding customers, product positioning, advertising copy testing, and

new market development will help to establish the market. However, the analysis

used by statistical analysis like factor analysis may extend the resultant output. In

this context, computational approaches like supervised, unsupervised and other data

mining approaches are widely used for data analysis.

3.2.1 Supervised Approach

A supervised approach is a systematic application of artiﬁcial intelligence (AI) where

a computer algorithm is absolutely trained on input data for assumed output. It

creates labelled data according to the speciﬁc question asked by the customer. The

supervised approach is also the ﬁnest learning approach for machine learning, useful

for forecasting ﬁnancial results, identifying fraud, recognizing objects in images, and

also evaluating risk. In a supervised approach, the input and output data are known

in advance for better prediction with the appropriate classiﬁcation. Object detection

is one of the important aspects of the supervised approach to computer vision. The

classical object detection approaches, such as background subtraction and saliency

detection, do not have manual collection and labelling of samples. Generally, they

do not train the samples for the classiﬁcation of labelled data like the supervised

approach. But sometimes it is absolutely affected by noise issues like changes in

luminance and the cluttered background. On the other hand, supervised approaches

like support vector machine, boosting and decision tree have good performance in

object detection. But it needs a substantial human interface to train the data for

labelling. In this connection, Wang et al. [22] developed a model to avoid manual

detection of objects or videos where the extension of the boosting algorithm (soft

label boosting) will help to train the samples with a soft (probabilistic) label in place

of a hard (binary) label. Tracking the emotions in the images or video clips is also

an important feature of the supervised approach.

In their paper, Malandrakis et al. [23] proposed an emotion tracking system in the

movie where the valance-arousal scale was detected through a continuous annotated

database. A supervised approach is proposed in their paper using hidden Markov

models in each dimension. They used HMMs to predict arousal and valance features

in the movie. They found that the sensor could be captured microscopically and

detect emotions. However, evaluation of the supervised approach is also important

Customer Segmentation via Data Mining … 497

for image segmentation with the use of a proper algorithm [24]. Sentiment anal-

ysis (SA) is a newly emerged research topic which unlocks a new future for busi-

nessmen, writers, and bloggers. It is an emerging form of computational algorithm

to understand the percentage of product acceptance and rejection where the business

acumen builds up their strategy to improve the product performance. In this regard,

opinion mining will be possible to ﬁnd the exact intention of the customer through

supervised machine learning models [25]. The supervised approach also detects the

musical boundaries between verse and chorus segments. Here the perceptual aspects

such as timbre, harmony, melody, and the rhythm of music through boosting [26].

Graph base spectral algorithm is a recent topic in research today, which detects image

objects through a clustering algorithm in a meaningful enlarge structure [27]. The

fault diagnosis system (FDS) is also an improved method of supervised learning

using a support vector machine for appropriate decision-making [28]. The decom-

position of nuclear waste objects through robotics is a matter of concern where the

RGBD-based detection and categorization is applied by a deep convolutional neural

network (DCNN) from unlabelled RGBD videos. It helps to make an object detection

benchmark to recognize waste objects perfectly [29]. In this connection, supervised

learning is a leading algorithm that was developed to identify the data, cluster and

recognize to perceive the individual customer expectations. This type of segmen-

tation will be helpful for researchers and business leaders to develop the product

quality and meet the needs of the customers.

3.2.2 Unsupervised Approach

An unsupervised approach is a form of an algorithm that learns patterns from unla-

belled data. In particular, it captures patterns such as neural prediction or prospect

density. It develops imaginative content through the internal representation of data.

Unlike the supervised approach, it has no human interaction, rather segmentation of

data by neural network and probabilistic method. It ﬁnds an interesting pattern from

various unlabelled sensor data without prior information. One of the popular tech-

niques of the unsupervised approach is data mining for the activity recognition task.

Though it has no human interaction, the classiﬁcation of complex data is possibly

effective in the customer segmentation process through pattern recognition. Often,

data sets have larger features and fewer occurrences are a relatively challenging task

for machine learning. However, with these multiple features of data sets, there may

be irrelevant or redundant information that causes damage in terms of correctness

or training time. To deal with these complex situations, the feature selection (FS)

and feature discretization (FD) methods will be helpful to recognize the data sets.

In particular, in the pre-processing stage, some classiﬁcation algorithms deal with

discrete features where the FD technique ﬁnds the representation of each feature. On

the other hand, FS is aiming at dropping features to target the curse of dimensionality

problems, often permitting learning algorithms to be better-performing classiﬁers.

Therefore, feature discretization-based algorithms could reduce the redundancy and

classify the data set [30]. In an unsupervised approach, fuzzy-based clustering is

498 S. Das and J. Nayak

evaluated through the fuzzy joint points (FJP) method where the data set is classiﬁed

in hierarchical order [30].

DNA array analysis is a functional algorithm to measure the expression of multiple

genes in an unsupervised approach. Just like supervised learning, a two-way clus-

tering framework is also able to identify gene patterns and perform cluster discovery

on samples where connectivity among the groups of genes could be possible [31].

Speech recognition and grouping of voices through co-channel (two-talker) speech

separation is also a part of the unsupervised learning approach. For voice segre-

gation and segmentation of speech, a differential algorithm like tandem will work

to separate the unvoiced speech [32]. This unsupervised approach is also applied

for the summarization of opinions. The state-of-the-art algorithm has been used

in this process where the summarization method is informative and readable [33].

This approach also detects human activity recognition from raw data by wearable

sensors to identify expectations [34]. The segmentation of data classiﬁcation could

be possible through multidimensional time series using the hidden Markov model,

which predicts human activity accurately. Automatic summarization of documents

is a recent development in the summarization of documents where the algorithms

classify the data into words, sentences, and phrases and ﬁnally process the docu-

ment. It also observes the relevancy, redundancy, and length of the document while

summarizing it [35]. Most researchers used the unsupervised learning approach for

different perspectives, such as facial landmark detectors, protocol features of word

extraction, product attribute extraction, clusters of pixel images, and so on.

3.2.3 Other Data Mining Approaches

In recent years, customer segmentation in direct marketing has become more effec-

tive with the development of database marketing techniques. These types of data

mining approaches ensure direct marketers segment customers in a better way to

perform with a different marketing strategy. The data mining approaches such as

CHAID, RFM, GA, and logistic regression were used as the analytical tools for direct

marketing segmentation with two types of data sets. It was found that amongst all the

approaches, RFM is the perfect approach. However, CHAID is also an optimal solu-

tion for segmenting the data into sequence. So an empirically based RFM approach

could replace both CHAID and logistic regression in database marketing systems

[36]. Therefore, it can be observed from several studies that RFM technology has

been used vividly to segment customers to access information. The marketing repre-

sentatives of commercial banks can segment through k-means classiﬁcation to obtain

potential customers. To obtain useful information from the customer, four types of

data mining methods, such as neural network, C5.0, classiﬁcation and regression tree,

and chi-squared automatic interaction detector, will deﬁnitely be helpful to detect

the background information for credit card holders [37,38]. Market segmentation

has a key role in continuing the relationship with a loyal customer. In this regard,

there must be a correlation between the retailer and the customer. By the use of the

Customer Segmentation via Data Mining … 499

divisive cluster analysis technique of data mining, the retailer can ﬁnd all kinds of

information from the customer database [39].

The advent of technology for data optimization and screening is an important

technique for data mining that mines vast data sets and classiﬁes the market accord-

ingly. In particular, ANN and particle swarm optimization (PSO) methods are recent

developments for market decision strategy. With the integration between statistical

analysis and particle swarm optimization, we can reduce redundant data and segment

the market properly [40]. Data mining techniques have become an indispensable

method in market segmentation. The classiﬁcation of larger data sets from databases

is a recent form of market research where some intelligent solutions, such as neural

networks, evolutionary algorithms (EA), fuzzy theory, RFM, hierarchical clustering,

K-means, bagged clustering, kernel methods, Taguchi method, multidimensional

scaling, model-based clustering, rough sets, and others, will be very effective and

time-bound [41]. So, clustering the data is an important feature of data mining tech-

niques where latent class analysis (LCA), prior clustering, and some description

of similarity or distance measures of data are used for segmenting large groups of

customers for individual expectations [42]. To understand the various research arti-

cles, we can conﬁrm that data mining is vividly used for the exploration and prediction

of expected outcomes in the heterogeneous market. Data mining is used for classiﬁca-

tion, clustering, association, and sequential analysis. In this regard, certain statistical

applications such as regression, time series, association and sequential analysis will

be beneﬁcial for mining large data sets [43].

4 Critical Investigation

Customer segmentation is an integral approach to target the customer and position

the brand in the mindset of the customer. Though there are several approaches,

such as supervised, unsupervised, and other data mining techniques that have been

used since 1990 by various researchers at different points of time, it has become a

part of customer segmentation to classify and cluster large data. In this paper, we

have extracted articles from various online bibliographies of academic articles on

customer segmentation, such as ABI/INFORM database, Science Direct, Emerald,

IEEE Transactions, JOSTER, Springer, Google scholar, and Wily online library. The

academic articles are searched for keywords like customer segmentation, market

segmentation, and customer segmentation and data mining. Among 550 articles,

the relevant literature on customer segmentation using data mining techniques has

been considered as a state-of-the-art review. In this paper, we considered almost

57 articles and 17 conference papers. After detailed observation of the literature, it

found that data mining techniques like K-means, RFM, GA, and other algorithms

are used in research for classiﬁcation of large data sets to target customers and create

meaningful marketing strategy.

500 S. Das and J. Nayak

4.1 Impact of Segmentation Variables

Consumers have an extensive variety of characteristics. Based on their variables,

we can ﬁnd several segmentation variables, such as geographic, demographic, ﬁrm

graphics, decision-making processes, situational factors, personality, proﬁtability,

beneﬁts sought, and so on. According to Kotler and Keller [44], segmentation

variables are classiﬁed into four important areas, such as demographic features,

geographic characteristics, psychographic and behavioural variables. On the other

hand, several authors have articulated the levels of variables, e.g. general variables,

domain-based and brand-speciﬁc; and the objectivity-oriented and subjectivity of the

variables. The number of variables that have been developed over different periods

has faced a massive challenge. Too many have been proposed to make it practical

for the market to empirically compare them all when trying to segment a market.

In this regard, the classiﬁcation can be broadly divided into general observed vari-

ables (e.g. geographic features, demographic proﬁle, socio-economic variables) and

unobserved variables (e.g. lifestyle and psychographics); product-oriented observed

variables (e.g. usage frequency and loyalty) and product-oriented unobserved vari-

ables (e.g. beneﬁts, preferences, and intentions) [45]. Therefore, the selection of the

proper segmentation variable is a signiﬁcant point to consider. In his article, wind [46]

articulated that most of the segmentation studies were on consumer goods. However,

the process of segmentation is also applicable in the industrial market. So, before

selecting an appropriate segmentation method, we must think about the problems

and prospects of segmentation. To select the proper method, the priori segmentation

design and cluster-based design are most essential. In prior segmentation designs,

the marketer was able to segment through product purchase, loyalty, and type of

customer wherein cluster-based design segments determine the beneﬁts, needs, and

attitudes of customers. Further, the advantages and disadvantages of segmentation

are also necessary. After observing several academic literatures, we found that there

is an equally importance on variety of segmentation models. But we must be careful

to select the segmentation method based on management’s speciﬁc objectives and

also on current trends in the consumer market (Table 2;Fig.2).

4.2 Model Reliability in Segmentation

Despite the importance of segmentation analysis on different data sets, minor atten-

tion has to be paid to check the reliability and validity. Because some variables, like

demographics (age, gender, income, religion, etc.) are more reliable than behavioural

or psychological characteristics. In particular, in the case of an attitude survey, proper

care should be considered and should test the reliability of data. To check the relia-

bility of data, statistical measures like factor analysis, conjoint analysis, co-relation,

component matrix, etc. will be beneﬁcial for data analysis. However, these tradi-

tional methods could not provide accuracy due to several exceptions to the number

Customer Segmentation via Data Mining … 501

Tabl e 2 Impact study of

segmentation variables Types of segmentation Focus area References

General observable variables Demographics [47–50]

Socio-economic [51,52]

Behavioural [53,54]

Cultural [55,56]

General unobservable

variables

Lifestyle [50,57–60]

Psychographic [61,61–64]

Product speciﬁc variables Usage frequency [65]

Loyalty [66–68]

Product speciﬁc unobservable

variables

Beneﬁts [69,70]

Attitude [71,72]

Fig. 2 Types of customer

segmentation variables

36%

14%

General observed

variable

Genaral unobserved

variable

Product oriented

observed variable

Product oriented

unobserved variable

of items. In this connection, perceptual studies like a generalization of data could

provide better analytical results. Therefore, there is a need for instrument devel-

opment in data reliability [46]. Commonly, there are two potential approaches to

measure the reliability, such as degree of consistency and cross-validation [73]. The

former approach can be executed through clustering or classiﬁcation of data sets,

which requires multiple time veriﬁcation. The latter approach can be performed by

dividing the data into two different parts and performing the analysis to check the

reliability of the sample parts. When the clustering process is executed, the latter

method can be modiﬁed by obtaining the cluster centroids from the ﬁrst part and

using them to describe clusters in the second part. Cross validation is a more gener-

alized approach compared to the ﬁrst approach. Concerning cross-validation of the

data discriminate measure of the Wilk’s Lambda (k) and the Kappa, the index is the

most famous method applied in marketing research [74]. Before examining the expe-

riential task, we will immediately believe whether any type of reliability has been

taken into account or not. The distance between the clusters should be measured

through squares within and between the clusters, a scatter matrix of data points, and

indexes. Further, different indexes could be employed to determine the number of

502 S. Das and J. Nayak

fuzzy clusters in the datasets. Some of the indexes also compare the clusters. Hence,

inherently, the data sets should be checked and rechecked through the proper method

to test their reliability.

4.3 Selection of Proper Data Mining Model

Data mining is the signiﬁcant procedure of analysing large volumes of data to ascer-

tain business acumen, which helps companies to resolve problems, mitigate risks,

and grasp new opportunities. This particular division of data science derived from the

similarities in data between searching for important information in a large database

and mining a peak. Both processes need sifting through wonderful amounts of mate-

rial to ﬁnd hidden value. Data mining can answer all kinds of business questions that

traditionally took more time to resolve manually. Using a wide range of statistical

techniques to analyse data from a different perspective, users can identify patterns,

trends, and relationships. Customer segmentation is a measure of concern for market

analysis where proper data classiﬁcation is important. Though there is the applica-

tion of several statistical techniques in a customer database, data mining techniques

could help predict, analyse and proﬁle the customer in a signiﬁcant way. Several

academic literature has given the importance of various data mining techniques, like

supervised, unsupervised, and other data mining techniques, but it could be difﬁcult

to identify the exact data mining techniques for their study. So the researchers should

have domain knowledge of business, techniques, and also a ﬁtness model. Here, we

proposed a data mining model (Fig. 3) based on the suitability of customer needs

and expectations.

5 Discussion and Conclusion

Customer segmentation using data mining is a recent study where most of the

academic literature suggests the classiﬁcation of data. Some of these studies empha-

sized different clustering methods also. However, the selection of segmentation tech-

niques is a challenging task for a business concern. About the selection of the segmen-

tation, we must think about two important aspects, i.e. the objective of management

and the recent trends in the market. The classical methods like factor analysis, regres-

sion, conjoint analysis, or co-efﬁcient determinants may not provide accurate predic-

tions. Therefore, in this review, we observed that computational algorithms could

justify businessmen for analysis and prediction. As we know, most business acumen

are expanding their products or services into different markets and also searching for

a better customer portfolio where they can target customers and position their brands.

In this connection, we highlighted four types of segmentation techniques, such as

general observable variables, unobservable variables, product-speciﬁc observable

variables, and product-speciﬁc unobservable variables. In the ﬁrst case, the variables

Customer Segmentation via Data Mining … 503

Criteria of selection: computational complexity,

optimization, flexibility, scalability, interpritablity,

encoding the problem, assesibility

Data mining techniques: ANN, GA, RFM, SVM, EA,

CHAID, K means, bagged clustering, kernel methods,

multidimensional scaling, taguchi method, model-

based clustering, rough set

Number of data mining of task: Classification,

prediction, association, cluster analysis, time series,

regression

Fig. 3 Proposed model for selection of data mining techniques

are geographic, demographics, socio-economic, and culture; in the second case,

they include lifestyle, psychographics, attitude, and emotions; in the third case, the

variables are frequency of purchase and loyalty; and ﬁnally, in the fourth case, the

variables are beneﬁts, preference, and intention. Therefore, segmenting the customer

through data mining techniques like K-mean, RFM, GA, ANN, kernel method, PSO

could be helpful to the marker to segment properly.

In the future, the marketing strategy will rely on these customer segmentation

techniques with a large data bank. For example, a credit card service provider will

collect all kinds of customer information from the bank and facilitate the credit

card. An insurance company is collecting prospect customer information to sell its

services. Though the data sets are large, little human interaction is also necessary

for prediction. Data mining techniques use algorithms to quantify the labelled and

unlabelled training inputs for a valid output. So, the supervised and unsupervised

approaches will justify the adequate output. With the use of data mining techniques,

the business will grow with a stringent marketing strategy to expand and diversify

the product or services.

References

1. G. Lefait, T. Kechadi, Customer segmentation architecture based on clustering techniques,

in 2010 Fourth International Conference on Digital Society (IEEE, 2010). https://doi.org/10.

1109/ICDS.2010.47

504 S. Das and J. Nayak

2. P.Q. Brito et al., Customer segmentation in a large database of an online customized fashion

business. Robot. Comput.-Integr. Manuf. 36, 93–100 (2015). https://doi.org/10.1016/j.rcim.

2014.12.014

3. W.R. Smith, Product differentiation and market segmentation as alternative marketing

strategies.J.Mark.21(1), 3–8 (1956). https://doi.org/10.1177/002224295602100102

4. A. Nairn, P. Berthon, Creating the customer: the inﬂuence of advertising on consumer market

segments—evidence and ethics. J. Bus. Ethics 42(1), 83–100 (2003). https://doi.org/10.1023/

A:1021620825950

5. A. Hiziroglu, Soft computing applications in customer segmentation: state-of-art review and

critique. Expert Syst. Appl. 40(16), 6491–6507 (2013). https://doi.org/10.1016/j.eswa.2013.

05.052

6. A. Hajiha, R. Radfar, S.S. Malayeri, Data mining application for customer segmentation based

on loyalty: an Iranian food industry case study, in 2011 IEEE International Conference on

Industrial Engineering and Engineering Management (IEEE, 2011). https://doi.org/10.1109/

IEEM.2011.6117968

7. V. Golmah, G. Mirhashemi, Implementing a data mining solution to customer segmentation

for decayable products—a case study for a textile ﬁrm. Int. J. Database Theory Appl. 5(3),

73–90 (2012)

8. M.M.T.M. Hassan, M. Tabasum, Customer proﬁling and segmentation in retail banks using

data mining techniques. Int. J. Adv. Res. Comput. Sci. 9(4), 24–29 (2018)

9. S.Y. Hosseini, A.Z. Bideh, A data mining approach for segmentation-based importance-

performance analysis (SOM–BPNN–IPA): a new framework for developing customer retention

strategies. Serv. Bus. 8(2), 295–312 (2014). https://doi.org/10.1007/s11628-013-0197-7

10. M. Carnein, H. Trautmann, Customer segmentation based on transactional data using stream

clustering, in Paciﬁc-Asia Conference on Knowledge Discovery and Data Mining (Springer,

Cham, 2019). https://doi.org/10.1007/978-3-030-16148-4_22

11. W. Wang, S. Fan, Application of data mining technique in customer segmentation of shipping

enterprises, in 2010 2nd International Workshop on Database Technology and Applications

(IEEE, 2010). https://doi.org/10.1109/DBTA.2010.5659081

12. J. Ranjan, R. Agarwal, Application of segmentation in customer relationship management: a

data mining perspective. Int. J. Electron. Custom. Relat. Manag. 3(4), 402–414 (2009). https://

doi.org/10.1504/IJECRM.2009.029298

13. L.-S. Chen, C.-C. Hsu, M.-C. Chen, Customer segmentation and classiﬁcation from blogs by

using data mining: an example of VOIP phone. Cybern. Syst. Int. J. 40(7), 608–632 (2009).

https://doi.org/10.1080/01969720903152593

14. Z. Yihua, Vip customer segmentationbased on data mining in mobile-communications industry,

in 2010 5th International Conference on Computer Science & Education (IEEE, 2010). https://

doi.org/10.1109/ICCSE.2010.5593669

15. C. Qiuru et al., Telecom customer segmentation based on cluster analysis, in 2012 International

Conference on Computer Science and Information Processing (CSIP) (IEEE, 2012). https://

doi.org/10.1109/CSIP.2012.6309069

16. H. Gong, Q. Xia, Study on application of customer segmentation based on data mining tech-

nology,in 2009 ETP International Conference on Future Computer and Communication (IEEE,

2009). https://doi.org/10.1109/FCC.2009.66

17. X. Lai, Segmentation study on enterprise customers based on data mining technology, in 2009

First International Workshop on Database Technology and Applications (IEEE, 2009). https://

doi.org/10.1109/DBTA.2009.96

18. H. Hwang, T. Jung, E. Suh, An LTV model and customer segmentation based on customer

value: a case study on the wireless telecommunication industry. Expert Syst. Appl. 26(2),

181–188 (2004). https://doi.org/10.1016/S0957-4174(03)00133-7

19. C.-H. Cheng, Y.-S. Chen, Classifying the segmentation of customer value via RFM model and

RS theory. Expert Syst. Appl. 36(3), 4176–4184 (2009). https://doi.org/10.1016/j.eswa.2008.

04.003

Customer Segmentation via Data Mining … 505

20. S. Kelly, Mining data to discover customer segments. Interact. Mark. 4(3), 235–242 (2003).

https://doi.org/10.1057/palgrave.im.4340185

21. R.J. Calantone, J.S. Johar, Seasonal segmentation of the tourism market using a beneﬁt segmen-

tation framework. J. Travel Res. 23(2), 14–24 (1984). https://doi.org/10.1177/004728758402

300203

22. W. Wang et al., A weakly supervised approach for object detection based on soft-label boosting,

in 2013 IEEE Workshop on Applications of Computer Vision (WACV) (IEEE, 2013). https://

doi.org/10.1109/WACV.2013.6475037

23. N. Malandrakis et al., A supervised approach to movie emotion tracking, in 2011 IEEE Interna-

tional Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, 2011). https://

doi.org/10.1109/ICASSP.2011.5946961

24. L. Yang et al., A supervised approach to the evaluation of image segmentation methods, in

International Conference on Computer Analysis of Images and Patterns (Springer, Berlin,

Heidelberg, 1995). https://doi.org/10.1007/3-540-60268-2_377

25. Md.S. Islam et al., Supervised approach of sentimentality extraction from Bengali Face-

book status, in 2016 19th International Conference on Computer and Information Technology

(ICCIT) (IEEE, 2016). https://doi.org/10.1109/ICCITECHN.2016.7860228

26. D. Turnbull et al., A supervised approach for detecting boundaries in music using difference

features and boosting, in ISMIR (2007)

27. L. Yang et al., A supervised approach to the evaluation of image segmentation methods, in

International Conference on Computer Analysis of Images and Patterns (Springer, Berlin,

Heidelberg, 1995). https://doi.org/10.1016/j.neucom.2011.09.002

28. I. Monroy et al., A semi-supervised approach to fault diagnosis for chemical processes. Comput.

Chem. Eng. 34(5), 631–642 (2010). https://doi.org/10.1016/j.compchemeng.2009.12.008

29. L. Sun et al., A novel weakly-supervised approach for RGB-D-based nuclear waste object

detection. IEEE Sens. J. 19(9), 3487–3500 (2018). https://doi.org/10.1109/JSEN.2018.288

8815

30. A.J. Ferreira, M.A.T. Figueiredo, An unsupervised approach to feature discretization and

selection. Pattern Recogn. 45(9), 3048–3060 (2012). https://doi.org/10.1016/j.patcog.2011.

12.008

31. E.N. Nasibov, G. Ulutagay, A new unsupervised approach for fuzzy clustering. Fuzzy Sets

Syst. 158(19), 2118–2133 (2007). https://doi.org/10.1016/j.fss.2007.02.019

32. Ke. Hu, D.L. Wang, An unsupervised approach to cochannel speech separation. IEEE Trans.

Audio Speech Lang. Process. 21(1), 122–131 (2012). https://doi.org/10.1109/TASL.2012.221

5591

33. K. Ganesan, C.X. Zhai, E. Viegas, Micropinion generation: an unsupervised approach to gener-

ating ultra-concise summaries of opinions, in Proceedings of the 21st International Conference

on World Wide Web (2012)

34. D. Trabelsi et al., An unsupervised approach for automatic activity recognition based on hidden

Markov model regression. IEEE Trans. Autom. Sci. Eng. 10(3), 829–835 (2013). https://doi.

org/10.1109/TASE.2013.2256349

35. R.M. Alguliyev, R.M. Aliguliyev, N.R. Isazade, An unsupervised approach to generating

generic summaries of documents. Appl. Soft Comput. 34, 236–250 (2015). https://doi.org/

10.1016/j.asoc.2015.04.050

36. J.A. McCarty, M. Hastak, Segmentation approaches in data-mining: a comparison of RFM,

CHAID, and logistic regression. J. Bus. Res. 60(6), 656–662 (2007). https://doi.org/10.1016/

j.jbusres.2006.06.015

37. W. Li et al., Credit card customer segmentation and target marketing based on data mining,

in 2010 International Conference on Computational Intelligence and Security (IEEE, 2010).

https://doi.org/10.1109/CIS.2010.23

38. Z. Lu et al., Customer segmentation algorithm based on data mining for electric vehicles, in 2019

IEEE 4th International Conference on Cloud Computing and Big Data Analysis (ICCCBDA)

(IEEE, 2019). https://doi.org/10.1109/ICCCBDA.2019.8725737

506 S. Das and J. Nayak

39. V.L. Miguéis, A.S. Camanho, J. Falcão e Cunha, Customer data mining for lifestyle segmen-

tation. Expert Syst. Appl. 39(10), 9359–9366 (2012). https://doi.org/10.1016/j.eswa.2012.

02.133

40. C.-Y Chiu et al., An intelligent market segmentation system using k-means and particle

swarm optimization. Expert Syst. Appl. 36(3), 4558–4565 (2009). https://doi.org/10.1016/j.

eswa.2008.05.029

41. S. Dutta, S. Bhattacharya, K.K. Guin, Data mining in market segmentation: a literature review

and suggestions, in Proceedings of Fourth International Conference on Soft Computing for

Problem Solving (Springer, New Delhi, 2015). https://doi.org/10.1007/978-81-322-2217-0_8

42. E.R. Swenson, N.D. Bastian, H.B. Nembhard, Healthcare market segmentation and data

mining: a systematic review. Health Mark. Q. 35(3), 186–208 (2018). https://doi.org/10.1080/

07359683.2018.1514734

43. S. Mckechnie, Integrating intelligent systems into marketing to support market segmentation

decisions. Intell. Syst. Account. Finance Manag. Int. J. 14(3), 117–127 (2006). https://doi.org/

10.1002/isaf.280

44. P. Kotler, K.L. Keller, Marketing Management, ed. by W. Lassar, international 11th edn.

(Prentice Hall, New Jersey, 2003)

45. M. Wedel, W.A. Kamakura, Market Segmentation: Conceptual and Methodological Founda-

tions, vol. 8 (Springer Science & Business Media, 2012)

46. Y. Wind, Issues and advances in segmentation research. J. Mark. Res. 15(3), 317–337 (1978).

https://doi.org/10.1177/002224377801500302

47. L. Alfansi, A. Sargeant, Market segmentation in the Indonesian banking sector: the relationship

between demographics and desired customer beneﬁts. Int. J. Bank Mark. (2000). https://doi.

org/10.1108/02652320010322976

48. D.G. Tonks, Validity and the design of market segments. J. Mark. Manag. 25(3–4), 341–356

(2009). https://doi.org/10.1362/026725709X429782

49. M. Taks, J. Scheerder, Youth sports participation styles and market segmentation proﬁles:

evidence and applications. Eur. Sport Manag. Q. 6(2), 85–121 (2006). https://doi.org/10.1080/

16184740600954080

50. J. Bruwer, E. Li, Wine-related lifestyle (WRL) market segmentation: demographic and

behavioural factors. J. Wine Res. 18(1), 19–34 (2007). https://doi.org/10.1080/095712607015

26865

51. P. Vyncke, Lifestyle segmentation: from attitudes, interests and opinions, to values, aesthetic

styles, life visions and media preferences. Eur. J. Commun. 17(4), 445–463 (2002). https://doi.

org/10.1177/02673231020170040301

52. A. Vellido, P.J.G. Lisboa, K. Meehan, Segmentation of the on-line shopping market using

neural networks. Expert Syst. Appl. 17(4), 303–314 (1999). https://doi.org/10.1016/S0957-

4174(99)00042-1

53. J. Swait, A structural equation model of latent segmentation and product choice for cross-

sectional revealed preference choice data. J. Retail. Consum. Serv. 1(2), 77–89 (1994). https://

doi.org/10.1016/0969-6989(94)90002-7

54. T. Teichert, E. Shehu, I. von Wartburg, Customer segmentation revisited: the case of the airline

industry. Transp. Res. Part A Policy Pract. 42(1), 227–242 (2008). https://doi.org/10.1016/j.

tra.2007.08.003

55. A. Lindridge, S. Dibb, Is ‘culture’ a justiﬁable variable for market segmentation? A cross-

cultural example. J. Consum. Behav. Int. Res. Rev. 2(3), 269–286 (2003). https://doi.org/10.

1002/cb.106

56. F. Casarin, A. Moretti, An international review of cultural consumption research. SSRN

Electron. J. Department of Management, Università Ca’ Foscari Venezia working paper 12

(2011)

57. A.M. Gonzalez, L. Bello, The construct “lifestyle” in market segmentation: the behaviour of

tourist consumers. Eur. J. Mark. (2002). https://doi.org/10.1108/03090560210412700

58. D.B. Valentine, T.L. Powers, Generation Y values and lifestyle segments. J. Consum. Mark.

(2013). https://doi.org/10.1108/JCM-07-2013-0650

Customer Segmentation via Data Mining … 507

59. U.R. Orth et al., Promoting brand beneﬁts: the role of consumer psychographics and lifestyle.

J. Consum. Mark. (2004). https://doi.org/10.1108/07363760410525669

60. C.-S. Yu, Construction and validation of an e-lifestyle instrument. Internet Res. (2011). https://

doi.org/10.1108/10662241111139282

61. A.M. Thompson, P.F. Kaminski, Psychographic and lifestyle antecedents of service quality

expectations: a segmentation approach. J. Serv. Mark. (1993). https://doi.org/10.1108/088760

49310047742

62. J.L.M. Tam, S.H.C. Tai, Research note: the psychographic segmentation of the female market

in Greater China. Int. Mark. Rev. (1998). https://doi.org/10.1108/02651339810205258

63. T.F. Srihadi, D. Sukandar, A.W. Soehadi, Segmentation of the tourism market for Jakarta:

classiﬁcation of foreign visitors’ lifestyle typologies. Tour. Manag. Perspect. 19, 32–39 (2016).

https://doi.org/10.1016/j.tmp.2016.03.005

64. B. Oates, L. Shufeldt, B. Vaught, A psychographic study of the elderly and retail store attributes.

J. Consum. Mark. (1996). https://doi.org/10.1108/07363769610152572

65. T.M.M. Verhallen, R.T. Frambach, J. Prabhu, Strategy-based segmentation of industrial

markets. Ind. Mark. Manag. 27(4), 305–313 (1998). https://doi.org/10.1016/S0019-850

1(97)00064-3

66. E.J. Cheron, R. McTavish, J. Perrien, Segmentation of bank commercial markets. Int. J. Bank

Mark. (1989). https://doi.org/10.1108/EUM0000000001458

67. S.W. Clopton, J.E. Stoddard, D. Dave, Event preferences among arts patrons: implications for

market segmentation and arts management. Int. J. Arts Manag. 48–59 (2006)

68. A. Buratto, L. Grosset, B. Viscolani, Advertising a new product in a segmented market. Eur. J.

Oper. Res. 175(2), 1262–1267 (2006)

69. R. Sánchez-Fernández, M. Ángeles Iniesta-Bonillo, A. Cervera-Taulet, Exploring the concept

of perceived sustainability at tourist destinations: a market segmentation approach. J. Travel

Tour. Mark. 36(2), 176–190 (2019)

70. K. Bijak, L.C. Thomas, Does segmentation always improve model performance in credit

scoring? Expert Syst. Appl. 39(3), 2433–2442 (2012). https://doi.org/10.1016/j.eswa.2011.

08.093

71. A. Sell, P. Walden, Segmentation bases in the mobile services market: attitudes in, demographics

out, in 2012 45th Hawaii International Conference on System Sciences (IEEE, 2012)

72. A. Sell, J. Mezei, P. Walden, An attitude-based latent class segmentation analysis of mobile

phone users. Telemat. Inform. 31(2), 209–219 (2014)

73. D.J. Ketchen, C.L. Shook, The application of cluster analysis in strategic management research:

an analysis and critique. Strateg. Manag. J. 17(6), 441–458 (1996)

74. G. Punj, D.W. Stewart, Cluster analysis in marketing research: review and suggestions for

application. J. Mark. Res. 20(2), 134–148 (1983)

Unveiling IoT Customer Behaviour: Segmentation and Insights for Enhanced IoT-CRM Strategies: A Real Case Study

Preprint

Full-text available

Dec 2023

In today’s competitive landscape, achieving customer-centricity is paramount for the sustainable growth and success of organisations. This research is dedicated to understanding customer preferences in the context of the Internet of Things (IoT) and employs a two-part modeling ap-proach tailored in this digital era. In the first phase, we leverage the power of the Self-Organizing Map (SOM) algorithm to segment IoT customers based on their connected device usage patterns. This segmentation approach reveals three distinct customer clusters, with the second cluster demonstrating the highest propensity for IoT device adoption and usage. In the second phase, we introduce a robust Decision Tree methodology designed to prioritize various factors influencing customer satisfaction in the IoT ecosystem. We employ the Classification and Regression Tree (CART) technique to analyze 17 key questions that assess the significance of factors impacting IoT device purchase decisions. By aligning these factors with the identified IoT customer clusters, we gain profound insights into customer behaviour and preferences in the rapidly evolving world of connected devices. This comprehensive analysis delves into the factors contributing to customer retention in the IoT space, with a strong emphasis on crafting logical marketing strategies, en-hancing customer satisfaction, and fostering customer loyalty in the digital realm. Our research methodology involves surveys and questionnaires distributed to 207 IoT users, categorizing them into three distinct IoT customer groups. Leveraging analytical statistical methods, regression analysis, and IoT-specific tools and software, this study rigorously evaluate the factors influencing IoT device purchases. Importantly, this approach not only effectively clusters the IoT Customer Relationship Management (IoT-CRM) dataset but also provides valuable visualizations that are essential for understanding the complex dynamics of the IoT customer landscape. Our findings underscore the critical role of logical marketing strategies, customer satisfaction, and customer loyalty in enhancing customer retention in the IoT era. This research makes a significant contri-bution to businesses seeking to optimize their IoT -CRM strategies and capitalize on the oppor-tunities presented by the IoT ecosystem.

Data Mining for Dynamic Customer Segmentation: Unraveling Insights

Article

Full-text available

May 2024

Effective decision-making is essential for every firm to earn high income. These days, there is intense rivalry, and every company is advancing using a unique set of techniques. We ought to make an informed choice based on evidence. Since each client is unique, we have no idea what they enjoy or what they purchase. But by using a variety of algorithms on the dataset, one may use machine learning techniques to filter through the data and identify the target group. In the absence of this, identifying a group of individuals with like interests and personalities within a sizable dataset will be exceedingly challenging and no better methods exist. The use of K-Mean clustering for customer segmentation aids in grouping data with comparable characteristics, which benefits the firm the most. We will use the elbow approach to determine the number of clusters, and then we will visualize the results.

Customer segmentation application based on K-Means

Article

Full-text available

Mar 2024

Jiaqi Zhao

Customer segmentation(CS) is a crucial aspect of customer relationship management, widely utilized by industries, banks, and consulting companies. However, the intricate data relationship between individuals presents significant challenges in customer segmentation research. Fortunately, machine learning has made remarkable progress in processing big data, and its exceptional performance has captivated the attention of business analytics researchers. Based on this, numerous customer segmentation methods based on machine learning have been proposed. This paper aims to review the papers published after 2010 on customer segmentation, and summarize the current status and importance of customer segmentation in implementing marketing strategies. Additionally, it introduces two primary types of customer segmentation scenarios, and summarizes the common combination of analysis models and machine learning algorithms in customer segmentation. Finally, the paper introduces a customer segmentation method based on k-means and provides a perspective on the future development of customer segmentation.

DAO-LGBM: dual annealing optimization with light gradient boosting machine for advocates prediction in online customer engagement

Article

Full-text available

Jan 2024
CLUSTER COMPUT

Social networks have modernized the way people communicate, share information, and consume content. The widespread use of social media platforms has resulted in the creation of vast amounts of user-generated content, which can be analyzed to gain valuable insights into customer behaviour, emotions, preferences, and trends. Previous studies on online customer engagement have mainly focused on brand perspective and its socially significant elements, such as brand personality, image, reputation, and loyalty. These studies have explored how these elements influence the behavioural engagement of customers, such as their purchase intentions, word-of-mouth recommendations, and repeat purchases. However, more recent research has started to shift towards a more customer-centric perspective, which acknowledges that customer engagement is a two-way process, involving both the brand and the customer. This approach considers the role of customer experiences, emotions, topics of interest, and motivations in shaping their social engagement with the brand. This paper contributes to these endeavours by developing a consolidated framework that incorporates various facets of the customer's emotional and behavioural social content. In particular, features of online customers have been extracted using various sophisticated modules that incorporate natural language inference, topic modelling, sentiment analysis, emotion detection, and the Big-Five Personality Traits. Further, a heuristic-based feature selection (FS) strategy, Dual Annealing Optimisation (DAO), is integrated with Light Gradient Boosting Machine (LGBM) to furnish a consolidated machine learning module (DAO-LGBM) that is implemented and examined to detect advocates in online customer engagement. A thorough examination of a proposed model and its utility for detecting advocates using rigorous evaluation metrics is undertaken, reported, and discussed. These findings have substantial implications for both academic research and practical applications in social media analytics.

Examination of the Criticality of Customer Segmentation Using Unsupervised Learning Methods

Article

Full-text available

Jan 2024

In the world, everything revolves around selling and buying to get something or to earn a living. Whoever is selling is a seller who needs a customer to sell the things. The customer went to a seller when the seller approached the customer. Long-term relationships with customers become more and more important as a marketing paradigm unfolds. To predict the customer–seller relationship or to analyze customer satisfaction, to efficiently identify and serve its customers depending on multiple variables, a corporation must segment its market because it has a finite number of resources. Clustering is a useful and popular method for market segmentation, which identifies the intended market and customer groupings, in the field of market research. This study demonstrates how to segment mall customers using machine learning methods. This is the unsupervised clustering problem, and three well-known algorithms—K-means, affinity propagation, and DBSCAN—will be discussed and contrasted. The primary goal of the study is to go through the fundamentals of clustering techniques while also touching on some more complicated ideas. The study also revealed that there are more female customers than male consumers, with women making up 56% of all customers. Males have a greater mean income than females ($62.2 k vs. $59.2 k). Additionally, male customers’ median income ($62.5 k) is higher than female customers ($60 k). Both groups’ standard deviations are comparable. With an annual income of roughly 140 k dollars, one male stands out in the group.

Customer profiling, segmentation, and sales prediction using AI in direct marketing

Article

Full-text available

Dec 2023
NEURAL COMPUT APPL

In the current business environment, where the customer is the primary focus, effective communication between marketing and senior management is vital for success. Effective customer profiling is a cornerstone of strategic decision-making for digital start-ups seeking sustainable growth and customer satisfaction. This research investigates the clustering of customers based on recency, frequency, and monetary (RFM) analysis and employs validation metrics to derive optimal clusters. The K-means clustering algorithm, coupled with the Elbow method, Silhouette coefficient, and Gap Statistics method, facilitates the identification of distinct customer segments. The study unveils three primary clusters with unique characteristics: new customers (Cluster A), best customers (Cluster B), and intermittent customers (Cluster C). For platform-based Edutech start-ups, Cluster A underscores the importance of tailored learning content and support, Cluster B emphasizes personalized incentives, and Cluster C suggests re-engagement strategies. By understanding and addressing the diverse needs of these clusters, digital start-ups can forge enduring connections, optimize customer engagement, and fuel sustainable business growth.

Unlocking hidden market segments: A data-driven approach exemplified by the electric vehicle market

Article

May 2024
EXPERT SYST APPL

Insights and Future Prospects for ChatGPT: A Productive Computational Intelligence Approach on the Administration of Human Resources

Chapter

Full-text available

May 2024

Using regenerative artificial intelligence (AI) models, ChatGPT and its variations have quickly gained attention in scientific and public debate about the possible advantages and disadvantages they may have in economics, a republic, the community, and the environment. It is unclear if these advancements will create new jobs or eliminate existing ones, or if they redistribute human labour by producing additional knowledge and choices that may be insignificant or functionally unimportant. In light of the swift progress in productive neural networks (AI) as well as their arising consequences for job procedures worldwide and HR management in especially, this HRMJ argument writing generates jointly a variety of opinions concerning how we may improve HRM academic discourse. Giving a synopsis of the most recent advances in the discipline and creating a collection of possibilities for study are the main goals of this approach. By assuming tangible proof, we hope to advance the comprehension of artificial intelligence and push beyond the borders of what is currently known as science.

Implementation of Machine Learning and Deep Learning in Finance

Chapter

Apr 2024

Artificial intelligence, machine learning, and deep learning are powerful and intelligent technologies that have prevalent applications in the finance domain. These technologies enable financial institutions to develop advanced systems such as fraud detection, portfolio management, market segmentation, stock price prediction, and security anomaly detection. Recent decades have shown a great deal of research applications of AI in various areas of finance. This paper presents the state of ML and DL technologies, their implementation areas in finance, future trends and challenges.

The Use of Chatbot as an Artificial Intelligence Tool to Improve Intelligent Digital Marketing

Chapter

Oct 2023

Digital eras convert the way of work and make it easy for all industries including marketing. The emergence of artificial intelligence technology and machine learning technology leads marketing beyond digital marketing to intelligent digital marketing. In intelligent digital marketing, the company will be able to classify the customers based on their preferences, which ends up having different segmentations for customers. A chatbot is a tool that was discussed in this research to provide customer service to the customers. Chatbots depending on artificial intelligence, machine learning, and natural language processors can provide 24/7 services to customers. Other than chatbot benefits to help customers, it will help the company to understand customer needs and to target the required segment of customers for the specific service or product. Chatbot also has some limitations and user resistance which the researcher believes will shrink over time. Digital marketing is a huge industry that impacts both the customer and the company. Focusing on digital marketing with the use of artificial intelligence will create a new way of competitiveness and will create a data market. This paper focused on highlighting both digital marketing and chatbot as artificially intelligent tools to support the customer in digital marketing. The paper covers many facts and theories about this topic.

A Novel Weakly-supervised approach for RGB-D-based Nuclear Waste Object Detection and Categorization

Article

Full-text available

Dec 2018

This paper addresses the problem of RGBD-based detection and categorization of waste objects for nuclear de-commissioning. To enable autonomous robotic manipulation for nuclear decommissioning, nuclear waste objects must be detected and categorized. However, as a novel industrial application, large amounts of annotated waste object data are currently unavailable. To overcome this problem, we propose a weakly-supervised learning approach which is able to learn a deep convolutional neural network (DCNN) from unlabelled RGBD videos while requiring very few annotations. The proposed method also has the potential to be applied to other household or industrial applications. We evaluate our approach on the Washington RGB-D object recognition benchmark, achieving the state-of-the-art performance among semi-supervised methods. More importantly, we introduce a novel dataset, i.e. Birmingham nuclear waste simulants dataset, and evaluate our proposed approach on this novel industrial object recognition challenge. We further propose a complete real-time pipeline for RGBD-based detection and categorization of nuclear waste simulants. Our weakly-supervised approach has demonstrated to be highly effective in solving a novel RGB-D object detection and recognition application with limited human annotations. Index Terms-nuclear waste detection and categorization, nuclear waste decommissioning, autonomous waste sorting and segregation.

CUSTOMER PROFILING AND SEGMENTATION IN RETAIL BANKS USING DATA MINING TECHNIQUES

Article

Full-text available

Aug 2018

Malik Mubasher Hassan

Supervised Approach of Sentimentality Extraction from Bengali Facebook Status

Conference Paper

Full-text available

Dec 2016

Sentiment is the only things that separate human and machine. To simulate the feelings for machines many researchers have been trying to create method and automated the process to extract opinion of particular news, product or life entity. Sentiment Analysis (SA) is a combination of opinions, emotions and subjectivity of a text. Currently SA is the most demanding task in Natural Language Processing. Social networking site like Facebook are mostly used in expressing the opinions about a particular entity of life. Newspaper published news about a particular event and user expressed their feedback in news comments. Online product feedback is increasing day by day. So reviews and opinions mining play a very important role in understanding people satisfactions. Such opinion mining has potential for knowledge discovery. The main target of SA is to find opinions from text extract sentiments from them and define their polarity, i.e positive or negative. In this domain most of the model was designed for English Language. This paper describes a novel approach using Naïve Bayes classification model for Bengali Language. Here a supervised classification method is used with language rules for detecting sentiment for Bengali Facebook Status.

Customer Segmentation Algorithm Based on Data Mining for Electric Vehicles

Conference Paper

Apr 2019

Customer Segmentation Based on Transactional Data Using Stream Clustering

Conference Paper

Mar 2019

Customer Segmentation aims to identify groups of customers that share similar interest or behaviour. It is an essential tool in marketing and can be used to target customer segments with tailored marketing strategies. Customer segmentation is often based on clustering techniques. This analysis is typically performed as a snapshot analysis where segments are identified at a specific point in time. However, this ignores the fact that customer segments are highly volatile and segments change over time. Once segments change, the entire analysis needs to be repeated and strategies adapted. In this paper we explore stream clustering as a tool to alleviate this problem. We propose a new stream clustering algorithm which allows to identify and track customer segments over time. The biggest challenge is that customer segmentation often relies on the transaction history of a customer. Since this data changes over time, it is necessary to update customers which have already been incorporated into the clustering. We show how to perform this step incrementally, without the need for periodic re-computations. As a result, customer segmentation can be performed continuously, faster and is more scalable. We demonstrate the performance of our algorithm using a large real-life case study.

Cluster Analysis in Marketing Research: Review and Suggestions for Application

Article

May 1983

Applications of cluster analysis to marketing problems are reviewed. Alternative methods of cluster analysis are presented and evaluated in terms of recent empirical work on their performance characteristics. A two-stage cluster analysis methodology is recommended: preliminary identification of clusters via Ward's minimum variance method or simple average linkage, followed by cluster refinement by an iterative partitioning procedure. Issues and problems related to the use and validation of cluster analytic methods are discussed.

Issues and Advances in Segmentation Research

Article

Aug 1978

Yoram Wind

The author reviews the current status and recent advances in segmentation research, covering segmentation problem definition, research design considerations, data collection approaches, data analysis procedures, and data interpretation and implementation. Areas for future research are identified.

Healthcare market segmentation and data mining: A systematic review

Article

Nov 2018

Providing insight into healthcare consumers’ behaviors and attitudes is critical information in an environment where healthcare delivery is moving rapidly towards patient-centered care that is premised upon individuals becoming more active participants in managing their health. A systematic review of the literature concerning healthcare market segmentation and data mining identified several areas for future health marketing research. Common themes included: (a) reliance on survey data, (b) clustering methods, (c) limited classification modeling after clustering, and (d) detailed analysis of clusters by demographic data. Opportunities exist to expand health-marketing research to leverage patient level data with advanced data mining methods.

Product Differentiation and Market Segmentation as Alternative Marketing Strategies

Article

Jul 1956
J MARKETING

Wendell R. Smith

Exploring the concept of perceived sustainability at tourist destinations: a market segmentation approach

Article

Aug 2018

The concept of sustainability as perceived by tourists has rarely been studied and much less considered as a basis for segmentation. This article provides a conceptual framework based on tourists’ perception of sustainability policies at destinations and a multidimensional measure for this construct. An empirical analysis at five Mediterranean destinations validated the conceptual proposal and provided empirical evidence for the potential use of perceived sustainability in segmentation studies. Our findings show the discriminating power of the construct, identifying four latent clusters. Perceived sustainability as a tool for segmentation can help analyze the effectiveness of sustainability strategies and action taken.

Customer Segmentation via Data Mining Techniques: State-of-the-Art Review

Abstract and Figures

Recommended publications

Market Analytics

Data mining in customer segmentation

Data Mining Approach for Customer Segmentation in B2B Settings using Centroid-Based Clustering

AN IMPROVED CLUSTERING ALGORITHM FOR CUSTOMER SEGMENTATION

The Connection between Data Mining and Segmentation in Marketing Area

Applying clustering algorithm to analyze the data from different dimensions

Regions segmentation based on the level of development for quota allocation to graduate students for...