ArticlePDF Available

GBNRS: A Novel Rough Set Algorithm for Fast Adaptive Attribute Reduction in Classification

Authors:

Abstract and Figures

Feature reduction is an important aspect of Big Data analytics on today's ever-larger datasets. Rough sets are a classical method widely applied in attribute reduction. Most rough set algorithms use the priori domain knowledge of a dataset to process continuous attributes through using a membership function. Neighborhood rough sets (NRS) replace the membership function with the concept of neighborhoods, allowing NRS to handle scenarios where no a priori knowledge is available. However, the neighborhood radius of each object in NRS is fixed, and the optimization of the radius depends on grid searching. This diminishes both the efficiency and effectiveness, leading to a time complexity of not lower than O(N $^2$ ). To resolve these limitations, granular ball neighborhood rough sets (GBNRS), a novel NRS method with time complexity O(N), is proposed. GBNRS adaptively generates a different neighborhood for each object, resulting in greater generality and flexibility in comparison to standard NRS methods. GBNRS is compared with the current state-of-the-art NRS method, FARNeMF, and find that GBNRS obtains both higher performance and higher classification accuracy on public benchmark datasets. All code has been released in the open source GBNRS library at http://www.cquptshuyinxia.com/GBNRS.html.
Content may be subject to copyright.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1
GBNRS: A Novel Rough Set Algorithm for Fast
Adaptive Attribute Reduction in Classification
Shuyin Xia, Hao Zhang, Wenhua Li, Guoyin Wang, Elisabeth Giem, Zizhong Chen
Abstract—Feature reduction is an important aspect of Big Data analytics on today’s ever-larger datasets. Rough sets are a classical
method widely applied to attribute reduction. Most rough sets algorithms process continuous attributes by using a membership
function, which is set using a priori knowledge of the dataset. Neighborhood rough sets (NRS) replaces the membership function with
the concept of neighborhoods, allowing NRS to handle to handle scenarios where no a priori knowledge is available. However, the
neighborhood radius of each object in NRS is fixed, and optimization of the radius depends on grid searching. This diminishes both the
efficiency and effectiveness, leading to a time complexity of not lower than O(N2). To resolve these limitations, we propose granular
ball neighborhood rough sets (GBNRS), a novel NRS method with time complexity O(N). GBNRS adaptively generates a different
neighborhood for each object, resulting in greater generality and flexibility in comparison to standard NRS methods. We compare
GBNRS with the current state-of-the-art NRS method, FARNeMF, and find that GBNRS obtains both higher performance and higher
classification accuracy on public benchmark datasets. All code has been released in the open source GBNRS library at
http://www.cquptshuyinxia.com/GBNRS.html.
Index Terms—Rough sets, Neighborhood rough sets, Granular ball computing, Fuzzy rough sets.
F
1 INTRODUCTION
DATABASE storage technology and data collection has
grown to the point where massive datasets are the
norm. These large datasets often have redundant or irrel-
evant features which increase the computational complexity
of classifiers and lead to poor performance, both in accu-
racy and execution time [19, 33, 39]. Feature or attribute
reduction is essential to mitigate this issue. The core idea of
attribute reduction is to delete irrelevant or unimportant at-
tributes under the condition that the classification accuracy
of a classifier is either unchanged or increased. The result of
this reduction is twofold: as few attributes as possible will
carry as much data information as possible; and reducing
redundant attributes can increase the generalizability of
classifiers.
Polish scientist Zdzisław Pawlak first proposed the theo-
ry of rough sets in 1982. Rough sets are a mathematical data
mining tool that can effectively deal with uncertain, inac-
curate, and incomplete data [24, 25]. Rough sets have been
applied to machine learning, data mining, decision support
and analysis, soft computing, and other fields [2, 4, 27, 45],
but one of their most important applications is attribute
reduction [11, 21, 26, 29, 39]. Rough sets use an equivalence
relation to partition the universe, separating an information
system into an upper approximation set and a lower ap-
proximation set. The upper approximation set contains the
maximum set of elements that may belong to a class, and
the lower approximation set refers to the minimum set of all
S. Xia, H. Zhang, W. Li, G. Wang & Q. Zhang are with the
Chongqing Key Laboratory of Computational Intelligence, Chongqing
University of Telecommunications and Posts, 400065, Chongqing, China.
E-mail: xiasy@cqupt.edu.cn, 1025476698@qq.com,846659545@qq.com,
wanggy@cqupt.edu.cn
Z. Chen and E. Giem are with the Department of Computer Science and
Engineering, University of California Riverside, Riverside, CA, 92521.
E-mail: zizhongchen@gmail.com, gieme01@ucr.edu
elements that can be accurately classified. Rough sets were
initially developed for discrete data and also were not origi-
nally developed as algorithms with the most desirable time
complexities; as a consequence, improving the efficiency of
the algorithms and the ability to process continuous data are
two of the cardinal directions of research in rough sets.
Granulation and approximation are two basic problems
in rough sets and granular computing. The theory of rough
sets proposed by Pawlak partitions the universe according
to an equivalence relation, which results in the granulation
of the universe. However, the values of objects are continu-
ous in real number space, and therefore the classical equiv-
alence relation cannot be directly applied. Fuzzy rough sets
can process real number space, but requires prior knowl-
edge of the dataset in order to exactly or approximately set
a membership function and membership degree for each
object in advance. This knowledge cannot be provided a
priori in many scenarios. In order to overcome this obstacle,
Hu et al. introduced the neighborhood model of rough sets
in place of the membership function [14], forming a new
theory of fuzzy rough sets for processing continuous data:
neighborhood rough sets (NRS). The fuzzy degree in NRS is
characterized by the size of the neighborhood instead of the
membership degree. As changing the membership function
changes the positive region in fuzzy rough sets, so, different
sizes of neighborhood of neighborhood lead to changes in
the positive region. The key difference is that the size of the
neighborhood is influenced by the distribution of the dataset
and does not rely on any a priori knowledge. This size is a
parameter that is typically optimized by grid search. NRS
can perform better than other fuzzy rough sets on numerical
datasets in the absence of prior knowledge.
We build on NRS by introducing the idea of granular
computing into neighborhood rough sets to build a new
NRS method that is entirely parameter-free in processing
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2
continuous data. We require no membership function, no
membership degree, and no fixed radius parameter which
must be optimized by grid search, in direct contrast to all
other current methods. The main contributions of this paper
are as follows:
We establish the novel idea of granular ball neighborhood
rough sets by introducing granular ball computing into
neighborhood rough sets. GBNRS has a time complexity
of O(N), rendering it much more efficient than existing
NRS methods.
Granular ball neighborhood rough sets is the first rough
sets algorithm that is parameter-free in processing contin-
uous data; that is, it does not need to set any member-
ship function or optimize any extraneous parameters for
processing continuous data. To our knowledge, no other
rough sets algorithm has this desirable property.
GBNRS generates optimal neighborhood radii adaptively,
resulting in greater flexibility than existing NRS algo-
rithms. This adaptive nature of GBNRS leads to better
generalizability in comparison with classical NRS.
We additionally propose multiple granular ball neighbor-
hood rough sets, an improvement on GBNRS which more
fully uses label information in GBC, giving even higher
efficiency and accuracy than GBNRS.
The rest of the paper is organized as follows: we intro-
duce related works in Section 2, and then detail the back-
ground theory of neighborhood rough sets and granular
ball computing in Section 3. Section 4 presents the design
and analysis of our novel granular ball neighborhood rough
sets scheme. Evaluation results are given in Section 5. We
present our conclusions in Section 6.
2 RE LATED WORK
Considering the effectiveness of the rough sets method,
improving the efficiency of the rough sets algorithm is an
important direction of research in rough sets. However, at-
tribute reduction in rough sets is an NP-hard problem, mak-
ing this a fertile area for new ideas. Hoa et al. form a new
condition attribute set and take the final output condition
attribute set as attribute reduction by comparing the relative
positive regions [12]. To improve performance in the specif-
ic case where there are many important attributes in the
decision table, Skowron et al. propose attribute reduction
based on a discernibility matrix [29]. In Skowron’s method, a
minimal reduction is found. To further alleviate the problem
of combination explosion [29, 34, 43], Miao et al. introduce
information theory into attribute reduction and propose the
MIBARK algorithm [23, 38]. MIBARK is based on the idea
that information entropy can reduce the search space during
the reduction process and improve efficiency, although it is
possible the attribute reduction of the information system
may not be found in some cases. Dai et al. propose the
concept of discernible pairs based on rough sets theory
and constructed a unified measurement method to measure
attributes in both supervised and unsupervised frameworks
[9]. Dai et al. also propose a fast feature selection algorithm
based on the neighbor inconsistent pair, which can pare
down the time needed to find a reduction [11]. In recent
years, attribute reduction incremental algorithms based on
rough sets have been proposed for processing large sample
dynamic datasets. Liang et al. propose a group incremental
rough feature selection algorithm based on information
entropy [19], which can find new feature subsets in a shorter
time when multiple objects are added to the decision table.
Considering the high time and space complexity of Liang’s
algorithm, Yang et al. propose a rough sets attribute reduc-
tion incremental algorithm based on both the active sample
selection process and the attribute reduction process [40].
Currently, some datasets collected from various applications
change over time, especially when new objects are added;
new properties may appear. In order to solve the problem
of dynamic maintenance, a dynamic maintencance method
for approximate objects with targets and attributes at the
same time has been proposed by Chen et al. [6, 7]. These
algorithms all improve the efficiency of rough sets methods
in different scenarios. However, few of them can decrease
the time complexity to a low degree; most are O(N2)or
higher. This limits the applicability of these rough sets
methods on large-scale datasets.
Classical rough sets adopt the concepts of equivalence
partitions and equivalence classes to calculate granularity.
However, this processing method is only applicable to dis-
crete data, while the data in practical applications is usually
numerical. Therefore, improving the ability of rough sets
to process continuous attributes is another productive area
of research. Both fuzzy sets and rough sets are capable of
dealing with uncertain information [17]. Dubois and Prade
combine these theories to produce fuzzy rough sets (FRS)
theory [25], which provides an effective method for dis-
cretizing continuous data. FRS can be directly applied to the
reduction of continuous attributes. A membership function
and membership degree are used to describe fuzzy degree
in fuzzy rough sets, and a different membership function
can lead to changes in the positive region. Liu & Pedrycz et
al. redefine the concept of fuzzy rough sets on the basis of
axiomatic fuzzy set theory, which provides higher flexibility
and effectiveness [20]. Due to the advantages of fuzzy rough
sets theory, in recent years fuzzy rough sets have been
extended to many applications [3, 5, 10, 16, 18, 33, 38, 41].
However, as there is often significant overlap between dif-
ferent categories in a dataset, it is easy for samples to be
misclassified. Wang et al. propose a new fuzzy rough sets
model for those datasets with significant overlap between
different categories [33]. Aggarwal et al. propose proba-
bilistic variable precision fuzzy rough sets to address the
imprecision problem [1]. Aiming at the problem of large-
scale multimodality fuzzy classification, Hu et al. extract
fuzzy similarity by using a combination of kernals based on
rough sets theory [14]. In order to solve the problem of real
valued noisy features, Maji et al. propose an IT2 fuzzy-rough
feature selection method that combines the advantages of
IT2 fuzzy sets, rough sets, and the MRMS criterion [22].
This method is more effective with the exact membership
function is not known.
All of these methods have greatly advanced the devel-
opment of rough sets. However, they are all limited to cases
where the membership degree and membership function
can be exactly or approximately set in advance using a priori
knowledge; they do not work in cases where this knowledge
is missing. To enable FRS to handle these cases, Hu et al.
propose neighborhood rough sets (NRS). A new type of
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 3
fuzzy rough sets, it describes the fuzzy degree with the size
of a neighborhood instead of with a membership degree.
Like the membership function in classic fuzzy rough sets, a
different size of neighborhood will lead to a change in the
positive region. However, the size of the neighborhood can
be optimized according to the distribution of the dataset in-
stead of prior knowledge, mitigating the need to set a mem-
bership function and membership degree when processing
continuous data [15]. Therefore, on numerical datasets when
lacking prior knowledge, NRS can out-perform classical
fuzzy rough sets. To improve the efficiency of NRS, Hu
et al. further propose forward attribute reduction based
on neighborhood rough sets and fast search (FARNeMF)
with attribute significance as heuristic information [13].
FARNeMF reduces attributes by selecting the condition
attribute with the maximum attribute significance. As the
newly added attributes are only effective for distinguishing
boundary samples, only samples in the original negative
region and boundary region need to be evaluated to see
if they will belong to the new positive region when the
dependence degree of the decision attributes is calculated on
the newly added condition attributes. This avoids repeated
evaluations of the objects in the positive region and reduces
the amount of calculation. To our knowledge, FARNeMF is
the state-of-the-art NRS algorithm and therefore is selected
for comparison in this paper.
Neighborhood rough sets have shown good perfor-
mance in many applications [8, 32, 35]. However, NRS
searches the neighbors of each object within a given unified
range on a condition attribute set, and those objects whose
neighbors have the same labels are determined to belong to
the positive region in the current condition attribute set. This
gives NRS a time complexity of O(N2). In addition, the uni-
fied range also needs to either be artificially set or optimized
by grid searching. We turn to a new, lower time-complexity
method in this paper, combing this theory with NRS. Gran-
ular computing is a scalable, efficient, and robust method
that is similar to the way a human brain thinks [42]. It uses
simple, low-cost, satisfactory approximate solutions rather
than exact solutions to achieve tractable, robust, and cheap
intelligent systems that can describe the real world better
[31]. In Science [28], Rodriguez pointed out that granular
computing is an effective method for finding knowledge in
big data, and it has been combined with various learning
methods, such as rough sets [24], computing with words as
proposed by Zadeh [44], and label noise detection [36]. We
discuss the theory of granular computing further in the next
section.
3 BACKG RO UN D MODELS AND DEFINITIONS
GBNRS relies heavily on two previous theories, granular
ball computing and neighborhood rough sets, of which we
introduce the mathematical underpinnings here.
3.1 Neighborhood rough sets
We have introduced Pawlak rough sets, fuzzy rough sets,
and neighborhood rough sets somewhat loosely. We now
drill down into the details, defining the basic spaces we are
operating in, the neighborhoods we are working with in
neighborhood rough sets, and the positive region we have
mentioned, which is key to the operation of these methods.
Definition 1. Let ∆:Ω×Rbe a function generated
on a set .h,iis known as a metric space if satisfies:
(1) ∆ (x1, x2)0,∆ (x1, x2) = 0 iff x1=x2,x1, x2;
(2) ∆ (x1, x2) = ∆ (x2, x1),x1, x2;
(3) ∆ (x1, x3)∆ (x1, x2) + ∆ (x2, x3),x1, x2, x3.
In this case, is known as a metric.
Definition 2. Let a quaternion hU, A, V , firepresent an
information system where:
U={x1, x2, ..., xn}denotes a non-empty finite set of
objects. Uis called the universe;
A={a1, a2, ..., am}denotes a non-empty finite set of
attributes;
V=SaAVadenotes the set of all attribute values,
where Vadenotes the value range of attribute a;
f=U×AVdenotes a mapping function: xi
U, a A,f(xi, a)Va.
This information system is called a decision system
hU, C, Diif the set of attributes in the information system
above satisfies A=CD,CD= Ø, and D6= Ø, where C
is the condition attribute set and Dis the decision attribute
set.
Definition 3. Let U={x1, x2, ..., xn}be a non-empty
finite set of real space. xiU, the δ-neigborhood of xiis
defined as:
δ(xi) = {x|xU, ∆ (x, xi)δ},(1)
where δ0.
Definition 4. Let hU, C, Dibe a neighborhood decision
system. The decision attribute set Ddivides Uinto L
equivalence classes: X1, X2, ..., XL.BC, the lower
approximation and the upper approximation of the decision
attribute set Dwith respect to the condition attribute set
Bare respectively defined as:
NBD=
L
[
i=1
NBXi,(2)
NBD=
L
[
i=1
NBXi,(3)
where NBXi={xk|δB(xk)Xi, xkU},NBXi=
{xk|δB(xk)TXi6= Ø, xkU}, and its positive region and
boundary region are respectively defined as P OSB(D) =
NBD, BN (D) = NBDNBD.
NRS needs to search the neighbors of each object within
a given unified range on a condition attribute set, and
those objects whose neighbors have the same labels with
the queried object form the positive region of the condition
attribute set. NRS therefore has a time complexity of O(N2),
which is not efficient. In addition, the unified range either
needs to be set artificially (as we do in our experiments to
give this method the best possible running times in compar-
ison to our own methods), which decreases its effectiveness,
or optimized by grid searching (which we also present
in our experiments to give this method the best possible
accuracy, without considering the excessive time required to
obtain it), which decreases its efficiency. We now present the
theory of granular ball computing, an idea which allows us
to adaptively generate the range for each object and decrease
the time complexity of NRS to O(N).
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 4
3.2 Granular ball computing
In our own previous work [37], we have introduced gran-
ular computing into existing classifier models. We generate
granular balls (GB) that cover a dataset, and propose gran-
ular ball computing (GBC) classifiers by replacing the input
points with a completely symmetric structure, the granular
ball. The overall label of a GB is defined as the label with
the most appearances in a GB, that is, the majority label
in the GB. The quality of a granular ball is measured by
the purity of the ball, which is defined as the percentage of
majority samples in the granular ball. We show the detailed
mathematical definitions below [37]:
Definition 5. Let URNbe a dataset and U0U
(U06= Ø). We generate a granular ball GB with Cas its
center and ras its radius on Uas follows. Cdenotes the
center of gravity of all sample points in U, and rthe average
distance from all points in Uto C. Specifically, for a point
OiU0(i= 1,2..., M ), where Mdenotes the number of
points in U0, we have:
C=1
M
M
X
i=1
Oi, r =1
M
M
X
i=1
||OiC||.(4)
Definition 6. Let URNbe a dataset and U0U
(U06= Ø). Let GB be the granular ball generated on U0with
Cas its center and ras its radius. The overall label of GB is
defined as the label which appears the most times in GB;
that is, the majority label of GB.
As each granular ball covers many points but itself con-
sists of only two data, the center and the radius, the dataset
is greatly reduced. In addition, enough granular balls can fit
any decision boundary, allowing granular ball classifiers to
have good generalizability [37]. In the generation process
of GBs, we iteratively implement 2-means clustering on
each class until the purity of each granular ball reaches a
given threshold. Figure 1 shows the fitting process of GBC
with a purity threshold of 1. To provide visualizations of
multi-class data, we assign the points in the upper right
corner of the dataset fourclass the third label, according to
its distribution characteristics. We observe from Fig. 1 that
granular balls become more pure after partitioning, that is,
that the purity of each granular ball increases as the number
of granular balls becomes large. As seen in Fig. 1(f), when
the purity of the granular balls reaches 1, the decision curve
of positive and negative granular balls is very consistent
with that in the original dataset. GBC can not only fit any
decision boundary but also has a low time complexity of
O(N)[37]. As seen in Fig. 1(e), as the radius of a granular
ball is equal to the average distance of all objects in the
granular ball from its center, some samples may not be
covered by any granular ball. However, this will not affect
the boundary consistency between the original dataset and
the granular balls [37].
Motivated by GBC and NRS, granular ball neighborhood
rough sets are proposed. We show that this procedure
adaptively generates a different radius for each object and
results in a much more efficient method.
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Fig. 1. Visualizations of both granular ball generation and the generation
positive region using GBNRS on the dataset fourclass with the purity
degree threshold set at 1. (a)-(f) The generation results of six iterations.
(g) The final granular balls. (h) The generation positive region, which
coincides with the centers of the granular balls. The red points and red
granular balls are labeled +1, and the black points and black granular
balls are labeled -1.
4 GRANULAR BA LL NEIGHBORHOOD ROUGH
SET S
4.1 Theory and mathematical models
We have already shown the main processes of granular ball
neighborhood rough sets in Figure 1. We use granular ball
computing to adaptively generate many granular balls with
different radii. The purity of the balls in the GBC step is
set to 1, requiring the samples in each granular ball to
all have the same label. The centers of the granular balls
can then be used as generation positive regions for the
following three reasons: 1) the centers are completely within
the corresponding class and will not affect the decision
boundary of the dataset; 2) the process of granular ball
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 5
generation has a certain randomness since k-means is used
in GBC, but the centers of the granular balls are relatively
stably located inside of the corresponding class of objects
and therefore serve to generate a stable positive region; 3)
that the center of a granular ball belongs to the positive re-
gion is completely consistent with the definition of positive
regions in NRS [14, 15], that is, that an object’s neighbors
in an appropriate range all have the same labels. We see
this exactly by considering the center of a granular ball as
an object, the range as the granular ball’s radius, and the
neighbors of the center as the objects in the granular ball
within the radius (or range). We can see that the neighbors
all have the same labels as the center. Since the purity
of each granular ball is 1, objects near the boundary and
heterogeneous objects will not be covered in any granular
ball and have no chance to contaminate the positive region.
Thus, considering the centers of the granular balls to be
positive regions is strictly consistent with the definition of
positive regions in NRS theory.
The mathematical models of our granular ball neighbor-
hood rough sets (GBNRS) are described in the following
definitions.
Definition 7. Let U={x1, x2, ..., xn}be a non-empty
finite set, and let GBC generate granular balls that cover U.
The j-th granular ball is GBj, it’s center is Cj, and its radius
is rj. For xiGBj, we define the neighborhood of xias:
σ(xi) = {x|∀xGBj,dis (x, Cj)rj},(5)
where dis (x, Cj)is the distance from xto Cj.
There is some randomness in the process of granular ball
generation due to the randomness of the initial centroid
selection in k-means. The objects near the boundary of a
granular ball belong to its positive region with a greater
degree of randomness than the objects near its center. We
therefore introduce the generation lower approximation in
Definitions 8 and 9 and the generation positive region in
Definition 10. The generation positive region is the core
indicator measuring the importance of attributes in GBNRS.
Definition 8. Let hU, A, V, f ibe an information system.
For xiU, a A, s.t. f(xi, a)Va, let GBC generate
granular balls that cover the entire sample set. The j-th gran-
ular ball GBj. For PA,XU, the upper approximation
set, lower approximation set and generation lower approximation
set of Xwith respect to a attribute set Pare defined as:
P X =nxiU, xiGBj(P)|σ(xi)\X6= Øo,(6)
P X ={xiU, xiGBj(P)|σ(xi)X},(7)
P X 0=
x=1
lj
lj
X
i=1
xi|xiGBj(P), σ (x)X
,(8)
where GBj(P)denotes the jth granular ball under the
condition attribute set P, and ljrepresents the number of
objects in the j-th granular ball.
Definition 9 follows from Definition 8 in the special case
of a decision system.
Definition 9. Let hU, C, Dibe a decision system. D
divides Uinto Lequivalence classes: X1, X2, ..., XL. Let
GBC generate granular balls that cover the entire sample
set. The i-th granular ball is GBi.BC, we define the
upper approximation,lower approximation, and generation lower
approximation of the decision attribute set Dwith respect to
the condition attribute set Bas follows:
BD =
L
[
i=1
BXi,(9)
BD =
L
[
i=1
BXi,(10)
BD0=
L
[
i=1
BX 0
i,(11)
where BXi={xkU|xkGBj(B), σ (xk)TXi6= Ø},
BXi={xkU|xkGBj(B), σ (xk)Xi},BX0
i=
nx=1
ljPlj
k=1 xk|xkGBj(B), σ (xk)Xio.
As Definition 9 describes, the lower approximation of D
consists of those objects in granular balls with purities of 1.
As a reminder, the purity of a granular ball is the percentage
of the majority sample in the granular ball. The generation
lower approximation consists of the centers of those granular
balls with purities of 1 in the lower approximations.
Definition 10. Let hU, C, Dibe a decision system. Let
GBC generate granular balls that cover the entire sample
set. BC, the generation positive region is defined as
GP osB(D) = BD0,(12)
where BD0represents the generation lower approximation
of the decision attribute set Dwith respect to the condition
attribute set B.
The objects contained in a granular ball with a purity
of 1 compose a positive region that allows for the adap-
tive generation of both granular balls and positive regions.
However, the objects near the boundary of a granular ball
in GBNRS are not stable due to the 2-means implemented
in GBC. Therefore the generation lower approximation in
Definition 9 and the generation positive region in Defini-
tion 10 are proposed. These two definitions, in contrast to
current NRS methods, imply that that the generation lower
approximation of GBNRS consists of the centers of granular
balls, which probably do not exist in U. That is to say, the
objects in the generation lower approximation are generated
instead of a selection in U.
Definition 11. Let hU, C, Dibe a decision system. For a
condition attribute set BC,Bis a relative reduction of C
if Bsatisfies:
GP osB(D) = GP osC(D),(13)
GP osB(D)6=GP osB−{a}(D), a B. (14)
As described in Definition 10, the generation positive
region is completely consistent with the traditional positive
region in NRS theory, because the neighbors of an object
belonging to the generation positive region all have the
same label as the object. These regions can be adaptively
and efficiently generated. Therefore, the generation positive
region can replace the positive region to stably determine
whether an attribute should be deleted. We describe this in
Definition 11.
Figure 2 shows the comparison between classical rough
sets, NRS, and GBNRS. In Fig. 2(a), the equivalence classes
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 6
(a) (b)
(c)
Fig. 2. A comparison of three rough sets methods. (a) Classical rough
sets in discrete space; (b) Neighborhood rough sets in real space; (c)
Granular ball neighborhood rough sets in the same space. In (b-c) the
samples marked with ”·” are the first type of samples, and those marked
with ”+” are the second type of samples. In (c) the rectangles represent
the objects in the generation positive regions.
marked in red belong to xcompletely and form the lower
approximation of X. In Fig. 2(b), the samples in the circular
neighborhood of sample x1are all from class 1. Therefore,
x1belongs to the lower approximation of class 1. The
neighborhood samples of x3are from class 2, so x3belongs
to the lower approximation of class 2. In contrast, in the
neighborhood samples of x2we find samples belonging to
class 1 and samples belonging to class 2. Therefore x2is the
boundary sample. In Fig. 2(c), each object in the generation
positive region has a different and adaptive range to search
its neighborswhich NRS does not have. The generation posi-
tive region consists of the centers of the granular balls. If the
purity threshold is 1, the outlier objects not contained in any
granular ball compose the boundary region: If the purity of
a granular ball is less than 1, those objects contained within
the ball also belong to the boundary region according to
Definition 9.
It should be noted that the generation positive region
does not belong to the original set Uand is not added into
U. As seen in Definition 11, it is only used as a much more
efficient measurement to replace the traditional positive
region and judge if a condition attribute set is a relative
reduction.
4.2 Time complexity
A small number of granular balls can fit the decision bound-
ary well, resulting in a GBC time complexity of O(N)[37].
We see that the production of the generation positive region
in GBNRS is therefore much more efficient than obtaining
the positive region in NRS. This can be intuitively explained
as follows: NRS needs to search the neighbors of all objects.
In contrast, GBNRS only needs to generate a small number
of clusters using a fast algorithm—to wit, k-means—the
centers of which will constitute the generation positive
region. A granular ball inside its corresponding class is
likely to contain more objects than those relatively close to
the decision boundary. This results in a small number of
granular balls; that is, the number of objects belonging to
the generation positive region in GBNRS is much smaller
than the number of objects belonging to the positive region
in NRS. In spite of the small number of granular balls, GB-
NRS is more effective because the adaptively-sized granular
balls can fit the decision boundary of datasets with various
distributions well [37]. This efficiency will be demonstrated
in our experimental section.
4.3 An improved method for computing granular balls
The method for granular ball computing we outlined does
not use any label information in the process of implement-
ing 2-means. For a given granular ball GB containing M
types of samples, we can improve GBC in two ways: first,
2-means is replaced with M-means, and second, a sample
is selected from each class to form the M initial centroids.
As a result, the shape of the initial clusters will be closer to
the distribution of the different classes in the original data,
and label information is fully incorporated into the k-means
process. Consequently, the number of granular balls in this
improved GBC method is much smaller than in our original
GBC step. A smaller number of granular balls leads to
higher efficiency and a larger average size of granular ball.
Larger granular balls are desirable because they generate
more stable generation positive regions, which can result
in higher classification accuracy. We call the NRS based on
this improved GBC multiple granular ball neighborhood
rough sets (MGBNRS). Figure 3 shows a visualization of
MGBNRS using a BPNN on the same data from Figure 1. In
subplots (a)-(c), the first three iterations show the boundary
described by the granular balls computed using this method
corresponds much more closely to the original data than the
granular balls in Figure 1. This improved GBC converged
in the 8th iteration and generated 34 granular balls: in
contrast, our original GBC converged in the 10th iteration
and generated 43 granular balls. Clearly, our improved GBC
is more efficient than the original GBC.
4.4 Algorithm design
The GBNRS algorithm can be divided into two parts. First,
the generation positive region is obtained, and second,
some attributes are reduced according to their importance
as judged by the generation positive region. In the first
stage, GBC is implemented with a purity threshold of 1,
and the initial generation positive region consists of the
centers of all granular balls. Then, k-means clustering with
the centers initialized at the centers of the granular balls is
used to globally fine-tune the centers and the granular balls.
Any resulting granular balls with purities unequal to 1 are
removed, and the generation positive region now consists
of the the centers of the remaining granular balls.
In the second stage, partition of points in the generation
positive region is produced after an attributed is removed.
Similar in approach to one iteration in k-means clustering,
each object is partitioned into its nearest cluster (that is,
granular ball), the center of which belongs to the genera-
tion positive region from the first stage. This partition will
generate new granular balls. If there is a granular ball with a
purity less than 1, its center will not belong to the generation
positive region. This indicates that the generation positive
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 7
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Fig. 3. Visualizations of both granular ball generation and the generation
positive region using MGBNRS on the dataset fourclass with the purity
degree threshold set at 1. (a)-(c) The first three iterations. (c)-(f) Middle
iterations. (g) The final granular balls. (h) The generation positive region,
which coincides with the centers of the granular balls found during
improved GBC. The red points and red granular balls are labeled +1,
and the black points and black granular balls are labeled -1.
region is changed if the attribute is deleted, and therefore the
attribute should be retained. Otherwise, if the purity of each
granular ball is still 1, the generation positive region has
not changed, so the attribute should be removed. Because
attribute deletion may slightly affect the distribution of the
dataset, k-means is used again to fine-tune the generation
positive region. This algorithm does not optimize any extra
parameters at all, rendering it completely adaptive. The
two stages are repeated until no further attributes can be
reduced. The specific algorithm is outlined in Algorithm
1. Clusters before and after an attribute is deleted often
have great similarities, so the cluster centers of the k-means
algorithm in each iteration of step 12 are initialized to the
centers of clusters before the attribute in question is deleted.
Algorithm 1: Granular Ball Neighborhood Rough Sets
Input: The dataset data, the condition attribute set C=
(c1, c2, ..., cm);
Output: A reduced attribute set C0;
// Stage 1: Obtaining the generation positive
region on the current condition attribute set C0.
1: C0is initialized to C;
2: Implement GBC on the data with the purity threshold
at 1 on C0: remove any outlier objects that are not
contained in any granular ball—these outlier objects
cannot possibly belong to the generation positive region
3: k-means clustering is implemented on the centers of
the resulting granular balls to globally fine-tune each
granular ball;
4: The generation positive region now consists of the cen-
ters of the granular balls whose purity is 1 after the
previous step; all other balls are discarded.
//Stage 2: Determining if a condition attribute
should be reduced by comparing the generation posi-
tive region before and after the attribute is removed.
5: Remove a condition attribute ciin C0;
6: Generate the partition based on the centers in step 4 by
partitioning each object into the nearest granular ball
(that is, a cluster);
7: Compute the purity of the new granular balls in step 6;
8: if the purity of each granular ball is equal to 1 then
9: // This indicates that the generation positive region
is unchanged and that the attribute should be removed.
10: C0=C0c
11: Re-run k-means clustering on the current centers of
the granular balls to generate new granular balls and
split them until the purity of each ball is 1;
12: Go to step 4;
13: else
14: cishould be retained;
15: if all attributes in C0have been checked then
16: Terminate;
17: else
18: Remove a new attribute in C0and go to step 6;
19: end if
20: end if
Therefore, even though the k-means clustering algorithm
is iteratively implemented in step 12, the algorithm can
converge quickly because the initial centers are close to the
convergent solution.
This is the process for GBNRS. Our second algorithm,
MGBNRS, only differs in the ball creation process. GBNRS
uses 2-means clustering to iteratively partition each granu-
lar ball into two balls until the purity of each granular ball is
1. In contrast, MGBNRS uses k-means clustering to partition
a granular ball, where k is equal to the number of classes in
the granular ball. Additionally, the initial centroids of the
k-means algorithm are selected from different classes.
5 EXPERIMENTS
In this section, we demonstrate the effectiveness and ef-
ficiency of GBNRS on widely-used benchmark datasets
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 8
TABLE 1
The Information of Datasets
Data Samples
Numerical
Condition
Attributes
Categorical
Condition
Attributes
Class
1 anneal 798 6 32 5
2 credit 690 6 9 2
3 german 1000 7 12 2
4 heart1 270 7 6 2
5 heart2 303 6 7 5
6 hepatitis 155 6 13 2
7 horse 368 7 16 2
8 iono 351 34 0 2
9 wdbc 569 30 0 2
10 wine 178 13 0 3
11 lymphography 148 0 18 4
12 zoo 101 0 16 7
13 abalone 4177 7 1 29
14 electrical 10000 13 0 2
15 htru2 17898 8 0 2
16 mushroom 8124 0 22 2
17 letter 20000 16 0 26
in comparison with classical NRS and the current state-
of-the-art NRS algorithm, Forward Attribute Reduction
Based on Neighborhood Rough Sets and Fast Search [13].
FARNeMF is an accelerated NRS method that finds a
new positive region only in the boundary region dur-
ing the process of forward attribute reduction. To our
knowledge, it is currently the most efficient and effec-
tive NRS method. We design our experiments along the
lines of Tsang [30] and Chen [5]: as the quality of the
reduced attribute set is not related to the testing clas-
sifier used, we use only a common testing classifier—
the nearest neighbor algorithm—to verify the quality of
the reduced attribute set. 10 cross-validation is used, and
the classification accuracies with variances are present-
ed. We select a suite of widely-used benchmark datasets
in feature selection [13, 14], the UCI benchmark dataset-
s (http://archive.ics.uci.edu/ml/datasets.html). These are
described in detail in Table 1; we have extracted all usable
attributes from the datasets. For example, in the Horse
dataset, there are 27 total attributes; of these, 23 are condi-
tional attributes (categorized in Table 1) and 1 is a decision
attribute. The remaining 3 attributes are descriptions of
lesion types which cannot be used as conditional attributes.
5.1 Effectiveness
Our experimental results are shown in Tables 2 and 3,
and supplementary Tables 1 to 11. In classical NRS and
FARNeMF, in order to find a good solution the neighbor-
hood radius is increased from 0.01 to 0.5 in intervals of 0.01.
The results are shown in Table 3, and supplementary Tables
1 to 11. As GBNRS inherently displays some instability due
to the random initialization of the centroids of the granular
balls during k-means clustering, both GBNRS and MGBNRS
are implemented ten times, and the best solution is selected.
In the tables, NA denotes the number of reduced attributes,
TABLE 2
Classification Accuracy of Different NRS Methods
Data original NRS FARNeMF GBNRS MGBNRS
1 0.954±0.032 0.944±0.035 0.944±0.035 0.971±0.023 0.975±0.022
2 0.871±0.042 0.865±0.042 0.865±0.042 0.867±0.028 0.873±0.033
3 0.709±0.036 0.737±0.048 0.737±0.048 0.734±0.023 0.731±0376
4 0.819±0.051 0.819±0.062 0.819±0.062 0.809±0.063 0.801±0.103
5 0.551±0.069 0.598±0.049 0.598±0.049 0.591±0.058 0.613±0.050
6 0.800±0.112 0.846±0.063 0.846±0.063 0.851±0.077 0.876±0.090
7 0.771±0.081 0.796±0.082 0.796±0.082 0.825±0.051 0.834±0.042
8 0.848±0.069 0.871±0.044 0.871±0.044 0.883±0.030 0.883±0.030
9 0.967±0.025 0.972±0.023 0.972±0.023 0.972±0.019 0.974±0.020
10 0.943±0.060 0.955±0.036 0.955±0.036 0.966±0.029 0.978±0.126
11 0.806±0.137 0.814±0.096 0.814±0.096 0.861±0.088 0.863±0.085
12 0.941±0.067 0.914±0.102 0.914±0.102 0.959±0.054 0.953±0.092
13 0.525±0.050 0.529±0.034 0.529±0.034 0.527±0.057 0.527±0.057
14 0.914±0.006 0.914±0.006 0.914±0.006 0.916±0.009 0.916±0.009
15 0.977±0.004 0.914±0.046 0.914±0.046 0.978±0.004 0.978±0.004
16 0.950±0.109 0.961±0.034 0.961±0.034 0.945±0.118 0.983±0.055
17 0.979±0.004 0.979±0.004 0.979±0.004 0.979±0.004 0.979±0.004
and CA denotes the classification accuracy. RN denotes the
neighborhood radius in NRS and FARNeMF.
It can be seen from Table 3 that when the neighborhood
radius is small, the number of remaining attributes is small,
and the corresponding classification accuracy is also low.
As the neighborhood radius increases, both the number
of attributes and the corresponding classification accuracy
gradually increase. However, when the neighborhood ra-
dius exceeds a certain value, the number of attributes and
the corresponding classification accuracy gradually decrease
again. The above results are explained by the following:
When the neighborhood radius is very small, the neighbor-
hood of sample point contains very few other sample points.
Therefore most sample points stably belong to the positive
region, so the number of reduced attributes is large, result-
ing in a low classification accuracy. As the neighborhood
radius increases, whether a sample belongs to the positive
region is more affected by the points near it. The number
of reduced attributes gradually decreases, and the classifica-
tion accuracy gradually increases. When the neighborhood
radius is large enough, the neighborhood of each sample
point contains many other sample points that belong to
different categories, resulting in many data being classified
as the boundary region. Therefore, both the number of
remaining attributes and the corresponding classification
accuracy is small. Although the overall trend shows this
pattern, as demonstrated in the tables, the optimal solution
corresponding to the highest classification accuracy cannot
be found exactly. Therefore, following previous work with
NRS [15], we search for the optimal solution by increasing
the neighborhood radius from 0.01 to 0.5 in intervals of 0.01,
and present the experimental results in Table 2.
As redundant attributes are deleted, the representative
ability of the remaining attributes is improved. As shown in
Table 2, almost all of the highest classification accuracies—
marked in bold—appear in the NRS, FARNeMF, GBNRS,
and MGBNRS columns; that is, these methods all produce
an attribute set on which the test classifier can achieve a
higher accuracy than on the original dataset because the
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 9
TABLE 3
Attribution Reduction on the Dataset WDBC
RN Reduction Results NA CA
0.01 [0, 1, 8, 27] 4 0.9493±0.0368
0.02 [14, 21, 27, 28, 29] 5 0.9226±0.0416
0.03 [1, 4, 5, 9, 14, 27, 28] 7 0.9332±0.0284
0.04 [0, 7, 8, 9, 11, 18, 21, 24, 27] 9 0.9616±0.0278
0.05 [1, 4, 7, 8, 9, 11, 15, 18, 21, 22, 24, 29] 12 0.9617±0.0379
0.06 [0, 1, 4, 5, 6, 8, 9, 11, 18, 20, 21, 24, 25, 27] 14 0.9650±0.0282
0.07 [0, 1, 4, 5, 6, 8, 9, 10, 11, 15, 18, 20, 21, 24, 25, 26, 27, 28, 29] 19 0.9668±0.0275
0.08 [0, 1, 4, 5, 6, 8, 9, 10, 11, 14, 15, 18, 20, 21, 24, 26, 27, 28, 29] 19 0.9650±0.0215
0.09 [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 18, 19, 20, 21, 24, 25, 26, 27, 28, 29] 23 0.9684±0.0213
0.1 [1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 14, 15, 17, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29] 24 0.9702±0.0232
0.11 [0, 1, 4, 5, 6, 8, 9, 10, 11, 14, 15, 17, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29] 23 0.9702±0.0216
0.12 [0, 1, 3, 4, 5, 6, 7, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 23, 24, 25, 26, 27, 28, 29] 24 0.9720±0.0248
0.13 [0, 1, 2, 3, 4, 5, 7, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 24, 25, 26, 27, 28, 29] 23 0.9685±0.0228
0.14 [0, 1, 3, 4, 5, 6, 7, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 24, 25, 26, 27, 28, 29] 23 0.9703±0.0231
0.15 [1, 3, 4, 5, 6, 7, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 24, 25, 26, 27, 28, 29] 22 0.9667±0.0190
0.16 [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 21, 22, 23, 24, 25, 26, 27, 28, 29] 25 0.9719±0.0234
0.170.18 [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 20, 21, 24, 25, 26, 27, 28, 29] 24 0.9702±0.0216
0.19 [0, 1, 5, 7, 8, 9, 11, 12, 14, 15, 16, 17, 18, 21, 23, 24, 25, 26, 27, 28, 29] 21 0.9615±0.0254
0.2 [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 14, 15, 16, 17, 18, 19, 21, 23, 24, 25, 26, 27, 28, 29] 24 0.9703±0.0199
0.21 [0, 1, 5, 6, 8, 9, 10, 11, 14, 15, 16, 17, 18, 20, 21, 23, 24, 25, 26, 27, 28, 29] 22 0.9668±0.0250
0.22 [0, 1, 5, 6, 8, 9, 10, 11, 14, 15, 16, 17, 18, 20, 21, 23, 24, 25, 26, 29] 20 0.9632±0.0263
0.23 [0, 1, 5, 6, 7, 8, 9, 10, 11, 14, 16, 17, 18, 21, 23, 24, 25, 26, 28, 29] 20 0.9633±0.0319
0.24 [0, 1, 5, 6, 7, 8, 9, 10, 11, 14, 17, 18, 21, 23, 24, 25, 26, 28, 29] 19 0.9633±0.0319
0.25 [0, 1, 5, 6, 7, 8, 9, 10, 11, 14, 15, 16, 18, 21, 23, 24, 25, 26, 28, 29] 20 0.9668±0.0299
0.26 [0, 1, 5, 6, 8, 9, 10, 11, 14, 15, 16, 17, 18, 21, 22, 23, 24, 25, 26, 29] 20 0.9615±0.0254
0.27 [0, 1, 4, 5, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 21, 24, 25, 26, 28, 29] 22 0.9633±0.0298
0.28 [0, 1, 4, 8, 9, 10, 11, 14, 16, 17, 18, 19, 26, 29] 14 0.9580±0.0274
0.29 [0, 1, 4, 8, 9, 10, 11, 14, 16, 18, 19, 27, 28, 29] 14 0.9579±0.0248
0.3 [0, 1, 4, 8, 9, 10, 11, 14, 16, 18, 19, 27, 29] 13 0.9527±0.0317
0.31 [0, 1, 4, 8, 9, 10, 11, 12, 14, 16, 18, 19, 24, 25, 27, 29] 16 0.9510±0.0290
0.32 [11, 16] 2 0.6786±0.0421
0.33 [0, 4, 9, 10, 14, 16, 17, 18, 29] 9 0.9385±0.0343
0.34 [4, 9, 13, 14, 16, 17, 18, 29] 8 0.9070±0.0383
0.350.37 [14, 16, 17] 3 0.7418±0.0900
0.380.5 [] 0 0
removal of redundant features improves generalizability.
Although we search for the optimal solution in NRS and
FARNeMF by increasing the neighborhood radius from
0.01 to 0.5 in intervals of 0.01, the interval 0.01 may not
be small enough, and the optimal neighborhood radius
using our grid searching may not be optimal enough. In
contrast, the neighborhood radii in GBNRS and MGBNRS
are generated adaptively according to the granular balls’
cohesion with the decision boundary of the original dataset,
allowing GBNRS and MGBNRS to achieve higher average
classification accuracy than classical NRS and FARNeMF.
Because the granular ball centers in MGBNRS are more
stable than the granular ball centers in GBNRS, MGBNRS
can achieve higher classification accuracy than GBNRS.
As shown in Table 2, GBNRS and MGBNRS did not give
a higher accuracy than other methods on three datasets.
In both methods, some granular balls are quite small and
their centers are not stable enough to be considered part
of the generation positive region. In addition, although
the generation of granular balls in MGBNRS more fully
utilizes label information in the dataset than the generation
of granular balls in GBNRS, both generations have an ele-
ment of randomness due to the k-means selection process—
though generation positive regions in MGNRS have a higher
probability of being stable than those in GBNRS. This is a
known issue and something we plan to address in future
work.
Obviously, grid searching for the optimal neighborhood
radius will considerably deteriorate the performance of
NRS. Our experiments in the next section demonstrate that
GBNRS and MGBNRS are much more efficient than NRS
even when grid searching is not used.
5.2 Efficiency
We demonstrate the efficiency of GBNRS and MGBNRS in
comparison with NRS and FARNeMF on six large bench-
mark datasets selected from the previous benchmarks. We
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 10
(a) (b)
(c) (d)
(e) (f)
Fig. 4. Efficiency comparison. (a) On the dataset german. (b) On the
dataset abalone. (c) On the dataset mushroom. (d) On the dataset
electrical. (e) On the dataset htru2. (f) On the dataset letter.
randomly select ten percent of the dataset to begin with, and
implement the experiment on this ten percent of the data:
we gradually increase the size of our working dataset by
feeding in the remaining data in increments of ten percent.
Because NRS and FARNeMF are very slow for large-scale
datasets, the neighborhood radius in this experiment is fixed
at 0.02 and no grid searching is performed. Even though the
time spent in grid searching for the neighborhood radius
is not considered in either NRS or FARNeMF, we can see
from Figure 4 that the advantage of GBNRS and MFBNRS
in efficiency is considerable. Except in a few cases on smaller
working datasets (that is, 40% or less of the original dataset)
in Fig. 4(a) and Fig. 4(b), the GBNRS execution time curve
is below those of NRS and FARNeMF. The FARNeMF curve
is always below that of NRS because some points in the
positive region in FARNeMF do not affect the computation.
When the size of the dataset increases, the execution time of
GBNRS is always below that of both NRS and FARNeMF.
For example, in the largest experimental dataset, letter in
Fig. 4(f), NRS and FARNeMF take more than 6000 and
1000 seconds respectively; in contrast, GBNRS only takes
80 seconds, a more than 90% improvement. However, in
all datasets, the completion time curve for MGBNRS is the
absolute lowest of all methods, including on those datasets
on which GBNRS was out-performed at low sample sizes
by the current state-of-the-art FARNeMF method. This in-
dicates that MGBNRS is more efficient than GBNRS; this
is because label information is more fully involved in the
improved GBC in MGBNRS, and therefore the number of
generation granular balls in MGBNRS is much smaller than
in GBNRS. We can also easily see the at minimum O(N2)
time of NRS and FARNeMF reflected in these graphs: if grid
searching time was included instead of a fixed radius, this
time would only increase. However, GBNRS and MGBNRS
both reflect their actual O(N)times, as seen in these graphs.
An unexpected result of the experiment in Fig. 4(a) and
Fig. 4(b) is that increasing the size of the small working
dataset lead to a decrease in the running time. This is caused
by a small amount of randomness in both the compared
algorithms and the running processes of the computer.
6 CONCLUSION
We propose GBNRS, a novel rough set method. This is
the first parameter-free rough set algorithm for process-
ing continuous data; it requires no membership functions
nor the optimization of any mid-computation parameter-
s for processing continuous data. We demonstrably out-
perform the current state-of-the-art NRS algorithm with a
time complexity of O(N). Our adaptive method of selecting
the neighborhood radius improves the quality of attribute
reduction. On benchmark datasets widely used for feature
selection, GBNRS obtains higher classification accuracy than
both classical neighborhood rough sets and the current best
NRS algorithm, FARNeMF. We show that efficiency is im-
proved by more than 90% on the relatively large benchmark
dataset letter. We also improve granular ball computing and
propose MGBNRS to achieve even higher efficiency than
GBNRS.
In our future work we would like to further improve
the stability of GBNRS and MGBNRS. Our method for
computing the generation positive region already improves
the stability of GBNRS, but there is still some inherent
instability in the process of partitioning the granular balls
during attribute reduction, due to the random initialization
of the centers of the granular balls in the new iteration. We
also seek to improve the accuracy of GBNRS and MGBNRS:
we were unable to achieve the highest accuracy on 3 of our
20 test datasets. This could either be because some granular
balls are so small that their centers are too unstable to be
considered part of the generation positive region (which
we want to remedy in itself), or because the generation
of granular balls in both GBNRS and MGBNRS has some
inherent randomness because k-means is used, even though
label information is more fully used in MGBNRS than
GBNRS. As our execution time improvement is so great that
we have achieved O(N)performance, we must improve in
accuracy, which is held back by randomness and instability.
These stability issues are of great interest to us and provide
strong motivation for future work.
7 ACKNOWLEGEMENTS
The authors greatly thank the handling associate editor and
all anonymous reviewers for their valuable comments. This
work was supported in part by the National Natural Science
Foundation of China under Grant Nos. 61806030, 61876027,
61876201, and 61772096, the National Key Research and
Development Program of China (2016QY01W0200), the
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 11
Natural Science Foundation of Chongqing (cstc2019jcyj-
msxmX0485), and NICE: NRT for Integrated Computational
Entomology, US NSF award 1631776.
REFERENCES
[1] Aggarwal, and Manish. ”Probabilistic Variable Precision Fuzzy
Rough Sets.” IEEE Transactions on Fuzzy Systems (2015):1-1.
[2] Aggarwal, and Manish. ”Rough Information Set and Its Applica-
tions in Decision Making.” IEEE Transactions on Fuzzy Systems
25.2(2017):265-276.
[3] Albanese, Alessia , S. K. Pal , and A. Petrosino . ”Rough Sets, Kernel
Set, and Spatiotemporal Outlier Detection.” IEEE Transactions on
Knowledge and Data Engineering 26.1(2014):194-207.
[4] An, Shuang , et al. ”Data-Distribution-Aware Fuzzy Rough Set
Model and its Application to Robust Classification.” IEEE Trans
Cybern 46.12(2015):3073-3085.
[5] Chen, Degang , and Y. Yang . ”Attribute Reduction for Het-
erogeneous Data Based on the Combination of Classical and
Fuzzy Rough Set Models.” IEEE Transactions on Fuzzy Systems
22.5(2014):1325-1334.
[6] Chen, Hongmei , et al. ”A decision-theoretic rough set approach
for dynamic data mining.” IEEE Transactions on Fuzzy Systems
23.6(2015):1-1.
[7] Chen, Hongmei , et al. ”A Rough Set-Based Method for Up-
dating Decision Rules on Attribute Values’ Coarsening and Re-
fining.” Knowledge & Data Engineering IEEE Transactions on
26.12(2014):2886-2899.
[8] Chen, Yumin , et al. ”Measures of uncertainty for neighborhood
rough sets.” Knowledge-Based Systems 120(2017):226-235.
[9] Dai, Jianhua , et al. ”Attribute Selection for Partially Labeled
Categorical Data By Rough Set Approach.” IEEE Transactions on
Cybernetics (2016):1-12.
[10] Dai, Jianhua , et al. ”Maximal Discernibility Pairs based Approach
to Attribute Reduction in Fuzzy Rough Sets.” IEEE Transactions on
Fuzzy Systems (2017):1-1.
[11] Dai, Jianhua, et al. ”Neighbor inconsistent pair selection for at-
tribute reduction by rough set approach.” IEEE Transactions on
Fuzzy Systems 26.2 (2018): 937-950.
[12] Hoa, Nguyen Sinh, and Nguyen Hung Son. ”Some efficient algo-
rithms for rough set methods.” Proceedings IPMU. Vol. 96. 1996.
[13] Hu, Qinghua , et al. ”Efficient Symbolic and Numerical At-
tribute Reduction with Neighborhood Rough Sets.” PR & AI
21.6(2008):730-738.
[14] Hu, Qinghua , et al. ”Large-Scale Multi-Modality Attribute Re-
duction with Multi-Kernel Fuzzy Rough Sets.” IEEE Transactions
on Fuzzy Systems (2017):1-1.
[15] Hu, Qinghua , et al. ”Numerical Attribute Reduction Based on
Neighborhood Granulation and Rough Approximation.” Journal of
Software 19.3(2008):640-649.
[16] Jensen, Richard , and Q. Shen . ”Fuzzyrough attribute reduction
with application to web categorization.” Fuzzy Sets and Systems
141.3(2004):469-485.
[17] Juang, Chia Feng , and C. T. Lin . ”An online self-constructing
neural fuzzy inference network and its applications.” IEEE Trans-
actions on Fuzzy Systems 6.1(1998):12-32.
[18] Liang, Decui , Z. Xu , and D. Liu . ”A New Aggregation Method-
based Error Analysis for Decision-theoretic Rough Sets and Its
Application in Hesitant Fuzzy Information Systems.” IEEE Trans-
actions on Fuzzy Systems 25.6(2017):1685-1697.
[19] Liang, Jiye , et al. ”A Group Incremental Approach to Feature
Selection Applying Rough Set Technique.” IEEE Transactions on
Knowledge and Data Engineering 26.2(2014):294-308.
[20] Liu, Xiaodong , et al. ”The Development of Fuzzy Rough Sets with
the Use of Structures and Algebras of Axiomatic Fuzzy Sets.” IEEE
Transactions on Knowledge and Data Engineering 21.3(2009):443-
462.
[21] Maji, Pradipta. ”A rough hypercuboid approach for feature selec-
tion in approximation spaces.” IEEE Transactions on Knowledge
and Data Engineering 26.1 (2014): 16-29.
[22] Maji, Pradipta , and P. Garai . ”IT2 Fuzzy-Rough Sets and Max
Relevance-Max Significance Criterion for Attribute Selection.” IEEE
Transactions on Cybernetics 45.8(2014):1657-1668.
[23] Miao, Duoqian , et al. ”A Heuristic Algorithm for Reduction
of Knowledge.” Journal of Computer Research & Development
36.6(1999):681-684.
[24] Pawlak, Zdzisław. ”Rough sets.” International Journal of Comput-
er & Information Sciences 11.5(1982):341-356.
[25] Pawlak Z, Skowron A. Rudiments of rough sets[J]. Information
Sciences, 2006, 177(1):3-27.
[26] Qian, Yuhua , et al. ”Positive approximation: An accelerator for
attribute reduction in rough set theory.” Artificial Intelligence 174.9-
10(2010):597-618.
[27] Rehman, Noor, et al. ”SDMGRS: Soft dominance based multi
granulation rough sets and their applications in conflict analysis
problems.” IEEE Access 6 (2018): 31399-31416.
[28] Rodriguez, Alex, and Alessandro Laio. ”Clustering by fast search
and find of density peaks.” Science 344.6191 (2014): 1492-1496.
[29] Skowron, A. , and C. Rauszer . ”The Discernibility Matrices and
Functions in Information Systems.” (1992).
[30] Tsang, Eric CC, et al. ”Attributes reduction using fuzzy rough
sets.” IEEE Transactions on Fuzzy systems 16.5 (2008): 1130-1141.
[31] University, Tsinghua, et al. ”Theory of Fuzzy Quotient Space
(Methods of Fuzzy Granular Computing).” Journal of Software
14.4(2003):770-776.
[32] Wang, Changzhong, et al. ”Attribute reduction based on k-nearest
neighborhood rough sets.” International Journal of Approximate
Reasoning 106 (2019): 18-31.
[33] Wang, Changzhong , et al. ”A Fitting Model for Feature Selection
With Fuzzy Rough Sets.” IEEE Transactions on Fuzzy Systems
25.4(2017):741-753.
[34] Wang, Jue , and J. Wang . ”Reduction algorithms based on discerni-
bility matrix: The ordered attributes method.” Journal of Computer
Science and Technology 16.6(2001):489-504.
[35] Wang, Qi, et al. ”Local neighborhood rough set.” Knowledge-
Based Systems 153 (2018): 53-64.
[36] Xia, Shuyin, et al. ”Complete Random Forest based Class Noise
Filtering Learning for Improving the Generalizability of Classifier-
s.” IEEE Transactions on Knowledge and Data Engineering (2018).
[37] Xia, Shuyin, et al. ”Granular ball computing classifiers for efficient,
scalable and robust learning.” Information Sciences 483 (2019): 136-
152.
[38] Xu, Feifei , et al. ”Mutual Information-Based Algorithm for Fuzzy-
Rough Attribute Reduction.” Journal of Electronics & Information
Technology 30.6(2008):1372-1375.
[39] Yang, and Yanyan. ”Incremental perspective for feature selection
based on fuzzy rough sets.” IEEE Transactions on Fuzzy Systems
(2017):1-1.
[40] Yang, Yanyan , D. Chen , and H. Wang . ”Active Sample Selection
Based Incremental Algorithm for Attribute Reduction With Rough
Sets.” IEEE Transactions on Fuzzy Systems 25.4(2017):825-838.
[41] Yao, Jing Tao , and N. Azam . ”Web-Based Medical Decision
Support Systems for Three-Way Medical Decision Making With
Game-Theoretic Rough Sets.” IEEE Transactions on Fuzzy Systems
23.1(2015):3-15.
[42] Yao, Yiyu. ”Granular computing for data mining.” Defense &
Security Symposium 2006.
[43] Yao, Yiyu , and Y. Zhao . ”Discernibility matrix simplifica-
tion for constructing attribute reducts.” Information Sciences
179.7(2009):867-882.
[44] ZADEH, and A. L. . ”Toward a theory of fuzzy information
granulation and its centrality in human reasoning and fuzzy logic.”
Fuzzy Sets & Systems 90.90(1997):111-127.
[45] Zhao, Suyun , et al. ”A Novel Approach to Building a Robust
Fuzzy Rough Classifier.” IEEE Transactions on Fuzzy Systems
23.4(2015):769-786.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 12
Shuyin Xia received his B.S. degree in 2008
and his M.S. degree in 2012, both in computer
science and both from the Chongqing University
of Technology in China. He received his Ph.D.
degree from the College of Computer Science
at Chongqing University in China. He is an as-
sociate professor and a Ph.D. super visor at the
Chongqing University of Posts and Telecommu-
nications in Chongqing, China, a position he has
held since 2015. His research interests include
data mining, granular computing, fuzzy rough
sets, classifiers, and label noise detection. His research has been
published in many prestigious journals and conferences, such as NIPS,
IEEE-TKDE, and IS. He is the current executive deputy director of the
Chongqing Municipal Public Security Bureau Qihoo 360 Big Data and
Network Security Joint Lab. He is also the director of the Chongqing
Artificial Intelligence Association and an IEEE Member.
Hao Zhang received his B.S. degree in internet
of things engineering from the Chongqing Uni-
versity of Science and Technology in Chongqing,
China, in 2014. He is currently pursuing a M.S.
degree in computer technology at the Chongqing
University of Posts and Telecommunications in
Chongqing, China. His research interests in-
clude rough sets, granular computing, and data
mining.
Wenhua Li received her B.S. degree in in-
formation and computation science from the
Chongqing University of Posts and Telecommu-
nications in Chongqing, China, in 2014. She is
currently pursuing a M.S. degree in computer
science and technology at the Chongqing U-
niversity of Posts and Telecommunications in
Chongqing, China. Her research interests in-
clude rough sets, granular computing, and data
mining.
Guoyin Wang received a B.E. degree in com-
puter software in 1992, a M.S. degree in com-
puter software in 1994, and a Ph.D. degree in
computer organization and architecture in 1996,
all from Xi’an Jiaotong University in Xi’an, Chi-
na. His research interests include data mining,
machine learning, rough sets, granular comput-
ing, cognitive computing, and so forth. He has
worked at the University of North Texas, USA,
and the University of Regina, Canada, as a Vis-
iting Scholar. Since 1996, he has been work-
ing at the Chongqing University of Posts and Telecommunications in
Chongqing, China, where he is currently a Professor and a Ph.D. super-
visor, the Director of the Chongqing Key Laboratory of Computational
Intelligence, and the Dean of the Graduate School. He is the Steering
Committee Chair of the International Rough Set Society (IRSS), a Vice-
President of the Chinese Association for Artificial Intelligence (CAAI),
and a council member of the China Computer Federation (CCF).
Elisabeth Giem received two bachelor’s de-
grees, one in pure mathematics and one in
music (concentration in performance) from the
University of California, Riverside (UCR). She
received a master’s degree in computational and
applied mathematics from Rice University, and a
master’s degree in pure mathematics from UCR.
She joined Zizhong Chen’s SuperLab in 2018
as a computer science Ph.D. student, and has
been awarded the NSF NRT In Computational
Entomology Fellowship. Her research interests
include but are not limited to high-performance computing, parallel and
distributed systems, big data analytics, computational entomology, and
numerical linear algebra algorithms and software.
Zizhong Chen received his bachelor’s degree
in mathematics from Beijing Normal Universi-
ty, master’s degree in economics from Renmin
University of China, and Ph.D. degree in com-
puter science from the University of Tennessee,
Knoxville. He is a professor of computer sci-
ence at the University of California, Riverside.
His research interests include high performance
computing, parallel and distributed systems, big
data analytics, cluster and cloud computing,
algorithm-based fault tolerance, power and en-
ergy efficient computing, numerical algorithms and software, and large-
scale computer simulations. His research has been supported by the
US National Science Foundation, US Department of Energy, Nvidia,
and Microsoft Corporation. He received a CAREER Award from the US
National Science Foundation and Best Paper Awards from the Interna-
tional Supercomputing Conference and IEEE International Conference
on Cluster Computing. He is a senior member of the IEEE and a life
member of the ACM. He currently serves as a subject area editor for
Elsevier Parallel Computing journal and an associate editor for IEEE
Transactions on Parallel and Distributed Systems.
... In addition, parameter δ needs to be manually set, and if the δ value is set unreasonably, it will affect the accuracy of the algorithm. GBNRS theory was presented [50] to address this issue. It can adaptively generate the neighborhood range and has a time complexity of O(n), which is lower than that of NRS O(n 2 ), thereby improving the algorithm's running efficiency. ...
... This section reviews the concept of GBNRS [45], which replaces point inputs by constructing multiple balls of different granularities through k-means [50] or k-division [42] algorithms. This reduces the number of sample points and enhances the algorithm's operational efficiency. ...
... The traditional granular ball algorithm (GBNRS) [50] relies on the k-means algorithm, with the time complexity of the k-means method being O(nkt), where n denotes the number of samples, k represents the number of granular balls, and t is the iteration count. As the dimension of the dataset increases, the efficiency of the k-means algorithm decreases. ...
Preprint
Full-text available
Attribute reduction is a key step in processing large-scale datasets, where the Granular Ball Neighborhood Rough Set (GBNRS) can significantly enhance the performance of attribute reduction compared to the traditional Neighborhood Rough Set (NRS). However, the GBNRS algorithm faces such challenges as a sharp increase in computational costs in high-dimensional spaces. To address these issues, this study introduces a new granular ball quality index to judge the separability degree of decision classes, and on the basis of this index, a rapid variable granular ball generation model (RVGBGM) is proposed. Compared with GBNRS, RVGBGM has the following advantages: 1) it reduces the number of granular balls and can quickly reflect the separability degree of different decision classes with few granular balls, 2) it constructs granular balls by using label information and shortens the time of granular ball construction, and 3) it can adjust the radius of granular balls adaptively by using parameters to determine the optimal granular ball radius for different datasets. Finally, we compare the RVGBGM algorithm with classical attribute reduction algorithms and the current state-of-the-art granular ball algorithm on 11 datasets. The proposed algorithm significantly improves algorithm efficiency while maintaining high accuracy.
... Li [42] also used multi-granularity mechanisms to adjust the degree of mutual influence between two different neighborhood radii. The algorithm of GBNRS proposed by Xia [31] adaptively generates different neighborhoods for each sample, which is more flexible and variable than traditional methods. Dai [40] deleted objects in the dataset to change the granularity. ...
Article
Full-text available
Neighborhood rough sets, as an effective tool for processing numerical data, is widely used in many fields, such as data mining, machine learning and decision-making system. However, most of the existing neighborhood rough set-based attribute reduction algorithms have low efficiency. To address the limitation, this paper has proposed an efficient positive region search algorithm based on multiple hash buckets and multiple granularity mechanisms. This algorithm achieves a more accurate neighborhood extent by superimposing the effects of multiple hash buckets, and accelerates positive region searching through the idea of multiple granularity. In addition, on the foundation the positive region search algorithm, we improved the existing algorithm and proposed an attribute reduction algorithm based on multi-hash bucket and multi-granularity. To further remove the redundant attributes, the two algorithms mentioned above are applied into a novel attribute reduction approach based on random walk. Experiments conducted on UCI datasets show that our attribute reduction algorithm has high efficiency. Moreover, attribute reduction approach we proposed can further compress the reduced attribute set, and the results maintain similar or even better classification accuracy.
Article
Attribute reduction or attribute subset selection is among the highly important, and essential data pre-processing tasks in all the applications belonging to various domains of engineering that fall under the broad spectrum of artificial intelligence. The process of attribute subset selection and the significance of each selected attribute greatly affect the classification performance of any machine learning algorithm. Rough set theory-based solutions for attribute subset selection have been proven to be very effective for categorical information systems. However, most of those attribute reduction algorithms are serial in nature. They are either inefficient in processing datasets having a very large number of dimensions or their efficiency is overshadowed by high computational costs. Hence, they are becoming inapplicable to the current data processing requirements. To address this problem, we first propose a novel and efficient attribute reduction algorithm named Reduction of Attributes based on Association and Separation (RAAS). This algorithm is based on two measures: the degree of association (DA) of objects within a class and the degree of separation (DS) among objects of different classes. These measures are used for the evaluation of the significance of each attribute as well as the classification ability of each attribute subset. A sequential backward elimination strategy using the DA and the DS is designed to obtain the optimal attribute subset. The RAAS algorithm is evaluated against other typical reduction algorithms over a few publicly available standard datasets from the UCI data repository. The experimental results show that RAAS produces better classification accuracies in comparison to the others. We then designed the parallel version of RAAS, the other proposed algorithm called Parallel Attribute Reduction Algorithm based on Association and Separation (PARAAS) which is both efficient and fast. The PARAAS algorithm is the first algorithm that is designed specifically to perform attribute reduction of larger dimensional categorical datasets on graphics processing units (GPUs) that support CUDA. Experimental analysis suggests that PARAAS has the ability to produce high classification accuracies in significantly low execution times.
Article
Full-text available
The existing noise detection methods required the classifiers or distance measurements or data overall distribution, and ‘curse of dimensionality’ and other restrictions made them insufficiently effective in complex data, e.g. different attribute weights, high-dimensionality, containing feature noise, nonlinearity, etc. This is also the main reason that the existing noise filtering methods were not widely applied and formed an effective learning framework. To address this problem, we propose here a complete and efficient random forest method (CRF) specifically for the class noise detection by simulating the grid generation and expansion. The CRF is not based on distance measures or overall distribution or classifiers; besides, the voting mechanism makes it able to effectively process datasets containing feature noise. Furthermore, we introduce CRF based class noise filtering learning framework (CRF-NFL) and derive its mathematical model. The framework is then applied to many widely used classifiers including some stat-of-the-art algorithms, e.g. k-means tree, GBDT and XGBoost. Moreover, its parallelized is designed for large-scale data. The CRF-NFL show much better generalizability than the conventional classifiers and the relative density-based method, which is the most effective noise filtering method as far as we know. All research has formed an open source library, called CRF-NFL: http://www.cquptshuyinxia.com/CRF-NFL.html.
Article
Full-text available
The classical rough set theory was presented by Pawlak, which is mainly concerned with the approximation of sets described by a single binary relation on the universe. In the present paper, we initiate a multi attribute group decision making problems in the presence of multi attribute and multi decision in decision making with preferences. Then resolving the problem, using two different approximation strategies, i.e., seeking common reserving difference and seeking common rejecting difference, four kinds of soft dominance based multi-granulation rough sets are presented, namely, soft dominance based optimistic multi-granulation rough sets and soft dominance based pessimistic multi-granulation rough sets and their applications in solving a multi agent conflict analysis decision problem. The proposed method addresses the limitations of the Pawlak model and Sun’s conflict analysis model and thus improve these models. Finally, the results on labor management negotiation problems show that the proposed algorithms are more effective and effcient for feasible consensus strategy when compared with Sun’s technique.
Article
Full-text available
With the advent of the age of big data, a typical big data set called limited labeled big data appears. It includes a small amount of labeled data and a large amount of unlabeled data. Some existing neighborhood-based rough set algorithms work well in analyzing the rough data with numerical features. But, they face three challenges: limited labeled property of big data, computational inefficiency and over-fitting in attribute reduction when dealing with limited labeled data. In order to address the three issues, a combination of neighborhood rough set and local rough set called local neighborhood rough set (LNRS) is proposed in this paper. The corresponding concept approximation and attribute reduction algorithms designed with linear time complexity can efficiently and effectively deal with limited labeled big data. The experimental results show that the proposed local neighborhood rough set and corresponding algorithms significantly outperform its original counterpart in classical neighborhood rough set. These results will enrich the local rough set theory and enlarge its application scopes.
Article
Full-text available
Rough set theory, as one of the most useful soft computing methods dealing with vague and uncertain information, has been successfully applied to many fields, and one of its main applications is to perform attribute reduction. Although many heuristic attribute reduction algorithms have been proposed within the framework of the rough set theory, these methods are still computationally time-consuming. In order to overcome this deficit, we propose in this paper two quick feature selection algorithms based on the neighbor inconsistent pair, which can reduce the time consumed in finding a reduct. At first, we propose several concepts regarding simplified decision table(U ′ ) and neighbor inconsistent pairs. Based on neighbor inconsitent pairs, we constructe two new attribute significance measures. Furthermore, we put forward two new attribute reduction algorithms based on quick neighbor inconsistent pairs. The key characteristic of the presented algorithms is that they only need to calculate U ′ =R once under the process of selecting the best attribute from attribute sets: C−R, while most existing algorithms need to calculate partition of U ′ for |C−R| times. In addition, the proposed algorithms need only to deal with the equivalent classes in U ′ =R that contain at least one neighbor inconsistent pair, while most existing algorithms need to consider all objects in U ′ . The experimental results show that the proposed algorithms are feasible and efficient.
Article
Granular computing is an efficient and scalable computing method. Most of the existing granular computing-based classifiers treat the granules as a preliminary feature procession method, without revising the mathematical model and improving the main performance of the classifiers themselves. So far, only few methods, such as the G-svm and WLMSVM, have been combined with specific classifiers. Because of the complete symmetry of the ball and its simple mathematical expression, it is relatively easy to be combined with the other classifiers’ mathematical models. Therefore, this paper uses a ball to represent the grain, namely the granular ball, and not only the granular balls’ labels but also the distance between a pair of balls is defined. Based on that, this paper attempts to propose a new granular classifier framework by replacing the point inputs with the granular balls. We derive the basic model of both the granular ball support vector machine and granular ball k-nearest neighbor algorithm (GBkNN). In addition, the GBkNN is compared with the k-means tree based kNN, which is the most efficient and effective kNN as far as we known, on both public and artificial data sets. The Experimental results demonstrate the effectiveness and efficiency of the proposed framework.
Article
Feature selection based on fuzzy rough sets is an effective approach to select a compact feature subset that optimally predicts a given decision label. Despite being studied extensively, most existing methods of fuzzy rough set based feature selection are restricted to computing the whole dataset in batch, which is often costly or even intractable for large datasets. To improve the time efficiency, we investigate the incremental perspective for fuzzy rough set based feature selection assuming data can be presented in sample subsets one after another. The key challenge for the incremental perspective is how to add and delete features with the subsequent arrival of sample subsets. We tackle this challenge with strategies of adding and deleting features based on the relative discernibility relations that are updated as subsets arrive sequentially. By the strategies, two incremental versions for fuzzy rough set based feature selection are designed: 1) updating the relative discernibility relations and the feature subset as each sample subset arrives, and then returning the final feature subset after all subsets are processed; 2) updating the relative discernibility relations with subset arriving continuously, and then computing the feature subset after all subsets are added. Experimental comparisons suggest our incremental algorithms expedite fuzzy rough set based feature selection without compromising performance.
Article
The decision making in the real world is inevitably characterized with vagueness, and imprecision due to incomplete knowledge. To this end, we combine the information set with the rough set theory to represent both the vagueness and imprecision at the same time. We term the proposed structure as rough information set that has information sets based on fuzzy equivalence relations as its building blocks. The usefulness of the proposed structure is demonstrated through a case-study in credit scoring analysis, and a biometrics application on knuckle-based recognition.