ArticlePDF Available

GBNRS: A Novel Rough Set Algorithm for Fast Adaptive Attribute Reduction in Classification

May 2020
IEEE Transactions on Knowledge and Data Engineering PP(99):1-1

May 2020
PP(99):1-1

DOI:10.1109/TKDE.2020.2997039

Authors:

Shu-yin Xia

Chongqing University of Posts and Telecommunications

Show all 6 authorsHide

Feature reduction is an important aspect of Big Data analytics on today's ever-larger datasets. Rough sets are a classical method widely applied in attribute reduction. Most rough set algorithms use the priori domain knowledge of a dataset to process continuous attributes through using a membership function. Neighborhood rough sets (NRS) replace the membership function with the concept of neighborhoods, allowing NRS to handle scenarios where no a priori knowledge is available. However, the neighborhood radius of each object in NRS is fixed, and the optimization of the radius depends on grid searching. This diminishes both the efficiency and effectiveness, leading to a time complexity of not lower than O(N $^2$ ). To resolve these limitations, granular ball neighborhood rough sets (GBNRS), a novel NRS method with time complexity O(N), is proposed. GBNRS adaptively generates a different neighborhood for each object, resulting in greater generality and flexibility in comparison to standard NRS methods. GBNRS is compared with the current state-of-the-art NRS method, FARNeMF, and find that GBNRS obtains both higher performance and higher classification accuracy on public benchmark datasets. All code has been released in the open source GBNRS library at http://www.cquptshuyinxia.com/GBNRS.html.

Visualizations of both granular ball generation and the generation positive region using GBNRS on the dataset fourclass with the purity degree threshold set at 1. (a)-(f) The generation results of six iterations. (g) The final granular balls. (h) The generation positive region, which coincides with the centers of the granular balls. The red points and red granular balls are labeled +1, and the black points and black granular balls are labeled -1.

…

A comparison of three rough sets methods. (a) Classical rough sets in discrete space; (b) Neighborhood rough sets in real space; (c) Granular ball neighborhood rough sets in the same space. In (b-c) the samples marked with "·" are the first type of samples, and those marked with "+" are the second type of samples. In (c) the rectangles represent the objects in the generation positive regions.

…

Visualizations of both granular ball generation and the generation positive region using MGBNRS on the dataset fourclass with the purity degree threshold set at 1. (a)-(c) The first three iterations. (c)-(f) Middle iterations. (g) The final granular balls. (h) The generation positive region, which coincides with the centers of the granular balls found during improved GBC. The red points and red granular balls are labeled +1, and the black points and black granular balls are labeled -1.

…

Efficiency comparison. (a) On the dataset german. (b) On the dataset abalone. (c) On the dataset mushroom. (d) On the dataset electrical. (e) On the dataset htru2. (f) On the dataset letter.

…

Information of Datasets

…

Figures - uploaded by Shu-yin Xia

Content may be subject to copyright.

Content uploaded by Shu-yin Xia

Content may be subject to copyright.

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1

GBNRS: A Novel Rough Set Algorithm for Fast

Adaptive Attribute Reduction in Classiﬁcation

Shuyin Xia, Hao Zhang, Wenhua Li, Guoyin Wang, Elisabeth Giem, Zizhong Chen

Abstract—Feature reduction is an important aspect of Big Data analytics on today’s ever-larger datasets. Rough sets are a classical

method widely applied to attribute reduction. Most rough sets algorithms process continuous attributes by using a membership

function, which is set using a priori knowledge of the dataset. Neighborhood rough sets (NRS) replaces the membership function with

the concept of neighborhoods, allowing NRS to handle to handle scenarios where no a priori knowledge is available. However, the

neighborhood radius of each object in NRS is ﬁxed, and optimization of the radius depends on grid searching. This diminishes both the

efﬁciency and effectiveness, leading to a time complexity of not lower than O(N2). To resolve these limitations, we propose granular

ball neighborhood rough sets (GBNRS), a novel NRS method with time complexity O(N). GBNRS adaptively generates a different

neighborhood for each object, resulting in greater generality and ﬂexibility in comparison to standard NRS methods. We compare

GBNRS with the current state-of-the-art NRS method, FARNeMF, and ﬁnd that GBNRS obtains both higher performance and higher

classiﬁcation accuracy on public benchmark datasets. All code has been released in the open source GBNRS library at

http://www.cquptshuyinxia.com/GBNRS.html.

Index Terms—Rough sets, Neighborhood rough sets, Granular ball computing, Fuzzy rough sets.

1 INTRODUCTION

DATABASE storage technology and data collection has

grown to the point where massive datasets are the

norm. These large datasets often have redundant or irrel-

evant features which increase the computational complexity

of classiﬁers and lead to poor performance, both in accu-

racy and execution time [19, 33, 39]. Feature or attribute

reduction is essential to mitigate this issue. The core idea of

attribute reduction is to delete irrelevant or unimportant at-

tributes under the condition that the classiﬁcation accuracy

of a classiﬁer is either unchanged or increased. The result of

this reduction is twofold: as few attributes as possible will

carry as much data information as possible; and reducing

redundant attributes can increase the generalizability of

classiﬁers.

Polish scientist Zdzisław Pawlak ﬁrst proposed the theo-

ry of rough sets in 1982. Rough sets are a mathematical data

mining tool that can effectively deal with uncertain, inac-

curate, and incomplete data [24, 25]. Rough sets have been

applied to machine learning, data mining, decision support

and analysis, soft computing, and other ﬁelds [2, 4, 27, 45],

but one of their most important applications is attribute

reduction [11, 21, 26, 29, 39]. Rough sets use an equivalence

relation to partition the universe, separating an information

system into an upper approximation set and a lower ap-

proximation set. The upper approximation set contains the

maximum set of elements that may belong to a class, and

the lower approximation set refers to the minimum set of all

•S. Xia, H. Zhang, W. Li, G. Wang & Q. Zhang are with the

Chongqing Key Laboratory of Computational Intelligence, Chongqing

University of Telecommunications and Posts, 400065, Chongqing, China.

E-mail: xiasy@cqupt.edu.cn, 1025476698@qq.com,846659545@qq.com,

wanggy@cqupt.edu.cn

•Z. Chen and E. Giem are with the Department of Computer Science and

Engineering, University of California Riverside, Riverside, CA, 92521.

E-mail: zizhongchen@gmail.com, gieme01@ucr.edu

elements that can be accurately classiﬁed. Rough sets were

initially developed for discrete data and also were not origi-

nally developed as algorithms with the most desirable time

complexities; as a consequence, improving the efﬁciency of

the algorithms and the ability to process continuous data are

two of the cardinal directions of research in rough sets.

Granulation and approximation are two basic problems

in rough sets and granular computing. The theory of rough

sets proposed by Pawlak partitions the universe according

to an equivalence relation, which results in the granulation

of the universe. However, the values of objects are continu-

ous in real number space, and therefore the classical equiv-

alence relation cannot be directly applied. Fuzzy rough sets

can process real number space, but requires prior knowl-

edge of the dataset in order to exactly or approximately set

a membership function and membership degree for each

object in advance. This knowledge cannot be provided a

priori in many scenarios. In order to overcome this obstacle,

Hu et al. introduced the neighborhood model of rough sets

in place of the membership function [14], forming a new

theory of fuzzy rough sets for processing continuous data:

neighborhood rough sets (NRS). The fuzzy degree in NRS is

characterized by the size of the neighborhood instead of the

membership degree. As changing the membership function

changes the positive region in fuzzy rough sets, so, different

sizes of neighborhood of neighborhood lead to changes in

the positive region. The key difference is that the size of the

neighborhood is inﬂuenced by the distribution of the dataset

and does not rely on any a priori knowledge. This size is a

parameter that is typically optimized by grid search. NRS

can perform better than other fuzzy rough sets on numerical

datasets in the absence of prior knowledge.

We build on NRS by introducing the idea of granular

computing into neighborhood rough sets to build a new

NRS method that is entirely parameter-free in processing

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2

continuous data. We require no membership function, no

membership degree, and no ﬁxed radius parameter which

must be optimized by grid search, in direct contrast to all

other current methods. The main contributions of this paper

are as follows:

•We establish the novel idea of granular ball neighborhood

rough sets by introducing granular ball computing into

neighborhood rough sets. GBNRS has a time complexity

of O(N), rendering it much more efﬁcient than existing

NRS methods.

•Granular ball neighborhood rough sets is the ﬁrst rough

sets algorithm that is parameter-free in processing contin-

uous data; that is, it does not need to set any member-

ship function or optimize any extraneous parameters for

processing continuous data. To our knowledge, no other

rough sets algorithm has this desirable property.

•GBNRS generates optimal neighborhood radii adaptively,

resulting in greater ﬂexibility than existing NRS algo-

rithms. This adaptive nature of GBNRS leads to better

generalizability in comparison with classical NRS.

•We additionally propose multiple granular ball neighbor-

hood rough sets, an improvement on GBNRS which more

fully uses label information in GBC, giving even higher

efﬁciency and accuracy than GBNRS.

The rest of the paper is organized as follows: we intro-

duce related works in Section 2, and then detail the back-

ground theory of neighborhood rough sets and granular

ball computing in Section 3. Section 4 presents the design

and analysis of our novel granular ball neighborhood rough

sets scheme. Evaluation results are given in Section 5. We

present our conclusions in Section 6.

2 RE LATED WORK

Considering the effectiveness of the rough sets method,

improving the efﬁciency of the rough sets algorithm is an

important direction of research in rough sets. However, at-

tribute reduction in rough sets is an NP-hard problem, mak-

ing this a fertile area for new ideas. Hoa et al. form a new

condition attribute set and take the ﬁnal output condition

attribute set as attribute reduction by comparing the relative

positive regions [12]. To improve performance in the specif-

ic case where there are many important attributes in the

decision table, Skowron et al. propose attribute reduction

based on a discernibility matrix [29]. In Skowron’s method, a

minimal reduction is found. To further alleviate the problem

of combination explosion [29, 34, 43], Miao et al. introduce

information theory into attribute reduction and propose the

MIBARK algorithm [23, 38]. MIBARK is based on the idea

that information entropy can reduce the search space during

the reduction process and improve efﬁciency, although it is

possible the attribute reduction of the information system

may not be found in some cases. Dai et al. propose the

concept of discernible pairs based on rough sets theory

and constructed a uniﬁed measurement method to measure

attributes in both supervised and unsupervised frameworks

[9]. Dai et al. also propose a fast feature selection algorithm

based on the neighbor inconsistent pair, which can pare

down the time needed to ﬁnd a reduction [11]. In recent

years, attribute reduction incremental algorithms based on

rough sets have been proposed for processing large sample

dynamic datasets. Liang et al. propose a group incremental

rough feature selection algorithm based on information

entropy [19], which can ﬁnd new feature subsets in a shorter

time when multiple objects are added to the decision table.

Considering the high time and space complexity of Liang’s

algorithm, Yang et al. propose a rough sets attribute reduc-

tion incremental algorithm based on both the active sample

selection process and the attribute reduction process [40].

Currently, some datasets collected from various applications

change over time, especially when new objects are added;

new properties may appear. In order to solve the problem

of dynamic maintenance, a dynamic maintencance method

for approximate objects with targets and attributes at the

same time has been proposed by Chen et al. [6, 7]. These

algorithms all improve the efﬁciency of rough sets methods

in different scenarios. However, few of them can decrease

the time complexity to a low degree; most are O(N2)or

higher. This limits the applicability of these rough sets

methods on large-scale datasets.

Classical rough sets adopt the concepts of equivalence

partitions and equivalence classes to calculate granularity.

However, this processing method is only applicable to dis-

crete data, while the data in practical applications is usually

numerical. Therefore, improving the ability of rough sets

to process continuous attributes is another productive area

of research. Both fuzzy sets and rough sets are capable of

dealing with uncertain information [17]. Dubois and Prade

combine these theories to produce fuzzy rough sets (FRS)

theory [25], which provides an effective method for dis-

cretizing continuous data. FRS can be directly applied to the

reduction of continuous attributes. A membership function

and membership degree are used to describe fuzzy degree

in fuzzy rough sets, and a different membership function

can lead to changes in the positive region. Liu & Pedrycz et

al. redeﬁne the concept of fuzzy rough sets on the basis of

axiomatic fuzzy set theory, which provides higher ﬂexibility

and effectiveness [20]. Due to the advantages of fuzzy rough

sets theory, in recent years fuzzy rough sets have been

extended to many applications [3, 5, 10, 16, 18, 33, 38, 41].

However, as there is often signiﬁcant overlap between dif-

ferent categories in a dataset, it is easy for samples to be

misclassiﬁed. Wang et al. propose a new fuzzy rough sets

model for those datasets with signiﬁcant overlap between

different categories [33]. Aggarwal et al. propose proba-

bilistic variable precision fuzzy rough sets to address the

imprecision problem [1]. Aiming at the problem of large-

scale multimodality fuzzy classiﬁcation, Hu et al. extract

fuzzy similarity by using a combination of kernals based on

rough sets theory [14]. In order to solve the problem of real

valued noisy features, Maji et al. propose an IT2 fuzzy-rough

feature selection method that combines the advantages of

IT2 fuzzy sets, rough sets, and the MRMS criterion [22].

This method is more effective with the exact membership

function is not known.

All of these methods have greatly advanced the devel-

opment of rough sets. However, they are all limited to cases

where the membership degree and membership function

can be exactly or approximately set in advance using a priori

knowledge; they do not work in cases where this knowledge

is missing. To enable FRS to handle these cases, Hu et al.

propose neighborhood rough sets (NRS). A new type of

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 3

fuzzy rough sets, it describes the fuzzy degree with the size

of a neighborhood instead of with a membership degree.

Like the membership function in classic fuzzy rough sets, a

different size of neighborhood will lead to a change in the

positive region. However, the size of the neighborhood can

be optimized according to the distribution of the dataset in-

stead of prior knowledge, mitigating the need to set a mem-

bership function and membership degree when processing

continuous data [15]. Therefore, on numerical datasets when

lacking prior knowledge, NRS can out-perform classical

fuzzy rough sets. To improve the efﬁciency of NRS, Hu

et al. further propose forward attribute reduction based

on neighborhood rough sets and fast search (FARNeMF)

with attribute signiﬁcance as heuristic information [13].

FARNeMF reduces attributes by selecting the condition

attribute with the maximum attribute signiﬁcance. As the

newly added attributes are only effective for distinguishing

boundary samples, only samples in the original negative

region and boundary region need to be evaluated to see

if they will belong to the new positive region when the

dependence degree of the decision attributes is calculated on

the newly added condition attributes. This avoids repeated

evaluations of the objects in the positive region and reduces

the amount of calculation. To our knowledge, FARNeMF is

the state-of-the-art NRS algorithm and therefore is selected

for comparison in this paper.

Neighborhood rough sets have shown good perfor-

mance in many applications [8, 32, 35]. However, NRS

searches the neighbors of each object within a given uniﬁed

range on a condition attribute set, and those objects whose

neighbors have the same labels are determined to belong to

the positive region in the current condition attribute set. This

gives NRS a time complexity of O(N2). In addition, the uni-

ﬁed range also needs to either be artiﬁcially set or optimized

by grid searching. We turn to a new, lower time-complexity

method in this paper, combing this theory with NRS. Gran-

ular computing is a scalable, efﬁcient, and robust method

that is similar to the way a human brain thinks [42]. It uses

simple, low-cost, satisfactory approximate solutions rather

than exact solutions to achieve tractable, robust, and cheap

intelligent systems that can describe the real world better

[31]. In Science [28], Rodriguez pointed out that granular

computing is an effective method for ﬁnding knowledge in

big data, and it has been combined with various learning

methods, such as rough sets [24], computing with words as

proposed by Zadeh [44], and label noise detection [36]. We

discuss the theory of granular computing further in the next

section.

3 BACKG RO UN D MODELS AND DEFINITIONS

GBNRS relies heavily on two previous theories, granular

ball computing and neighborhood rough sets, of which we

introduce the mathematical underpinnings here.

3.1 Neighborhood rough sets

We have introduced Pawlak rough sets, fuzzy rough sets,

and neighborhood rough sets somewhat loosely. We now

drill down into the details, deﬁning the basic spaces we are

operating in, the neighborhoods we are working with in

neighborhood rough sets, and the positive region we have

mentioned, which is key to the operation of these methods.

Deﬁnition 1. Let ∆:Ω×Ω→Rbe a function generated

on a set Ω.hΩ,∆iis known as a metric space if ∆satisﬁes:

(1) ∆ (x1, x2)≥0,∆ (x1, x2) = 0 iff x1=x2,∀x1, x2∈Ω;

(2) ∆ (x1, x2) = ∆ (x2, x1),∀x1, x2∈Ω;

(3) ∆ (x1, x3)≤∆ (x1, x2) + ∆ (x2, x3),∀x1, x2, x3∈Ω.

In this case, ∆is known as a metric.

Deﬁnition 2. Let a quaternion hU, A, V , firepresent an

information system where:

U={x1, x2, ..., xn}denotes a non-empty ﬁnite set of

objects. Uis called the universe;

A={a1, a2, ..., am}denotes a non-empty ﬁnite set of

attributes;

V=Sa∈AVadenotes the set of all attribute values,

where Vadenotes the value range of attribute a;

f=U×A→Vdenotes a mapping function: ∀xi∈

U, a ∈A,f(xi, a)∈Va.

This information system is called a decision system

hU, C, Diif the set of attributes in the information system

above satisﬁes A=C∪D,C∩D= Ø, and D6= Ø, where C

is the condition attribute set and Dis the decision attribute

set.

Deﬁnition 3. Let U={x1, x2, ..., xn}be a non-empty

ﬁnite set of real space. ∀xi∈U, the δ-neigborhood of xiis

deﬁned as:

δ(xi) = {x|x∈U, ∆ (x, xi)≤δ},(1)

where δ≥0.

Deﬁnition 4. Let hU, C, Dibe a neighborhood decision

system. The decision attribute set Ddivides Uinto L

equivalence classes: X1, X2, ..., XL.∀B⊆C, the lower

approximation and the upper approximation of the decision

attribute set Dwith respect to the condition attribute set

Bare respectively deﬁned as:

NBD=

[

i=1

NBXi,(2)

NBD=

[

i=1

NBXi,(3)

where NBXi={xk|δB(xk)⊆Xi, xk∈U},NBXi=

{xk|δB(xk)TXi6= Ø, xk∈U}, and its positive region and

boundary region are respectively deﬁned as P OSB(D) =

NBD, BN (D) = NBD−NBD.

NRS needs to search the neighbors of each object within

a given uniﬁed range on a condition attribute set, and

those objects whose neighbors have the same labels with

the queried object form the positive region of the condition

attribute set. NRS therefore has a time complexity of O(N2),

which is not efﬁcient. In addition, the uniﬁed range either

needs to be set artiﬁcially (as we do in our experiments to

give this method the best possible running times in compar-

ison to our own methods), which decreases its effectiveness,

or optimized by grid searching (which we also present

in our experiments to give this method the best possible

accuracy, without considering the excessive time required to

obtain it), which decreases its efﬁciency. We now present the

theory of granular ball computing, an idea which allows us

to adaptively generate the range for each object and decrease

the time complexity of NRS to O(N).

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 4

3.2 Granular ball computing

In our own previous work [37], we have introduced gran-

ular computing into existing classiﬁer models. We generate

granular balls (GB) that cover a dataset, and propose gran-

ular ball computing (GBC) classiﬁers by replacing the input

points with a completely symmetric structure, the granular

ball. The overall label of a GB is deﬁned as the label with

the most appearances in a GB, that is, the majority label

in the GB. The quality of a granular ball is measured by

the purity of the ball, which is deﬁned as the percentage of

majority samples in the granular ball. We show the detailed

mathematical deﬁnitions below [37]:

Deﬁnition 5. Let U⊆RNbe a dataset and U0⊆U

(U06= Ø). We generate a granular ball GB with Cas its

center and ras its radius on Uas follows. Cdenotes the

center of gravity of all sample points in U, and rthe average

distance from all points in Uto C. Speciﬁcally, for a point

Oi∈U0(i= 1,2..., M ), where Mdenotes the number of

points in U0, we have:

C=1

i=1

Oi, r =1

i=1

||Oi−C||.(4)

Deﬁnition 6. Let U⊆RNbe a dataset and U0⊆U

(U06= Ø). Let GB be the granular ball generated on U0with

Cas its center and ras its radius. The overall label of GB is

deﬁned as the label which appears the most times in GB;

that is, the majority label of GB.

As each granular ball covers many points but itself con-

sists of only two data, the center and the radius, the dataset

is greatly reduced. In addition, enough granular balls can ﬁt

any decision boundary, allowing granular ball classiﬁers to

have good generalizability [37]. In the generation process

of GBs, we iteratively implement 2-means clustering on

each class until the purity of each granular ball reaches a

given threshold. Figure 1 shows the ﬁtting process of GBC

with a purity threshold of 1. To provide visualizations of

multi-class data, we assign the points in the upper right

corner of the dataset fourclass the third label, according to

its distribution characteristics. We observe from Fig. 1 that

granular balls become more pure after partitioning, that is,

that the purity of each granular ball increases as the number

of granular balls becomes large. As seen in Fig. 1(f), when

the purity of the granular balls reaches 1, the decision curve

of positive and negative granular balls is very consistent

with that in the original dataset. GBC can not only ﬁt any

decision boundary but also has a low time complexity of

O(N)[37]. As seen in Fig. 1(e), as the radius of a granular

ball is equal to the average distance of all objects in the

granular ball from its center, some samples may not be

covered by any granular ball. However, this will not affect

the boundary consistency between the original dataset and

the granular balls [37].

Motivated by GBC and NRS, granular ball neighborhood

rough sets are proposed. We show that this procedure

adaptively generates a different radius for each object and

results in a much more efﬁcient method.

(a) (b)

(e) (f)

(g) (h)

Fig. 1. Visualizations of both granular ball generation and the generation

positive region using GBNRS on the dataset fourclass with the purity

degree threshold set at 1. (a)-(f) The generation results of six iterations.

(g) The ﬁnal granular balls. (h) The generation positive region, which

coincides with the centers of the granular balls. The red points and red

granular balls are labeled +1, and the black points and black granular

balls are labeled -1.

4 GRANULAR BA LL NEIGHBORHOOD ROUGH

SET S

4.1 Theory and mathematical models

We have already shown the main processes of granular ball

neighborhood rough sets in Figure 1. We use granular ball

computing to adaptively generate many granular balls with

different radii. The purity of the balls in the GBC step is

set to 1, requiring the samples in each granular ball to

all have the same label. The centers of the granular balls

can then be used as generation positive regions for the

following three reasons: 1) the centers are completely within

the corresponding class and will not affect the decision

boundary of the dataset; 2) the process of granular ball

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 5

generation has a certain randomness since k-means is used

in GBC, but the centers of the granular balls are relatively

stably located inside of the corresponding class of objects

and therefore serve to generate a stable positive region; 3)

that the center of a granular ball belongs to the positive re-

gion is completely consistent with the deﬁnition of positive

regions in NRS [14, 15], that is, that an object’s neighbors

in an appropriate range all have the same labels. We see

this exactly by considering the center of a granular ball as

an object, the range as the granular ball’s radius, and the

neighbors of the center as the objects in the granular ball

within the radius (or range). We can see that the neighbors

all have the same labels as the center. Since the purity

of each granular ball is 1, objects near the boundary and

heterogeneous objects will not be covered in any granular

ball and have no chance to contaminate the positive region.

Thus, considering the centers of the granular balls to be

positive regions is strictly consistent with the deﬁnition of

positive regions in NRS theory.

The mathematical models of our granular ball neighbor-

hood rough sets (GBNRS) are described in the following

deﬁnitions.

Deﬁnition 7. Let U={x1, x2, ..., xn}be a non-empty

ﬁnite set, and let GBC generate granular balls that cover U.

The j-th granular ball is GBj, it’s center is Cj, and its radius

is rj. For xi∈GBj, we deﬁne the neighborhood of xias:

σ(xi) = {x|∀x∈GBj,dis (x, Cj)≤rj},(5)

where dis (x, Cj)is the distance from xto Cj.

There is some randomness in the process of granular ball

generation due to the randomness of the initial centroid

selection in k-means. The objects near the boundary of a

granular ball belong to its positive region with a greater

degree of randomness than the objects near its center. We

therefore introduce the generation lower approximation in

Deﬁnitions 8 and 9 and the generation positive region in

Deﬁnition 10. The generation positive region is the core

indicator measuring the importance of attributes in GBNRS.

Deﬁnition 8. Let hU, A, V, f ibe an information system.

For ∀xi∈U, a ∈A, s.t. f(xi, a)∈Va, let GBC generate

granular balls that cover the entire sample set. The j-th gran-

ular ball GBj. For P⊆A,X⊆U, the upper approximation

set, lower approximation set and generation lower approximation

set of Xwith respect to a attribute set Pare deﬁned as:

P X =nxi∈U, xi∈GBj(P)|σ(xi)\X6= Øo,(6)

P X ={xi∈U, xi∈GBj(P)|σ(xi)⊆X},(7)

P X 0=





x=1

i=1

xi|xi∈GBj(P), σ (x)⊆X





,(8)

where GBj(P)denotes the jth granular ball under the

condition attribute set P, and ljrepresents the number of

objects in the j-th granular ball.

Deﬁnition 9 follows from Deﬁnition 8 in the special case

of a decision system.

Deﬁnition 9. Let hU, C, Dibe a decision system. D

divides Uinto Lequivalence classes: X1, X2, ..., XL. Let

GBC generate granular balls that cover the entire sample

set. The i-th granular ball is GBi.∀B⊆C, we deﬁne the

upper approximation,lower approximation, and generation lower

approximation of the decision attribute set Dwith respect to

the condition attribute set Bas follows:

BD =

[

i=1

BXi,(9)

BD =

[

i=1

BXi,(10)

BD0=

[

i=1

BX 0

i,(11)

where BXi={xk∈U|xk∈GBj(B), σ (xk)TXi6= Ø},

BXi={xk∈U|xk∈GBj(B), σ (xk)⊆Xi},BX0

nx=1

ljPlj

k=1 xk|xk∈GBj(B), σ (xk)⊆Xio.

As Deﬁnition 9 describes, the lower approximation of D

consists of those objects in granular balls with purities of 1.

As a reminder, the purity of a granular ball is the percentage

of the majority sample in the granular ball. The generation

lower approximation consists of the centers of those granular

balls with purities of 1 in the lower approximations.

Deﬁnition 10. Let hU, C, Dibe a decision system. Let

GBC generate granular balls that cover the entire sample

set. ∀B⊆C, the generation positive region is deﬁned as

GP osB(D) = BD0,(12)

where BD0represents the generation lower approximation

of the decision attribute set Dwith respect to the condition

attribute set B.

The objects contained in a granular ball with a purity

of 1 compose a positive region that allows for the adap-

tive generation of both granular balls and positive regions.

However, the objects near the boundary of a granular ball

in GBNRS are not stable due to the 2-means implemented

in GBC. Therefore the generation lower approximation in

Deﬁnition 9 and the generation positive region in Deﬁni-

tion 10 are proposed. These two deﬁnitions, in contrast to

current NRS methods, imply that that the generation lower

approximation of GBNRS consists of the centers of granular

balls, which probably do not exist in U. That is to say, the

objects in the generation lower approximation are generated

instead of a selection in U.

Deﬁnition 11. Let hU, C, Dibe a decision system. For a

condition attribute set B⊆C,Bis a relative reduction of C

if Bsatisﬁes:

GP osB(D) = GP osC(D),(13)

GP osB(D)6=GP osB−{a}(D), a ⊆B. (14)

As described in Deﬁnition 10, the generation positive

region is completely consistent with the traditional positive

region in NRS theory, because the neighbors of an object

belonging to the generation positive region all have the

same label as the object. These regions can be adaptively

and efﬁciently generated. Therefore, the generation positive

region can replace the positive region to stably determine

whether an attribute should be deleted. We describe this in

Deﬁnition 11.

Figure 2 shows the comparison between classical rough

sets, NRS, and GBNRS. In Fig. 2(a), the equivalence classes

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 6

(a) (b)

(c)

Fig. 2. A comparison of three rough sets methods. (a) Classical rough

sets in discrete space; (b) Neighborhood rough sets in real space; (c)

Granular ball neighborhood rough sets in the same space. In (b-c) the

samples marked with ”·” are the ﬁrst type of samples, and those marked

with ”+” are the second type of samples. In (c) the rectangles represent

the objects in the generation positive regions.

marked in red belong to xcompletely and form the lower

approximation of X. In Fig. 2(b), the samples in the circular

neighborhood of sample x1are all from class 1. Therefore,

x1belongs to the lower approximation of class 1. The

neighborhood samples of x3are from class 2, so x3belongs

to the lower approximation of class 2. In contrast, in the

neighborhood samples of x2we ﬁnd samples belonging to

class 1 and samples belonging to class 2. Therefore x2is the

boundary sample. In Fig. 2(c), each object in the generation

positive region has a different and adaptive range to search

its neighborswhich NRS does not have. The generation posi-

tive region consists of the centers of the granular balls. If the

purity threshold is 1, the outlier objects not contained in any

granular ball compose the boundary region: If the purity of

a granular ball is less than 1, those objects contained within

the ball also belong to the boundary region according to

Deﬁnition 9.

It should be noted that the generation positive region

does not belong to the original set Uand is not added into

U. As seen in Deﬁnition 11, it is only used as a much more

efﬁcient measurement to replace the traditional positive

region and judge if a condition attribute set is a relative

reduction.

4.2 Time complexity

A small number of granular balls can ﬁt the decision bound-

ary well, resulting in a GBC time complexity of O(N)[37].

We see that the production of the generation positive region

in GBNRS is therefore much more efﬁcient than obtaining

the positive region in NRS. This can be intuitively explained

as follows: NRS needs to search the neighbors of all objects.

In contrast, GBNRS only needs to generate a small number

of clusters using a fast algorithm—to wit, k-means—the

centers of which will constitute the generation positive

region. A granular ball inside its corresponding class is

likely to contain more objects than those relatively close to

the decision boundary. This results in a small number of

granular balls; that is, the number of objects belonging to

the generation positive region in GBNRS is much smaller

than the number of objects belonging to the positive region

in NRS. In spite of the small number of granular balls, GB-

NRS is more effective because the adaptively-sized granular

balls can ﬁt the decision boundary of datasets with various

distributions well [37]. This efﬁciency will be demonstrated

in our experimental section.

4.3 An improved method for computing granular balls

The method for granular ball computing we outlined does

not use any label information in the process of implement-

ing 2-means. For a given granular ball GB containing M

types of samples, we can improve GBC in two ways: ﬁrst,

2-means is replaced with M-means, and second, a sample

is selected from each class to form the M initial centroids.

As a result, the shape of the initial clusters will be closer to

the distribution of the different classes in the original data,

and label information is fully incorporated into the k-means

process. Consequently, the number of granular balls in this

improved GBC method is much smaller than in our original

GBC step. A smaller number of granular balls leads to

higher efﬁciency and a larger average size of granular ball.

Larger granular balls are desirable because they generate

more stable generation positive regions, which can result

in higher classiﬁcation accuracy. We call the NRS based on

this improved GBC multiple granular ball neighborhood

rough sets (MGBNRS). Figure 3 shows a visualization of

MGBNRS using a BPNN on the same data from Figure 1. In

subplots (a)-(c), the ﬁrst three iterations show the boundary

described by the granular balls computed using this method

corresponds much more closely to the original data than the

granular balls in Figure 1. This improved GBC converged

in the 8th iteration and generated 34 granular balls: in

contrast, our original GBC converged in the 10th iteration

and generated 43 granular balls. Clearly, our improved GBC

is more efﬁcient than the original GBC.

4.4 Algorithm design

The GBNRS algorithm can be divided into two parts. First,

the generation positive region is obtained, and second,

some attributes are reduced according to their importance

as judged by the generation positive region. In the ﬁrst

stage, GBC is implemented with a purity threshold of 1,

and the initial generation positive region consists of the

centers of all granular balls. Then, k-means clustering with

the centers initialized at the centers of the granular balls is

used to globally ﬁne-tune the centers and the granular balls.

Any resulting granular balls with purities unequal to 1 are

removed, and the generation positive region now consists

of the the centers of the remaining granular balls.

In the second stage, partition of points in the generation

positive region is produced after an attributed is removed.

Similar in approach to one iteration in k-means clustering,

each object is partitioned into its nearest cluster (that is,

granular ball), the center of which belongs to the genera-

tion positive region from the ﬁrst stage. This partition will

generate new granular balls. If there is a granular ball with a

purity less than 1, its center will not belong to the generation

positive region. This indicates that the generation positive

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 7

(a) (b)

(e) (f)

(g) (h)

Fig. 3. Visualizations of both granular ball generation and the generation

positive region using MGBNRS on the dataset fourclass with the purity

degree threshold set at 1. (a)-(c) The ﬁrst three iterations. (c)-(f) Middle

iterations. (g) The ﬁnal granular balls. (h) The generation positive region,

which coincides with the centers of the granular balls found during

improved GBC. The red points and red granular balls are labeled +1,

and the black points and black granular balls are labeled -1.

region is changed if the attribute is deleted, and therefore the

attribute should be retained. Otherwise, if the purity of each

granular ball is still 1, the generation positive region has

not changed, so the attribute should be removed. Because

attribute deletion may slightly affect the distribution of the

dataset, k-means is used again to ﬁne-tune the generation

positive region. This algorithm does not optimize any extra

parameters at all, rendering it completely adaptive. The

two stages are repeated until no further attributes can be

reduced. The speciﬁc algorithm is outlined in Algorithm

1. Clusters before and after an attribute is deleted often

have great similarities, so the cluster centers of the k-means

algorithm in each iteration of step 12 are initialized to the

centers of clusters before the attribute in question is deleted.

Algorithm 1: Granular Ball Neighborhood Rough Sets

Input: The dataset data, the condition attribute set C=

(c1, c2, ..., cm);

Output: A reduced attribute set C0;

// Stage 1: Obtaining the generation positive

region on the current condition attribute set C0.

1: C0is initialized to C;

2: Implement GBC on the data with the purity threshold

at 1 on C0: remove any outlier objects that are not

contained in any granular ball—these outlier objects

cannot possibly belong to the generation positive region

3: k-means clustering is implemented on the centers of

the resulting granular balls to globally ﬁne-tune each

granular ball;

4: The generation positive region now consists of the cen-

ters of the granular balls whose purity is 1 after the

previous step; all other balls are discarded.

//Stage 2: Determining if a condition attribute

should be reduced by comparing the generation posi-

tive region before and after the attribute is removed.

5: Remove a condition attribute ciin C0;

6: Generate the partition based on the centers in step 4 by

partitioning each object into the nearest granular ball

(that is, a cluster);

7: Compute the purity of the new granular balls in step 6;

8: if the purity of each granular ball is equal to 1 then

9: // This indicates that the generation positive region

is unchanged and that the attribute should be removed.

10: C0=C0−c

11: Re-run k-means clustering on the current centers of

the granular balls to generate new granular balls and

split them until the purity of each ball is 1;

12: Go to step 4;

13: else

14: cishould be retained;

15: if all attributes in C0have been checked then

16: Terminate;

17: else

18: Remove a new attribute in C0and go to step 6;

19: end if

20: end if

Therefore, even though the k-means clustering algorithm

is iteratively implemented in step 12, the algorithm can

converge quickly because the initial centers are close to the

convergent solution.

This is the process for GBNRS. Our second algorithm,

MGBNRS, only differs in the ball creation process. GBNRS

uses 2-means clustering to iteratively partition each granu-

lar ball into two balls until the purity of each granular ball is

1. In contrast, MGBNRS uses k-means clustering to partition

a granular ball, where k is equal to the number of classes in

the granular ball. Additionally, the initial centroids of the

k-means algorithm are selected from different classes.

5 EXPERIMENTS

In this section, we demonstrate the effectiveness and ef-

ﬁciency of GBNRS on widely-used benchmark datasets

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 8

TABLE 1

The Information of Datasets

Data Samples

Numerical

Condition

Attributes

Categorical

Condition

Attributes

Class

1 anneal 798 6 32 5

2 credit 690 6 9 2

3 german 1000 7 12 2

4 heart1 270 7 6 2

5 heart2 303 6 7 5

6 hepatitis 155 6 13 2

7 horse 368 7 16 2

8 iono 351 34 0 2

9 wdbc 569 30 0 2

10 wine 178 13 0 3

11 lymphography 148 0 18 4

12 zoo 101 0 16 7

13 abalone 4177 7 1 29

14 electrical 10000 13 0 2

15 htru2 17898 8 0 2

16 mushroom 8124 0 22 2

17 letter 20000 16 0 26

in comparison with classical NRS and the current state-

of-the-art NRS algorithm, Forward Attribute Reduction

Based on Neighborhood Rough Sets and Fast Search [13].

FARNeMF is an accelerated NRS method that ﬁnds a

new positive region only in the boundary region dur-

ing the process of forward attribute reduction. To our

knowledge, it is currently the most efﬁcient and effec-

tive NRS method. We design our experiments along the

lines of Tsang [30] and Chen [5]: as the quality of the

reduced attribute set is not related to the testing clas-

siﬁer used, we use only a common testing classiﬁer—

the nearest neighbor algorithm—to verify the quality of

the reduced attribute set. 10 cross-validation is used, and

the classiﬁcation accuracies with variances are present-

ed. We select a suite of widely-used benchmark datasets

in feature selection [13, 14], the UCI benchmark dataset-

s (http://archive.ics.uci.edu/ml/datasets.html). These are

described in detail in Table 1; we have extracted all usable

attributes from the datasets. For example, in the Horse

dataset, there are 27 total attributes; of these, 23 are condi-

tional attributes (categorized in Table 1) and 1 is a decision

attribute. The remaining 3 attributes are descriptions of

lesion types which cannot be used as conditional attributes.

5.1 Effectiveness

Our experimental results are shown in Tables 2 and 3,

and supplementary Tables 1 to 11. In classical NRS and

FARNeMF, in order to ﬁnd a good solution the neighbor-

hood radius is increased from 0.01 to 0.5 in intervals of 0.01.

The results are shown in Table 3, and supplementary Tables

1 to 11. As GBNRS inherently displays some instability due

to the random initialization of the centroids of the granular

balls during k-means clustering, both GBNRS and MGBNRS

are implemented ten times, and the best solution is selected.

In the tables, NA denotes the number of reduced attributes,

TABLE 2

Classiﬁcation Accuracy of Different NRS Methods

Data original NRS FARNeMF GBNRS MGBNRS

1 0.954±0.032 0.944±0.035 0.944±0.035 0.971±0.023 0.975±0.022

2 0.871±0.042 0.865±0.042 0.865±0.042 0.867±0.028 0.873±0.033

3 0.709±0.036 0.737±0.048 0.737±0.048 0.734±0.023 0.731±0376

4 0.819±0.051 0.819±0.062 0.819±0.062 0.809±0.063 0.801±0.103

5 0.551±0.069 0.598±0.049 0.598±0.049 0.591±0.058 0.613±0.050

6 0.800±0.112 0.846±0.063 0.846±0.063 0.851±0.077 0.876±0.090

7 0.771±0.081 0.796±0.082 0.796±0.082 0.825±0.051 0.834±0.042

8 0.848±0.069 0.871±0.044 0.871±0.044 0.883±0.030 0.883±0.030

9 0.967±0.025 0.972±0.023 0.972±0.023 0.972±0.019 0.974±0.020

10 0.943±0.060 0.955±0.036 0.955±0.036 0.966±0.029 0.978±0.126

11 0.806±0.137 0.814±0.096 0.814±0.096 0.861±0.088 0.863±0.085

12 0.941±0.067 0.914±0.102 0.914±0.102 0.959±0.054 0.953±0.092

13 0.525±0.050 0.529±0.034 0.529±0.034 0.527±0.057 0.527±0.057

14 0.914±0.006 0.914±0.006 0.914±0.006 0.916±0.009 0.916±0.009

15 0.977±0.004 0.914±0.046 0.914±0.046 0.978±0.004 0.978±0.004

16 0.950±0.109 0.961±0.034 0.961±0.034 0.945±0.118 0.983±0.055

17 0.979±0.004 0.979±0.004 0.979±0.004 0.979±0.004 0.979±0.004

and CA denotes the classiﬁcation accuracy. RN denotes the

neighborhood radius in NRS and FARNeMF.

It can be seen from Table 3 that when the neighborhood

radius is small, the number of remaining attributes is small,

and the corresponding classiﬁcation accuracy is also low.

As the neighborhood radius increases, both the number

of attributes and the corresponding classiﬁcation accuracy

gradually increase. However, when the neighborhood ra-

dius exceeds a certain value, the number of attributes and

the corresponding classiﬁcation accuracy gradually decrease

again. The above results are explained by the following:

When the neighborhood radius is very small, the neighbor-

hood of sample point contains very few other sample points.

Therefore most sample points stably belong to the positive

region, so the number of reduced attributes is large, result-

ing in a low classiﬁcation accuracy. As the neighborhood

radius increases, whether a sample belongs to the positive

region is more affected by the points near it. The number

of reduced attributes gradually decreases, and the classiﬁca-

tion accuracy gradually increases. When the neighborhood

radius is large enough, the neighborhood of each sample

point contains many other sample points that belong to

different categories, resulting in many data being classiﬁed

as the boundary region. Therefore, both the number of

remaining attributes and the corresponding classiﬁcation

accuracy is small. Although the overall trend shows this

pattern, as demonstrated in the tables, the optimal solution

corresponding to the highest classiﬁcation accuracy cannot

be found exactly. Therefore, following previous work with

NRS [15], we search for the optimal solution by increasing

the neighborhood radius from 0.01 to 0.5 in intervals of 0.01,

and present the experimental results in Table 2.

As redundant attributes are deleted, the representative

ability of the remaining attributes is improved. As shown in

Table 2, almost all of the highest classiﬁcation accuracies—

marked in bold—appear in the NRS, FARNeMF, GBNRS,

and MGBNRS columns; that is, these methods all produce

an attribute set on which the test classiﬁer can achieve a

higher accuracy than on the original dataset because the

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 9

TABLE 3

Attribution Reduction on the Dataset WDBC

RN Reduction Results NA CA

0.01 [0, 1, 8, 27] 4 0.9493±0.0368

0.02 [14, 21, 27, 28, 29] 5 0.9226±0.0416

0.03 [1, 4, 5, 9, 14, 27, 28] 7 0.9332±0.0284

0.04 [0, 7, 8, 9, 11, 18, 21, 24, 27] 9 0.9616±0.0278

0.05 [1, 4, 7, 8, 9, 11, 15, 18, 21, 22, 24, 29] 12 0.9617±0.0379

0.06 [0, 1, 4, 5, 6, 8, 9, 11, 18, 20, 21, 24, 25, 27] 14 0.9650±0.0282

0.07 [0, 1, 4, 5, 6, 8, 9, 10, 11, 15, 18, 20, 21, 24, 25, 26, 27, 28, 29] 19 0.9668±0.0275

0.08 [0, 1, 4, 5, 6, 8, 9, 10, 11, 14, 15, 18, 20, 21, 24, 26, 27, 28, 29] 19 0.9650±0.0215

0.09 [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 18, 19, 20, 21, 24, 25, 26, 27, 28, 29] 23 0.9684±0.0213

0.1 [1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 14, 15, 17, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29] 24 0.9702±0.0232

0.11 [0, 1, 4, 5, 6, 8, 9, 10, 11, 14, 15, 17, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29] 23 0.9702±0.0216

0.12 [0, 1, 3, 4, 5, 6, 7, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 23, 24, 25, 26, 27, 28, 29] 24 0.9720±0.0248

0.13 [0, 1, 2, 3, 4, 5, 7, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 24, 25, 26, 27, 28, 29] 23 0.9685±0.0228

0.14 [0, 1, 3, 4, 5, 6, 7, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 24, 25, 26, 27, 28, 29] 23 0.9703±0.0231

0.15 [1, 3, 4, 5, 6, 7, 8, 9, 11, 12, 14, 15, 17, 18, 20, 21, 24, 25, 26, 27, 28, 29] 22 0.9667±0.0190

0.16 [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 21, 22, 23, 24, 25, 26, 27, 28, 29] 25 0.9719±0.0234

0.17∼0.18 [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 20, 21, 24, 25, 26, 27, 28, 29] 24 0.9702±0.0216

0.19 [0, 1, 5, 7, 8, 9, 11, 12, 14, 15, 16, 17, 18, 21, 23, 24, 25, 26, 27, 28, 29] 21 0.9615±0.0254

0.2 [0, 1, 4, 5, 6, 7, 8, 9, 10, 11, 14, 15, 16, 17, 18, 19, 21, 23, 24, 25, 26, 27, 28, 29] 24 0.9703±0.0199

0.21 [0, 1, 5, 6, 8, 9, 10, 11, 14, 15, 16, 17, 18, 20, 21, 23, 24, 25, 26, 27, 28, 29] 22 0.9668±0.0250

0.22 [0, 1, 5, 6, 8, 9, 10, 11, 14, 15, 16, 17, 18, 20, 21, 23, 24, 25, 26, 29] 20 0.9632±0.0263

0.23 [0, 1, 5, 6, 7, 8, 9, 10, 11, 14, 16, 17, 18, 21, 23, 24, 25, 26, 28, 29] 20 0.9633±0.0319

0.24 [0, 1, 5, 6, 7, 8, 9, 10, 11, 14, 17, 18, 21, 23, 24, 25, 26, 28, 29] 19 0.9633±0.0319

0.25 [0, 1, 5, 6, 7, 8, 9, 10, 11, 14, 15, 16, 18, 21, 23, 24, 25, 26, 28, 29] 20 0.9668±0.0299

0.26 [0, 1, 5, 6, 8, 9, 10, 11, 14, 15, 16, 17, 18, 21, 22, 23, 24, 25, 26, 29] 20 0.9615±0.0254

0.27 [0, 1, 4, 5, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 21, 24, 25, 26, 28, 29] 22 0.9633±0.0298

0.28 [0, 1, 4, 8, 9, 10, 11, 14, 16, 17, 18, 19, 26, 29] 14 0.9580±0.0274

0.29 [0, 1, 4, 8, 9, 10, 11, 14, 16, 18, 19, 27, 28, 29] 14 0.9579±0.0248

0.3 [0, 1, 4, 8, 9, 10, 11, 14, 16, 18, 19, 27, 29] 13 0.9527±0.0317

0.31 [0, 1, 4, 8, 9, 10, 11, 12, 14, 16, 18, 19, 24, 25, 27, 29] 16 0.9510±0.0290

0.32 [11, 16] 2 0.6786±0.0421

0.33 [0, 4, 9, 10, 14, 16, 17, 18, 29] 9 0.9385±0.0343

0.34 [4, 9, 13, 14, 16, 17, 18, 29] 8 0.9070±0.0383

0.35∼0.37 [14, 16, 17] 3 0.7418±0.0900

0.38∼0.5 [] 0 0

removal of redundant features improves generalizability.

Although we search for the optimal solution in NRS and

FARNeMF by increasing the neighborhood radius from

0.01 to 0.5 in intervals of 0.01, the interval 0.01 may not

be small enough, and the optimal neighborhood radius

using our grid searching may not be optimal enough. In

contrast, the neighborhood radii in GBNRS and MGBNRS

are generated adaptively according to the granular balls’

cohesion with the decision boundary of the original dataset,

allowing GBNRS and MGBNRS to achieve higher average

classiﬁcation accuracy than classical NRS and FARNeMF.

Because the granular ball centers in MGBNRS are more

stable than the granular ball centers in GBNRS, MGBNRS

can achieve higher classiﬁcation accuracy than GBNRS.

As shown in Table 2, GBNRS and MGBNRS did not give

a higher accuracy than other methods on three datasets.

In both methods, some granular balls are quite small and

their centers are not stable enough to be considered part

of the generation positive region. In addition, although

the generation of granular balls in MGBNRS more fully

utilizes label information in the dataset than the generation

of granular balls in GBNRS, both generations have an ele-

ment of randomness due to the k-means selection process—

though generation positive regions in MGNRS have a higher

probability of being stable than those in GBNRS. This is a

known issue and something we plan to address in future

work.

Obviously, grid searching for the optimal neighborhood

radius will considerably deteriorate the performance of

NRS. Our experiments in the next section demonstrate that

GBNRS and MGBNRS are much more efﬁcient than NRS

even when grid searching is not used.

5.2 Efﬁciency

We demonstrate the efﬁciency of GBNRS and MGBNRS in

comparison with NRS and FARNeMF on six large bench-

mark datasets selected from the previous benchmarks. We

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 10

(a) (b)

(e) (f)

Fig. 4. Efﬁciency comparison. (a) On the dataset german. (b) On the

dataset abalone. (c) On the dataset mushroom. (d) On the dataset

electrical. (e) On the dataset htru2. (f) On the dataset letter.

randomly select ten percent of the dataset to begin with, and

implement the experiment on this ten percent of the data:

we gradually increase the size of our working dataset by

feeding in the remaining data in increments of ten percent.

Because NRS and FARNeMF are very slow for large-scale

datasets, the neighborhood radius in this experiment is ﬁxed

at 0.02 and no grid searching is performed. Even though the

time spent in grid searching for the neighborhood radius

is not considered in either NRS or FARNeMF, we can see

from Figure 4 that the advantage of GBNRS and MFBNRS

in efﬁciency is considerable. Except in a few cases on smaller

working datasets (that is, 40% or less of the original dataset)

in Fig. 4(a) and Fig. 4(b), the GBNRS execution time curve

is below those of NRS and FARNeMF. The FARNeMF curve

is always below that of NRS because some points in the

positive region in FARNeMF do not affect the computation.

When the size of the dataset increases, the execution time of

GBNRS is always below that of both NRS and FARNeMF.

For example, in the largest experimental dataset, letter in

Fig. 4(f), NRS and FARNeMF take more than 6000 and

1000 seconds respectively; in contrast, GBNRS only takes

80 seconds, a more than 90% improvement. However, in

all datasets, the completion time curve for MGBNRS is the

absolute lowest of all methods, including on those datasets

on which GBNRS was out-performed at low sample sizes

by the current state-of-the-art FARNeMF method. This in-

dicates that MGBNRS is more efﬁcient than GBNRS; this

is because label information is more fully involved in the

improved GBC in MGBNRS, and therefore the number of

generation granular balls in MGBNRS is much smaller than

in GBNRS. We can also easily see the at minimum O(N2)

time of NRS and FARNeMF reﬂected in these graphs: if grid

searching time was included instead of a ﬁxed radius, this

time would only increase. However, GBNRS and MGBNRS

both reﬂect their actual O(N)times, as seen in these graphs.

An unexpected result of the experiment in Fig. 4(a) and

Fig. 4(b) is that increasing the size of the small working

dataset lead to a decrease in the running time. This is caused

by a small amount of randomness in both the compared

algorithms and the running processes of the computer.

6 CONCLUSION

We propose GBNRS, a novel rough set method. This is

the ﬁrst parameter-free rough set algorithm for process-

ing continuous data; it requires no membership functions

nor the optimization of any mid-computation parameter-

s for processing continuous data. We demonstrably out-

perform the current state-of-the-art NRS algorithm with a

time complexity of O(N). Our adaptive method of selecting

the neighborhood radius improves the quality of attribute

reduction. On benchmark datasets widely used for feature

selection, GBNRS obtains higher classiﬁcation accuracy than

both classical neighborhood rough sets and the current best

NRS algorithm, FARNeMF. We show that efﬁciency is im-

proved by more than 90% on the relatively large benchmark

dataset letter. We also improve granular ball computing and

propose MGBNRS to achieve even higher efﬁciency than

GBNRS.

In our future work we would like to further improve

the stability of GBNRS and MGBNRS. Our method for

computing the generation positive region already improves

the stability of GBNRS, but there is still some inherent

instability in the process of partitioning the granular balls

during attribute reduction, due to the random initialization

of the centers of the granular balls in the new iteration. We

also seek to improve the accuracy of GBNRS and MGBNRS:

we were unable to achieve the highest accuracy on 3 of our

20 test datasets. This could either be because some granular

balls are so small that their centers are too unstable to be

considered part of the generation positive region (which

we want to remedy in itself), or because the generation

of granular balls in both GBNRS and MGBNRS has some

inherent randomness because k-means is used, even though

label information is more fully used in MGBNRS than

GBNRS. As our execution time improvement is so great that

we have achieved O(N)performance, we must improve in

accuracy, which is held back by randomness and instability.

These stability issues are of great interest to us and provide

strong motivation for future work.

7 ACKNOWLEGEMENTS

The authors greatly thank the handling associate editor and

all anonymous reviewers for their valuable comments. This

work was supported in part by the National Natural Science

Foundation of China under Grant Nos. 61806030, 61876027,

61876201, and 61772096, the National Key Research and

Development Program of China (2016QY01W0200), the

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 11

Natural Science Foundation of Chongqing (cstc2019jcyj-

msxmX0485), and NICE: NRT for Integrated Computational

Entomology, US NSF award 1631776.

REFERENCES

[1] Aggarwal, and Manish. ”Probabilistic Variable Precision Fuzzy

Rough Sets.” IEEE Transactions on Fuzzy Systems (2015):1-1.

[2] Aggarwal, and Manish. ”Rough Information Set and Its Applica-

tions in Decision Making.” IEEE Transactions on Fuzzy Systems

25.2(2017):265-276.

[3] Albanese, Alessia , S. K. Pal , and A. Petrosino . ”Rough Sets, Kernel

Set, and Spatiotemporal Outlier Detection.” IEEE Transactions on

Knowledge and Data Engineering 26.1(2014):194-207.

[4] An, Shuang , et al. ”Data-Distribution-Aware Fuzzy Rough Set

Model and its Application to Robust Classiﬁcation.” IEEE Trans

Cybern 46.12(2015):3073-3085.

[5] Chen, Degang , and Y. Yang . ”Attribute Reduction for Het-

erogeneous Data Based on the Combination of Classical and

Fuzzy Rough Set Models.” IEEE Transactions on Fuzzy Systems

22.5(2014):1325-1334.

[6] Chen, Hongmei , et al. ”A decision-theoretic rough set approach

for dynamic data mining.” IEEE Transactions on Fuzzy Systems

23.6(2015):1-1.

[7] Chen, Hongmei , et al. ”A Rough Set-Based Method for Up-

dating Decision Rules on Attribute Values’ Coarsening and Re-

ﬁning.” Knowledge & Data Engineering IEEE Transactions on

26.12(2014):2886-2899.

[8] Chen, Yumin , et al. ”Measures of uncertainty for neighborhood

rough sets.” Knowledge-Based Systems 120(2017):226-235.

[9] Dai, Jianhua , et al. ”Attribute Selection for Partially Labeled

Categorical Data By Rough Set Approach.” IEEE Transactions on

Cybernetics (2016):1-12.

[10] Dai, Jianhua , et al. ”Maximal Discernibility Pairs based Approach

to Attribute Reduction in Fuzzy Rough Sets.” IEEE Transactions on

Fuzzy Systems (2017):1-1.

[11] Dai, Jianhua, et al. ”Neighbor inconsistent pair selection for at-

tribute reduction by rough set approach.” IEEE Transactions on

Fuzzy Systems 26.2 (2018): 937-950.

[12] Hoa, Nguyen Sinh, and Nguyen Hung Son. ”Some efﬁcient algo-

rithms for rough set methods.” Proceedings IPMU. Vol. 96. 1996.

[13] Hu, Qinghua , et al. ”Efﬁcient Symbolic and Numerical At-

tribute Reduction with Neighborhood Rough Sets.” PR & AI

21.6(2008):730-738.

[14] Hu, Qinghua , et al. ”Large-Scale Multi-Modality Attribute Re-

duction with Multi-Kernel Fuzzy Rough Sets.” IEEE Transactions

on Fuzzy Systems (2017):1-1.

[15] Hu, Qinghua , et al. ”Numerical Attribute Reduction Based on

Neighborhood Granulation and Rough Approximation.” Journal of

Software 19.3(2008):640-649.

[16] Jensen, Richard , and Q. Shen . ”Fuzzyrough attribute reduction

with application to web categorization.” Fuzzy Sets and Systems

141.3(2004):469-485.

[17] Juang, Chia Feng , and C. T. Lin . ”An online self-constructing

neural fuzzy inference network and its applications.” IEEE Trans-

actions on Fuzzy Systems 6.1(1998):12-32.

[18] Liang, Decui , Z. Xu , and D. Liu . ”A New Aggregation Method-

based Error Analysis for Decision-theoretic Rough Sets and Its

Application in Hesitant Fuzzy Information Systems.” IEEE Trans-

actions on Fuzzy Systems 25.6(2017):1685-1697.

[19] Liang, Jiye , et al. ”A Group Incremental Approach to Feature

Selection Applying Rough Set Technique.” IEEE Transactions on

Knowledge and Data Engineering 26.2(2014):294-308.

[20] Liu, Xiaodong , et al. ”The Development of Fuzzy Rough Sets with

the Use of Structures and Algebras of Axiomatic Fuzzy Sets.” IEEE

Transactions on Knowledge and Data Engineering 21.3(2009):443-

462.

[21] Maji, Pradipta. ”A rough hypercuboid approach for feature selec-

tion in approximation spaces.” IEEE Transactions on Knowledge

and Data Engineering 26.1 (2014): 16-29.

[22] Maji, Pradipta , and P. Garai . ”IT2 Fuzzy-Rough Sets and Max

Relevance-Max Signiﬁcance Criterion for Attribute Selection.” IEEE

Transactions on Cybernetics 45.8(2014):1657-1668.

[23] Miao, Duoqian , et al. ”A Heuristic Algorithm for Reduction

of Knowledge.” Journal of Computer Research & Development

36.6(1999):681-684.

[24] Pawlak, Zdzisław. ”Rough sets.” International Journal of Comput-

er & Information Sciences 11.5(1982):341-356.

[25] Pawlak Z, Skowron A. Rudiments of rough sets[J]. Information

Sciences, 2006, 177(1):3-27.

[26] Qian, Yuhua , et al. ”Positive approximation: An accelerator for

attribute reduction in rough set theory.” Artiﬁcial Intelligence 174.9-

10(2010):597-618.

[27] Rehman, Noor, et al. ”SDMGRS: Soft dominance based multi

granulation rough sets and their applications in conﬂict analysis

problems.” IEEE Access 6 (2018): 31399-31416.

[28] Rodriguez, Alex, and Alessandro Laio. ”Clustering by fast search

and ﬁnd of density peaks.” Science 344.6191 (2014): 1492-1496.

[29] Skowron, A. , and C. Rauszer . ”The Discernibility Matrices and

Functions in Information Systems.” (1992).

[30] Tsang, Eric CC, et al. ”Attributes reduction using fuzzy rough

sets.” IEEE Transactions on Fuzzy systems 16.5 (2008): 1130-1141.

[31] University, Tsinghua, et al. ”Theory of Fuzzy Quotient Space

(Methods of Fuzzy Granular Computing).” Journal of Software

14.4(2003):770-776.

[32] Wang, Changzhong, et al. ”Attribute reduction based on k-nearest

neighborhood rough sets.” International Journal of Approximate

Reasoning 106 (2019): 18-31.

[33] Wang, Changzhong , et al. ”A Fitting Model for Feature Selection

With Fuzzy Rough Sets.” IEEE Transactions on Fuzzy Systems

25.4(2017):741-753.

[34] Wang, Jue , and J. Wang . ”Reduction algorithms based on discerni-

bility matrix: The ordered attributes method.” Journal of Computer

Science and Technology 16.6(2001):489-504.

[35] Wang, Qi, et al. ”Local neighborhood rough set.” Knowledge-

Based Systems 153 (2018): 53-64.

[36] Xia, Shuyin, et al. ”Complete Random Forest based Class Noise

Filtering Learning for Improving the Generalizability of Classiﬁer-

s.” IEEE Transactions on Knowledge and Data Engineering (2018).

[37] Xia, Shuyin, et al. ”Granular ball computing classiﬁers for efﬁcient,

scalable and robust learning.” Information Sciences 483 (2019): 136-

152.

[38] Xu, Feifei , et al. ”Mutual Information-Based Algorithm for Fuzzy-

Rough Attribute Reduction.” Journal of Electronics & Information

Technology 30.6(2008):1372-1375.

[39] Yang, and Yanyan. ”Incremental perspective for feature selection

based on fuzzy rough sets.” IEEE Transactions on Fuzzy Systems

(2017):1-1.

[40] Yang, Yanyan , D. Chen , and H. Wang . ”Active Sample Selection

Based Incremental Algorithm for Attribute Reduction With Rough

Sets.” IEEE Transactions on Fuzzy Systems 25.4(2017):825-838.

[41] Yao, Jing Tao , and N. Azam . ”Web-Based Medical Decision

Support Systems for Three-Way Medical Decision Making With

Game-Theoretic Rough Sets.” IEEE Transactions on Fuzzy Systems

23.1(2015):3-15.

[42] Yao, Yiyu. ”Granular computing for data mining.” Defense &

Security Symposium 2006.

[43] Yao, Yiyu , and Y. Zhao . ”Discernibility matrix simpliﬁca-

tion for constructing attribute reducts.” Information Sciences

179.7(2009):867-882.

[44] ZADEH, and A. L. . ”Toward a theory of fuzzy information

granulation and its centrality in human reasoning and fuzzy logic.”

Fuzzy Sets & Systems 90.90(1997):111-127.

[45] Zhao, Suyun , et al. ”A Novel Approach to Building a Robust

Fuzzy Rough Classiﬁer.” IEEE Transactions on Fuzzy Systems

23.4(2015):769-786.

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 12

Shuyin Xia received his B.S. degree in 2008

and his M.S. degree in 2012, both in computer

science and both from the Chongqing University

of Technology in China. He received his Ph.D.

degree from the College of Computer Science

at Chongqing University in China. He is an as-

sociate professor and a Ph.D. super visor at the

Chongqing University of Posts and Telecommu-

nications in Chongqing, China, a position he has

held since 2015. His research interests include

data mining, granular computing, fuzzy rough

sets, classiﬁers, and label noise detection. His research has been

published in many prestigious journals and conferences, such as NIPS,

IEEE-TKDE, and IS. He is the current executive deputy director of the

Chongqing Municipal Public Security Bureau Qihoo 360 Big Data and

Network Security Joint Lab. He is also the director of the Chongqing

Artiﬁcial Intelligence Association and an IEEE Member.

Hao Zhang received his B.S. degree in internet

of things engineering from the Chongqing Uni-

versity of Science and Technology in Chongqing,

China, in 2014. He is currently pursuing a M.S.

degree in computer technology at the Chongqing

University of Posts and Telecommunications in

Chongqing, China. His research interests in-

clude rough sets, granular computing, and data

mining.

Wenhua Li received her B.S. degree in in-

formation and computation science from the

Chongqing University of Posts and Telecommu-

nications in Chongqing, China, in 2014. She is

currently pursuing a M.S. degree in computer

science and technology at the Chongqing U-

niversity of Posts and Telecommunications in

Chongqing, China. Her research interests in-

clude rough sets, granular computing, and data

mining.

Guoyin Wang received a B.E. degree in com-

puter software in 1992, a M.S. degree in com-

puter software in 1994, and a Ph.D. degree in

computer organization and architecture in 1996,

all from Xi’an Jiaotong University in Xi’an, Chi-

na. His research interests include data mining,

machine learning, rough sets, granular comput-

ing, cognitive computing, and so forth. He has

worked at the University of North Texas, USA,

and the University of Regina, Canada, as a Vis-

iting Scholar. Since 1996, he has been work-

ing at the Chongqing University of Posts and Telecommunications in

Chongqing, China, where he is currently a Professor and a Ph.D. super-

visor, the Director of the Chongqing Key Laboratory of Computational

Intelligence, and the Dean of the Graduate School. He is the Steering

Committee Chair of the International Rough Set Society (IRSS), a Vice-

President of the Chinese Association for Artiﬁcial Intelligence (CAAI),

and a council member of the China Computer Federation (CCF).

Elisabeth Giem received two bachelor’s de-

grees, one in pure mathematics and one in

music (concentration in performance) from the

University of California, Riverside (UCR). She

received a master’s degree in computational and

applied mathematics from Rice University, and a

master’s degree in pure mathematics from UCR.

She joined Zizhong Chen’s SuperLab in 2018

as a computer science Ph.D. student, and has

been awarded the NSF NRT In Computational

Entomology Fellowship. Her research interests

include but are not limited to high-performance computing, parallel and

distributed systems, big data analytics, computational entomology, and

numerical linear algebra algorithms and software.

Zizhong Chen received his bachelor’s degree

in mathematics from Beijing Normal Universi-

ty, master’s degree in economics from Renmin

University of China, and Ph.D. degree in com-

puter science from the University of Tennessee,

Knoxville. He is a professor of computer sci-

ence at the University of California, Riverside.

His research interests include high performance

computing, parallel and distributed systems, big

data analytics, cluster and cloud computing,

algorithm-based fault tolerance, power and en-

ergy efﬁcient computing, numerical algorithms and software, and large-

scale computer simulations. His research has been supported by the

US National Science Foundation, US Department of Energy, Nvidia,

and Microsoft Corporation. He received a CAREER Award from the US

National Science Foundation and Best Paper Awards from the Interna-

tional Supercomputing Conference and IEEE International Conference

on Cluster Computing. He is a senior member of the IEEE and a life

member of the ACM. He currently serves as a subject area editor for

Elsevier Parallel Computing journal and an associate editor for IEEE

Transactions on Parallel and Distributed Systems.

Attribute reduction based on a rapid variable granular ball generation model

Preprint

Full-text available

Jun 2024

Attribute reduction is a key step in processing large-scale datasets, where the Granular Ball Neighborhood Rough Set (GBNRS) can significantly enhance the performance of attribute reduction compared to the traditional Neighborhood Rough Set (NRS). However, the GBNRS algorithm faces such challenges as a sharp increase in computational costs in high-dimensional spaces. To address these issues, this study introduces a new granular ball quality index to judge the separability degree of decision classes, and on the basis of this index, a rapid variable granular ball generation model (RVGBGM) is proposed. Compared with GBNRS, RVGBGM has the following advantages: 1) it reduces the number of granular balls and can quickly reflect the separability degree of different decision classes with few granular balls, 2) it constructs granular balls by using label information and shortens the time of granular ball construction, and 3) it can adjust the radius of granular balls adaptively by using parameters to determine the optimal granular ball radius for different datasets. Finally, we compare the RVGBGM algorithm with classical attribute reduction algorithms and the current state-of-the-art granular ball algorithm on 11 datasets. The proposed algorithm significantly improves algorithm efficiency while maintaining high accuracy.

WalkNAR: A neighborhood rough sets-based attribute reduction approach using random walk

Article

Full-text available

Jun 2024
APPL INTELL

Neighborhood rough sets, as an effective tool for processing numerical data, is widely used in many fields, such as data mining, machine learning and decision-making system. However, most of the existing neighborhood rough set-based attribute reduction algorithms have low efficiency. To address the limitation, this paper has proposed an efficient positive region search algorithm based on multiple hash buckets and multiple granularity mechanisms. This algorithm achieves a more accurate neighborhood extent by superimposing the effects of multiple hash buckets, and accelerates positive region searching through the idea of multiple granularity. In addition, on the foundation the positive region search algorithm, we improved the existing algorithm and proposed an attribute reduction algorithm based on multi-hash bucket and multi-granularity. To further remove the redundant attributes, the two algorithms mentioned above are applied into a novel attribute reduction approach based on random walk. Experiments conducted on UCI datasets show that our attribute reduction algorithm has high efficiency. Moreover, attribute reduction approach we proposed can further compress the reduced attribute set, and the results maintain similar or even better classification accuracy.

Efficient and Fast Algorithm for Attribute Reduction of Large Dimensional Data Using Rough Set Theory on Graphics Processing Unit

Article

Jun 2024

Attribute reduction or attribute subset selection is among the highly important, and essential data pre-processing tasks in all the applications belonging to various domains of engineering that fall under the broad spectrum of artificial intelligence. The process of attribute subset selection and the significance of each selected attribute greatly affect the classification performance of any machine learning algorithm. Rough set theory-based solutions for attribute subset selection have been proven to be very effective for categorical information systems. However, most of those attribute reduction algorithms are serial in nature. They are either inefficient in processing datasets having a very large number of dimensions or their efficiency is overshadowed by high computational costs. Hence, they are becoming inapplicable to the current data processing requirements. To address this problem, we first propose a novel and efficient attribute reduction algorithm named Reduction of Attributes based on Association and Separation (RAAS). This algorithm is based on two measures: the degree of association (DA) of objects within a class and the degree of separation (DS) among objects of different classes. These measures are used for the evaluation of the significance of each attribute as well as the classification ability of each attribute subset. A sequential backward elimination strategy using the DA and the DS is designed to obtain the optimal attribute subset. The RAAS algorithm is evaluated against other typical reduction algorithms over a few publicly available standard datasets from the UCI data repository. The experimental results show that RAAS produces better classification accuracies in comparison to the others. We then designed the parallel version of RAAS, the other proposed algorithm called Parallel Attribute Reduction Algorithm based on Association and Separation (PARAAS) which is both efficient and fast. The PARAAS algorithm is the first algorithm that is designed specifically to perform attribute reduction of larger dimensional categorical datasets on graphics processing units (GPUs) that support CUDA. Experimental analysis suggests that PARAAS has the ability to produce high classification accuracies in significantly low execution times.

Detecting anomalies with granular-ball fuzzy rough sets

Article

Jun 2024
INFORM SCIENCES

GB RAIN : Combating Textual Label Noise by Granular-ball based Robust Training

Conference Paper

Jun 2024

Granular-ball-based Fast Spectral Embedding Clustering Algorithm for Large-Scale Data

Conference Paper

Jun 2024

Text Adversarial Defense via Granular-Ball Sample Enhancement

Conference Paper

Jun 2024

Adaptive Three-way KNN Classifier Using Density-based Granular Balls

Article

Jun 2024
INFORM SCIENCES

RA-MRS: A high efficient attribute reduction algorithm in big data

Article

Jun 2024

Multi-fuzzy β -covering fusion based accuracy and self-information for feature subset selection

Article

May 2024
INFORM FUSION

Complete Random Forest Based Class Noise Filtering Learning for Improving the Generalizability of Classifiers

Article

Full-text available

Oct 2018

The existing noise detection methods required the classifiers or distance measurements or data overall distribution, and ‘curse of dimensionality’ and other restrictions made them insufficiently effective in complex data, e.g. different attribute weights, high-dimensionality, containing feature noise, nonlinearity, etc. This is also the main reason that the existing noise filtering methods were not widely applied and formed an effective learning framework. To address this problem, we propose here a complete and efficient random forest method (CRF) specifically for the class noise detection by simulating the grid generation and expansion. The CRF is not based on distance measures or overall distribution or classifiers; besides, the voting mechanism makes it able to effectively process datasets containing feature noise. Furthermore, we introduce CRF based class noise filtering learning framework (CRF-NFL) and derive its mathematical model. The framework is then applied to many widely used classifiers including some stat-of-the-art algorithms, e.g. k-means tree, GBDT and XGBoost. Moreover, its parallelized is designed for large-scale data. The CRF-NFL show much better generalizability than the conventional classifiers and the relative density-based method, which is the most effective noise filtering method as far as we know. All research has formed an open source library, called CRF-NFL: http://www.cquptshuyinxia.com/CRF-NFL.html.

SDMGRS: Soft Dominance Based Multi Granulation Rough Sets and their Applications in Conflict Analysis Problems

Article

Full-text available

May 2018

The classical rough set theory was presented by Pawlak, which is mainly concerned with the approximation of sets described by a single binary relation on the universe. In the present paper, we initiate a multi attribute group decision making problems in the presence of multi attribute and multi decision in decision making with preferences. Then resolving the problem, using two different approximation strategies, i.e., seeking common reserving difference and seeking common rejecting difference, four kinds of soft dominance based multi-granulation rough sets are presented, namely, soft dominance based optimistic multi-granulation rough sets and soft dominance based pessimistic multi-granulation rough sets and their applications in solving a multi agent conflict analysis decision problem. The proposed method addresses the limitations of the Pawlak model and Sun’s conflict analysis model and thus improve these models. Finally, the results on labor management negotiation problems show that the proposed algorithms are more effective and effcient for feasible consensus strategy when compared with Sun’s technique.

Local neighborhood rough set

Article

Full-text available

Apr 2018
KNOWL-BASED SYST

With the advent of the age of big data, a typical big data set called limited labeled big data appears. It includes a small amount of labeled data and a large amount of unlabeled data. Some existing neighborhood-based rough set algorithms work well in analyzing the rough data with numerical features. But, they face three challenges: limited labeled property of big data, computational inefficiency and over-fitting in attribute reduction when dealing with limited labeled data. In order to address the three issues, a combination of neighborhood rough set and local rough set called local neighborhood rough set (LNRS) is proposed in this paper. The corresponding concept approximation and attribute reduction algorithms designed with linear time complexity can efficiently and effectively deal with limited labeled big data. The experimental results show that the proposed local neighborhood rough set and corresponding algorithms significantly outperform its original counterpart in classical neighborhood rough set. These results will enrich the local rough set theory and enlarge its application scopes.

Maximal-Discernibility-Pair-Based Approach to Attribute Reduction in Fuzzy Rough Sets

Article

Full-text available

Oct 2017

Neighbor Inconsistent Pair Selection for Attribute Reduction By Rough Set Approach

Article

Full-text available

Apr 2017

Rough set theory, as one of the most useful soft computing methods dealing with vague and uncertain information, has been successfully applied to many fields, and one of its main applications is to perform attribute reduction. Although many heuristic attribute reduction algorithms have been proposed within the framework of the rough set theory, these methods are still computationally time-consuming. In order to overcome this deficit, we propose in this paper two quick feature selection algorithms based on the neighbor inconsistent pair, which can reduce the time consumed in finding a reduct. At first, we propose several concepts regarding simplified decision table(U ′ ) and neighbor inconsistent pairs. Based on neighbor inconsitent pairs, we constructe two new attribute significance measures. Furthermore, we put forward two new attribute reduction algorithms based on quick neighbor inconsistent pairs. The key characteristic of the presented algorithms is that they only need to calculate U ′ =R once under the process of selecting the best attribute from attribute sets: C−R, while most existing algorithms need to calculate partition of U ′ for |C−R| times. In addition, the proposed algorithms need only to deal with the equivalent classes in U ′ =R that contain at least one neighbor inconsistent pair, while most existing algorithms need to consider all objects in U ′ . The experimental results show that the proposed algorithms are feasible and efficient.

Granular Ball Computing Classifiers for Efficient, Scalable and Robust Learning

Article

May 2019
INFORM SCIENCES

Granular computing is an efficient and scalable computing method. Most of the existing granular computing-based classifiers treat the granules as a preliminary feature procession method, without revising the mathematical model and improving the main performance of the classifiers themselves. So far, only few methods, such as the G-svm and WLMSVM, have been combined with specific classifiers. Because of the complete symmetry of the ball and its simple mathematical expression, it is relatively easy to be combined with the other classifiers’ mathematical models. Therefore, this paper uses a ball to represent the grain, namely the granular ball, and not only the granular balls’ labels but also the distance between a pair of balls is defined. Based on that, this paper attempts to propose a new granular classifier framework by replacing the point inputs with the granular balls. We derive the basic model of both the granular ball support vector machine and granular ball k-nearest neighbor algorithm (GBkNN). In addition, the GBkNN is compared with the k-means tree based kNN, which is the most efficient and effective kNN as far as we known, on both public and artificial data sets. The Experimental results demonstrate the effectiveness and efficiency of the proposed framework.

Attribute reduction based on k-nearest neighborhood rough sets

Article

Dec 2018
INT J APPROX REASON

Incremental Perspective for Feature Selection Based on Fuzzy Rough Sets

Article

Jun 2017

Yanyan Yang

Feature selection based on fuzzy rough sets is an effective approach to select a compact feature subset that optimally predicts a given decision label. Despite being studied extensively, most existing methods of fuzzy rough set based feature selection are restricted to computing the whole dataset in batch, which is often costly or even intractable for large datasets. To improve the time efficiency, we investigate the incremental perspective for fuzzy rough set based feature selection assuming data can be presented in sample subsets one after another. The key challenge for the incremental perspective is how to add and delete features with the subsequent arrival of sample subsets. We tackle this challenge with strategies of adding and deleting features based on the relative discernibility relations that are updated as subsets arrive sequentially. By the strategies, two incremental versions for fuzzy rough set based feature selection are designed: 1) updating the relative discernibility relations and the feature subset as each sample subset arrives, and then returning the final feature subset after all subsets are processed; 2) updating the relative discernibility relations with subset arriving continuously, and then computing the feature subset after all subsets are added. Experimental comparisons suggest our incremental algorithms expedite fuzzy rough set based feature selection without compromising performance.

Rough Information Set and Its Applications in Decision Making

Article

Feb 2017

Manish Aggarwal

The decision making in the real world is inevitably characterized with vagueness, and imprecision due to incomplete knowledge. To this end, we combine the information set with the rough set theory to represent both the vagueness and imprecision at the same time. We term the proposed structure as rough information set that has information sets based on fuzzy equivalence relations as its building blocks. The usefulness of the proposed structure is demonstrated through a case-study in credit scoring analysis, and a biometrics application on knuckle-based recognition.

A Heuristic Algorithm for Reduction of Knowledge

Article

Jan 1999

GBNRS: A Novel Rough Set Algorithm for Fast Adaptive Attribute Reduction in Classification

Abstract and Figures

Recommended publications

Attribute Reduction Method Based on Improved Granular Ball Neighborhood Rough Set

Extended Rough Sets Model Based on Fuzzy Granular Ball and Its Attribute Reduction

A Fast Attribute Reduction Algorithm of Neighborhood Rough Set

LRA: an accelerated rough set framework based on local redundancy of attribute for feature selection