ArticlePDF Available

Abstract and Figures

A rule based knowledge system consists of three main components: a set of rules, facts to be fed to the reasoning corresponding to the data of a case, and an inference engine. In general, facts are stored in (relational) databases that represent knowledge in a first-order based formalism. However, legal knowledge uses defeasible deontic logic for knowledge representation due to its particular features that cannot be supported by first-order logic. In this work, we present a unified framework that supports efficient legal reasoning. In the framework, a novel inference engine is proposed in which the Semantic Rule Index can identify candidate rules with their corresponding semantic rules if any, and an inference controller is able to guide the executions of queries and reasoning. It can eliminate rules that cannot be fired to avoid unnecessary computations in early stages. The experiments demonstrated the effectiveness and efficiency of the proposed framework.
Content may be subject to copyright.
Towards an Efficient Rule-based Framework for Legal
Reasoning
Qing Liu1, Mohammad Badiul Islam, Guido Governatori
Software & Computational Systems, Data61, CSIRO, Australia
{Q.Liu, Badiul.Islam, Guido.Governatori}@data61.csiro.au
Abstract
A rule based knowledge system consists of three main components: a set of
rules, facts to be fed to the reasoning corresponding to the data of a case, and
an inference engine. In general, facts are stored in (relational) databases that
represent knowledge in a first-order based formalism. However, legal knowledge
uses defeasible deontic logic for knowledge representation due to its particular
features that cannot be supported by first-order logic. In this work, we present
a unified framework that supports efficient legal reasoning. In the framework,
a novel inference engine is proposed in which the Semantic Rule Index can
identify candidate rules with their corresponding semantic rules if any, and an
inference controller is able to guide the executions of queries and reasoning. It
can eliminate rules that cannot be fired to avoid unnecessary computations in
early stages. The experiments demonstrated the effectiveness and efficiency of
the proposed framework.
Keywords: Rule-based Legal Reasoning, Query Processing, Index,
Integration
1. Introduction
Normative systems can be understood as a set of norms, where each norm
can be represented as “IF precondition THEN conclusion” structure where the
1Corresponding Author
Preprint submitted to Journal of L
A
T
E
X Templates April 29, 2021
IF part represents the precondition of applicability of norms and the THEN
corresponds to the normative conclusion of the norms [1, 2]. Accordingly, rule5
based systems provide an adequate framework for the representation of norms,
normative systems and legal knowledge (see, for example, [3, 4, 5] for some
rule-based frameworks for legal reasoning).
Typically, a rule based knowledge system consists of three main components,
a set of rules (encoding the norms and principles to be used to perform the re-10
quired reasoning), facts to be fed to the reasoning corresponding to the data
of a case, and an inference engine. In general for many applications, the data
needed to run the application is stored in (relational) databases. Relational
databases essentially represent knowledge in a first-order based formalism, and
query languages mostly exploit first-order logic features. However, legal knowl-15
edge has some particular features that make first-order logic not fully suitable
to represent it [6]. In general, the proper representation of norms and legal
knowledge requires:
defeasible reasoning [7, 8], and
reasoning about and with deontic concepts [9].20
We use rules converted from the New South Wales mandatory reporter guide-
lines [10] as an example.
Example 1.1. Consider the example of the provision prescribing to report to
Community Service (CS) if there is a situation where a child has been sexually
abused and the initiating person has continuing or imminent contact with the25
victim and there where some coercion or the victim is in a situation of inferior-
ity, then the situation has to be reported immediately to Community Services
(CS). Otherwise, the normal procedure is to file a formal report to CS. In case
of a problematic behaviour without further conditions, the educator has to con-
tinue to monitor the situation. The above legal requirement can be formally30
represented by the following rules [3]:
2
r1:SexualBehaviourV sOther C oercionOrI nf erior
ContactW ithV ictim [OAN P ]reportT oC SI mmediately
r2:SexualBehaviourV sOther C oercionOrI nf erior
∧¬[OAN P ]reportT oC SI mmediately [OAN P ]repor tT oCS
r3:P ersistentS exualBehaviour V sOther CoercionOrI nf erior
∧¬[OAN P ]reportT oC SI mmediately ∧ ¬[OAN P ]reportT oC S
[OAN P ]consultW ithC W U N[OM]monitor
, where indicates that the rule is a defeasible rule that can be defeated
by a contrary evidence, [OANP] and [OM] are deontic operators, operator N
in conclusion is used for expressing a preference ordering (as in this case) for al-
ternatives or reparation chain in which if an obligation (e.g. consultW ithC W U35
in r3) is violated, then the violation can be compensated by the next obligation
(e.g. monitor).
By the above example, we can see that rule-based legal knowledge systems
have their own features compared with that of the other rule-based knowledge
systems:40
Logic: Defeasible logic with deontic operators is used to model legal rules.
Most of rule-based systems considers strict rules only. It means whenever
a precondition is indisputable, then so is the conclusion. However, for
a legal knowledge-based system, not only strict rules, but also defeasible
rules and defeaters are involved (see Section 2 for details). It indicates45
that the semantic relations among rules are more complex in a rule-based
legal system than that in a general rule-based system.
Precondition: A set of predicates form the precondition of a rule. Each
predicate name represents a legal term in a legal system and the predicate
is corresponding to a simple or complex database query that may involves50
multiple selections, projections and joins across tables and databases.
3
Here, a precondition can be represented by a set of DB queries. As for
general rule-based systems, each predicate is corresponding to a database
attribute. Therefore, a precondition represents only one DB query in a
“Select-From-Where” form but not a set of queries with more rich seman-55
tics to describe preconditions as legal rules do.
Conclusion: Given the same precondition, the conclusion of a rule could be
a reparation chain in which if an obligation is violated, then the violation
can be compensated by the next obligation and so forth using the N
operator (e.g. r3in Example 1.1), where in general rule-based systems, a60
conclusion is indisputable.
The process of determining which rules should be applied and how they
should be interpreted is often referred to as legal reasoning. To allow knowledge-
based systems to support large scale legal reasoning for decision making and
even be able to explain or legally justify conclusions reached, it is critical to65
understand how reasoning is achieved. In general, there are two main strategies
of rule-based reasoning: (a) forward chaining that starts with existing facts and
applying rules to derive all possible facts, and (b) backward chaining that starts
with the desired conclusion and performs backward to find supporting facts.
There are some challenges that must be overcome for large scale legal reasoning70
due to its own characteristics:
Complex semantic relations: Rules in legal domain have not only depen-
dent relationship but also defeater relationship. Together with the repara-
tion chain situation, they bring more dynamics during reasoning process
which essentially leads to the performance challenge.75
Inference efficiency: rule matching is to determine how to match facts
stored in databases with rules. It is a crucial issue that influences the
efficiency of a reasoning performance. A well-defined index may mean
the difference between hours and a few seconds in rule matching [11].
In many rule-based systems, the precondition of a rule is constructed by80
4
predicates that directly correspond to attributes in relational databases.
Then indexes are developed using attributes to facilitate fast rule match-
ing. However, given a large amount of legal rules and complex predicates,
it is impossible using the existing methods to build index by decomposing
each predicate to attribute level due to huge memory consumptions and85
the complex relations among rules. And the whole index may need to be
updated if a database relation is changed.
Recursive problem: When modelling real world applications, some rules
are dependent on other rules. In Example 1.1, r3is dependent on r1and
r2because two of its predicates depend on the conclusion of r1and r2
90
respectively, where r2is also dependent on r1. Agrawa et al. show that
recursive rules can be converted to solving the transitive closure on the
database relations [12]. Due to the limitations of attribute-based approach
as mentioned above, new methods that do not rely on attributes and
database relations are demanded to address the recursive problem for legal95
reasoning.
Reactive inference: the current reasoning process is that the inference
engine looks for rules which match facts stored in the working memory
or provided by users. One rule is selected from the “conflict set” and
executed to generate a new fact. Then the inference engine continues the100
reasoning based on the new fact together with the previous given facts.
We call this as reactive inference because the inference engine only reasons
based on what are given. In a Big Data environment, it is impossible for
users to pre-know all the facts.
Database query efficiency problem: most of the existing work focuses on105
reducing the response time for rule matching but neglects the query ex-
ecution time and the reasoning time. [13] is the only work that studied
the query efficiency problem between databases and the rule-based legal
system. Its empirical experiments have suggested that query executions
are the most expensive processes for the rule-based legal decision making.110
5
However, how to map queries to the corresponding rules is not studied in
that work.
In this paper, we study the integration problem between database systems
and rule-based legal systems. Given a set of queries/predicates that users are
interested in, we present a unified framework with the aim to minimizing the115
overall response time for legal reasoning. In the framework, a novel inference
engine is proposed in which the Semantic Rule Index can identify candidate
rules with their corresponding semantic rules if any, and an inference controller
is able to guide the executions of queries and reasoning. It can eliminate rules
that cannot be fired to avoid unnecessary computations in early stages. To the120
best of our knowledge, this is the first work that provides a seamless integration
between the inference engine and databases for rule-based legal reasoning. The
contributions are summarised as following:
We formally define the “Rule Containment Query” problem that given a
set of predicates, it returns both strict rules and defeasible rules that can125
be fired, as well as their semantic rules that can explain why those rules
are fired.
A novel inference engine for legal reasoning is proposed. It includes:
The two-layer Semantic Rule Index that is able to identify candidate
rules efficiently. It also establishes a semantic relationship among130
rules to solve the recursive problem.
An inference controller that guides reasonings and queries to avoid
unnecessary query and rule computations.
Some database caching techniques are adopted in the unified framework
to improve the overall database query performance that further reduces135
query responding times.
The experiments conducted demonstrate the effectiveness and efficiency
of the proposed framework.
6
The rest of the paper is organized as follows. We introduce the Defeasible
Logic and Defeasible Deontic Logic in Section 2. In Section 3, we formally de-140
fined the problem. The framework with novel inference engine and database
caching techniques is presented in Section 4. Evaluation results are shown in
Section 5. We review some related work in Section 6. Finally, Section 7 con-
cludes the work.
2. Defeasible Logic145
Legal reasoning can be viewed as rule-guided2activities and processes in-
volving series of actions leading to a legal decision [14]. Indeed, Defeasible Logic
(DL) is a formalism that has been successfully used for legal reasoning to gen-
erate a legal decision (and it has been proved that other formalisms successful
in legal reasoning correspond to variants of DL [15]). Defeasible Deontic Logic150
has been successfully used for applications in legal reasoning [16, 17, 18, 19] and
it is has been shown that it does not suffer from problems affecting other logics
used for reasoning about norms and compliance [20, 19]. Thus Defeasible Deon-
tic Logic is a conceptually sound approach for the representation of regulations
and at the same time, it offers a computationally feasible environment to reason155
about them ([21] proved that the logic is computationally feasible since we can
compute the extension of a theory in linear time).
Defeasible Logic, a “skeptical” nonmonotonic logic (meaning that it does
not support contradictory conclusion), was originally proposed by Donald Nute
[22]. Since then it has been significantly used in the legal domain or closely160
related areas, such as modelling regulations [23], e-contracting [17, 24], busi-
ness processes compliance [25, 26] and automatic negotiation system [27]. The
modelling of regulations in DL also offers support for “Decision support”, “Ex-
planation”, “Anomaly detection”, “Hypothetical reasoning” and “Debugging”
tasks. Decision support is used to infer a correct answer from given rules and165
regulations. DL is one of the possible solutions since regulations may contradict
2Rules are encoded according to the legal requirements describes in legal documents.
7
one another. Using the defeasible rules do not necessarily in force; instead they
may be blocked by other rules with contrary conclusions [23].
Adefeasible theory D (a knowledge base in defeasible logic, or a defeasible
logic program [28]), consists of five different kinds of knowledge: facts, strict170
rules, defeasible rules, defeaters, and a superiority relation. D is a triple (F, R,
) where Fand Rare finite sets of facts and rules respectively, and is a
superiority relation on R.
The language of DL consists of a finite set of literals, where a literal is
either an atomic proposition or its negation. Given a literal l,ldenotes its175
complement. That is, if l=pthen l=¬p, and if l=¬pthen l=p.
Facts are logical statements describing indisputable facts, represented either
in the form of states of affairs (literal or modal literal (refer to section 2.1)) or
actions that have been performed, and are considered to be always true. For
example, “John is a human” is represented by: human(John).180
A rule r, on the other hand, describes the relations between a set of literals
(the precondition A(r), which can be empty) and a literal (the conclusion C(r)).
We can specify the strength of the rule relation using the three kinds of rules
supported by DL, namely: strict,defeasible, and defeater.
Strict rules are rules in the classical sense: whenever the premises are indis-
putable (e.g. a fact) then so is the conclusion. For example,
human(X)mammal(X)
which means “Every human is a mammal”.185
It is worth to mention that strict rules with empty precondition can be
interpreted the same way as facts. However, in practice, facts are more likely
to be used to describe contextual information; while rules, on the other hand,
are more likely to be used to represent the reasoning underlying the context.
Defeasible rules are rules that can be defeated by contrary evidence. For
example, typically mammal cannot fly, written formally:
mammal(X)⇒ ¬flies(X)
8
The idea is that if we know that Xis a mammal, then we may conclude that190
it cannot fly unless there is other, not defeated, evidence suggesting that it may
fly (for example that the mammal is a bat). A defeasible rule with an empty
precondition can be considered as a presumption.
Defeaters are rules that cannot be used, on their own, to draw any con-
clusions. Their only use is to prevent some conclusions, i.e., to defeat some
defeasible rules by producing evidence to the contrary. For example the rule:
heavy(X) ¬flies (X)
states that an animal is heavy is not sufficient enough to conclude that it does
not fly. It is only evidence against the conclusion that a heavy animal flies. In195
other words, we do not wish to conclude that ¬flies if heavy, we simply want
to prevent a conclusion flies.
A full definition of the proof theory can be found in [16, 29]. Roughly, the
rules with conclusion pform a team that competes with the team consisting of
the rules with conclusion ¬p. If the former team wins pis defeasibly provable,200
whereas if the opposing team wins, pis non-provable. To conclude, let us
consider Das a theory in DL (as described above). A conclusion of Dis a
tagged literal and can have one of the following four forms: +∆qmeaning that
qis definitely provable in D(i.e. using only facts and strict rules); qmeaning
that we have proved that qis not definitely provable in D; +∂q meaning that205
qis defeasible provable in D; and q meaning that we have proved that qis
not defeasible provable in D.
Strict derivations are obtained by forward chaining of strict rules while a
defeasible conclusion pcan be derived if there is a rule whose conclusion is p,
whose prerequisites (precondition) have either already been proved or given in210
the case at hand (i.e., facts), and any stronger rule whose conclusion is ¬phas
precondition that fails to be derived. In other words, a conclusion pis (defea-
sibly) derivable when: pis a fact, or there is an applicable strict or defeasible
rules for p, and either all the rules for ¬pare discarded (i.e., not suitable) or
every rule for ¬pis weaker than an applicable rule for p.215
9
2.1. Defeasible Deontic Logic
It has been argued that legal reasoning requires two types of rules: constitu-
tive rules and prescriptive rules. Constitutive rules are used to model definition
of terms and parameters specific to legal documents, for example, the defini-
tions of terms in an act, whereas prescriptive rules are applied for encoding220
the obligations, prohibitions, permissions, . . . , and the conditions under which
they enter into force according to a specific legal document. To correctly model
the provision corresponding to prescriptive norms, we have to supplement the
language with deontic operators. In this respect we follow the classification
proposed by [30, 31]. In addition, the logic has mechanisms to terminate and225
remove obligations (see [26] for full details). For obligations and permission we
use the following notation:
[P]p:pis permitted;
[OM]p: there is a maintenance obligation for p;3
[OAPP]p: there is an achievement preemptive and perduring obligation230
for p;
[OAPNP]p: there is an achievement preemptive and non-perduring obli-
gation for p;
[OANPP]p: there is an achievement non preemptive and perduring obli-
gation for p;235
[OANPNP]p: there is an achievement non preemptive and non-perduring
obligation for p.
Compensations are implemented based on the notion of ‘reparation chain’
[32]. A reparation chain is an expression
[O1]c1[O2]c2⊗ · · · [On]cn,
3Prohibitions can be expressed as maintenance obligations with a negated content, i.e.,
[OM]¬p.
10
where each [Oi] is an obligation, and each ciis the content of the obligation
(modelled by a literal). The meaning of a reparation chain is that we have that
c1is obligatory, but if the obligation of c1is violated, i.e., we have ¬c1, then the240
violation is compensated by c2(which is then obligatory). But if even [O2]c2is
violated, then this violation is compensated by c3which, after the violation of
c2, becomes obligatory, and so on.
Defeasible Deontic Logic allows deontic expressions (but not reparation chains)
to appear in the body of rules. Thus we can have rules like:
restaurant,[P]sellAlcohol [OM]showLicense [OANPP]payFine
The rule above means that if a restaurant has a license to sell alcohol (i.e, it
is permitted to sell it, [P]sellAlcohol ), then it has a maintenance obligation to
expose the license ([OM]showLicense), if it does not then it has to pay a fine
([OANPP]payFine). The obligation to pay the fine is non-pre-emptive (meaning
that it cannot be paid before the violation). The logic is equipped with a binary
relation over rules, called superiority relation, that allows us to handle rules with
conflicting conclusions: for example a rule rsetting a general prohibition and
a second rule sthat derogates the prohibition permitting the conclusions. This
type of situation is common in legal reasoning and can be modelled by saying
that sis “stronger” than r, in symbols s > r. If both rules apply, we will say
that sdefeats r. For example, continuing the restaurant example above, we can
have the rules
r1:restaurant [OM]¬sellAlcohol
r2:restaurant,license [P]sellAlchol
r2> r1
The first rule (r1) prescribe the general prohibition for a restaurant to sell
alcohol, and the second rule (r2), in conjunction with the superiority relation,245
derogate the prohibition, permitting the sale if the restaurant has a license to
see alcohol.
For full a description of the logic and its features, see [17, 26, 3].
11
The reasoning to determine what obligations, prohibitions, and permissions
are derivable from a set of facts and a set of rules is as follows.250
An obligation [O]p(where [O], [Ox] and [Dy], in the description below, are
placeholders for the obligations described above) is derivable if:
1. [O]pis given as one of the facts, or
2. there is a rule
r:a1,...an[O1]p1... [Om]pm[O]p . . .
such that
(a) for all 1 in,aiis provable, and255
(b) for all 1 jm, [Oj]pjand ¬pjare provable, and
(c) for all rules
s:b1, . . . , bk[D1]q1... [Dl]ql[D]p0
such that p0is the negation of p, either
i. exists 1 iksuch that biis not provable, or
ii. exists 1 jlsuch that either [Dj]qjor ¬qjis not provable, or
iii. rdefeats s.260
The idea is that there must be a rule that fires: so all the elements in the
antecedents are provable (a), and in case the conclusion is an obligation for a
reparation, all the obligations before it have to be violated. Thus, the violated
obligations were in force (thus the obligations were provable) and we have ev-
idence that it was violated (thus the negation of the content of each violated265
obligation is provable) (b). Also, we have to ensure that there are no rules for
the opposite that fire (c), and if they do, these rules are weaker than the rule
for the obligation we want to conclude.
For permission, we have the same conditions, but where we use [P]pinstead
of [O]p; also, we conclude [P]pif we can conclude [O]p. Due to space reasons,270
readers interested in understanding the semantics, deontic operator conversions,
conflict detections, conflict resolutions, and algorithm implementing this rule-
based system are referred to [33, 21, 26, 3] for details.
12
3. Problem Statement
Through out the discussion in previous sections, we can see that predicates275
and literals have the same meaning. In the rest of the article, we will refer them
interchangeably based on the context.
A rule can be fired only if all the literals in its antecedent are provable. We
classify literals into 2 types based on how they are derived:
Definition 3.1 (Fact Literal).A literal in a rule’s antecedent that provides280
indisputable facts through an SQL query statement is a fact literal.
Definition 3.2 (Dependent Literal).A literal in a rule’s antecedent, that is
dependent on another rule’s conclusion, is a dependant literal.
The semantic relationships between two rules can be defined as:
Definition 3.3 (Support Rule).Rule rjis rule ri’s support rule if rihas a285
dependent literal on rj.
Definition 3.4 (Defeat Rule).Rule rjis rule ri’s a defeat rule if ri< rj.
Example 3.1. Figure 1 shows an example of a rule set. Among the literals of
all the rules’ antecedents, pi(i1..7) and ¬p4are fact literals. ¬d1,¬d2and
d5are dependent literals because they are dependent on the conclusions of rule290
r2,r3and r5respectively.
r1: p1O¬d1O¬d2 => Od3
r2: p3p4p5=> O¬d1
r3: p3p5O¬d1=> O¬d2 Od7
r4: p1p2O¬d1=> Od4
r5: p5 => Od5
r6: p5p7¬p4Od5=> O¬d7
r7: p3∧ p6=> Od6
r3< r6
Figure 1: A Rule Set Example
Example 3.2. For rule r1in Figure 1, it can not be fired unless fact literal
p1is true, and ¬d1and ¬d2are provable. Since ¬d1and ¬d2are dependant
13
literals, we have to examine r1’s support rule r2and r3. At the same time, r2
is also the support rule of r3. This is an example of the recursive rule problem295
as mentioned above. Furthermore, since r6is a defeat rule of r3,r6needs to be
reasoned as well to decide if ¬d7is derivable that violates d7in r3in case ¬d2
in r3is violated.
Next, we formally define the problems.
Definition 3.5 (Rule Containment Query (RCQ)).Given a query set Q=300
{q1, q2, ..., qx}, rule set R={r1, r2, ..., rm}, where Fr={p1, p2, ..., pn}is the
fact literal set of r(rR). Rule Containment Query returns all the rules
RQ={r|FrQTrR}that can be fired as well as their corresponding
support rules and/or defeat rules.
By the above definition, users can apply RCQ to determine a set of legal305
literal of interest instead of using a set of all pre-known facts. We need to
address the recursive rule problem to decide if a rule can be fired or not given a
query set. The sets of supported rules and defeated rules provide an explanation
of the reasons why the rules are fired. Furthermore, RCQ could involve both
backward chaining and forward chaining since we want to identify all the rules310
that can be fired.
Based on users’ interests, we also define the Rule Intersection Query problem
as following. It has a more relax constraint on rules compared with that for the
Rule Containment Query.
Definition 3.6 (Rule Intersection Query).Given a query set Q={q1, q2, ..., qx},315
rule set R={r1, r2, ..., rm}, where Fr={p1, p2, ..., pn}is the fact literal set of
r(rR). Rule Intersection Query returns all the rules RQ={r|∃pFrTp
QTrR}that can be fired with their corresponding support rules and/or
defeat rules.
In this paper, we focus on the containment query. The principles can be320
adapted to address the intersection query easily.
14
4. System Framework
For a rule-based system to support decision-making activities drawn from
existing records stored in various databases and relevant legal requirements,
it is important that the two systems, database systems and rule systems, can325
interact with each other but also can work independently. In this section, we
present a novel unified framework that integrates the two systems seamlessly
for legal reasoning and answers the Rule Containment Query efficiently.
Given a query set Qthat represents users’ interests and a rule set R, the
overall response time for users receiving reasoning results are influenced by the330
three components: the time for identifying relevant rules, query execution time
to compute facts, and the reasoning time. A straight forward method is for
every rule rR, (a) compute if Fris contained by Q, (b) execute all the
fact literals p(pFr) represented by SQL statements to generate facts, (c)
identify its support rules to decide if its dependant literals are provable, which335
is a recursive process, (d) send the rule, the computed fact literals and the
dependent literals to an inference engine to conduct reasoning, and (e) check
its defeat rules are provable if any, by which itself may again involve recursive
problem. All the steps, apart step (d) which is reasonably fast, require expensive
processes.340
We propose a novel framework for Rule-based Legal Reasoning that has the
following advantages: first, by incorporating a two-layer Semantic Rule Index
in the inference engine, we are able to search candidate rules based on literals
provided by users, as well as their corresponding support rules and defeat rules
efficiently. Second, the Inference Controller designed for the inference engine345
can use intermediate query results and reasoning results to remove rules that
can not be fired in early stages. It avoids unnecessary query computations
and reasonings that further improve the performance significantly. Third, the
framework allows for the adoption of database caching strategies to reduce the
response time.350
Figure 2 shows the overall Rule-based framework. The three main compo-
15
nents are: the Semantic Rule Index (SRI), the Inference Controller and the
Database Query.
Remote
DB
Cache
Query
Cashed
Fact Literal Trie
Inference Engine
yes
Query Execution
no
Rule
Containment
Query
Rule
System
Predicates
Semantic Rule Graph
Index
Database Query
Inference Controller
Work ing Mem ory
Rule
Interpreter
All triggered rules
and explanations
Figure 2: The Rule-based Framework for Legal Reasoning
4.1. The Semantic Rule Index
Based on Definition 3.5, given a query set Q, Rule Containment Query prob-355
lem seeks all the rules that can be fired. Our goal is to design an index structure
to efficiently solve the lookup problem as well as rule recursive problem. The
Semantic Rule Index has two layers: the Fact Literal Trie and the Semantic
Rule Graph.
Let FR=S
rR
Frbe a set of distinct fact literals of all rules in R, where Fr
360
is rule r’s literals in its antecedent. Let fbe a bijective mapping f:FRI,
where I={1,2,3, ..., |FR|}. By f, we assign every fact literal {∀p|pFR}an
unique ID. Since a precondition is a conjunction of literals, all fact literals of
a rule can be represented by a set of IDs in ascending order. So we can work
directly with the IDs for the index purpose.365
The Fact Literal Trie
A data structure for fast super-set and sub-set queries, named set trie, is
16
presented in [34]. Set trie is a tree storing a set of words which are represented
by a path from the root of the set-trie to a node corresponding to the indices of
elements from words. Next we will show how we adopt this data structure to370
solve our look up problem.
The Fact Literal Trie (FLT) is a tree-based data structure built by all fact
literals’ IDs with the properties of set-trie. It has a root node with key {} and all
its child nodes with literal IDs kas keys, where k∈ {1,2, ..., |FR|}. A node khas
ordered children with their unique keys js in ascending order, where j, j > k.375
Therefore, a rule’s fact literals can be uniquely represented by a path in FLT.
The fact literal with the largest ID of each rule has its rule ID associated with
it.
r1: p1O¬d1O¬d2 => Od3
r2: p3p4p5=> O¬d1
r3: p3p5O¬d1=> O¬d2 Od7
r4: p1p2O¬d1=> Od4
r5: p5 => Od5
r6: p5p7¬p4Od5=> O¬d7
r7: p3∧ p6=> Od6
r3< r6
(a) Rule Set
{}
1
2
3
4
5
5
5
r5
r1
r2
r3
r4
7
8
r6
----------------------------------------------
(b) The Semantic Rule Index
6
Fact Literal Trie Semantic Rule
Graph
r1
r4
r2
r7
r5
r6
r3
¬d1¬d1
¬d2
¬d1
d5
(<,d7)
Figure 3: An Example of a Rule Set and its Rule Index
Example 4.1. The upper part of Figure 3(b) shows an example of the Fact
Literal Trie for the rule set in Figure 3(a), where I={1,2, ..., 8}represents380
{p1, ..., p7,¬p4}respectively. We can see that the path {} → 12 in FLT
represents r4’s fact literals. r2,r3and r7share the common path {} → 3. It
presents that they all have fact literal p3in their preconditions. 8 is the largest
fact literal ID with which its rule ID r6is associted.
The FLT construction is similar to that in [34] except it deals with literal but385
not word. To be self-contained, next we present the complete FLT construction
method in Algorithms 1 and 2.
To guarantee the property that all children’s keys are larger than that of
17
Algorithm 1: FactLiteralTrieConstruction
Input : Rule set R
Output : Fact Literal Trie rootNode
1create rootNode with key {};
2foreach rRdo
3sort r.getF r() in ascending order;
4node rootN ode;
5RuleInsertion(node, r);
6return rootNode
their parent, we pre-sort the literal IDs kof each rule in ascending order (Line
3 in Algorithm 1). Therefore, for the Rule Insertion method, node with smaller390
key can always be inserted before the node with larger key. Later we will show
that by this feature, the FLT is able to filter irrelevant rules efficiently given a
query.
Algorithm 2: RuleInsertion
Input : node, r
Output : Fact Literal Trie rootNode
1if r.getF r().getC urrentP () 6=null then
2if exists child of node with key k=r.getF r().getC urrentP () then
3nextNode child of node with key k;
4else
5nextNode create child of node with key k;
6RuleInsertion(nextN ode, r.getF r().g etNextP ());
7else
8node.setRuleID(r.getI D());
9return
In Algorithm 2, first we check if all the fact literals pof rule rhave been
visited (Line 1 7). Line 2 5 build a new child node to hold key kif kdoesn’t395
exist in the children of the current node. Otherwise, the algorithm is ready to
18
examine the next literal based on the child of current node with key k(Line 6).
If all the fact literals have been inserted to the trie, the corresponding rule ID
is assigned to the literal with the largest literal ID (Line 8).
The Semantic Rule Graph400
A Semantic Rule Graph (SRG) is a labelled directed acyclic graph g=
(V, E , L(v), L(e)), where Vis a set of vertices, each vertex vrepresenting a
rule; L(v) is a vertex labelling function, L(v)∈ {r1..|R|}.EVxVis a set
of directed edges, each edge < vi, vj>representing a semantic relationship
between riand rj.L(vi, vj) is an edge labelling function, where L(vi, vj) = d405
if rjis a support rule of riand dis the corresponding dependent literal, or
L(vi, vj) = (, d0) if rjis ri’s defeat rule and d0is an obligation in ri’s conclusion.
It indicates that rjhas ¬d0in its conclusion.
SRG models the support and defeat relationships among rules. Given a rule
ri, all the vertices that can be reached directly from rifollowing edges’ directions410
are its support rules or defeat rules. All the vertices that can be reached directly
from riagainst edges’ directions are rules that have rias support rule or defeat
rule. The recursive rule problem is solved through graph traversal. Note that
the Semantic Rule Graph constructed may contain a set of graphs in which each
graph captures the rules that have semantic relations among them.415
Example 4.2. Figure 3(b) lower part shows an example of the Semantic Rule
Graph. By the graph, we can see that r2is the support rule of r1,r3and r4
because each of them has a path to r2. Similarly, r3is the support rule of r1.
r6is the defeat rule of r3and r5is r6’s support rule. Since r7does not have
any semantic relationship with the rest of the rules, it is not presented in the420
Semantic Rule Graph.
4.2. Querying the Semantic Rule Index
As introduced in Section 2.1, for defeasible deontic logic, a rule rcan be fired
if and only if: (a) all its fact literals are provable, (b) all its dependent literals
are provable, and (c) the rules for the opposite conclusion are either defeated or425
19
do not fire. Based on the Semantic Rule Index constructed, the query algorithm
aims at identifying the candidate rules and their corresponding semantic rules
efficiently.
Give a query Q, there are two steps to query the Semantic Rule Index. First,
we search the Fact Literal Trie for all the candidate rules whose fact literals are430
contained by Q(Algorithm 3). Then the candidate rules are pushed into the
Semantic Rule Graph to identify their corresponding rules that have semantic
relations (Algorithm 4).
Algorithm 3: FactLiteralTrieSearch
Input : Trie root node, Query Q
Output : CQ
1if node.getRuleI D() 6=null then
2CQnode.getRuleI D();
3if node.hasChild() == false or node.getKey() Q.getM axI D() then
4return
5foreach node’s child cwith key kQdo
6node c;
7F actLiteralT rieS earch(node, Q);
Fact Literal Trie Search
Algorithm 3 shows the key steps. The algorithm works in a recursive depth435
first search manner. It identifies a path that contains all the keys in Q(Line
57). The current recursion stops when it reaches FLT leave nodes or the key
of node is greater than or equal to the maximum key in Q(Line 3 4). Recall
that one of the main features of the Fact Literal Trie is that keys of children
nodes are always larger than that of the parents. Therefore, there is no need440
to search the children of the current node further. The rules whose fact literals
are contained by Qcan be identified through the nodes with rule IDs that are
visited by the algorithm (Line 1 2).
Example 4.3. Consider query Q={1,3,5,7}and the Fact Literal Trie in
20
Figure 3(b). Algorithm 3 first identifies that the path 1 : {} → 1, path 2 : {} → 3445
and path 3 : {} → 5 are the candidate paths to be further examined. For path
1, since r1is associated with key 1, r1can be added to the candidate result set
CQimmediately. Since key 1’s child key 2 is not contained by Q, there is no
need to examine its children further. Path 2 can be extended to: {} → 35.
For path 3, since key 7 is not contained by Q, there is no need to examine its450
child and this path can be pruned straight away. The output of the trie search
is CQ={r1, r3, r5}.
In Example 4.3, we can see that there is no need to consider all the children
paths if the current ID is not contained by Q. The algorithm is able to prune
the paths as early as possible that leads to an efficient trie search.455
Semantic Rule Graph Search
The Semantic Rule Graph search algorithm identifies all the rules that have
semantic relations with the candidate rules generated by Algorithm 3. For each
candidate rule ri, a semantic rule search starting from riis conducted on the
Semantic Rule Graph. All the vertices rjs that can be reached from riare the460
support rules or defeat rules. And all the vertices rks that can reach riare the
rules that have rias their support rule or defeat rule.
Algorithm 4: SemanticGraphSearch
Input : a set of semantic rule graphs {G}, Candidate Rule Set CQ
Output : A set of subgraphs Crules
1Set all vertices and edges in the corresponding G“Unvisited”;
2Crules null;
3foreach rCQdo
4if G.getLable(r) = U nvisited then
5SemanticRuleS earch(G, r, Crules );
Algorithm 4 seeks all the rules that have semantic relations with candidate
rules. First all the vertices and edges are set to Unvisited and the candidate
rule graph set Crules is null as the initialization (Line 12). For each candidate465
21
Algorithm 5: SemanticRuleSearch
Input : Semantic Rule Graph G, vetex v, a set of subgraphs Crules
Output : the updated Crules by v
1overlapV ertex null;
2newSubGraph =f alse;
3G.set(v, V isited);
4if vdoes not belong to any graph gCrules then
5Graph gnew Graph();
6g.addV ertex(v);
7Crules g;
8newSubGraph =true;
9foreach eG.outgoingEdges(v)// backward chaining
10 do
11 if G.getLabel(e) = U nvisited then
12 G.set(e, V isited);
13 we.getT arg et();
14 g.addEdge(v, w );
15 if G.getLabel(w) = U nvisited then
16 SemanticRuleS earch(G, w , Crules);
17 else
18 if newSubGraph =true then
19 overlapV ertex.add(w);
20 Similarly, search all in going edges of v// forward chaining;
21 if !overlapV ertex.isEmpty() then
22 merge subgraphs in Crules that share the same vertex woverlapV ertex;
22
rule rreturned by the FLT search, Semantic Rule Search Algorithm is executed
(Line 4 20).
The main idea of Algorithm 5 is that during traversing the Semantic Rule
Graph Gstarted with a given candidate rule ri, all the vertices and edges
reached by riin Gare the rules that need to be computed and marked as470
“Visited” (Line 9 19). Since Gwill be searched for other candidate rules rj
in CQ, we could stop traversing Gwhen rjreaches vertex rkand/or edge ek
that is already marked “Visited”. It means rjshares some semantic rules with
riand the semantic relations starting from rk/ekhave been discovered before.
There is no need to traverse Gfrom rk/ekfurther. Otherwise, the recursive rule475
search (Line 15) will not stop until reaching rules that do not have any semantic
relation with any other rule (no outgoing/ingoing edge).
Second, a new graph gwill be formed to represent r’s part of the semantic
rules if the current being visited rule is not discovered (marked as “Unvisited”)
by previous rules (Line 48). Compared with the traditional Depth First Search480
algorithm, instead of returning a set of visited vertices or a path, our algorithm
needs to capture the full semantic relations among the candidate rules which are
essential a set of directed acyclic labelled sub-graphs of the original Semantic
Rule Graph. The new generated gis put in the Cr ules set.
Similarly, the algorithm traverses through ri’s ingoing edges (Line 20) as485
well to identify all potential rules that have rias their support rule or defeat
rule.
Last, as mentioned above, since rjmay share some semantic rules with ri,
we need to merge the two subgraphs together to represent the overall semantic
relations among the rules.490
Example 4.4. Continue the Example 4.3. Given query Q={1,3,5,7}and
the output of the FLT search CQ={r1, r3, r5}, Figure 4(a) shows the semantic
rule search result. By the graph, we can see that to fire r1, we need to check if
its support rule r3and r2can be fired. However, to fire r3, not only its support
rule r2but also its defeat rule r6need to be checked. Furthermore, r6cannot495
23
be fired unless its support rule r5is fired. In summary, to check if the candidate
rule r1and r3can be fired, potentially we need to examine r2, r6and r5as well.
Furthermore, the algorithm also do the forward chaining checking. Since r2is
a support rule of r4,r4may be fired if r2is fired.
Figure 4: An Example of the Semantic Graph Search
The size of the candidate rule graph set |Crules| ≤ |CQ|. By the algorithm,500
we can see that every vertex and edge in Gare visited at most once no matter
how many candidate rules are present. The algorithm guarantees that the search
can be done in O(V+E) time, where Vand Eare the number of vertices and
edges contained in the Semantic Rule Graph. As stated before, Vis smaller
than the total number of rules since it only contains the rules that have semantic505
relations with other rules.
4.3. The Inference Controller
To fire a rule, it is necessary to reason its support rules and defeat rules if
any. In another word, if a rule’s support rules cannot be fired, its correspond-
ing dependent literal is not provable and therefore, the rule cannot be fired.510
Furthermore, if a rule’s defeat rule concludes the opposite, the rule cannot be
fired either. Therefore, if not only the semantic relations but also the ordering
of the rules to be reasoned can be identified, we can avoid some unnecessary
computations including both query executions and reasonings.
The inference controller first sets the reasoning order among the rules by515
topological sorting. Topological sorting problem is that given digraph G=
(V, E ), find a linear ordering of vertices such that: for all edges (v, w)E,v
precedes win the ordering. It has been proved that a directed acyclic graph
(DAG) can be topologically sorted and constructing a topological sorting of
24
any DAG can be done in linear time. However, the solution is not necessarily520
unique. Since the candidate rule graphs identified by Algorithm 4 are DAGs,
we are able to order the rules. Figure 4(b) shows the corresponding topological
sorted candidate rule graph for Figure 4(a).
By the order, the inference controller guides the reasonings and database
queries to avoid unnecessary computations. It is able to filter rules and stop the525
graph traversal as early as possible based on the query and reasoning results.
This essentially leads to identifying the rules that can be fired efficiently.
Another beauty of the algorithm is that not only unnecessary reasonings and
query executions can be avoided, but also the inference processes for multiple
candidate rules can be done in one traversal of a sorted candidate rule graph.530
Therefore, the query and reasoning involved are executed only once for multiple
rules. It greatly reduces the overall response time further. Algorithm 6 shows
the main idea.
The Inference Controller performs the reasoning by backward traversing each
candidate rule graph gCrules (Line 2 26). This ensures that a candidate535
rule’s support rule and/or defeat rule are always computed first by which the
number of unnecessary queries and reasonings can be minimized. All the rules
that can be fired and their consequences are stored in Ctrue . To check if a rule
rcan be fired, it involves three steps:
First, all its dependent literals are provable (Line 4 10). If a conclusion540
of one of its support rules r0does not match the corresponding dependent
literal (Line 7), rcannot be fired. Note here, since there may have a
reparation chain in r0’s conclusion, it is necessary to check not only that
r0can be fired, but also that r0’s conclusion can lead to r’s corresponding
dependent literal provable.545
Furthermore, the inference control method takes a proactive approach
that updates gto remove all the rules that are dependent on rif rcan be
fired (Line 8) . The update algorithm backward traverses the candidate
rule graph and removes all vertices that have a dependent path to r.
25
Algorithm 6: InferenceController
Input : Candidate Rule Graphs Crules, Candidate Rule Set CQ
Output : A set of rules that can be fired
1Topological sort each subgraph in Crules ,Ctrue null;
2foreach candidate rule graph gCrules do
3foreach rgbackward traversing gdo
4if there are dependent literals dA(r)then
5foreach ddo
6Get d’s corresponding rule r0,cget(Ctr ue, r0);
7if !L(r, r0).equals(c)then
8g.update(r);
9if r00 CQand r00 gthen
10 Go to next rin g;
11 if there are fact literals fA(r)then
12 foreach fdo
13 Get f’s SQL query qfrom Predicates, Execute q;
14 if qreturns false then
15 g.update(r);
16 if r00 CQand r00 gthen
17 Go to next rin g;
18 Get rule rfrom the Rule System;
19 Send rand A(r) to the Rule Interpreter, cr’s reasoning result;
20 if rhas defeat rule r0on cthen
21 c0get(Ctrue , r0);
22 if ¬c0.equals(c)then
23 g.update(r);
24 if r00 CQand r00 gthen
25 Go to next rin g;
26 Cture (r, c);
27 Find all the paths Pin gthat CQreaches;
28 return Pand Ctrue
26
Therefore, the algorithm removes rules that cannot be fired anyway as550
early as possible to avoid unnecessary computations. After gis updated,
the algorithm continues the computation if there are vertices left in gthat
have not been tested (Line 9 10).
Second, all its fact literals are provable (Line 11 17). In the frame-
work, the Predicate database stores all fact literals’ corresponding SQL555
queries. The algorithm retrieves r’s SQL queries from Predicate and per-
forms query execution (Line 13). If any query returns false, it means it
is not provable and rcannot be fired. The algorithm updates gagain to
remove all the impacted rules that cannot be fired (Line 15 17).
Last, if r’s dependent literals and fact literals are all provable, we retrieve560
rule rfrom the Rule System. rwith all the literals in its precondition are
sent to the Inference engine for reasoning (Line 18 19). Then the algo-
rithm checks if there is a defeat rule r0that provides opposite conclusion
(Line 20 25). If r0does not defeat r, then ris successfully fired and
saved in Ctrue (Line 26). The result can be used to compute following565
rules that have semantic relationships with r.
Through the above process, the algorithm is able to efficiently identify the
rules that can be fired. The algorithms also generate the corresponding support
rules and defeat rules to explain why the rules can be fired.
Example 4.5. Continue Example 4.4. The Inference Controller starts with570
rule r5. If r5cannot be fired or r5can be fired but d5is not provable based
on the reasoning of r5, we can conclude that r6cannot be fired. The sorted
candidate rule graph is updated by removing vertex r5,r6and edge r6r5
and r3r6. Next the algorithm evaluates r2. If r2cannot be fired due to some
of its fact literals are not provable, all the vertices that have paths leading to r2
575
can be removed from the sorted candidate rule graph. It means that no rule can
be fired. By the example, we can see that the response time can be improved
by avoiding querying and reasoning times for r6,r3,r1and r4.
27
Example 4.6. In Example 4.4, assume that both r5and r6can be fired but
the conclusion of r6is ¬d7. It means r6defeats r3. Then the algorithm updates580
the sorted candidate rule graph by removing vertex r3,r1and edge r3r6,
r1r3,r1r2and r3r2. Eventually only r2and r4are left un-evaluated.
Then the algorithm continues the process.
4.4. Database Query
Large scale legal reasoning often needs to access distributed relational databases585
that are hold by different organizations. These databases may contain a large
amount of data which provides evidences as facts. For Rule-based Legal Rea-
soning, we need to query these databases to identify facts to determine what
obligations, prohibitions and permissions are derivable.
Disk-based databases can pose several challenges to achieve low latency and590
scalability: (a) slow processing queries due to the data retrieval speed from disk
plus the added query processing times, (b) costly scaling for high reads, and
(c) the need to simplify data access [35]. In memory database caching can be
one of the most effective strategies for improving the overall query performance.
Frequently used data can be stored locally. This makes data retrieval faster595
because it removes network traffic that is associated with retrieving data. There
are two popular caching methods.
Materialized View: A materialized view is a database object that contains
the results of a query. They are local copies of data located remotely,
or are used to create summary tables based on aggregations of a table’s600
data. Index structure can be built on materialized view. Hence, accessing
and querying to a materialized view are much faster than accessing and
computing based on remote tables.
The major target to select an appropriate set of views is to reduce the
entire query response time as well as maintaining cost. Materialized view605
selection involves query frequency, query processing and storage cost along
with materialized view maintenance cost. It is cheaper in many cases
28
where a query is complex (e.g., involve many tables and complex calcula-
tions) or base tables contain a huge amount of records to compute. Mohod
et al. [36] presented an extensive survey on using effective materialized610
view selection and maintenance to improve query performance.
It is time consuming for a RDBMS to materialize a view and its indices.
Hence, in the presence of updates to the base tables referenced by a ma-
terialized view, it is maintained up to date incrementally. The approach
computes changes to the materialized view and applies them to bring it615
up to date.
Key-Value Store: A key-value store maintains key-value pairs consisting
of a unique identifier (key) associated with some arbitrary value. For a
Rule-based Legal Reasoning System, a key is query ID and the value is
the result of the query computed based on remote databases. Similar to620
materialized view, key-value store improves query performance because
a cache look up is much faster than executing a complex SQL query on
remote distributed tables. In the presence of updates to the original data,
we need to maintain the cached key-value pairs consistent transparently
[37, 38, 39].625
Materialized view and key-value store are suitable for different applications.
Key-value store and key-value pairs enhance performance of queries that read
a very small amount of entire dataset repeatedly. SQL queries used to compute
a materialized view typically retrieve many rows [40]. For a large scale rule-
based reasoning system, various query results may be presented based on remote630
databased to be accessed. Therefore, for the cache component in our Rule-based
Legal Reasoning System, we adopt a generic approach that both materialized
view and key-value store methods can be applied. Each record in the Cache has
an unique query ID. The query selection criteria and update strategies can be
defined based on use cases.635
For the Database Query component, when receiving query q, the system
checks if qis cached. If yes, the results can be retrieved directly from Cache.
29
Otherwise, qis executed on the corresponding remote databases(Algorithm 6
Line 16). For most cases, we only care about if a returned result is true or
false which implies if the corresponding fact literal is provable or not. Then the640
algorithm continues accordingly.
5. Experiment
In this section, we evaluate the proposed framework. For the reasoning
process, the major concerns are the efficiency problem for large scale of legal
rules sets as well as large scale of facts. We perform queries on the rule sets645
and compute the response times. Regarding the overhead, we measure the index
storage required. All the experiments were performed on a mac machine of Intel
Core i7 CPU @2.7GHz and 16GB RAM. The algorithms were implemented in
Java using JDK 8.
The performance of the whole legal reasoning framework could be impacted650
by rule set size, the size of rules’ preconditions as well as the amount of seman-
tic relationships among rules. Therefore, the actual content of the underlying
rules and databases can be ignored. In the experiments, we set up the fol-
lowing four parameters to simulate real world rule set situations with different
characteristics:655
Size: the number of rules in a legal system
minP/minQ: the minimum number of predicates/literals of a rule/query
maxP/maxQ: the maximum number of predicates/literals of a rule/query
numP: the total number of distinct predicates/literals contained in a rule
set/query set660
rel%: percentage of rules that have either support relations or defeat re-
lations with other rules in a rule set
In the following, first the whole framework using synthetic datasets is eval-
uated. Then we study the rule sets with different characteristics in details for
30
each main component. Last, we evaluate the caching performance using real665
datasets.
5.1. Overall Performance Evaluation on Inference Engine
In this sub-section, we examine the filtering power of the components in the
proposed inference engine as well as the query response time and the storage
required.670
5.1.1. Datasets
3 rule sets with different sizes (Table 1) are used to test the overall frame-
work performance. Each rule could have predicates/literals between 1 and 100
to reflect the characteristics of real world rule sets. In D1and D2, 5% of rules
have either support relations or defeat relations with other rules, and 20% in675
D3. These relations may cause the recursive problem as introduced in previ-
ous section. The query set Qcontains 1000 queries in which the number of
predicates/literals for each query is between 2 and 200.
Rule Set Size minP maxP numP rel%
D110000
1 100 1000 5%
D2100000
D3100000 20%
Table 1: Rule Sets A
5.1.2. Filtering Power by the Fact Literal Trie (FLT) and the Semantic Rule
Graph (SRG)680
To simulate the actual query and reasoning results, a parameter f,failedRa-
tio, is set as 0.5 to represent the possibility that a predicate is failed to return
“true” due to the conditions stated at Line 9, 17 or 26 in Algorithm 6 that lead
to the failure of rule activation.
Table 2 shows the summary results of 1000 queries for each dataset. For685
example in D1, among all the rules that can be fired by a query, in average,
84.6% of activated rules are identified by FLT. These rules were sent to the
31
Inference Controller for database query execution directly. For the 15.4% of
rules that need to be further examined by SRG, only 50% of rules can be fired
(equivalent to 7.7% of total activated rules) based on the computations of rule690
interpreter and database queries. For D2and D3, 95.2% and 88.9% of activated
rules are recorgnized by the Trie.
Dataset Rules fired
by FLT
Candidate rules to be
tested by SRG
Rules fired
by SRG
D184.6% 15.4% 7.7%
D295.2% 4.8% 1.2%
D388.9% 11.1% 1.2%
Table 2: Average Filtering Results of 1000 Queries
Given the fact that in many legal regulations, most of the rules do not have
support and/or defeat relations with other rules, the Fact Literal Trie is able to
identify the majority of the rules that can be fired quickly. Since the number695
of rules that have relations with other rules is becoming larger in D3compared
with that in D2, the number of candidate rules to be tested by the SRG is larger
(11.1% vs 4.8% as shown in Table 2).
5.1.3. Filtering Power by the Inference Controller (IC)
Since IC uses the candidate rule graphs generated by SRG as inputs, in the700
following we use two queries to explain the filtering power of SRG and IC.
For query 321, Figure 5 shows the partial candidate rule graph of candidate
rule 940 in dataset D3. During the query processing, first, rule 940 was recor-
ganized by the Fact Literal Trie as a candidate rule. Then its corresponding
candidate rule graph shown in Figure 5 was searched using the Semantic Rule705
Graph . Based on the topological order computed by Algorithm 6, IC updated
the candidate graph after each rule was computed including database querying
and/or rule reasoning. For example, if rule 981 cannot be fired due to one of its
predicates failed, then the algorithm will remove node 981, 1136, 1244 and 940
based on the dependant relations among them. Then the algorithm concludes710
32
940
970 973
977
981
1051
1103
1147
1200
1208
1115
1135
1120
1136
1145
1163
1149
1175
1244
1258
1206
1216
1217
1227
1166
<
<
<
<
Figure 5: Partial Candidate Rule Graph for Query 321
that rule 940 cannot be fired without computing the rest of rules. On the other
hand, if rule 1217 can be fired, since 1217 is a defeat rule for rule 1227, the
algorithm will remove node 1227, 1216, 1244 and 940. Therefore, the algorithm
is able to avoid unnecessary computations and reach the conclusion as early as
possible. In our experiment, rule 940 was fired for query 321 which means all its715
support rules can be activated and they were not defeated by the corresponding
defeat rules.
3896 4206 4425
4601
4593
4721
4631
4709
4689
4776 48504888 49124925
4375
4463
4546
4542
4543
<
<
Figure 6: Partial Candidate Rule Graphs for Query 444
For query 444, there are 3 candidate rules generated by FLT: rule 4375,
3896 and 4721 in dataset D3. After searching the Semantic Rule Graph, there
are only two candidate rule graphs were constructed by the algorithm (Figure720
6). This is because the candidate rule graph of rule 4721 is the sub-graph of
rule 3896’s candidate rule graph. By the algorithm, every node in the Semantic
33
Rule Graphs is examined only once regardless of the number of candidate rules.
Furthermore, the execution results stored in Working Memory can also be shared
among candidate rules that further avoid unnecessary computations. By the725
above two examples, we showed how the algorithm is able to minimize the
computation to improve the reasoning efficiency.
Given a candidate rule, by comparing the number of executed support and
defeat rules with the size of its corresponding candidate rule graph, we can
derive how many computations are saved by IC.730
Dataset failedRatio = 0.5 failedRatio = 0.1
D162% 34%
D255% 36%
D359% 35%
Average 58.7% 35%
Table 3: Average Saved Computations
Using the same datasets and queries, Table 3 shows the results using the
two different failedRatios. As mentioned before, when failedRatio is set as 0.5,
it indicates that a rule has 50% of possibility to be failed. In this case, for all
the three datasets, our algorithm saves 58.7% of computations in average. Even
when we set failedRatio very low as 0.1, 35% of computations can still be saved.735
This experiment demonstrated the filtering power of Inference Controller.
5.1.4. Query Response Time
Table 4 shows the response times for 1000 queries. In general, FLT takes
most of response times for all the three datasets. Dataset D2contains 100K
rules. Its query response time (2469s) is larger than that for D1with 10Krules740
(972s). Overall, the response time doesn’t grow exponentially as its rule set size
grows.
FLT response time in D3is similar to that for D2. Since there is a large
amount of rules in D3that has support or defeat relations (20%) with other
rules compared with that in D2(5%), the number of semantic rule graphs in745
34
Dataset FLT (ms) SRG (ms) Total Time (ms)
D1875 97 972
D21938 531 2469
D32518 1758 4276
Table 4: Response Time for 1000 Queries
each dataset as well as the size of each semantic rule graph could become larger.
By closing examining the graphs involved in D3, one of its semantic rule graph
contains about 10K rules. Therefore, SRG response time (1758s) in D3is larger
than that (531s) for D2.
Overall, even given a rare case of a legal rule system that contains a large750
amount of rules with complex semantic relations, the proposed framework and
the algorithms can perform queries efficiently.
5.1.5. Index Size
We further examine the index sizes for the three datasets. By table 5, we
can see that in general, FLT uses more storage spaces compared with that for755
SRG. Since D2and D3contain large amounts of rules, their FLT sizes (591MB,
593MB) are larger than that for D1(94.2MB). D3takes a slightly more storage
for its SRG because it has a large number of rules with semantic relations.
We can conclude that with the increasing size of rule sets and the increasing
complexities of relations among rules, the proposed index method is scalable.760
Dataset FLT (MB) SRG (MB) Total Size (MB)
D194.2 0.07 94.27
D2591 0.91 591.91
D3593 3.43 596.43
Table 5: Index Size for the Three Datasets
5.2. Performance Evaluation on the Fact Literal Trie
By the above experiments, we can see the Fact Literal Trie plays a significant
role in the overall performance. In this sub-section, we will examine the rule sets
35
with various characteristics to analyze how the corresponding FLTs perform.
5.2.1. Datasets765
To understand the impacts of rules’ predicates/literals’ distribution on per-
formance, we generate 9 datasets (Table 6) with different maxP and numP
while fixing the rule set size as 10000. The density of a rule set could be defined
using the sparsity matrix in which row is the rule set size and column is the
average number of predicates in a rule set. Then, the density of a rule set is the770
number of non-zero cells in a sparsity matrix. Based on the density, the 9 rule
sets can be classified into 3 groups: dense, medium, and sparse. Within each
group, there are 3 rule sets with different maxP . For example, rule set DD1
contains 10Krules with total 200 distinct predicates and the largest rule could
have up to 100 predicates. The FLT generated using DD1is T D1. The density775
of FLT depends on the density of its corresponding rule set.
Group Rule Set (10K) Density minP maxP numP FLT
Dense DD10.25 1 100 200 T D1
DD20.125 1 50 200 T D2
DD30.05 1 20 200 T D3
Medium DM40.05 1 100 1000 T M1
DM50.025 1 50 1000 T M2
DM60.01 1 20 1000 T M3
Sparse DS70.01 1 100 5000 T S1
DS80.005 1 50 5000 T S2
DS90.002 1 20 5000 T S3
Table 6: 9 Rule Sets
We also generated the 12 query sets with different minQ,maxQ and numP
to study the performance. Each query set contains 100 queries. Table 7 shows
the 12 query sets that are classified into 4 groups: the small query group with
each query having 2 50 predicates, the medium query group with each query780
having predicates between 80 and 120, the large query group having 150 200
36
predicates and mix group containing both small and large queries.
Group Query Set (100) minQ maxQ numP
Small QS12 50 200
QS22 50 1000
QS32 50 5000
Medium QM480 120 200
QM580 120 1000
QM680 120 5000
Large QL7150 200 200
QL8150 200 1000
QL9150 200 5000
Mix QX10 2 200 200
QX11 2 200 1000
QX12 2 200 5000
Table 7: 12 Query Sets
5.2.2. Query Efficiency
In this sub-section, we evaluate the query response times using the query sets
in different groups on the corresponding rule sets. Specifically, a query set can785
be performed on FLTs that have the same predicate range numP . For exam-
ple, query sets QS1, QM4, QL7, QX10 can perform on rule set DD1, DD2, DD3
because they all have the same predicate range numP = 200. In the following,
the query response times are reported on the basis of 100 queries.
Impact of FLT Density First we examine how the rule sets with differ-790
ent density perform regarding the query response time. Figure 7 shows the
results. Recall that density(T M4)> density(T M5)> density(T M6) and
density(T S7)> density(T S8)> density(T S9). When QS2is executed on
T M4, T M5and T M6, the query response time on T M6is slower than that of
T M5which is slower than that of T M4. The reason is that the search space on795
37
a sparse FLT (e.g. T M6) could be larger than that on a dense FLT (e.g. T M4).
This leads to a slower response time. Similar pattern can be found on all data
sets. Therefore, we can conclude that given the same query set, the more sparse
a rule set is, the more sparse its FLT is, and the slower the query response time
is.800
0
100
200
300
400
500
600
QS2 QM5 QL8 QX11 QS3 QM6 QL9 QX12
Response Time (ms)
TM4 (d=0.05)
TM5 (d=0.025)
TM6 (d=0.01)
TS7 (d=0.01)
TS8 (d=0.005)
TS9 (d=0.002)
numP=5000numP=1000
Figure 7: Impact of Rule Set Density
The above pattern is also observed in the query group QS1,QM4,QL7,
QX10 with numP = 200 performing on T D1, T D2, T D3. But the overall query
response times are much larger than those with numP = 1000 and numP =
5000. Next we will explain this case .
Impact of numP vs Query Size In most real world cases, in general a query805
size is much smaller than the total number of predicates contained in a rule set,
|Q|<< numP . By the above analysis, we know that the more sparse a FLT is,
the slower the query response time is. However, this feature could change if a
query size becomes very large.
Figure 7 only shows the performance of query sets with numP = 1000,5000810
in which the longest response time is 600ms. If we compare the performance
between query sets with numP = 200 and query sets with numP = 1000,5000
(Figure 8), we can see that the query response times of query set QS1and QM4
over T D1, T D2, T D3follow the same pattern as we discussed above. However,
for QL7, its response times over T D1, T D2, T D3are all above 3000ms which815
38
are significantly larger than that of any other query sets. This is because the
average size, 175, of QL7is approaching the total number of predicates (200)
of the rule sets. It means that all predicates contained by a query could cover
nearly all the predicates contained in a FLT. Therefore, the search algorithm
has little capacity to quickly filter unnecessary paths in FLT and may have to820
traverse nearly the entire space.
Furthermore, although the sparsity of DD3(d= 0.05) is smaller than that
of DD2(d= 0.125) which is smaller than that of DD1(d= 0.25), the query
response times don’t follow the above pattern. This is because that DD3’s
maxP = 20 which is smaller than that of DD2(maxP = 50), the search825
depth on T D3is smaller than that on T D2given query size approaching 200.
Therefore, a search can be stopped earlier on each path that leads to overall less
query response time. For similar reason, the response time of QL7over T D2is
smaller than that of QL7over T D1.
QX10 contains both small and very large queries, its performance follows830
the similar trend of that for QL7.
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
|QS1|=25
|QM4|=100
|QL7|=175
|QX10|=100
Response Time (ms)
DD1 (maxP=100)
DD2 (maxP=50)
DD3 (maxP=20)
numP=200
Figure 8: Impact of numP vs Query Size
5.2.3. FLT Storage Size and Construction Time
We measured the FLT storage size for the above 9 rule sets and the corre-
sponding FLT construction time.
39
FLT Size Figure 9 shows the index sizes. Given the same numP , the larger the835
number of predicates (maxP ) involved in a rule , the larger the index size. This
is because it requires to construct longer paths in a trie to capture a larger rule
which in turn occupy more spaces. For example, given the same numP = 200
but with different maxP = 100,50,20, the index size of T D1is much larger
than that of T D2which is larger than that of T D3.840
10
20
30
40
50
60
70
80
90
100
TD1(maxP=100)
TD2(maxP=50)
TD3(maxP=20)
TM4(maxP=100)
TM5(maxP=50)
TM6(maxP=20)
TS7(maxP=100)
TS8(maxP=50)
TS9(maxP=20)
FLT Size (MB)
numP=200 numP=1000 numP=5000
Figure 9: FLT Storage Size Comparison
However, if numP is larger, it means that a rule sets is more sparse. The
probability that predicates of different rules share the same FLT paths is lower.
This leads to more storage required to construct a FLT. Therefore, the larger
anumP is, the larger the index size is. For example, rule set DD1, DM4, DS7
all have the same maxP = 100 but with different numP = 200,1000,5000845
respectively. The index size of T S7is larger than that of T M4, and the index
size of T M4is larger than that of T D1. Similar patten can be found in other
corresponding rule sets.
FLT Construction Time Figure 10 shows the index construction time for the
9 rule sets. In general, all the FLTs can be constructed very fast. Similar to850
the index size, given the same numP (e.g. numP = 200 for DD1, DD2, DD3),
the larger a maxP (e.g. maxP = 100,50,20 for DD1, DD2, DD3respectively),
the longer the time required to construct a FLT.
40
40
60
80
100
120
140
160
180
TD1(maxP=100)
TD2(maxP=50)
TD3(maxP=20)
TM4(maxP=100)
TM5(maxP=50)
TM6(maxP=20)
TS7(maxP=100)
TS8(maxP=50)
TS9(maxP=20)
FLT Construction Time (ms)
numP=200 numP=1000 numP=5000
Figure 10: FLT Construction Time
5.3. Performance Evaluation on Caching Technique
As mentioned before, the overall response time consists of two components:855
the response time required to execute database queries for the corresponding
predicates, and the time required by reasoning which has been discussed in the
previous sub-sections. This sub-section aims to evaluate whether the adopted
caching techniques offer a practical solution to the database query performance
issues given a large scale of facts stored in databases. We uses the same case860
studies presented in [13] to demonstrate the efficiency.
5.3.1. Dataset
The ChildSafe Care Online Management System (ChildSafeOMS) studied
in [13] contains three sources of information: (1) a child care database (Child-
SafeDB) recording child enrolment, attendance, abuse and/or neglect data etc.,865
(2) a set of defeasible rules encoding the decision trees, definitions and (nor-
mative) guidelines of New South Wales Mandatory Reporter Guidelines (NSW
Government 2016), and (3) a set of bridging statements relating the terms of
the regulations and the fields (and data) in the ChildSafeDB.
SQL queries define the meaning of literals used in a defeasible theory re-870
garding data and the schema of a database. Since query response times depend
on the structure and size of databases as well as structures of SQL queries, the
actual content of the undergoing database is not relevant. Therefore, in [13],
41
to increase the scale of databases as well as query complexities, the datasets
from stat-computing4are extracted to many different tables that are linked to875
ChildSafeDB. The resultant databases contain more than 11 million records in
total. The data stored in the databases are queried to provide facts for the
reasoner in the form of predicates.
5.3.2. Result
We compare the query response time with and without caching technique.880
For some simple queries, the response times are short even using the remote
databases. But for some queries involving multiple Joins and large tables, we
have to stop the execution due to long waiting time. There were 30 predicates
were tested in [13] and the average response time is 171.19 ms. In the following,
the three example predicates are used to show the performance.885
Predicate 1
SELECT t2.child_crn, t2.year, t2.month
FROM tbl2008 t2
LEFT JOIN tblchildren ON t2.child_crn = tblchildren.child_crn
890
Predicate 2
SELECT t1.child_crn, t1.year, t1.month
FROM tbl2007 t1
LEFT JOIN tblchildren ON t1.child_crn = tblchildren.child_crn
UNION895
SELECT t2.child_crn, t2.year, t2.month
FROM tbl2008 t2
LEFT JOIN tblchildren ON t2.child_crn = tblchildren.child_crn
Predicate 3900
SELECT t1.child_crn, t2.child_crn
4http://stat-computing.org/dataexpo/2009/supplemental-data.html
42
FROM tbl2007 t1, tbl2007 t2
WHERE t1.child_crn <> t2.child_crn
Limit 1
905
The materialized views for the above 3 queries are built and used to compare
its performance with that using local databases and remote databases. Table 8
shows the comparison results on the query response times.
Predicate ID Remote DB Local DB Materialized View
1 12s 167ms 925ms 99 ms
2 25s 51ms 2s 428ms 457 ms
3 48s 418ms 14s 226ms 50ms
Table 8: Query Response Time on 3 predicates
By Table 8, we can see the query response times for all the 3 predicates using
remote databases are very slow but are improved using local databases. And910
we have to limit the number of returned record to 1 for predicate 3. Otherwise
it caused the system to hang if using remote database. The response times are
orders of magnitude faster by the corresponding materialized views constructed.
The materialized view for predicate 3 is especially useful that reduces the query
response time from 48 seconds using remote database to 50 millisecond As men-915
tioned before, how to select materialized views will be depend on specific use
cases.
6. Literature Review
In this section, we review some most related work from two research streams:
the rule matching problem and the argumentation framework. We also analyze920
how the proposed work could advance the state-of-the-art to support efficient
legal reasoning.
The Rule Matching Problem Legal reasoning consists of series of com-
plex reasoning activities [14] and “Rule-Based Reasoning” is one of the most
43
crucial types 5. Intuitively, a rule-based system consists of three essential compo-925
nents: a set of rules (rule base), fact base (knowledge base) and an interpreter for
the rules (inference engine). Rules that reflect the content of knowledge-based
sources are applied and matched with a set of facts to deduce the conclusion
using an inference engine in rule-based reasoning [41]. For many applications,
facts needed to run application are stored in (relational) databases. There-930
fore, rule-bases systems need to be coupled to database systems to allow the
interactions between rules and facts.
Since database schemas and vocabularies do not change often, it motivates
an approach of encoding relationships between and among schemas as rules in a
language called Datalog. A basic rule is an expression of the form shown below:
r:p1(t11, ..., t1n)... pk(tk1, ..., tkn )c(t1, ..., tn) (1)
where p1, ...pkare names of relations and tis are variables or constants. pis and
care called predicates and conclusion respectively.
When a new tuple is added to database, an inference engine cycles through935
three sequential steps: (1) match rules with the tuple, (2) run queries on all
relations involved in rules’ preconditions, and (3) solve the rule by reasoning to
generate resultant tuples. These steps would iterate as each tuple could trig-
ger additional rules. Therefore, the efficiency of a rule-based system primarily
depends on the efficiency of the above 3 steps. Given a rule-based system con-940
sisting of large number of rules and facts, rule matching is time consuming but
crucial that influences the performance of an inference engine. The efficiency
problem has been extensively studied in the past. Stonebraker [42] summarizes
four techniques in principle: brute force, marking, discrimination networks and
query rewrite.945
5https://lawprofessors.typepad.com/legal_skills/2011/08/
tip-of- the-week- five-methods- of-legal-reasoning.html accessed on 22 February
2019
44
Brute force builds index on attributes of relations. DLV6is a deductive
database system based on disjunctive logic programming. It constructs an index
on the first attribute of each EDB relation. Jena7GraphDB8and SwiftOWLIM9
for triples build full index on each attribute of each EDB relation. Both Yap [43]
as a prolog-based system and Ontobroker10 as a deductive database create index950
on the fly by analyzing the activation rules. Another prolog-based system, XSB
[44], offers a repertoire of indexing techniques, which must be specified manually.
AMEM index for LEAPS11 records the static part of EDB and Filter Index for
DATEX takes the advantage from both TREAT and the marking method as
explained below.955
The basic idea of marking method, used in POSTGRES rules system (PRS2),
is that each rule is processed against the database and every record satisfying
the event qualification is identified. Each such record is marked with a flag
identifying the rule to be awakened [42]. There are difficult problems with
keeping the markings correct as updates are made to the database.960
Discrimination networks, such as RETE [45], TREAT [46] and their vari-
ations, are mainly used by production and reactive rule systems including
Drools12 and Jess13 . RETE adopts a rule-centric approach that uses rules to
construct a network that efficiently locates tuples based on constant constraints
between rules and tuples. TREAT is a tuple-centric network that records all po-965
tential useful tuples in a network. The recent published work, Yield Index [47],
is similar to discrimination networks. It connects the matching tuples through
its semantic index that is organized on top of its data index to speed up the
update.
6http://www.dlvsystem.com
7http://jena.apache.org
8http://graphdb.ontotext.com
9https://lists.w3.org/Archives/Public/semantic-web/2010Jul/0411.html
10https://www.w3.org/2001/sw/wiki/OntoBroker
11https://www.leap.com.au
12https://www.drools.org
13https://jessrules.com/jess/index.shtml
45
The above three techniques are designed for forward chaining rules. The970
forth technique, query rewrite, is popular in backward chaining implementation.
Each applicable rule is substituted into user commands to produce a modified
command. [48] shows how to extend query rewrite to also support forward
chaining implementations.
All the above approaches improve the efficiency of rule matching but they975
are attribute-based methods. For legal reasoning using defeasible deontic logic,
each predicate itself could be a simple or complex SQL query. Therefore, those
techniques are not practical to an inference engine for legal system. Further-
more, long rules with many predicates create query efficiency problem which is
called long chain effect. Previous study mainly focuses on the rule matching980
step but neglects the time for query execution step. [13] is the only work that
investigated the integration problem between databases and the rule-based legal
system with focus on the query efficiency problem. However, the efficiency of
an inference engine for legal systems is not studied.
The Argumentation Framework Legal reasoning, at its core, is a process985
of argumentation with opposing sides attempting to justify their own interpreta-
tion, with appeals to present, principle, policy and purpose. AI and law research
has addressed this with formal argumentation based on Dung’s Abstract Argu-
mentation Frameworks (AAFs) [49]. An AAF is a pair (A, D), where A is a set
of arguments and DA×Ais a binary relation of defeat [50]. We say that990
A strictly defeats B if A defeats B while B does not defeat A. A semantics for
AAFs returns sets of arguments called extensions, which are internally coherent
and defend themselves against each attack.
Compared with AAFs in which arguments are interpreted as abstract entities
and only logical relationships between arguments are taken into account, struc-995
tured argumentation considers an argument’s internal structure. The authors
[51] overviewed some concrete algorithmic approaches to structured argumen-
tation, including ASPIC+ formalism [52] (e.g. TOAST system [53]), Defeasible
logic programming (DeLP) (e.g. Tweety system [54]), Assumption-Based Argu-
mentation and Carneades [55] etc. In previous study [56], some representative1000
46
approaches were compared regarding the representation and reasoning over legal
rule. The comparative analysis showed that Defeasible Deontic Logic provides
the largest set of feature and was the most efficient system among those tested.
DeLP offers a computational reasoning system that uses the Abstract Di-
alectical Frameworks (ADFs) [57] to obtain answers from a knowledge base rep-1005
resented using an extended logic programming language with defeasible rules.
This combination generates a computationally effective system together with
a reasoning model similar to the one used by humans that facilities its use in
real-world applications [58]. ADF is regarded as a powerful generalization of
Dung’s AFs. An ADF is a directed graph whose nodes represent arguments,1010
statements or positions. The nodes can be arbitrary items which can be ac-
cepted or not. The links represent dependencies. However, unlike a link in an
abstract argumentation framework (AAF) in which only defeat relation is mod-
elled, the meaning of an ADF link can vary. Bipolar ADF [59] is a sub-class of
general ADFs in which only defeat and support relations are defined.1015
Some strategies [60, 61, 62, 63] were proposed to efficiently construct di-
alectical trees or structured argumentation in general by pruning the search
space and speed up the inference process. This is realized by only expanding
the dialectical tree so far until the evaluation status of the query is decided.
For example, if an argument possesses multiple attackers, and it can already be1020
decided that the first attacker is ultimately accepted and defeats the argument,
then there is no need to evaluate the acceptance status of the remaining attack-
ers as it can already be decided that the argument under consideration is not
acceptable [51].
Next we provide some comparison analysis from three perspectives including1025
conception, computation and functionality between the proposed Semantic Rule
Index (SRI) and the structured argumentation framework.
Conception: As discussed, in a dialectical framework, nodes represent ar-
guments and edges represent the relationships between arguments. Each
argument may contain a rule or a set of rules that reach a particular conclu-1030
47
sion. For SRI, nodes represent rules and edges represent the relationships
between rules. Informally, a sub-graph in the Semantic Rule Graph could
be regarded as an argument and some edges between sub-graphs could be
regarded as the relationships between arguments. How to incorporate the
semantics defined in the Argumentation Framework into the SRI is out of1035
the scope of this work but it deserves a formal study in future. However,
formal relationships between argumentation systems and defeasible logic
have been carried out [64, 15, 65, 66], showing that several argumenta-
tion frameworks proposed for legal reasoning correspond to (fragment of)
variants of defeasible logic. Accordingly, we have strong reasons to believe1040
that SRI can be adapted to argumentation systems.
Computation:
For DeLP, to build a dialectical tree, the starting points for con-
structing an argument are the facts in the knowledge base. They
are arguments themselves and on top of them new arguments can1045
be made. The facts must be specified as follows: true f act. This
implies that users have to have pre-knowledge of what facts are avail-
able. Different facts provided will reach different conclusions due to
different dialectical trees generated. Given a large scale of rules and
frequent updated databases, it is hard for users to know every fact1050
that should be considered for inferencing.
Our framework enables users to provide some predicates interested
in. It is not necessary for users to pre-know if the predicates are
facts or not. The framework is able to identify which conclusions can
be reached based on the predicated provided as well as databases1055
at the point of querying time which may impact on what reasoning
processes need to be followed. Therefore, the proposed solution is
more practical compared with the traditional ADFs approach.
A dialectical tree uses arguments as nodes. An argument itself is an
inference tree that is constructed through rules. Conceptually, for-1060
48
malisms for structured argumentation often follow the steps of the
so-called argumentation process or argumentation pipeline: (a) ar-
gument construction that builds arguments composed of a claim and
a derivation of that claim (e.g. a proof tree) from the given knowl-
edge base; (b) determining conflicts among arguments; (c) evaluation1065
of acceptability of arguments; and (d) drawing conclusions from the
acceptable arguments [67]. From a computational point of view, all
of the steps of the process taken individually can be quite computa-
tionally expensive: for instance even construction of single arguments
may be computationally complex (NP-hard in cases); a large number1070
of arguments may be constructed; finding conflicts can be non-trivial;
and evaluation of acceptability has in general a high complexity, as
in the case of abstract argumentation [51].
For the proposed framework, the Semantic Rule Index uses rules as
nodes that is straightforward for construction. SRI can be built as1075
a pre-processing step by scanning Rule Systems once. The index
captures the defeat and support relations among the rules. During
querying time, the framework searches the index to identify all the
candidate rules and evaluate the acceptability of a rule conclusion
through database queries. As discussed in Section 4.2, the search1080
can be done in O(V+E) time. Therefore, the proposed approach
will support efficient large scale rule-based reasoning better.
Functionality: The reparation chain feature for defeasible deontic logic will
bring many dynamics into the argumentation process and to the best of our
knowledge, there is no work considering this feature. Therefore, it is not1085
clear how the current structured argumentation frameworks could address
this. However, this semantics could be captured through the proposed
Semantic Rule Index by enriching edges’ semantic representation.
The proposed inference engine achieved its goal of answering the Rule Con-
tainment/Intersection query efficiently. Although the setting is different from1090
49
the previous work, as the future work, we will explore the semantics (e.g.
grounded extension, admissible extension etc.) defined by the structured argu-
mentation framework and Dung’s work within the proposed framework. Based
on that, a large-scale and systematic implementation comparison between these
two approaches will be conducted.1095
7. Conclusion
In summary, we proposed the first unified framework that seamlessly in-
tegrates database systems with inference engines for legal domain. It is able
to answer Rule Containment Query efficiently through the proposed inference
engine in which the search space for rule matching can be reduced efficiently1100
through the Semantic Rule Index, and the unnecessary reasoning computa-
tions and database queries can be avoided by the proposed Inference Controller.
Moreover, query and reasoning results can be shared by multiple candidate rules
that further reduce the overall response time. Furthermore, by adopting some
caching techniques, the database query performance can be improved signif-1105
icantly. The framework and techniques can be adopted to address the Rule
Intersection Query and backward chining strategy. In future, we will extend the
work to support constitutive rules.
References
References1110
[1] G. Sartor, Legal Reasoning: A Cognitive Approach to the Law, Springer,
2005.
[2] T. F. Gordon, G. Governatori, A. Rotolo, Rules and norms: Requirements
for rule interchange languages in the legal domain, in: G. Governatori,
J. Hall, A. Paschke (Eds.), RuleML 2009, no. 5858 in LNCS, Springer,1115
Heidelberg, 2009, pp. 282–296.
50
[3] G. Governatori, F. Olivieri, A. Rotolo, S. Scannapieco, Computing strong
and weak permissions in defeasible logic, Journal of Philosophical Logic
42 (6) (2013) 799–829.
[4] T. F. Gordon, H. Prakken, D. Walton, The carneades model of argument1120
and burden of proof, Artificial Intelligence 171 (10-15) (2007) 875–896.
doi:10.1016/j.artint.2007.04.010.
URL https://doi.org/10.1016/j.artint.2007.04.010
[5] H. Prakken, A. Z. Wyner, T. J. M. Bench-Capon, K. Atkinson, A for-
malization of argumentation schemes for legal case-based reasoning in1125
ASPIC+, Journal of Logic and Computation 25 (5) (2015) 1141–1166.
doi:10.1093/logcom/ext010.
URL https://doi.org/10.1093/logcom/ext010
[6] H. Herrestad, Norms and formalization, in: ICAIL’91, ACM, 1991, pp.
175–184. doi:10.1145/112646.112667.1130
URL http://doi.acm.org/10.1145/112646.112667
[7] G. Sartor, The structure of norm conditions and nonmonotonic reasoning
in law, in: R. E. Susskind (Ed.), Proceedings of the Third International
Conference on Artificial Intelligence and Law, ICAIL ’91, Oxford, England,
June 25-28, 1991, ACM, 1991, pp. 155–164. doi:10.1145/112646.112665.1135
URL https://doi.org/10.1145/112646.112665
[8] H. Prakken, G. Sartor, Law and logic: A review from an argumentation
perspective, Artif. Intell. 227 (2015) 214–245. doi:10.1016/j.artint.
2015.06.005.
URL https://doi.org/10.1016/j.artint.2015.06.0051140
[9] A. J. I. Jones, M. J. Sergot, Deontic logic in the representation of law:
Towards a methodology, Artif. Intell. Law 1 (1) (1992) 45–64. doi:10.
1007/BF00118478.
URL https://doi.org/10.1007/BF00118478
51
[10] NSW Government, The NSW mandatory reporter guide (2016).1145
URL http://www.keepthemsafe.nsw.gov.au/reporting_concerns/
mandatory_reporter_guide
[11] S. Liang, P. Fodor, H. Wan, M. Kifer, Openrulebench: An analysis of
the performance of rule engines, in: Proceedings of the 18th International
Conference on World Wide Web, WWW ’09, ACM, New York, NY, USA,1150
2009, pp. 601–610. doi:10.1145/1526709.1526790.
URL http://doi.acm.org/10.1145/1526709.1526790
[12] R. Agrawal, Alpha: an extension of relational algebra to express a class of
recursive queries, IEEE Transactions on Software Engineering 14 (7) (1988)
879–885. doi:10.1109/32.42731.1155
[13] M. B. Islam, G. Governatori, RuleRS: a rule-based architecture for decision
support systems, Artificial Intelligence and Law 26 (4) (2018) 315–344.
[14] P. Wahlgren, Legal reasoning - a jurisprudential description, in: Pro-
ceedings of the 2Nd International Conference on Artificial Intelligence
and Law, ICAIL ’89, ACM, New York, NY, USA, 1989, pp. 147–156.1160
doi:10.1145/74014.74034.
URL http://doi.acm.org/10.1145/74014.74034
[15] G. Governatori, On the relationship between Carneades and defeasible
logic, in: Proceedings of the ICAIL 2011, ACM, 2011, pp. 31–40.
[16] G. Antoniou, D. Billington, G. Governatori, M. J. Maher, Representation1165
results for defeasible logic, ACM Transactions on Computational Logic 2 (2)
(2001) 255–287.
[17] G. Governatori, Representing business contracts in RuleML, International
Journal of Cooperative Information Systems 14 (2-3) (2005) 181–216.
arXiv:9617/coala.pdf,doi:10.1142/S0218843005001092.1170
[18] G. Governatori, S. Shek, Regorous: A business process compliance checker,
52
in: Proceedings of the Fourteenth International Conference on Artificial
Intelligence and Law, 2013, pp. 245–246. doi:10.1145/2514601.2514638.
[19] G. Governatori, The Regorous approach to process compliance, in: 2015
IEEE 19th International Enterprise Distributed Object Computing Work-1175
shop, IEEE Press, 2015, pp. 33–40. doi:10.1109/EDOC.2015.28.
[20] G. Governatori, M. Hashmi, No time for compliance, in: Enterprise Dis-
tributed Object Computing Conference (EDOC), 2015 IEEE 19th Interna-
tional, IEEE, 2015, pp. 9–18.
[21] G. Governatori, A. Rotolo, BIO logical agents: Norms, beliefs, intentions1180
in defeasible logic, Autonomous Agents and Multi-Agent Systems 17 (1)
(2008) 36–69.
[22] D. Nute, Defeasible logic, in: D. M. Gabbay, C. H. Hogger, J. Robinson
(Eds.), Handbook of logic in artificial intelligence and logic programming,
Vol. 3, Oxford University Press, 1994, pp. 353–395.1185
[23] G. Antoniou, D. Billington, G. Governatori, M. J. Maher, On the model-
ing and analysis of regulations, in: Australian Conference on Information
Systems, 1999.
[24] B. N. Grosof, Representing e-commerce rules via situated courteous logic
programs in RuleML, Electronic Commerce Research and Applications1190
3 (1) (2004) 2–20.
[25] S. Sadiq, G. Governatori, Managing regulatory compliance in business pro-
cesses, in: J. vom Brocke, M. Rosemann (Eds.), Handbook of Business
Process Management 2nd edition, 2nd Edition, Vol. 2, Springer, 2015, pp.
265–288. doi:10.1007/978-3-642-45103-4_11.1195
[26] G. Governatori, A. Rotolo, A conceptually rich model of business process
compliance, in: Proceedings of the APCCM 2010, no. 110 in CRPIT, ACS,
2010, pp. 3–12. arXiv:papers/2010/apccm2010.pdf.
53
[27] T. Skylogiannis, G. Antoniou, N. Bassiliades, G. Governatori, A. Bikakis,
Dr-negotiate— a system for automated agent negotiation with defeasible1200
logic-based strategies, Data & Knowledge Engineering 63 (2007) 362–380.
arXiv:13283/DKE-nego.pdf,doi:10.1016/j.datak.2007.03.004.
[28] M. J. Maher, Propositional defeasible logic has linear complexity, Theory
and Practice of Logic Programming 1 (06) (2001) 691–711.
[29] D. Billington, G. Antoniou, G. Governatori, M. J. Maher, An Inclusion1205
Theorem for Defeasible Logics, ACM Transactions in Computational Logic
12 (1) (2010) 1–27.
[30] G. Governatori, Business process compliance: An abstract normative
framework, IT – Information Technology 55 (6) (2013) 231–238. doi:
10.1515/itit.2013.2003.1210
[31] M. Hashmi, G. Governatori, M. T. Wynn, Normative requirements for reg-
ulatory compliance: An abstract formal framework, Information Systems
Frontiers (2015). doi:10.1007/s10796-015-9558-1.
[32] G. Governatori, A. Rotolo, Logic of violations: A Gentzen system for rea-
soning with contrary-to-duty obligations, Australasian Journal of Logic 41215
(2006) 193–215.
[33] G. Governatori, A. Rotolo, Defeasible logic: Agency, intention and obli-
gation, in: Proceedings of the DEON 2004, no. 3065 in LNCS, Springer,
2004, pp. 114–128.
[34] I. Savnik, Index data structure for fast subset and superset queries, in:1220
A. Cuzzocrea, C. Kittl, D. E. Simos, E. Weippl, L. Xu (Eds.), Availability,
Reliability, and Security in Information Systems and HCI, Springer Berlin
Heidelberg, Berlin, Heidelberg, 2013, pp. 134–148.
[35] M. Labib, Database caching strategies using redis, Amazon Web Services
(May 2017).1225
54
[36] A. P Mohod, M. Chaudhari, Improve query performance using effective ma-
terialized view selection and maintenance: A survey, International Journal
of computer science & Mobile computing 2 (2013) 485–490.
[37] J. Challenger, A. Iyengar, P. Dantzig, A scalable system for consistently
caching dynamic web data, in: Proceedings of the IEEE INFOCOM ’99.1230
Conference on Computer Communications, Vol. 1, 1999, pp. 294–303 vol.1.
[38] A. Datta, K. Dutta, H. M. Thomas, D. E. VanderMeer, K. Ramamritham,
D. Fishman, A comparative study of alternative middle tier caching solu-
tions to support dynamic web content acceleration, in: Proceedings of the
27th International Conference on Very Large Data Bases, VLDB ’01, 2001,1235
pp. 667–670.
[39] L. Degenaro, A. Iyengar, I. Lipkind, I. Rouvellou, A middleware system
which intelligently caches query results, in: IFIP/ACM International Con-
ference on Distributed Systems Platforms, Middleware ’00, 2000, pp. 24–44.
[40] S. Ghandeharizadeh, J. Yap, Materialized views and key-value pairs in a1240
cache augmented sql system: Similarities and differences, Technical report,
University of Southern California, Database Laborary Technical Report
2012 (01 2012).
[41] M. D. Soufi, T. Samad-Soltani, S. S. Vahdati, P. Rezaei-Hachesu,
Decision support system for triage management: A hybrid ap-1245
proach using rule-based reasoning and fuzzy logic, International
Journal of Medical Informatics 114 (2018) 35 – 44. doi:https:
//doi.org/10.1016/j.ijmedinf.2018.03.008.
URL http://www.sciencedirect.com/science/article/pii/
S13865056183021561250
[42] M. Stonebraker, The integration of rule systems and database systems,
IEEE Trans. on Knowl. and Data Eng. 4 (5) (1992) 415–423. doi:10.
1109/69.166984.
URL https://doi.org/10.1109/69.166984
55
[43] V. S. Costa, R. Rocha, L. Damas, The yap prolog system, Theory Pract.1255
Log. Program. 12 (1-2) (2012) 5–34. doi:10.1017/S1471068411000512.
URL http://dx.doi.org/10.1017/S1471068411000512
[44] K. Sagonas, T. Swift, D. S. Warren, Xsb as an efficient deductive database
engine, in: Proceedings of the 1994 ACM SIGMOD International Confer-
ence on Management of Data, SIGMOD ’94, ACM, New York, NY, USA,1260
1994, pp. 442–453. doi:10.1145/191839.191927.
URL http://doi.acm.org/10.1145/191839.191927
[45] C. L. Forgy, Expert systems, IEEE Computer Society Press, Los Alamitos,
CA, USA, 1990, Ch. Rete: A Fast Algorithm for the Many Pattern/Many
Object Pattern Match Problem, pp. 324–341.1265
URL http://dl.acm.org/citation.cfm?id=115710.115736
[46] D. P. Miranker, TREAT: A New and Efficient Match Algorithm for AI
Production Systems, Morgan Kaufmann Publishers Inc., San Francisco,
CA, USA, 1990.
[47] Y. Qin, X. Tao, Y. Huang, J. L¨u, An index structure supporting rule1270
activation in pervasive applications, World Wide Web 22 (1) (2019) 1–
37. doi:10.1007/s11280-017-0517-2.
URL https://doi.org/10.1007/s11280-017-0517-2
[48] M. Stonebraker, A. Jhingran, J. Goh, S. Potamianos, On rules, procedure,
caching and views in data base systems, in: Proceedings of the 1990 ACM1275
SIGMOD International Conference on Management of Data, SIGMOD ’90,
ACM, New York, NY, USA, 1990, pp. 281–290. doi:10.1145/93597.
98737.
URL http://doi.acm.org/10.1145/93597.98737
[49] P. Dung, On the acceptability of arguments and its fundamental role in1280
nonmonotonic reasoning, logic programming and n-person games, Artif.
Intell. 77 (1995) 321–358.
56
[50] H. Prakken, G. Sartor, Law and logic: A review from an argumentation
perspective, Artif. Intell. 227 (2015) 214–245.
[51] F. Cerutti, S. Gaggl, M. Thimm, J. Wallner, Foundations of implementa-1285
tions for formal argumentation, FLAP 4 (2017).
[52] S. Modgil, H. Prakken, The aspic+ framework for structured argumenta-
tion: a tutorial, Argument Comput. 5 (2014) 31–62.
[53] M. Snaith, C. Reed, Toast: online aspic+ implementation, Vol. 245, 2012.
doi:10.3233/978-1-61499-111-3-509.1290
[54] M. Thimm, Tweety: A comprehensive collection of java libraries for logical
aspects of artificial intelligence and knowledge representation, in: KR, 2014.
[55] T. Gordon, D. Walton, Formalizing balancing arguments, in: COMMA,
2016.
[56] S. Batsakis, G. Baryannis, G. Governatori, I. Tachmazidis, G. Antoniou,1295
Legal representation and reasoning in practice: A critical comparison, in:
JURIX, 2018.
[57] G. Brewka, S. Woltran, Abstract dialectical frameworks, in: KR, 2010.
[58] M. A. Leiva, G. I. Simari, S. Gottifredi, A. Garc´ıa, G. R. Simari, Daqap:
Defeasible argumentation query answering platform, in: FQAS, 2019.1300
[59] C. Cayrol, M. Lagasquie-Schiex, On the acceptability of arguments in bipo-
lar argumentation frameworks, in: ECSQARU, 2005.
[60] C. I. Ches˜nevar, G. R. Simari, A. Garc´ıa, Pruning search space in defeasible
argumentation, arXiv: Artificial Intelligence (2004).
[61] A. Cohen, S. Gottifredi, A. Garc´ıa, A heuristic pruning technique for di-1305
alectical trees on argumentation-based query-answering systems, in: FQAS,
2019.
57
[62] N. D. Rotstein, S. Gottifredi, A. Garc´ıa, G. R. Simari, A heuristics-based
pruning technique for argumentation trees, in: SUM, 2011.
[63] B. Testerink, D. Odekerken, F. Bex, A method for efficient argument-based1310
inquiry, in: FQAS, 2019.
[64] G. Antoniou, M. J. Maher, D. Billington, Defeasible logic versus logic pro-
gramming without negation as failure, J. Log. Program. 42 (1) (2000) 47–
57. doi:10.1016/S0743-1066(99)00060-6.
URL https://doi.org/10.1016/S0743-1066(99)00060-61315
[65] G. Governatori, M. J. Maher, D. Billington, G. Antoniou, Argumentation
semantics for defeasible logics, Journal of Logic and Computation 14 (5)
(2004) 675–702. arXiv:9614/preamble.pdf,doi:10.1093/logcom/14.5.
675.
[66] H.-P. Lam, G. Governatori, R. Riveret, On ASPIC+and defeasible logic,1320
in: P. Baroni, T. F. Gordon, T. Scheffler, M. Stede (Eds.), Proceed-
ings of COMMA 2016, Vol. 287 of Frontiers in Artificial Intelligence and
Applications, IOS Press, Amsterdam, 2016, pp. 359–370. doi:10.3233/
978-1-61499-686-6-359.
[67] M. Caminada, L. Amgoud, On the evaluation of argumenta-1325
tion formalisms, Artificial Intelligence 171 (5) (2007) 286 – 310.
doi:https://doi.org/10.1016/j.artint.2007.02.003.
URL http://www.sciencedirect.com/science/article/pii/
S0004370207000410
58
... The development of mobile hybrid system technology in this new framework system model can make transactions and medicine distribution supervision done online and real time [14]. The purpose of this research is to create a new framework system model by combining supply chain [15] and expert system [16] regarding medicine distribution using the rule-based reasoning method [17]. The rule-based reasoning method is very suitable to be used in this research because it can adopt the regulations and knowledge of pharmaceutical experts into a system in the form of an algorithm, even the rule-based reasoning method allows experts to be directly involved in research [18]. ...
... Likewise, the output in the system itself can be the input for the supply chain information system. Combined framework of supply chain and expert system using rule-based reasoning method can encourage better system work [17], [41]. The combined architecture in this research can be seen in Figure 5. ...
Article
Full-text available
The medicine distribution supply chain is important, especially during the COVID-19 pandemic, because delays in medicine distribution can increase the risk for patients. So far, the distribution of medicines has been carried out exclusively and even some medicines are distributed on a limited basis because they require strict supervision from the Medicine Supervisory Agency in each department. However, the distribution of this medicine has a weakness if at one public Health center there is a shortage of certain types of medicines, it cannot ask directly to other public Health center, thus allowing the availability of medicines not to be fulfilled. An integrated process is needed that can accommodate regulations and leadership policies and can be used for logistics management that will be used in medicine distribution. This study will create a new model by combining supply chains with information systems and expert systems using the rule-based reasoning method as an inference engine that can be developed for medicine distribution based on a mobile hybrid system in the Demak District Health Office, Indonesia. So that a new framework model based on a mobile hybrid system can facilitate the distribution of medicines effectively and efficiently.
... More recently, authors proposed a theoretical framework of legal rule-based system for the criminal domain, named CORBS (El Ghosh et al., 2017). The system is founded on a homogeneous integration of a criminal domain ontology with a set of logic rules (Liu et al., 2021). Thus, CORBS stands as a unified framework that supports efficient legal reasoning. ...
Article
Full-text available
Decisions made by legal adjudicators and administrative decision-makers often found upon a reservoir of stored experiences, from which is drawn a tacit body of expert knowledge. Such expertise may be implicit and opaque, even to the decision-makers themselves, and generates obstacles when implementing AI for automated decision-making tasks within the legal field, since, to the extent that AI-powered decision-making tools must found upon a stock of domain expertise, opacities may proliferate. This raises particular issues within the legal domain, which requires a high level of accountability, thus transparency. This requires enhanced explainability, which entails that a heterogeneous body of stakeholders understand the mechanism underlying the algorithm to the extent that an explanation can be furnished. However, the “black-box” nature of some AI variants, such as deep learning, remains unresolved, and many machine decisions therefore remain poorly understood. This survey paper, based upon a unique interdisciplinary collaboration between legal and AI experts, provides a review of the explainability spectrum, as informed by a systematic survey of relevant research papers, and categorises the results. The article establishes a novel taxonomy, linking the differing forms of legal inference at play within particular legal sub-domains to specific forms of algorithmic decision-making. The diverse categories demonstrate different dimensions in explainable AI (XAI) research. Thus, the survey departs from the preceding monolithic approach to legal reasoning and decision-making by incorporating heterogeneity in legal logics: a feature which requires elaboration, and should be accounted for when designing AI-driven decision-making systems for the legal field. It is thereby hoped that administrative decision-makers, court adjudicators, researchers, and practitioners can gain unique insights into explainability, and utilise the survey as the basis for further research within the field.
... More recently, authors proposed a theoretical framework of legal rule-based system for the criminal domain, named CORBS (El Ghosh et al., 2017). The system is founded on a homogeneous integration of a criminal domain ontology with a set of logic rules (Liu et al., 2021). Thus, CORBS stands as a unified framework that supports efficient legal reasoning. ...
Preprint
Full-text available
Decisions made by legal adjudicators and administrative decision-makers often found upon a reservoir of stored experiences, from which is drawn a tacit body of expert knowledge. Such expertise may be implicit and opaque, even to the decision-makers themselves, and generates obstacles when implementing AI for automated decision-making tasks within the legal field, since, to the extent that AI-powered decision-making tools must found upon a stock of domain expertise, opacities may proliferate. This raises particular issues within the legal domain, which requires a high level of accountability, thus transparency. This requires enhanced explainability, which entails that a heterogeneous body of stakeholders understand the mechanism underlying the algorithm to the extent thatanexplanationcanbefurnished.However,the’black-box’nature ofsomeAIvariants,suchas deep learning, remains unresolved, and many machine decisions therefore remain poorly understood. This survey paper, based upon a unique interdisciplinary collaboration between legal and AI experts, provides a review of the explainability spectrum, as informed by a systematic survey of relevant research papers, and categorises the results. The article establishes a novel taxonomy, linking the differing forms of legal inference at play within particular legal sub-domains to specific forms of algorithmic decision-making. The diverse categories demonstrate different dimensions in explainable AI (XAI) research. Thus, the survey departs from the preceding monolithic approach to legal reasoning and decision-making by incorporating heterogeneity in legal logics: a feature which requires elaboration, and should be accounted for when designing AI-driven decision-making systems for the legal field. It is thereby hoped that administrative decision-makers, court adjudicators, researchers, and practitioners can gain unique insights into explainability, and utilise the survey as the basis for further research within the field.
... Several works on knowledge graph reasoning do not involve input text for inferring new relation edges. These tasks have been solved using different approaches including logic rules [2,33,41,63], bayesian models [47,58,70], distributed representations [36,55,79], neural networks [38,45,60,74], and reinforcement learning [19,37,64]. Relation prediction, as a special class of knowledge graph reasoning, predicts missing relation edges between two entities [12,13,44,50,65]. ...
Article
Full-text available
Contextual Path Generation (CPG) refers to the task of generating knowledge path(s) between a pair of entities mentioned in an input textual context to determine the semantic connection between them. Such knowledge paths, also called contextual paths, can be very useful in many advanced information retrieval applications. Nevertheless, CPG involves several technical challenges, namely, sparse and noisy input context, missing relations in knowledge graphs, and generation of ill-formed and irrelevant knowledge paths. In this paper, we propose a transformer-based model architecture. In this approach, we leverage a mixture of pre-trained word and knowledge graph embeddings to encode the semantics of input context, a transformer decoder to perform path generation controlled by encoded input context and head entity to stay relevant to the context, and scaling methods to sample a well-formed path. We evaluate our proposed CPG models derived using the above architecture on two real datasets, both consisting of Wikinews articles as input context documents and ground truth contextual paths, as well as a large synthetic dataset to conduct larger-scale experiments. Our experiments show that our proposed models outperform the baseline models, and the scaling methods contribute to better quality contextual paths. We further analyze how CPG accuracy can be affected by different amount of context data, and missing relations in the knowledge graph. Finally, we demonstrate that an answer model for knowledge graph questions adapted for CPG could not perform well due to the lack of an effective path generation module.
... Due to its simplicity in codifying the knowledge of human experts, rule-based reasoning systems have been widely used in various knowledge-intensive expert systems. For example, a rule-based system has been used for legal reasoning (Liu et al., 2021), safety assessment (Tang et al., 2020), emergency management (Jain et al., 2021), and online communication (Akbar et al., 2014). Specifically, in the biodiversity research area, rulebased systems are also widely used, for example, for predicting the impact of land-use changes on biodiversity (Scolozzi & Geneletti, 2011), molecular biodiversity database management (Pannarale et al., 2012), or for generating linked biodiversity data (Akbar et al., 2020). ...
Article
Full-text available
Aim/Purpose Although the significance of data provenance has been recognized in a variety of sectors, there is currently no standardized technique or approach for gathering data provenance. The present automated technique mostly employs workflow-based strategies. Unfortunately, the majority of current information systems do not embrace the strategy, particularly biodiversity information systems in which data is acquired by a variety of persons using a wide range of equipment, tools, and protocols. Background This article presents an automated technique for producing temporal data provenance that is independent of biodiversity information systems. The approach is dependent on the changes in contextual information of data items. By mapping the modifications to a schema, a standardized representation of data provenance may be created. Consequently, temporal information may be automatically inferred. Methodology The research methodology consists of three main activities: database event detection, event-schema mapping, and temporal information inference. First, a list of events will be detected from databases. After that, the detected events will be mapped to an ontology, so a common representation of data provenance will be obtained. Based on the derived data provenance, rule-based reasoning will be automatically used to infer temporal information. Consequently, a temporal provenance will be produced. Contribution This paper provides a new method for generating data provenance automatically without interfering with the existing biodiversity information system. In addition to this, it does not mandate that any information system adheres to any particular form. Ontology and the rule-based system as the core components of the solution have been confirmed to be highly valuable in biodiversity sci�ence. Findings Detaching the solution from any biodiversity information system provides scalability in the implementation. Based on the evaluation of a typical biodiver�sity information system for species traits of plants, a high number of temporal information can be generated to the highest degree possible. Using rules to en�code different types of knowledge provides high flexibility to generate temporal information, enabling different temporal-based analyses and reasoning. Recommendations for Practitioners The strategy is based on the contextual information of data items, yet most in�formation systems simply save the most recent ones. As a result, in order for the solution to function properly, database snapshots must be stored on a fre�quent basis. Furthermore, a more practical technique for recording changes in contextual information would be preferable. Recommendations for Researchers The capability to uniformly represent events using a schema has paved the way for automatic inference of temporal information. Therefore, a richer represen�tation of temporal information should be investigated further. Also, this work demonstrates that rule-based inference provides flexibility to encode different types of knowledge from experts. Consequently, a variety of temporal-based data analyses and reasoning can be performed. Therefore, it will be better to in�vestigate multiple domain-oriented knowledge using the solution. Impact on Society Using a typical information system to store and manage biodiversity data has not prohibited us from generating data provenance. Since there is no restriction on the type of information system, our solution has a high potential to be widely adopted. Future Research The data analysis of this work was limited to species traits data. However, there are other types of biodiversity data, including genetic composition, species population, and community composition. In the future, this work will be expanded to cover all those types of biodiversity data. The ultimate goal is to have a standard methodology or strategy for collecting provenance from any biodiversity data regardless of how the data was stored or managed.
... Still, there exist only comparably few systems that, in fact, automate reasoning processes based on normative knowledge. Notable examples are provided by Liu et al. who interpret legal norms in a defeasible deontic logic and provide automation for it [26], and the SPINdle prover [24] for propositional (modal) defeasible reasoning that has been used in multiple works in the normative application domain. ...
Preprint
LegalRuleML is a comprehensive XML-based representation framework for modeling and exchanging normative rules. The TPTP input and output formats, on the other hand, are general-purpose standards for the interaction with automated reasoning systems. In this paper we provide a bridge between the two communities by (i) defining a logic-pluralistic normative reasoning language based on the TPTP format, (ii) providing a translation scheme between relevant fragments of LegalRuleML and this language, and (iii) proposing a flexible architecture for automated normative reasoning based on this translation. We exemplarily instantiate and demonstrate the approach with three different normative logics.
... Due to its simplicity in codifying the knowledge of human experts, rule-based reasoning systems have been widely used in various knowledge-intensive expert systems. For example, a rule-based system has been used for legal reasoning (Liu et al., 2021), safety assessment (Tang et al., 2020), emergency management (Jain et al., 2021), and online communication (Akbar et al., 2014). Specifically, in the biodiversity research area, rule-based systems are also widely used, for example, for predicting the impact of land-use changes on biodiversity (Scolozzi & Geneletti, 2011), molecular biodiversity database management (Pannarale et al., 2012), or for generating linked biodiversity data (Akbar et al., 2020). ...
Article
Aim/Purpose: Although the significance of data provenance has been recognized in a variety of sectors, there is currently no standardized technique or approach for gathering data provenance. The present automated technique mostly employs workflow-based strategies. Unfortunately, the majority of current information systems do not embrace the strategy, particularly biodiversity information systems in which data is acquired by a variety of persons using a wide range of equipment, tools, and protocols. Background: This article presents an automated technique for producing temporal data provenance that is independent of biodiversity information systems. The approach is dependent on the changes in contextual information of data items. By mapping the modifications to a schema, a standardized representation of data provenance may be created. Consequently, temporal information may be automatically inferred. Methodology: The research methodology consists of three main activities: database event detection, event-schema mapping, and temporal information inference. First, a list of events will be detected from databases. After that, the detected events will be mapped to an ontology, so a common representation of data provenance will be obtained. Based on the derived data provenance, rule-based reasoning will be automatically used to infer temporal information. Consequently, a temporal provenance will be produced. Contribution : This paper provides a new method for generating data provenance automatically without interfering with the existing biodiversity information system. In addition to this, it does not mandate that any information system adheres to any particular forms. Ontology and the rule-based system as the core components of the solution have been confirmed to be highly valuable in biodiversity science. Findings : Detached the solution from any biodiversity information system provides scalability in the implementation. Based on the evaluation of a typical biodiversity information system for species traits of plants, a high number of temporal information can be generated to the highest degree possible. Using rules to encode different types of knowledge provides high flexibility to generate temporal information, enabling different temporal-based analyses and reasoning. Recommendations for Practitioners : The strategy is based on the contextual information of data items, yet most information systems simply save the most recent ones. As a result, in order for the solution to function properly, database snapshots must be stored on a frequent basis. Furthermore, a more practical technique for recording changes in contextual information would be preferable. Recommendations for Researchers : The capability to uniformly represent events using a schema has paved the way for automatic inference of temporal information. Therefore, a richer representation of temporal information should be investigated further. Also, this work demonstrates that rule-based inference provides flexibility to encode different types of knowledge from experts. Consequently, a variety of temporal-based data analyses and reasoning can be performed. Therefore, it will be better to investigate multiple domain-oriented knowledge using the solution. Impact on Society : Using a typical information system to store and manage biodiversity data has not prohibited us from generating data provenance. Since there is no restriction on the type of information system, our solution has a high potential to be widely adopted. Future Research : The data analysis of this work was limited to species traits data. However, there are other types of biodiversity data, including genetic composition, species population, and community composition. In the future, this work will be expanded to cover all those types of biodiversity data. The ultimate goal is to have a standard methodology or strategy for collecting provenance from any biodiversity data regardless of how the data was stored or managed. Keywords: temporal data provenance, biodiversity, ontology, rule-based reasoning
Article
Full-text available
Penelitian ini bertujuan untuk membangun sistem evaluasi tingkat pemahaman hukum menggunakan algoritma fuzzy logic berbasis bahasa python. Dengan memanfaatkan komponen input untuk menerima data kelompok pemahaman hukum dan komponen output untuk menghasilkan nilai tingkat pemahaman hukum. Metode algoritma fuzzy logic digunakan untuk mengolah data jawaban responden berdasarkan kelompok data, hasil uji sistem menunjukkan keakurasian sistem yang tepat. Temuan ini menegaskan kemampuan sistem dalam memberikan penilai dasar dari tingkat pemahaman hukum seseorang secara akurat. Implikasi dari sistem ini menunjukkan potensi sebagai alat evaluasi yang efektif untuk mengukur pemahaman hukum dari berbagai sampel kelompok tingkatan pemahaman seseorang khususnya di bidang hukum. Dengan memperhitungkan diversitas partisipan maka penelitian ini dapat memberikan landasan kuat bagi pengembangan sistem serupa dalam berbagai konteks evaluasi pemahaman yang akan datang.
Chapter
LegalRuleML is a comprehensive XML-based representation framework for modeling and exchanging normative rules. The TPTP input and output formats, on the other hand, are general-purpose standards for the interaction with automated reasoning systems. In this paper we provide a bridge between the two communities by (i) defining a logic-pluralistic normative reasoning language based on the TPTP format, (ii) providing a translation scheme between relevant fragments of LegalRuleML and this language, and (iii) proposing a flexible architecture for automated normative reasoning based on this translation. We exemplarily instantiate and demonstrate the approach with three different normative logics.KeywordsAutomated reasoningLegalRuleMLDeontic logics
Conference Paper
Full-text available
Representation and reasoning over legal rules is an important application domain and a number of related approaches have been developed. In this work, we investigate legal reasoning in practice based on three use cases of increasing complexity. We consider three representation and reasoning approaches: (a) Answer Set Programming, (b) Argumentation and (c) Defeasible Logic. Representation and reasoning approaches are evaluated with respect to semantics, expressiveness, efficiency , complexity and support.
Article
Full-text available
Decision-makers in governments, enterprises, businesses and agencies or individuals, typically, make decisions according to various regulations, guidelines and policies based on existing records stored in various databases, in particular, relational databases. To assist decision-makers, an expert system, encompasses interactive computer-based systems or subsystems to support the decision-making process. Typically, most expert systems are built on top of transaction systems, databases, and data models and restricted in decision-making to the analysis, processing and presenting data and information, and they do not provide support for the normative layer. This paper will provide a solution to one specific problem that arises from this situation, namely the lack of tool/mechanism to demonstrate how an expert system is well-suited for supporting decision-making activities drawn from existing records and relevant legal requirements aligned existing records stored in various databases.We present a Rule-based (pre and post) reporting systems (RuleRS) architecture, which is intended to integrate databases, in particular, relational databases, with a logic-based reasoner and rule engine to assist in decision-making or create reports according to legal norms. We argue that the resulting RuleRS provides an efficient and flexible solution to the problem at hand using defeasible inference. To this end, we have also conducted empirical evaluations of RuleRS performance.
Article
Full-text available
Rule mechanism has been widely used in many areas, such as databases, artificial intelligent and pervasive computing. In a rule mechanism, rule activation decides which rules are activated, when the rules are activated, and which tuples can be generated through the activation. Rule activation determines the efficiency of rule mechanism. In this article, we define the semantic constraints, constant constraint and variable constraint, of the rule according to the semantics of Datalog rules. Based on the constraints, we propose an index structure, named Yield index, to support the rule activation effectively. Yield index consists of the data index and semantic index, and records the complete information of a rule, including the matching relationship among the tuples of different relations in rule body. The index integrates tuple insertion and rule activation to directly determine whether the matching tuples of new inserted tuple exist. Due to this character, we perform effective rule activation only, avoiding ineffective rule activation that cannot generate new tuples, so that the efficiency of rule activation is improved. The article describes the structure of Yield index, the construction and maintenance algorithms, and the rule activation algorithm based on Yield index. The experimental results show that Yield index has better performance and improves activation efficiency of one order of magnitude, comparing with other index structures. In addition, we also discuss the possible extensions of Yield index in other applications.
Article
Full-text available
Data warehouse (DW) can be defined as a set of data cubes defined over the source relation. To avoid complex query evaluation based on master table, to increase the speed of queries posted to a data warehouse, we can use some snapshot results from the query processing stored in the data warehouse called materialized views. Appropriate Materialized views selection is one of the better and crucial decisions in designing a data warehouse for high efficiency as well as it is the basic requirement of successful business application. Materialized views are found extremely useful for quick query processing. In this paper, first we are focusing on various techniques that are implemented in past, recent for the selection of materialized view. Second, the most critical issues related to maintaining the materialized view and the effective query maintenance strategy are also discussed along with comparison between all the discussed systems.
Chapter
Arguments in argumentation-based query-answering systems can be associated with a set of evidence required for their construction. This evidence might have to be retrieved from external sources such as databases or the web, and each attempt of retrieving a piece of evidence comes with an associated cost. Moreover, a piece of evidence may be available at one moment but not at others, and this is not known beforehand. As a result, the set of active arguments (whose entire set of evidence is available) that can be used by the argumentation machinery of the system may vary from one scenario to another. In this work we propose a heuristic pruning technique for building dialectical trees in argumentation-based query-answering systems, with the aim of minimizing the cost of retrieving the pieces of evidence associated with the arguments that need to be accounted for in the reasoning process.
Chapter
In this paper we describe a method for efficient argument-based inquiry. In this method, an agent creates arguments for and against a particular topic by matching argumentation rules with observations gathered by querying the environment. To avoid making superfluous queries, the agent needs to determine if the acceptability status of the topic can change given more information. We define a notion of stability, where a structured argumentation setup is stable if no new arguments can be added, or if adding new arguments will not change the status of the topic. Because determining stability requires hypothesizing over all future argumentation setups, which is computationally very expensive, we define a less complex approximation algorithm and show that this is a sound approximation of stability. Finally, we show how stability (or our approximation of it) can be used in determining an optimal inquiry policy, and discuss how this policy can be used to, for example, determine a strategy in an argument-based inquiry dialogue.
Chapter
In this paper we present the DAQAP, a Web platform for Defeasible Argumentation Query Answering, which offers a visual interface that facilitates the analysis of the argumentative process defined in the Defeasible Logic Programming (DeLP) formalism. The tool presents graphs that show the interaction of the arguments generated from a DeLP program; this is done in two different ways: the first focuses on the structures obtained from the DeLP program, while the second presents the defeat relationships from the point of view of abstract argumentation frameworks, with the possibility of calculating the extensions using Dung’s semantics. Using all this data, the platform provides support for answering queries regarding the states of literals of the input program.
Article
Objectives: Fast and accurate patient triage for the response process is a critical first step in emergency situations. This process is often performed using a paper-based mode, which intensifies workload and difficulty, wastes time, and is at risk of human errors. This study aims to design and evaluate a decision support system (DSS) to determine the triage level. Methods: A combination of the Rule-Based Reasoning (RBR) and Fuzzy Logic Classifier (FLC) approaches were used to predict the triage level of patients according to the triage specialist's opinions and Emergency Severity Index (ESI) guidelines. RBR was applied for modeling the first to fourth decision points of the ESI algorithm. The data relating to vital signs were used as input variables and modeled using fuzzy logic. Narrative knowledge was converted to If-Then rules using XML. The extracted rules were then used to create the rule-based engine and predict the triage levels. Results: Fourteen RBR and 27 fuzzy rules were extracted and used in the rule-based engine. The performance of the system was evaluated using three methods with real triage data. The accuracy of the clinical decision support systems (CDSSs; in the test data) was 99.44%. The evaluation of the error rate revealed that, when using the traditional method, 13.4% of the patients were miss-triaged, which is statically significant. The completeness of the documentation also improved from 76.72% to 98.5%. Conclusions: Designed system was effective in determining the triage level of patients and it proved helpful for nurses as they made decisions, generated nursing diagnoses based on triage guidelines. The hybrid approach can reduce triage misdiagnosis in a highly accurate manner and improve the triage outcomes.