ArticlePDF Available

Towards an Efficient Rule-based Framework for Legal Reasoning

April 2021
Knowledge-Based Systems 224(6)

April 2021
224(6)

DOI:10.1016/j.knosys.2021.107082

Authors:

Qing Liu

The Commonwealth Scientific and Industrial Research Organisation

Mohammad Badiul Islam

The Commonwealth Scientific and Industrial Research Organisation

Guido Governatori

Charles Sturt University

A rule based knowledge system consists of three main components: a set of rules, facts to be fed to the reasoning corresponding to the data of a case, and an inference engine. In general, facts are stored in (relational) databases that represent knowledge in a first-order based formalism. However, legal knowledge uses defeasible deontic logic for knowledge representation due to its particular features that cannot be supported by first-order logic. In this work, we present a unified framework that supports efficient legal reasoning. In the framework, a novel inference engine is proposed in which the Semantic Rule Index can identify candidate rules with their corresponding semantic rules if any, and an inference controller is able to guide the executions of queries and reasoning. It can eliminate rules that cannot be fired to avoid unnecessary computations in early stages. The experiments demonstrated the effectiveness and efficiency of the proposed framework.

The Rule-based Framework for Legal Reasoning

…

An Example of a Rule Set and its Rule Index

…

Impact of numP vs Query Size

…

FLT Storage Size Comparison

…

Response Time for 1000 Queries

…

Figures - uploaded by Mohammad Badiul Islam

Content may be subject to copyright.

Content uploaded by Mohammad Badiul Islam

Content may be subject to copyright.

Towards an Eﬃcient Rule-based Framework for Legal

Reasoning

Qing Liu1, Mohammad Badiul Islam, Guido Governatori

Software & Computational Systems, Data61, CSIRO, Australia

{Q.Liu, Badiul.Islam, Guido.Governatori}@data61.csiro.au

Abstract

A rule based knowledge system consists of three main components: a set of

rules, facts to be fed to the reasoning corresponding to the data of a case, and

an inference engine. In general, facts are stored in (relational) databases that

represent knowledge in a ﬁrst-order based formalism. However, legal knowledge

uses defeasible deontic logic for knowledge representation due to its particular

features that cannot be supported by ﬁrst-order logic. In this work, we present

a uniﬁed framework that supports eﬃcient legal reasoning. In the framework,

a novel inference engine is proposed in which the Semantic Rule Index can

identify candidate rules with their corresponding semantic rules if any, and an

inference controller is able to guide the executions of queries and reasoning. It

can eliminate rules that cannot be ﬁred to avoid unnecessary computations in

early stages. The experiments demonstrated the eﬀectiveness and eﬃciency of

the proposed framework.

Keywords: Rule-based Legal Reasoning, Query Processing, Index,

Integration

1. Introduction

Normative systems can be understood as a set of norms, where each norm

can be represented as “IF precondition THEN conclusion” structure where the

1Corresponding Author

Preprint submitted to Journal of L

X Templates April 29, 2021

IF part represents the precondition of applicability of norms and the THEN

corresponds to the normative conclusion of the norms [1, 2]. Accordingly, rule5

based systems provide an adequate framework for the representation of norms,

normative systems and legal knowledge (see, for example, [3, 4, 5] for some

rule-based frameworks for legal reasoning).

Typically, a rule based knowledge system consists of three main components,

a set of rules (encoding the norms and principles to be used to perform the re-10

quired reasoning), facts to be fed to the reasoning corresponding to the data

of a case, and an inference engine. In general for many applications, the data

needed to run the application is stored in (relational) databases. Relational

databases essentially represent knowledge in a ﬁrst-order based formalism, and

query languages mostly exploit ﬁrst-order logic features. However, legal knowl-15

edge has some particular features that make ﬁrst-order logic not fully suitable

to represent it [6]. In general, the proper representation of norms and legal

knowledge requires:

•defeasible reasoning [7, 8], and

•reasoning about and with deontic concepts [9].20

We use rules converted from the New South Wales mandatory reporter guide-

lines [10] as an example.

Example 1.1. Consider the example of the provision prescribing to report to

Community Service (CS) if there is a situation where a child has been sexually

abused and the initiating person has continuing or imminent contact with the25

victim and there where some coercion or the victim is in a situation of inferior-

ity, then the situation has to be reported immediately to Community Services

(CS). Otherwise, the normal procedure is to ﬁle a formal report to CS. In case

of a problematic behaviour without further conditions, the educator has to con-

tinue to monitor the situation. The above legal requirement can be formally30

represented by the following rules [3]:

r1:SexualBehaviourV sOther ∧C oercionOrI nf erior

∧ContactW ithV ictim ⇒[OAN P ]reportT oC SI mmediately

r2:SexualBehaviourV sOther ∧C oercionOrI nf erior

∧¬[OAN P ]reportT oC SI mmediately ⇒[OAN P ]repor tT oCS

r3:P ersistentS exualBehaviour V sOther ∧CoercionOrI nf erior

∧¬[OAN P ]reportT oC SI mmediately ∧ ¬[OAN P ]reportT oC S

⇒[OAN P ]consultW ithC W U N[OM]monitor

, where ⇒indicates that the rule is a defeasible rule that can be defeated

by a contrary evidence, [OANP] and [OM] are deontic operators, operator N

in conclusion is used for expressing a preference ordering (as in this case) for al-

ternatives or reparation chain in which if an obligation (e.g. consultW ithC W U35

in r3) is violated, then the violation can be compensated by the next obligation

(e.g. monitor).

By the above example, we can see that rule-based legal knowledge systems

have their own features compared with that of the other rule-based knowledge

systems:40

•Logic: Defeasible logic with deontic operators is used to model legal rules.

Most of rule-based systems considers strict rules only. It means whenever

a precondition is indisputable, then so is the conclusion. However, for

a legal knowledge-based system, not only strict rules, but also defeasible

rules and defeaters are involved (see Section 2 for details). It indicates45

that the semantic relations among rules are more complex in a rule-based

legal system than that in a general rule-based system.

•Precondition: A set of predicates form the precondition of a rule. Each

predicate name represents a legal term in a legal system and the predicate

is corresponding to a simple or complex database query that may involves50

multiple selections, projections and joins across tables and databases.

Here, a precondition can be represented by a set of DB queries. As for

general rule-based systems, each predicate is corresponding to a database

attribute. Therefore, a precondition represents only one DB query in a

“Select-From-Where” form but not a set of queries with more rich seman-55

tics to describe preconditions as legal rules do.

•Conclusion: Given the same precondition, the conclusion of a rule could be

a reparation chain in which if an obligation is violated, then the violation

can be compensated by the next obligation and so forth using the N

operator (e.g. r3in Example 1.1), where in general rule-based systems, a60

conclusion is indisputable.

The process of determining which rules should be applied and how they

should be interpreted is often referred to as legal reasoning. To allow knowledge-

based systems to support large scale legal reasoning for decision making and

even be able to explain or legally justify conclusions reached, it is critical to65

understand how reasoning is achieved. In general, there are two main strategies

of rule-based reasoning: (a) forward chaining that starts with existing facts and

applying rules to derive all possible facts, and (b) backward chaining that starts

with the desired conclusion and performs backward to ﬁnd supporting facts.

There are some challenges that must be overcome for large scale legal reasoning70

due to its own characteristics:

•Complex semantic relations: Rules in legal domain have not only depen-

dent relationship but also defeater relationship. Together with the repara-

tion chain situation, they bring more dynamics during reasoning process

which essentially leads to the performance challenge.75

•Inference eﬃciency: rule matching is to determine how to match facts

stored in databases with rules. It is a crucial issue that inﬂuences the

eﬃciency of a reasoning performance. A well-deﬁned index may mean

the diﬀerence between hours and a few seconds in rule matching [11].

In many rule-based systems, the precondition of a rule is constructed by80

predicates that directly correspond to attributes in relational databases.

Then indexes are developed using attributes to facilitate fast rule match-

ing. However, given a large amount of legal rules and complex predicates,

it is impossible using the existing methods to build index by decomposing

each predicate to attribute level due to huge memory consumptions and85

the complex relations among rules. And the whole index may need to be

updated if a database relation is changed.

•Recursive problem: When modelling real world applications, some rules

are dependent on other rules. In Example 1.1, r3is dependent on r1and

r2because two of its predicates depend on the conclusion of r1and r2

respectively, where r2is also dependent on r1. Agrawa et al. show that

recursive rules can be converted to solving the transitive closure on the

database relations [12]. Due to the limitations of attribute-based approach

as mentioned above, new methods that do not rely on attributes and

database relations are demanded to address the recursive problem for legal95

reasoning.

•Reactive inference: the current reasoning process is that the inference

engine looks for rules which match facts stored in the working memory

or provided by users. One rule is selected from the “conﬂict set” and

executed to generate a new fact. Then the inference engine continues the100

reasoning based on the new fact together with the previous given facts.

We call this as reactive inference because the inference engine only reasons

based on what are given. In a Big Data environment, it is impossible for

users to pre-know all the facts.

•Database query eﬃciency problem: most of the existing work focuses on105

reducing the response time for rule matching but neglects the query ex-

ecution time and the reasoning time. [13] is the only work that studied

the query eﬃciency problem between databases and the rule-based legal

system. Its empirical experiments have suggested that query executions

are the most expensive processes for the rule-based legal decision making.110

However, how to map queries to the corresponding rules is not studied in

that work.

In this paper, we study the integration problem between database systems

and rule-based legal systems. Given a set of queries/predicates that users are

interested in, we present a uniﬁed framework with the aim to minimizing the115

overall response time for legal reasoning. In the framework, a novel inference

engine is proposed in which the Semantic Rule Index can identify candidate

rules with their corresponding semantic rules if any, and an inference controller

is able to guide the executions of queries and reasoning. It can eliminate rules

that cannot be ﬁred to avoid unnecessary computations in early stages. To the120

best of our knowledge, this is the ﬁrst work that provides a seamless integration

between the inference engine and databases for rule-based legal reasoning. The

contributions are summarised as following:

•We formally deﬁne the “Rule Containment Query” problem that given a

set of predicates, it returns both strict rules and defeasible rules that can125

be ﬁred, as well as their semantic rules that can explain why those rules

are ﬁred.

•A novel inference engine for legal reasoning is proposed. It includes:

–The two-layer Semantic Rule Index that is able to identify candidate

rules eﬃciently. It also establishes a semantic relationship among130

rules to solve the recursive problem.

–An inference controller that guides reasonings and queries to avoid

unnecessary query and rule computations.

•Some database caching techniques are adopted in the uniﬁed framework

to improve the overall database query performance that further reduces135

query responding times.

•The experiments conducted demonstrate the eﬀectiveness and eﬃciency

of the proposed framework.

The rest of the paper is organized as follows. We introduce the Defeasible

Logic and Defeasible Deontic Logic in Section 2. In Section 3, we formally de-140

ﬁned the problem. The framework with novel inference engine and database

caching techniques is presented in Section 4. Evaluation results are shown in

Section 5. We review some related work in Section 6. Finally, Section 7 con-

cludes the work.

2. Defeasible Logic145

Legal reasoning can be viewed as rule-guided2activities and processes in-

volving series of actions leading to a legal decision [14]. Indeed, Defeasible Logic

(DL) is a formalism that has been successfully used for legal reasoning to gen-

erate a legal decision (and it has been proved that other formalisms successful

in legal reasoning correspond to variants of DL [15]). Defeasible Deontic Logic150

has been successfully used for applications in legal reasoning [16, 17, 18, 19] and

it is has been shown that it does not suﬀer from problems aﬀecting other logics

used for reasoning about norms and compliance [20, 19]. Thus Defeasible Deon-

tic Logic is a conceptually sound approach for the representation of regulations

and at the same time, it oﬀers a computationally feasible environment to reason155

about them ([21] proved that the logic is computationally feasible since we can

compute the extension of a theory in linear time).

Defeasible Logic, a “skeptical” nonmonotonic logic (meaning that it does

not support contradictory conclusion), was originally proposed by Donald Nute

[22]. Since then it has been signiﬁcantly used in the legal domain or closely160

related areas, such as modelling regulations [23], e-contracting [17, 24], busi-

ness processes compliance [25, 26] and automatic negotiation system [27]. The

modelling of regulations in DL also oﬀers support for “Decision support”, “Ex-

planation”, “Anomaly detection”, “Hypothetical reasoning” and “Debugging”

tasks. Decision support is used to infer a correct answer from given rules and165

regulations. DL is one of the possible solutions since regulations may contradict

2Rules are encoded according to the legal requirements describes in legal documents.

one another. Using the defeasible rules do not necessarily in force; instead they

may be blocked by other rules with contrary conclusions [23].

Adefeasible theory D (a knowledge base in defeasible logic, or a defeasible

logic program [28]), consists of ﬁve diﬀerent kinds of knowledge: facts, strict170

rules, defeasible rules, defeaters, and a superiority relation. D is a triple (F, R, 

) where Fand Rare ﬁnite sets of facts and rules respectively, and is a

superiority relation on R.

The language of DL consists of a ﬁnite set of literals, where a literal is

either an atomic proposition or its negation. Given a literal l,∼ldenotes its175

complement. That is, if l=pthen ∼l=¬p, and if l=¬pthen ∼l=p.

Facts are logical statements describing indisputable facts, represented either

in the form of states of aﬀairs (literal or modal literal (refer to section 2.1)) or

actions that have been performed, and are considered to be always true. For

example, “John is a human” is represented by: human(John).180

A rule r, on the other hand, describes the relations between a set of literals

(the precondition A(r), which can be empty) and a literal (the conclusion C(r)).

We can specify the strength of the rule relation using the three kinds of rules

supported by DL, namely: strict,defeasible, and defeater.

Strict rules are rules in the classical sense: whenever the premises are indis-

putable (e.g. a fact) then so is the conclusion. For example,

human(X)→mammal(X)

which means “Every human is a mammal”.185

It is worth to mention that strict rules with empty precondition can be

interpreted the same way as facts. However, in practice, facts are more likely

to be used to describe contextual information; while rules, on the other hand,

are more likely to be used to represent the reasoning underlying the context.

Defeasible rules are rules that can be defeated by contrary evidence. For

example, typically mammal cannot ﬂy, written formally:

mammal(X)⇒ ¬ﬂies(X)

The idea is that if we know that Xis a mammal, then we may conclude that190

it cannot ﬂy unless there is other, not defeated, evidence suggesting that it may

ﬂy (for example that the mammal is a bat). A defeasible rule with an empty

precondition can be considered as a presumption.

Defeaters are rules that cannot be used, on their own, to draw any con-

clusions. Their only use is to prevent some conclusions, i.e., to defeat some

defeasible rules by producing evidence to the contrary. For example the rule:

heavy(X) ¬ﬂies (X)

states that an animal is heavy is not suﬃcient enough to conclude that it does

not ﬂy. It is only evidence against the conclusion that a heavy animal ﬂies. In195

other words, we do not wish to conclude that ¬ﬂies if heavy, we simply want

to prevent a conclusion ﬂies.

A full deﬁnition of the proof theory can be found in [16, 29]. Roughly, the

rules with conclusion pform a team that competes with the team consisting of

the rules with conclusion ¬p. If the former team wins pis defeasibly provable,200

whereas if the opposing team wins, pis non-provable. To conclude, let us

consider Das a theory in DL (as described above). A conclusion of Dis a

tagged literal and can have one of the following four forms: +∆qmeaning that

qis deﬁnitely provable in D(i.e. using only facts and strict rules); −∆qmeaning

that we have proved that qis not deﬁnitely provable in D; +∂q meaning that205

qis defeasible provable in D; and −∂q meaning that we have proved that qis

not defeasible provable in D.

Strict derivations are obtained by forward chaining of strict rules while a

defeasible conclusion pcan be derived if there is a rule whose conclusion is p,

whose prerequisites (precondition) have either already been proved or given in210

the case at hand (i.e., facts), and any stronger rule whose conclusion is ¬phas

precondition that fails to be derived. In other words, a conclusion pis (defea-

sibly) derivable when: pis a fact, or there is an applicable strict or defeasible

rules for p, and either all the rules for ¬pare discarded (i.e., not suitable) or

every rule for ¬pis weaker than an applicable rule for p.215

2.1. Defeasible Deontic Logic

It has been argued that legal reasoning requires two types of rules: constitu-

tive rules and prescriptive rules. Constitutive rules are used to model deﬁnition

of terms and parameters speciﬁc to legal documents, for example, the deﬁni-

tions of terms in an act, whereas prescriptive rules are applied for encoding220

the obligations, prohibitions, permissions, . . . , and the conditions under which

they enter into force according to a speciﬁc legal document. To correctly model

the provision corresponding to prescriptive norms, we have to supplement the

language with deontic operators. In this respect we follow the classiﬁcation

proposed by [30, 31]. In addition, the logic has mechanisms to terminate and225

remove obligations (see [26] for full details). For obligations and permission we

use the following notation:

•[P]p:pis permitted;

•[OM]p: there is a maintenance obligation for p;3

•[OAPP]p: there is an achievement preemptive and perduring obligation230

for p;

•[OAPNP]p: there is an achievement preemptive and non-perduring obli-

gation for p;

•[OANPP]p: there is an achievement non preemptive and perduring obli-

gation for p;235

•[OANPNP]p: there is an achievement non preemptive and non-perduring

obligation for p.

Compensations are implemented based on the notion of ‘reparation chain’

[32]. A reparation chain is an expression

[O1]c1⊗[O2]c2⊗ · · · ⊗ [On]cn,

3Prohibitions can be expressed as maintenance obligations with a negated content, i.e.,

[OM]¬p.

where each [Oi] is an obligation, and each ciis the content of the obligation

(modelled by a literal). The meaning of a reparation chain is that we have that

c1is obligatory, but if the obligation of c1is violated, i.e., we have ¬c1, then the240

violation is compensated by c2(which is then obligatory). But if even [O2]c2is

violated, then this violation is compensated by c3which, after the violation of

c2, becomes obligatory, and so on.

Defeasible Deontic Logic allows deontic expressions (but not reparation chains)

to appear in the body of rules. Thus we can have rules like:

restaurant,[P]sellAlcohol ⇒[OM]showLicense ⊗[OANPP]payFine

The rule above means that if a restaurant has a license to sell alcohol (i.e, it

is permitted to sell it, [P]sellAlcohol ), then it has a maintenance obligation to

expose the license ([OM]showLicense), if it does not then it has to pay a ﬁne

([OANPP]payFine). The obligation to pay the ﬁne is non-pre-emptive (meaning

that it cannot be paid before the violation). The logic is equipped with a binary

relation over rules, called superiority relation, that allows us to handle rules with

conﬂicting conclusions: for example a rule rsetting a general prohibition and

a second rule sthat derogates the prohibition permitting the conclusions. This

type of situation is common in legal reasoning and can be modelled by saying

that sis “stronger” than r, in symbols s > r. If both rules apply, we will say

that sdefeats r. For example, continuing the restaurant example above, we can

have the rules

r1:restaurant ⇒[OM]¬sellAlcohol

r2:restaurant,license ⇒[P]sellAlchol

r2> r1

The ﬁrst rule (r1) prescribe the general prohibition for a restaurant to sell

alcohol, and the second rule (r2), in conjunction with the superiority relation,245

derogate the prohibition, permitting the sale if the restaurant has a license to

see alcohol.

For full a description of the logic and its features, see [17, 26, 3].

The reasoning to determine what obligations, prohibitions, and permissions

are derivable from a set of facts and a set of rules is as follows.250

An obligation [O]p(where [O], [Ox] and [Dy], in the description below, are

placeholders for the obligations described above) is derivable if:

1. [O]pis given as one of the facts, or

2. there is a rule

r:a1,...an⇒[O1]p1⊗... ⊗[Om]pm⊗[O]p . . .

such that

(a) for all 1 ≤i≤n,aiis provable, and255

(b) for all 1 ≤j≤m, [Oj]pjand ¬pjare provable, and

s:b1, . . . , bk⇒[D1]q1⊗... ⊗[Dl]ql⊗[D]p0

such that p0is the negation of p, either

i. exists 1 ≤i≤ksuch that biis not provable, or

ii. exists 1 ≤j≤lsuch that either [Dj]qjor ¬qjis not provable, or

iii. rdefeats s.260

The idea is that there must be a rule that ﬁres: so all the elements in the

antecedents are provable (a), and in case the conclusion is an obligation for a

reparation, all the obligations before it have to be violated. Thus, the violated

obligations were in force (thus the obligations were provable) and we have ev-

idence that it was violated (thus the negation of the content of each violated265

obligation is provable) (b). Also, we have to ensure that there are no rules for

the opposite that ﬁre (c), and if they do, these rules are weaker than the rule

for the obligation we want to conclude.

For permission, we have the same conditions, but where we use [P]pinstead

of [O]p; also, we conclude [P]pif we can conclude [O]p. Due to space reasons,270

readers interested in understanding the semantics, deontic operator conversions,

conﬂict detections, conﬂict resolutions, and algorithm implementing this rule-

based system are referred to [33, 21, 26, 3] for details.

3. Problem Statement

Through out the discussion in previous sections, we can see that predicates275

and literals have the same meaning. In the rest of the article, we will refer them

interchangeably based on the context.

A rule can be ﬁred only if all the literals in its antecedent are provable. We

classify literals into 2 types based on how they are derived:

Deﬁnition 3.1 (Fact Literal).A literal in a rule’s antecedent that provides280

indisputable facts through an SQL query statement is a fact literal.

Deﬁnition 3.2 (Dependent Literal).A literal in a rule’s antecedent, that is

dependent on another rule’s conclusion, is a dependant literal.

The semantic relationships between two rules can be deﬁned as:

Deﬁnition 3.3 (Support Rule).Rule rjis rule ri’s support rule if rihas a285

dependent literal on rj.

Deﬁnition 3.4 (Defeat Rule).Rule rjis rule ri’s a defeat rule if ri< rj.

Example 3.1. Figure 1 shows an example of a rule set. Among the literals of

all the rules’ antecedents, pi(i∈1..7) and ¬p4are fact literals. ¬d1,¬d2and

d5are dependent literals because they are dependent on the conclusions of rule290

r2,r3and r5respectively.

r1: p1∧O¬d1∧O¬d2 => Od3

r2: p3∧p4∧p5=> O¬d1

r3: p3∧p5∧O¬d1=> O¬d2 ⊗Od7

r4: p1∧p2∧O¬d1=> Od4

r5: p5 => Od5

r6: p5∧p7∧¬p4∧Od5=> O¬d7

r7: p3∧ p6=> Od6

r3< r6

Figure 1: A Rule Set Example

Example 3.2. For rule r1in Figure 1, it can not be ﬁred unless fact literal

p1is true, and ¬d1and ¬d2are provable. Since ¬d1and ¬d2are dependant

literals, we have to examine r1’s support rule r2and r3. At the same time, r2

is also the support rule of r3. This is an example of the recursive rule problem295

as mentioned above. Furthermore, since r6is a defeat rule of r3,r6needs to be

reasoned as well to decide if ¬d7is derivable that violates d7in r3in case ¬d2

in r3is violated.

Next, we formally deﬁne the problems.

Deﬁnition 3.5 (Rule Containment Query (RCQ)).Given a query set Q=300

{q1, q2, ..., qx}, rule set R={r1, r2, ..., rm}, where Fr={p1, p2, ..., pn}is the

fact literal set of r(r∈R). Rule Containment Query returns all the rules

RQ={r|Fr⊆QTr∈R}that can be ﬁred as well as their corresponding

support rules and/or defeat rules.

By the above deﬁnition, users can apply RCQ to determine a set of legal305

literal of interest instead of using a set of all pre-known facts. We need to

address the recursive rule problem to decide if a rule can be ﬁred or not given a

query set. The sets of supported rules and defeated rules provide an explanation

of the reasons why the rules are ﬁred. Furthermore, RCQ could involve both

backward chaining and forward chaining since we want to identify all the rules310

that can be ﬁred.

Based on users’ interests, we also deﬁne the Rule Intersection Query problem

as following. It has a more relax constraint on rules compared with that for the

Rule Containment Query.

Deﬁnition 3.6 (Rule Intersection Query).Given a query set Q={q1, q2, ..., qx},315

rule set R={r1, r2, ..., rm}, where Fr={p1, p2, ..., pn}is the fact literal set of

r(r∈R). Rule Intersection Query returns all the rules RQ={r|∃p∈FrTp∈

QTr∈R}that can be ﬁred with their corresponding support rules and/or

defeat rules.

In this paper, we focus on the containment query. The principles can be320

adapted to address the intersection query easily.

4. System Framework

For a rule-based system to support decision-making activities drawn from

existing records stored in various databases and relevant legal requirements,

it is important that the two systems, database systems and rule systems, can325

interact with each other but also can work independently. In this section, we

present a novel uniﬁed framework that integrates the two systems seamlessly

for legal reasoning and answers the Rule Containment Query eﬃciently.

Given a query set Qthat represents users’ interests and a rule set R, the

overall response time for users receiving reasoning results are inﬂuenced by the330

three components: the time for identifying relevant rules, query execution time

to compute facts, and the reasoning time. A straight forward method is for

every rule r∈R, (a) compute if Fris contained by Q, (b) execute all the

fact literals p(p∈Fr) represented by SQL statements to generate facts, (c)

identify its support rules to decide if its dependant literals are provable, which335

is a recursive process, (d) send the rule, the computed fact literals and the

dependent literals to an inference engine to conduct reasoning, and (e) check

its defeat rules are provable if any, by which itself may again involve recursive

problem. All the steps, apart step (d) which is reasonably fast, require expensive

processes.340

We propose a novel framework for Rule-based Legal Reasoning that has the

following advantages: ﬁrst, by incorporating a two-layer Semantic Rule Index

in the inference engine, we are able to search candidate rules based on literals

provided by users, as well as their corresponding support rules and defeat rules

eﬃciently. Second, the Inference Controller designed for the inference engine345

can use intermediate query results and reasoning results to remove rules that

can not be ﬁred in early stages. It avoids unnecessary query computations

and reasonings that further improve the performance signiﬁcantly. Third, the

framework allows for the adoption of database caching strategies to reduce the

response time.350

Figure 2 shows the overall Rule-based framework. The three main compo-

nents are: the Semantic Rule Index (SRI), the Inference Controller and the

Database Query.

Remote

Cache

Query

Cashed

Fact Literal Trie

Inference Engine

yes

Query Execution

Rule

Containment

Query

Rule

System

Predicates

Semantic Rule Graph

Index

Database Query

Inference Controller

Work ing Mem ory

Rule

Interpreter

All triggered rules

and explanations

Figure 2: The Rule-based Framework for Legal Reasoning

4.1. The Semantic Rule Index

Based on Deﬁnition 3.5, given a query set Q, Rule Containment Query prob-355

lem seeks all the rules that can be ﬁred. Our goal is to design an index structure

to eﬃciently solve the lookup problem as well as rule recursive problem. The

Semantic Rule Index has two layers: the Fact Literal Trie and the Semantic

Rule Graph.

Let FR=S

r∈R

Frbe a set of distinct fact literals of all rules in R, where Fr

360

is rule r’s literals in its antecedent. Let fbe a bijective mapping f:FR→I,

where I={1,2,3, ..., |FR|}. By f, we assign every fact literal {∀p|p∈FR}an

unique ID. Since a precondition is a conjunction of literals, all fact literals of

a rule can be represented by a set of IDs in ascending order. So we can work

directly with the IDs for the index purpose.365

The Fact Literal Trie

A data structure for fast super-set and sub-set queries, named set trie, is

presented in [34]. Set trie is a tree storing a set of words which are represented

by a path from the root of the set-trie to a node corresponding to the indices of

elements from words. Next we will show how we adopt this data structure to370

solve our look up problem.

The Fact Literal Trie (FLT) is a tree-based data structure built by all fact

literals’ IDs with the properties of set-trie. It has a root node with key {} and all

its child nodes with literal IDs kas keys, where k∈ {1,2, ..., |FR|}. A node khas

ordered children with their unique keys js in ascending order, where ∀j, j > k.375

Therefore, a rule’s fact literals can be uniquely represented by a path in FLT.

The fact literal with the largest ID of each rule has its rule ID associated with

it.

r1: p1∧O¬d1∧O¬d2 => Od3

r2: p3∧p4∧p5=> O¬d1

r3: p3∧p5∧O¬d1=> O¬d2 ⊗Od7

r4: p1∧p2∧O¬d1=> Od4

r5: p5 => Od5

r6: p5∧p7∧¬p4∧Od5=> O¬d7

r7: p3∧ p6=> Od6

r3< r6

(a) Rule Set

{}

----------------------------------------------

(b) The Semantic Rule Index

Fact Literal Trie Semantic Rule

Graph

¬d1¬d1

¬d2

¬d1

(<,d7)

Figure 3: An Example of a Rule Set and its Rule Index

Example 4.1. The upper part of Figure 3(b) shows an example of the Fact

Literal Trie for the rule set in Figure 3(a), where I={1,2, ..., 8}represents380

{p1, ..., p7,¬p4}respectively. We can see that the path {} → 1→2 in FLT

represents r4’s fact literals. r2,r3and r7share the common path {} → 3. It

presents that they all have fact literal p3in their preconditions. 8 is the largest

fact literal ID with which its rule ID r6is associted.

The FLT construction is similar to that in [34] except it deals with literal but385

not word. To be self-contained, next we present the complete FLT construction

method in Algorithms 1 and 2.

To guarantee the property that all children’s keys are larger than that of

Algorithm 1: FactLiteralTrieConstruction

Input : Rule set R

Output : Fact Literal Trie rootNode

1create rootNode with key {};

2foreach r∈Rdo

3sort r.getF r() in ascending order;

4node ←rootN ode;

5RuleInsertion(node, r);

6return rootNode

their parent, we pre-sort the literal IDs kof each rule in ascending order (Line

3 in Algorithm 1). Therefore, for the Rule Insertion method, node with smaller390

key can always be inserted before the node with larger key. Later we will show

that by this feature, the FLT is able to ﬁlter irrelevant rules eﬃciently given a

query.

Algorithm 2: RuleInsertion

Input : node, r

Output : Fact Literal Trie rootNode

1if r.getF r().getC urrentP () 6=null then

2if exists child of node with key k=r.getF r().getC urrentP () then

3nextNode ←child of node with key k;

4else

5nextNode ←create child of node with key k;

6RuleInsertion(nextN ode, r.getF r().g etNextP ());

7else

8node.setRuleID(r.getI D());

9return

In Algorithm 2, ﬁrst we check if all the fact literals pof rule rhave been

visited (Line 1 −7). Line 2 −5 build a new child node to hold key kif kdoesn’t395

exist in the children of the current node. Otherwise, the algorithm is ready to

examine the next literal based on the child of current node with key k(Line 6).

If all the fact literals have been inserted to the trie, the corresponding rule ID

is assigned to the literal with the largest literal ID (Line 8).

The Semantic Rule Graph400

A Semantic Rule Graph (SRG) is a labelled directed acyclic graph g=

(V, E , L(v), L(e)), where Vis a set of vertices, each vertex vrepresenting a

rule; L(v) is a vertex labelling function, L(v)∈ {r1..|R|}.E⊆VxVis a set

of directed edges, each edge < vi, vj>representing a semantic relationship

between riand rj.L(vi, vj) is an edge labelling function, where L(vi, vj) = d405

if rjis a support rule of riand dis the corresponding dependent literal, or

L(vi, vj) = (≺, d0) if rjis ri’s defeat rule and d0is an obligation in ri’s conclusion.

It indicates that rjhas ¬d0in its conclusion.

SRG models the support and defeat relationships among rules. Given a rule

ri, all the vertices that can be reached directly from rifollowing edges’ directions410

are its support rules or defeat rules. All the vertices that can be reached directly

from riagainst edges’ directions are rules that have rias support rule or defeat

rule. The recursive rule problem is solved through graph traversal. Note that

the Semantic Rule Graph constructed may contain a set of graphs in which each

graph captures the rules that have semantic relations among them.415

Example 4.2. Figure 3(b) lower part shows an example of the Semantic Rule

Graph. By the graph, we can see that r2is the support rule of r1,r3and r4

because each of them has a path to r2. Similarly, r3is the support rule of r1.

r6is the defeat rule of r3and r5is r6’s support rule. Since r7does not have

any semantic relationship with the rest of the rules, it is not presented in the420

Semantic Rule Graph.

4.2. Querying the Semantic Rule Index

As introduced in Section 2.1, for defeasible deontic logic, a rule rcan be ﬁred

if and only if: (a) all its fact literals are provable, (b) all its dependent literals

are provable, and (c) the rules for the opposite conclusion are either defeated or425

do not ﬁre. Based on the Semantic Rule Index constructed, the query algorithm

aims at identifying the candidate rules and their corresponding semantic rules

eﬃciently.

Give a query Q, there are two steps to query the Semantic Rule Index. First,

we search the Fact Literal Trie for all the candidate rules whose fact literals are430

contained by Q(Algorithm 3). Then the candidate rules are pushed into the

Semantic Rule Graph to identify their corresponding rules that have semantic

relations (Algorithm 4).

Algorithm 3: FactLiteralTrieSearch

Input : Trie root node, Query Q

Output : CQ

1if node.getRuleI D() 6=null then

2CQ←node.getRuleI D();

3if node.hasChild() == false or node.getKey() ≥Q.getM axI D() then

4return

5foreach node’s child cwith key k∈Qdo

6node ←c;

7F actLiteralT rieS earch(node, Q);

Fact Literal Trie Search

Algorithm 3 shows the key steps. The algorithm works in a recursive depth435

ﬁrst search manner. It identiﬁes a path that contains all the keys in Q(Line

5−7). The current recursion stops when it reaches FLT leave nodes or the key

of node is greater than or equal to the maximum key in Q(Line 3 −4). Recall

that one of the main features of the Fact Literal Trie is that keys of children

nodes are always larger than that of the parents. Therefore, there is no need440

to search the children of the current node further. The rules whose fact literals

are contained by Qcan be identiﬁed through the nodes with rule IDs that are

visited by the algorithm (Line 1 −2).

Example 4.3. Consider query Q={1,3,5,7}and the Fact Literal Trie in

Figure 3(b). Algorithm 3 ﬁrst identiﬁes that the path 1 : {} → 1, path 2 : {} → 3445

and path 3 : {} → 5 are the candidate paths to be further examined. For path

1, since r1is associated with key 1, r1can be added to the candidate result set

CQimmediately. Since key 1’s child key 2 is not contained by Q, there is no

need to examine its children further. Path 2 can be extended to: {} → 3→5.

For path 3, since key 7 is not contained by Q, there is no need to examine its450

child and this path can be pruned straight away. The output of the trie search

is CQ={r1, r3, r5}.

In Example 4.3, we can see that there is no need to consider all the children

paths if the current ID is not contained by Q. The algorithm is able to prune

the paths as early as possible that leads to an eﬃcient trie search.455

Semantic Rule Graph Search

The Semantic Rule Graph search algorithm identiﬁes all the rules that have

semantic relations with the candidate rules generated by Algorithm 3. For each

candidate rule ri, a semantic rule search starting from riis conducted on the

Semantic Rule Graph. All the vertices rjs that can be reached from riare the460

support rules or defeat rules. And all the vertices rks that can reach riare the

rules that have rias their support rule or defeat rule.

Algorithm 4: SemanticGraphSearch

Input : a set of semantic rule graphs {G}, Candidate Rule Set CQ

Output : A set of subgraphs Crules

1Set all vertices and edges in the corresponding G“Unvisited”;

2Crules ←null;

3foreach r∈CQdo

4if G.getLable(r) = U nvisited then

5SemanticRuleS earch(G, r, Crules );

Algorithm 4 seeks all the rules that have semantic relations with candidate

rules. First all the vertices and edges are set to Unvisited and the candidate

rule graph set Crules is null as the initialization (Line 1−2). For each candidate465

Algorithm 5: SemanticRuleSearch

Input : Semantic Rule Graph G, vetex v, a set of subgraphs Crules

Output : the updated Crules by v

1overlapV ertex ←null;

2newSubGraph =f alse;

3G.set(v, V isited);

4if vdoes not belong to any graph g∈Crules then

5Graph g←new Graph();

6g.addV ertex(v);

7Crules ←g;

8newSubGraph =true;

9foreach e∈G.outgoingEdges(v)// backward chaining

10 do

11 if G.getLabel(e) = U nvisited then

12 G.set(e, V isited);

13 w←e.getT arg et();

14 g.addEdge(v, w );

15 if G.getLabel(w) = U nvisited then

16 SemanticRuleS earch(G, w , Crules);

17 else

18 if newSubGraph =true then

19 overlapV ertex.add(w);

20 Similarly, search all in going edges of v// forward chaining;

21 if !overlapV ertex.isEmpty() then

22 merge subgraphs in Crules that share the same vertex w∈overlapV ertex;

rule rreturned by the FLT search, Semantic Rule Search Algorithm is executed

(Line 4 −20).

The main idea of Algorithm 5 is that during traversing the Semantic Rule

Graph Gstarted with a given candidate rule ri, all the vertices and edges

reached by riin Gare the rules that need to be computed and marked as470

“Visited” (Line 9 −19). Since Gwill be searched for other candidate rules rj

in CQ, we could stop traversing Gwhen rjreaches vertex rkand/or edge ek

that is already marked “Visited”. It means rjshares some semantic rules with

riand the semantic relations starting from rk/ekhave been discovered before.

There is no need to traverse Gfrom rk/ekfurther. Otherwise, the recursive rule475

search (Line 15) will not stop until reaching rules that do not have any semantic

relation with any other rule (no outgoing/ingoing edge).

Second, a new graph gwill be formed to represent r’s part of the semantic

rules if the current being visited rule is not discovered (marked as “Unvisited”)

by previous rules (Line 4−8). Compared with the traditional Depth First Search480

algorithm, instead of returning a set of visited vertices or a path, our algorithm

needs to capture the full semantic relations among the candidate rules which are

essential a set of directed acyclic labelled sub-graphs of the original Semantic

Rule Graph. The new generated gis put in the Cr ules set.

Similarly, the algorithm traverses through ri’s ingoing edges (Line 20) as485

well to identify all potential rules that have rias their support rule or defeat

rule.

Last, as mentioned above, since rjmay share some semantic rules with ri,

we need to merge the two subgraphs together to represent the overall semantic

relations among the rules.490

Example 4.4. Continue the Example 4.3. Given query Q={1,3,5,7}and

the output of the FLT search CQ={r1, r3, r5}, Figure 4(a) shows the semantic

rule search result. By the graph, we can see that to ﬁre r1, we need to check if

its support rule r3and r2can be ﬁred. However, to ﬁre r3, not only its support

rule r2but also its defeat rule r6need to be checked. Furthermore, r6cannot495

be ﬁred unless its support rule r5is ﬁred. In summary, to check if the candidate

rule r1and r3can be ﬁred, potentially we need to examine r2, r6and r5as well.

Furthermore, the algorithm also do the forward chaining checking. Since r2is

a support rule of r4,r4may be ﬁred if r2is ﬁred.

Crules =

(a) Candidate rule graph (b) Sorted Candidate Rule Graph

¬d1

¬d2

¬d1(<,d7)

d5r1 r3 r2 r6 r5

¬d2

¬d1

(<,d7)

¬d1

{}

¬d1

Figure 4: An Example of the Semantic Graph Search

The size of the candidate rule graph set |Crules| ≤ |CQ|. By the algorithm,500

we can see that every vertex and edge in Gare visited at most once no matter

how many candidate rules are present. The algorithm guarantees that the search

can be done in O(V+E) time, where Vand Eare the number of vertices and

edges contained in the Semantic Rule Graph. As stated before, Vis smaller

than the total number of rules since it only contains the rules that have semantic505

relations with other rules.

4.3. The Inference Controller

To ﬁre a rule, it is necessary to reason its support rules and defeat rules if

any. In another word, if a rule’s support rules cannot be ﬁred, its correspond-

ing dependent literal is not provable and therefore, the rule cannot be ﬁred.510

Furthermore, if a rule’s defeat rule concludes the opposite, the rule cannot be

ﬁred either. Therefore, if not only the semantic relations but also the ordering

of the rules to be reasoned can be identiﬁed, we can avoid some unnecessary

computations including both query executions and reasonings.

The inference controller ﬁrst sets the reasoning order among the rules by515

topological sorting. Topological sorting problem is that given digraph G=

(V, E ), ﬁnd a linear ordering of vertices such that: for all edges (v, w)∈E,v

precedes win the ordering. It has been proved that a directed acyclic graph

(DAG) can be topologically sorted and constructing a topological sorting of

any DAG can be done in linear time. However, the solution is not necessarily520

unique. Since the candidate rule graphs identiﬁed by Algorithm 4 are DAGs,

we are able to order the rules. Figure 4(b) shows the corresponding topological

sorted candidate rule graph for Figure 4(a).

By the order, the inference controller guides the reasonings and database

queries to avoid unnecessary computations. It is able to ﬁlter rules and stop the525

graph traversal as early as possible based on the query and reasoning results.

This essentially leads to identifying the rules that can be ﬁred eﬃciently.

Another beauty of the algorithm is that not only unnecessary reasonings and

query executions can be avoided, but also the inference processes for multiple

candidate rules can be done in one traversal of a sorted candidate rule graph.530

Therefore, the query and reasoning involved are executed only once for multiple

rules. It greatly reduces the overall response time further. Algorithm 6 shows

the main idea.

The Inference Controller performs the reasoning by backward traversing each

candidate rule graph g∈Crules (Line 2 −26). This ensures that a candidate535

rule’s support rule and/or defeat rule are always computed ﬁrst by which the

number of unnecessary queries and reasonings can be minimized. All the rules

that can be ﬁred and their consequences are stored in Ctrue . To check if a rule

rcan be ﬁred, it involves three steps:

•First, all its dependent literals are provable (Line 4 −10). If a conclusion540

of one of its support rules r0does not match the corresponding dependent

literal (Line 7), rcannot be ﬁred. Note here, since there may have a

reparation chain in r0’s conclusion, it is necessary to check not only that

r0can be ﬁred, but also that r0’s conclusion can lead to r’s corresponding

dependent literal provable.545

Furthermore, the inference control method takes a proactive approach

that updates gto remove all the rules that are dependent on rif rcan be

ﬁred (Line 8) . The update algorithm backward traverses the candidate

rule graph and removes all vertices that have a dependent path to r.

Algorithm 6: InferenceController

Input : Candidate Rule Graphs Crules, Candidate Rule Set CQ

Output : A set of rules that can be ﬁred

1Topological sort each subgraph in Crules ,Ctrue ←null;

2foreach candidate rule graph g∈Crules do

3foreach r∈gbackward traversing gdo

4if there are dependent literals d∈A(r)then

5foreach ddo

6Get d’s corresponding rule r0,c←get(Ctr ue, r0);

7if !L(r, r0).equals(c)then

8g.update(r);

9if ∃r00 ∈CQand r00 ∈gthen

10 Go to next rin g;

11 if there are fact literals f∈A(r)then

12 foreach fdo

13 Get f’s SQL query qfrom Predicates, Execute q;

14 if qreturns false then

15 g.update(r);

16 if ∃r00 ∈CQand r00 ∈gthen

17 Go to next rin g;

18 Get rule rfrom the Rule System;

19 Send rand A(r) to the Rule Interpreter, c←r’s reasoning result;

20 if rhas defeat rule r0on cthen

21 c0←get(Ctrue , r0);

22 if ¬c0.equals(c)then

23 g.update(r);

24 if ∃r00 ∈CQand r00 ∈gthen

25 Go to next rin g;

26 Cture ←(r, c);

27 Find all the paths Pin gthat CQreaches;

28 return Pand Ctrue

Therefore, the algorithm removes rules that cannot be ﬁred anyway as550

early as possible to avoid unnecessary computations. After gis updated,

the algorithm continues the computation if there are vertices left in gthat

have not been tested (Line 9 −10).

•Second, all its fact literals are provable (Line 11 −17). In the frame-

work, the Predicate database stores all fact literals’ corresponding SQL555

queries. The algorithm retrieves r’s SQL queries from Predicate and per-

forms query execution (Line 13). If any query returns false, it means it

is not provable and rcannot be ﬁred. The algorithm updates gagain to

remove all the impacted rules that cannot be ﬁred (Line 15 −17).

•Last, if r’s dependent literals and fact literals are all provable, we retrieve560

rule rfrom the Rule System. rwith all the literals in its precondition are

sent to the Inference engine for reasoning (Line 18 −19). Then the algo-

rithm checks if there is a defeat rule r0that provides opposite conclusion

(Line 20 −25). If r0does not defeat r, then ris successfully ﬁred and

saved in Ctrue (Line 26). The result can be used to compute following565

rules that have semantic relationships with r.

Through the above process, the algorithm is able to eﬃciently identify the

rules that can be ﬁred. The algorithms also generate the corresponding support

rules and defeat rules to explain why the rules can be ﬁred.

Example 4.5. Continue Example 4.4. The Inference Controller starts with570

rule r5. If r5cannot be ﬁred or r5can be ﬁred but d5is not provable based

on the reasoning of r5, we can conclude that r6cannot be ﬁred. The sorted

candidate rule graph is updated by removing vertex r5,r6and edge r6→r5

and r3→r6. Next the algorithm evaluates r2. If r2cannot be ﬁred due to some

of its fact literals are not provable, all the vertices that have paths leading to r2

575

can be removed from the sorted candidate rule graph. It means that no rule can

be ﬁred. By the example, we can see that the response time can be improved

by avoiding querying and reasoning times for r6,r3,r1and r4.

Example 4.6. In Example 4.4, assume that both r5and r6can be ﬁred but

the conclusion of r6is ¬d7. It means r6defeats r3. Then the algorithm updates580

the sorted candidate rule graph by removing vertex r3,r1and edge r3→r6,

r1→r3,r1→r2and r3→r2. Eventually only r2and r4are left un-evaluated.

Then the algorithm continues the process.

4.4. Database Query

Large scale legal reasoning often needs to access distributed relational databases585

that are hold by diﬀerent organizations. These databases may contain a large

amount of data which provides evidences as facts. For Rule-based Legal Rea-

soning, we need to query these databases to identify facts to determine what

obligations, prohibitions and permissions are derivable.

Disk-based databases can pose several challenges to achieve low latency and590

scalability: (a) slow processing queries due to the data retrieval speed from disk

plus the added query processing times, (b) costly scaling for high reads, and

one of the most eﬀective strategies for improving the overall query performance.

Frequently used data can be stored locally. This makes data retrieval faster595

because it removes network traﬃc that is associated with retrieving data. There

are two popular caching methods.

•Materialized View: A materialized view is a database object that contains

the results of a query. They are local copies of data located remotely,

or are used to create summary tables based on aggregations of a table’s600

data. Index structure can be built on materialized view. Hence, accessing

and querying to a materialized view are much faster than accessing and

computing based on remote tables.

The major target to select an appropriate set of views is to reduce the

entire query response time as well as maintaining cost. Materialized view605

selection involves query frequency, query processing and storage cost along

with materialized view maintenance cost. It is cheaper in many cases

where a query is complex (e.g., involve many tables and complex calcula-

tions) or base tables contain a huge amount of records to compute. Mohod

et al. [36] presented an extensive survey on using eﬀective materialized610

view selection and maintenance to improve query performance.

It is time consuming for a RDBMS to materialize a view and its indices.

Hence, in the presence of updates to the base tables referenced by a ma-

terialized view, it is maintained up to date incrementally. The approach

computes changes to the materialized view and applies them to bring it615

up to date.

•Key-Value Store: A key-value store maintains key-value pairs consisting

of a unique identiﬁer (key) associated with some arbitrary value. For a

Rule-based Legal Reasoning System, a key is query ID and the value is

the result of the query computed based on remote databases. Similar to620

materialized view, key-value store improves query performance because

a cache look up is much faster than executing a complex SQL query on

remote distributed tables. In the presence of updates to the original data,

we need to maintain the cached key-value pairs consistent transparently

[37, 38, 39].625

Materialized view and key-value store are suitable for diﬀerent applications.

Key-value store and key-value pairs enhance performance of queries that read

a very small amount of entire dataset repeatedly. SQL queries used to compute

a materialized view typically retrieve many rows [40]. For a large scale rule-

based reasoning system, various query results may be presented based on remote630

databased to be accessed. Therefore, for the cache component in our Rule-based

Legal Reasoning System, we adopt a generic approach that both materialized

view and key-value store methods can be applied. Each record in the Cache has

an unique query ID. The query selection criteria and update strategies can be

deﬁned based on use cases.635

For the Database Query component, when receiving query q, the system

checks if qis cached. If yes, the results can be retrieved directly from Cache.

Otherwise, qis executed on the corresponding remote databases(Algorithm 6

Line 16). For most cases, we only care about if a returned result is true or

false which implies if the corresponding fact literal is provable or not. Then the640

algorithm continues accordingly.

5. Experiment

In this section, we evaluate the proposed framework. For the reasoning

process, the major concerns are the eﬃciency problem for large scale of legal

rules sets as well as large scale of facts. We perform queries on the rule sets645

and compute the response times. Regarding the overhead, we measure the index

storage required. All the experiments were performed on a mac machine of Intel

Core i7 CPU @2.7GHz and 16GB RAM. The algorithms were implemented in

Java using JDK 8.

The performance of the whole legal reasoning framework could be impacted650

by rule set size, the size of rules’ preconditions as well as the amount of seman-

tic relationships among rules. Therefore, the actual content of the underlying

rules and databases can be ignored. In the experiments, we set up the fol-

lowing four parameters to simulate real world rule set situations with diﬀerent

characteristics:655

•Size: the number of rules in a legal system

•minP/minQ: the minimum number of predicates/literals of a rule/query

•maxP/maxQ: the maximum number of predicates/literals of a rule/query

•numP: the total number of distinct predicates/literals contained in a rule

set/query set660

•rel%: percentage of rules that have either support relations or defeat re-

lations with other rules in a rule set

In the following, ﬁrst the whole framework using synthetic datasets is eval-

uated. Then we study the rule sets with diﬀerent characteristics in details for

each main component. Last, we evaluate the caching performance using real665

datasets.

5.1. Overall Performance Evaluation on Inference Engine

In this sub-section, we examine the ﬁltering power of the components in the

proposed inference engine as well as the query response time and the storage

required.670

5.1.1. Datasets

3 rule sets with diﬀerent sizes (Table 1) are used to test the overall frame-

work performance. Each rule could have predicates/literals between 1 and 100

to reﬂect the characteristics of real world rule sets. In D1and D2, 5% of rules

have either support relations or defeat relations with other rules, and 20% in675

D3. These relations may cause the recursive problem as introduced in previ-

ous section. The query set Qcontains 1000 queries in which the number of

predicates/literals for each query is between 2 and 200.

Rule Set Size minP maxP numP rel%

D110000

1 100 1000 5%

D2100000

D3100000 20%

Table 1: Rule Sets A

5.1.2. Filtering Power by the Fact Literal Trie (FLT) and the Semantic Rule

Graph (SRG)680

To simulate the actual query and reasoning results, a parameter f,failedRa-

tio, is set as 0.5 to represent the possibility that a predicate is failed to return

“true” due to the conditions stated at Line 9, 17 or 26 in Algorithm 6 that lead

to the failure of rule activation.

Table 2 shows the summary results of 1000 queries for each dataset. For685

example in D1, among all the rules that can be ﬁred by a query, in average,

84.6% of activated rules are identiﬁed by FLT. These rules were sent to the

Inference Controller for database query execution directly. For the 15.4% of

rules that need to be further examined by SRG, only 50% of rules can be ﬁred

(equivalent to 7.7% of total activated rules) based on the computations of rule690

interpreter and database queries. For D2and D3, 95.2% and 88.9% of activated

rules are recorgnized by the Trie.

Dataset Rules ﬁred

by FLT

Candidate rules to be

tested by SRG

Rules ﬁred

by SRG

D184.6% 15.4% 7.7%

D295.2% 4.8% 1.2%

D388.9% 11.1% 1.2%

Table 2: Average Filtering Results of 1000 Queries

Given the fact that in many legal regulations, most of the rules do not have

support and/or defeat relations with other rules, the Fact Literal Trie is able to

identify the majority of the rules that can be ﬁred quickly. Since the number695

of rules that have relations with other rules is becoming larger in D3compared

with that in D2, the number of candidate rules to be tested by the SRG is larger

(11.1% vs 4.8% as shown in Table 2).

5.1.3. Filtering Power by the Inference Controller (IC)

Since IC uses the candidate rule graphs generated by SRG as inputs, in the700

following we use two queries to explain the ﬁltering power of SRG and IC.

For query 321, Figure 5 shows the partial candidate rule graph of candidate

rule 940 in dataset D3. During the query processing, ﬁrst, rule 940 was recor-

ganized by the Fact Literal Trie as a candidate rule. Then its corresponding

candidate rule graph shown in Figure 5 was searched using the Semantic Rule705

Graph . Based on the topological order computed by Algorithm 6, IC updated

the candidate graph after each rule was computed including database querying

and/or rule reasoning. For example, if rule 981 cannot be ﬁred due to one of its

predicates failed, then the algorithm will remove node 981, 1136, 1244 and 940

based on the dependant relations among them. Then the algorithm concludes710

940

970 973

977

981

1051

1103

1147

1200

1208

1115

1135

1120

1136

1145

1163

1149

1175

1244

1258

1206

1216

1217

1227

1166

Figure 5: Partial Candidate Rule Graph for Query 321

that rule 940 cannot be ﬁred without computing the rest of rules. On the other

hand, if rule 1217 can be ﬁred, since 1217 is a defeat rule for rule 1227, the

algorithm will remove node 1227, 1216, 1244 and 940. Therefore, the algorithm

is able to avoid unnecessary computations and reach the conclusion as early as

possible. In our experiment, rule 940 was ﬁred for query 321 which means all its715

support rules can be activated and they were not defeated by the corresponding

defeat rules.

3896 4206 4425

4601

4593

4721

4631

4709

4689

4776 48504888 49124925

4375

4463

4546

4542

4543

Figure 6: Partial Candidate Rule Graphs for Query 444

For query 444, there are 3 candidate rules generated by FLT: rule 4375,

3896 and 4721 in dataset D3. After searching the Semantic Rule Graph, there

are only two candidate rule graphs were constructed by the algorithm (Figure720

6). This is because the candidate rule graph of rule 4721 is the sub-graph of

rule 3896’s candidate rule graph. By the algorithm, every node in the Semantic

Rule Graphs is examined only once regardless of the number of candidate rules.

Furthermore, the execution results stored in Working Memory can also be shared

among candidate rules that further avoid unnecessary computations. By the725

above two examples, we showed how the algorithm is able to minimize the

computation to improve the reasoning eﬃciency.

Given a candidate rule, by comparing the number of executed support and

defeat rules with the size of its corresponding candidate rule graph, we can

derive how many computations are saved by IC.730

Dataset failedRatio = 0.5 failedRatio = 0.1

D162% 34%

D255% 36%

D359% 35%

Average 58.7% 35%

Table 3: Average Saved Computations

Using the same datasets and queries, Table 3 shows the results using the

two diﬀerent failedRatios. As mentioned before, when failedRatio is set as 0.5,

it indicates that a rule has 50% of possibility to be failed. In this case, for all

the three datasets, our algorithm saves 58.7% of computations in average. Even

when we set failedRatio very low as 0.1, 35% of computations can still be saved.735

This experiment demonstrated the ﬁltering power of Inference Controller.

5.1.4. Query Response Time

Table 4 shows the response times for 1000 queries. In general, FLT takes

most of response times for all the three datasets. Dataset D2contains 100K

rules. Its query response time (2469s) is larger than that for D1with 10Krules740

(972s). Overall, the response time doesn’t grow exponentially as its rule set size

grows.

FLT response time in D3is similar to that for D2. Since there is a large

amount of rules in D3that has support or defeat relations (20%) with other

rules compared with that in D2(5%), the number of semantic rule graphs in745

Dataset FLT (ms) SRG (ms) Total Time (ms)

D1875 97 972

D21938 531 2469

D32518 1758 4276

Table 4: Response Time for 1000 Queries

each dataset as well as the size of each semantic rule graph could become larger.

By closing examining the graphs involved in D3, one of its semantic rule graph

contains about 10K rules. Therefore, SRG response time (1758s) in D3is larger

than that (531s) for D2.

Overall, even given a rare case of a legal rule system that contains a large750

amount of rules with complex semantic relations, the proposed framework and

the algorithms can perform queries eﬃciently.

5.1.5. Index Size

We further examine the index sizes for the three datasets. By table 5, we

can see that in general, FLT uses more storage spaces compared with that for755

SRG. Since D2and D3contain large amounts of rules, their FLT sizes (591MB,

593MB) are larger than that for D1(94.2MB). D3takes a slightly more storage

for its SRG because it has a large number of rules with semantic relations.

We can conclude that with the increasing size of rule sets and the increasing

complexities of relations among rules, the proposed index method is scalable.760

Dataset FLT (MB) SRG (MB) Total Size (MB)

D194.2 0.07 94.27

D2591 0.91 591.91

D3593 3.43 596.43

Table 5: Index Size for the Three Datasets

5.2. Performance Evaluation on the Fact Literal Trie

By the above experiments, we can see the Fact Literal Trie plays a signiﬁcant

role in the overall performance. In this sub-section, we will examine the rule sets

with various characteristics to analyze how the corresponding FLTs perform.

5.2.1. Datasets765

To understand the impacts of rules’ predicates/literals’ distribution on per-

formance, we generate 9 datasets (Table 6) with diﬀerent maxP and numP

while ﬁxing the rule set size as 10000. The density of a rule set could be deﬁned

using the sparsity matrix in which row is the rule set size and column is the

average number of predicates in a rule set. Then, the density of a rule set is the770

number of non-zero cells in a sparsity matrix. Based on the density, the 9 rule

sets can be classiﬁed into 3 groups: dense, medium, and sparse. Within each

group, there are 3 rule sets with diﬀerent maxP . For example, rule set DD1

contains 10Krules with total 200 distinct predicates and the largest rule could

have up to 100 predicates. The FLT generated using DD1is T D1. The density775

of FLT depends on the density of its corresponding rule set.

Group Rule Set (10K) Density minP maxP numP FLT

Dense DD10.25 1 100 200 T D1

DD20.125 1 50 200 T D2

DD30.05 1 20 200 T D3

Medium DM40.05 1 100 1000 T M1

DM50.025 1 50 1000 T M2

DM60.01 1 20 1000 T M3

Sparse DS70.01 1 100 5000 T S1

DS80.005 1 50 5000 T S2

DS90.002 1 20 5000 T S3

Table 6: 9 Rule Sets

We also generated the 12 query sets with diﬀerent minQ,maxQ and numP

to study the performance. Each query set contains 100 queries. Table 7 shows

the 12 query sets that are classiﬁed into 4 groups: the small query group with

each query having 2 ∼50 predicates, the medium query group with each query780

having predicates between 80 and 120, the large query group having 150 ∼200

predicates and mix group containing both small and large queries.

Group Query Set (100) minQ maxQ numP

Small QS12 50 200

QS22 50 1000

QS32 50 5000

Medium QM480 120 200

QM580 120 1000

QM680 120 5000

Large QL7150 200 200

QL8150 200 1000

QL9150 200 5000

Mix QX10 2 200 200

QX11 2 200 1000

QX12 2 200 5000

Table 7: 12 Query Sets

5.2.2. Query Eﬃciency

In this sub-section, we evaluate the query response times using the query sets

in diﬀerent groups on the corresponding rule sets. Speciﬁcally, a query set can785

be performed on FLTs that have the same predicate range numP . For exam-

ple, query sets QS1, QM4, QL7, QX10 can perform on rule set DD1, DD2, DD3

because they all have the same predicate range numP = 200. In the following,

the query response times are reported on the basis of 100 queries.

Impact of FLT Density First we examine how the rule sets with diﬀer-790

ent density perform regarding the query response time. Figure 7 shows the

results. Recall that density(T M4)> density(T M5)> density(T M6) and

density(T S7)> density(T S8)> density(T S9). When QS2is executed on

T M4, T M5and T M6, the query response time on T M6is slower than that of

T M5which is slower than that of T M4. The reason is that the search space on795

a sparse FLT (e.g. T M6) could be larger than that on a dense FLT (e.g. T M4).

This leads to a slower response time. Similar pattern can be found on all data

sets. Therefore, we can conclude that given the same query set, the more sparse

a rule set is, the more sparse its FLT is, and the slower the query response time

is.800

100

200

300

400

500

600

QS2 QM5 QL8 QX11 QS3 QM6 QL9 QX12

Response Time (ms)

TM4 (d=0.05)

TM5 (d=0.025)

TM6 (d=0.01)

TS7 (d=0.01)

TS8 (d=0.005)

TS9 (d=0.002)

numP=5000numP=1000

Figure 7: Impact of Rule Set Density

The above pattern is also observed in the query group QS1,QM4,QL7,

QX10 with numP = 200 performing on T D1, T D2, T D3. But the overall query

response times are much larger than those with numP = 1000 and numP =

5000. Next we will explain this case .

Impact of numP vs Query Size In most real world cases, in general a query805

size is much smaller than the total number of predicates contained in a rule set,

|Q|<< numP . By the above analysis, we know that the more sparse a FLT is,

the slower the query response time is. However, this feature could change if a

query size becomes very large.

Figure 7 only shows the performance of query sets with numP = 1000,5000810

in which the longest response time is 600ms. If we compare the performance

between query sets with numP = 200 and query sets with numP = 1000,5000

(Figure 8), we can see that the query response times of query set QS1and QM4

over T D1, T D2, T D3follow the same pattern as we discussed above. However,

for QL7, its response times over T D1, T D2, T D3are all above 3000ms which815

are signiﬁcantly larger than that of any other query sets. This is because the

average size, 175, of QL7is approaching the total number of predicates (200)

of the rule sets. It means that all predicates contained by a query could cover

nearly all the predicates contained in a FLT. Therefore, the search algorithm

has little capacity to quickly ﬁlter unnecessary paths in FLT and may have to820

traverse nearly the entire space.

Furthermore, although the sparsity of DD3(d= 0.05) is smaller than that

of DD2(d= 0.125) which is smaller than that of DD1(d= 0.25), the query

response times don’t follow the above pattern. This is because that DD3’s

maxP = 20 which is smaller than that of DD2(maxP = 50), the search825

depth on T D3is smaller than that on T D2given query size approaching 200.

Therefore, a search can be stopped earlier on each path that leads to overall less

query response time. For similar reason, the response time of QL7over T D2is

smaller than that of QL7over T D1.

QX10 contains both small and very large queries, its performance follows830

the similar trend of that for QL7.

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

|QS1|=25

|QM4|=100

|QL7|=175

|QX10|=100

Response Time (ms)

DD1 (maxP=100)

DD2 (maxP=50)

DD3 (maxP=20)

numP=200

Figure 8: Impact of numP vs Query Size

5.2.3. FLT Storage Size and Construction Time

We measured the FLT storage size for the above 9 rule sets and the corre-

sponding FLT construction time.

FLT Size Figure 9 shows the index sizes. Given the same numP , the larger the835

number of predicates (maxP ) involved in a rule , the larger the index size. This

is because it requires to construct longer paths in a trie to capture a larger rule

which in turn occupy more spaces. For example, given the same numP = 200

but with diﬀerent maxP = 100,50,20, the index size of T D1is much larger

than that of T D2which is larger than that of T D3.840

100

TD1(maxP=100)

TD2(maxP=50)

TD3(maxP=20)

TM4(maxP=100)

TM5(maxP=50)

TM6(maxP=20)

TS7(maxP=100)

TS8(maxP=50)

TS9(maxP=20)

FLT Size (MB)

numP=200 numP=1000 numP=5000

Figure 9: FLT Storage Size Comparison

However, if numP is larger, it means that a rule sets is more sparse. The

probability that predicates of diﬀerent rules share the same FLT paths is lower.

This leads to more storage required to construct a FLT. Therefore, the larger

anumP is, the larger the index size is. For example, rule set DD1, DM4, DS7

all have the same maxP = 100 but with diﬀerent numP = 200,1000,5000845

respectively. The index size of T S7is larger than that of T M4, and the index

size of T M4is larger than that of T D1. Similar patten can be found in other

corresponding rule sets.

FLT Construction Time Figure 10 shows the index construction time for the

9 rule sets. In general, all the FLTs can be constructed very fast. Similar to850

the index size, given the same numP (e.g. numP = 200 for DD1, DD2, DD3),

the larger a maxP (e.g. maxP = 100,50,20 for DD1, DD2, DD3respectively),

the longer the time required to construct a FLT.

100

120

140

160

180

TD1(maxP=100)

TD2(maxP=50)

TD3(maxP=20)

TM4(maxP=100)

TM5(maxP=50)

TM6(maxP=20)

TS7(maxP=100)

TS8(maxP=50)

TS9(maxP=20)

FLT Construction Time (ms)

numP=200 numP=1000 numP=5000

Figure 10: FLT Construction Time

5.3. Performance Evaluation on Caching Technique

As mentioned before, the overall response time consists of two components:855

the response time required to execute database queries for the corresponding

predicates, and the time required by reasoning which has been discussed in the

previous sub-sections. This sub-section aims to evaluate whether the adopted

caching techniques oﬀer a practical solution to the database query performance

issues given a large scale of facts stored in databases. We uses the same case860

studies presented in [13] to demonstrate the eﬃciency.

5.3.1. Dataset

The ChildSafe Care Online Management System (ChildSafeOMS) studied

in [13] contains three sources of information: (1) a child care database (Child-

SafeDB) recording child enrolment, attendance, abuse and/or neglect data etc.,865

(2) a set of defeasible rules encoding the decision trees, deﬁnitions and (nor-

mative) guidelines of New South Wales Mandatory Reporter Guidelines (NSW

Government 2016), and (3) a set of bridging statements relating the terms of

the regulations and the ﬁelds (and data) in the ChildSafeDB.

SQL queries deﬁne the meaning of literals used in a defeasible theory re-870

garding data and the schema of a database. Since query response times depend

on the structure and size of databases as well as structures of SQL queries, the

actual content of the undergoing database is not relevant. Therefore, in [13],

to increase the scale of databases as well as query complexities, the datasets

from stat-computing4are extracted to many diﬀerent tables that are linked to875

ChildSafeDB. The resultant databases contain more than 11 million records in

total. The data stored in the databases are queried to provide facts for the

reasoner in the form of predicates.

5.3.2. Result

We compare the query response time with and without caching technique.880

For some simple queries, the response times are short even using the remote

databases. But for some queries involving multiple Joins and large tables, we

have to stop the execution due to long waiting time. There were 30 predicates

were tested in [13] and the average response time is 171.19 ms. In the following,

the three example predicates are used to show the performance.885

Predicate 1

SELECT t2.child_crn, t2.year, t2.month

FROM tbl2008 t2

LEFT JOIN tblchildren ON t2.child_crn = tblchildren.child_crn

890

Predicate 2

SELECT t1.child_crn, t1.year, t1.month

FROM tbl2007 t1

LEFT JOIN tblchildren ON t1.child_crn = tblchildren.child_crn

UNION895

SELECT t2.child_crn, t2.year, t2.month

FROM tbl2008 t2

LEFT JOIN tblchildren ON t2.child_crn = tblchildren.child_crn

Predicate 3900

SELECT t1.child_crn, t2.child_crn

4http://stat-computing.org/dataexpo/2009/supplemental-data.html

FROM tbl2007 t1, tbl2007 t2

WHERE t1.child_crn <> t2.child_crn

Limit 1

905

The materialized views for the above 3 queries are built and used to compare

its performance with that using local databases and remote databases. Table 8

shows the comparison results on the query response times.

Predicate ID Remote DB Local DB Materialized View

1 12s 167ms 925ms 99 ms

2 25s 51ms 2s 428ms 457 ms

3 48s 418ms 14s 226ms 50ms

Table 8: Query Response Time on 3 predicates

By Table 8, we can see the query response times for all the 3 predicates using

remote databases are very slow but are improved using local databases. And910

we have to limit the number of returned record to 1 for predicate 3. Otherwise

it caused the system to hang if using remote database. The response times are

orders of magnitude faster by the corresponding materialized views constructed.

The materialized view for predicate 3 is especially useful that reduces the query

response time from 48 seconds using remote database to 50 millisecond As men-915

tioned before, how to select materialized views will be depend on speciﬁc use

cases.

6. Literature Review

In this section, we review some most related work from two research streams:

the rule matching problem and the argumentation framework. We also analyze920

how the proposed work could advance the state-of-the-art to support eﬃcient

legal reasoning.

The Rule Matching Problem Legal reasoning consists of series of com-

plex reasoning activities [14] and “Rule-Based Reasoning” is one of the most

crucial types 5. Intuitively, a rule-based system consists of three essential compo-925

nents: a set of rules (rule base), fact base (knowledge base) and an interpreter for

the rules (inference engine). Rules that reﬂect the content of knowledge-based

sources are applied and matched with a set of facts to deduce the conclusion

using an inference engine in rule-based reasoning [41]. For many applications,

facts needed to run application are stored in (relational) databases. There-930

fore, rule-bases systems need to be coupled to database systems to allow the

interactions between rules and facts.

Since database schemas and vocabularies do not change often, it motivates

an approach of encoding relationships between and among schemas as rules in a

language called Datalog. A basic rule is an expression of the form shown below:

r:p1(t11, ..., t1n)∧... ∧pk(tk1, ..., tkn )→c(t1, ..., tn) (1)

where p1, ...pkare names of relations and tis are variables or constants. pis and

care called predicates and conclusion respectively.

When a new tuple is added to database, an inference engine cycles through935

three sequential steps: (1) match rules with the tuple, (2) run queries on all

relations involved in rules’ preconditions, and (3) solve the rule by reasoning to

generate resultant tuples. These steps would iterate as each tuple could trig-

ger additional rules. Therefore, the eﬃciency of a rule-based system primarily

depends on the eﬃciency of the above 3 steps. Given a rule-based system con-940

sisting of large number of rules and facts, rule matching is time consuming but

crucial that inﬂuences the performance of an inference engine. The eﬃciency

problem has been extensively studied in the past. Stonebraker [42] summarizes

four techniques in principle: brute force, marking, discrimination networks and

query rewrite.945

5https://lawprofessors.typepad.com/legal_skills/2011/08/

tip-of- the-week- five-methods- of-legal-reasoning.html accessed on 22 February

2019

Brute force builds index on attributes of relations. DLV6is a deductive

database system based on disjunctive logic programming. It constructs an index

on the ﬁrst attribute of each EDB relation. Jena7GraphDB8and SwiftOWLIM9

for triples build full index on each attribute of each EDB relation. Both Yap [43]

as a prolog-based system and Ontobroker10 as a deductive database create index950

on the ﬂy by analyzing the activation rules. Another prolog-based system, XSB

[44], oﬀers a repertoire of indexing techniques, which must be speciﬁed manually.

AMEM index for LEAPS11 records the static part of EDB and Filter Index for

DATEX takes the advantage from both TREAT and the marking method as

explained below.955

The basic idea of marking method, used in POSTGRES rules system (PRS2),

is that each rule is processed against the database and every record satisfying

the event qualiﬁcation is identiﬁed. Each such record is marked with a ﬂag

identifying the rule to be awakened [42]. There are diﬃcult problems with

keeping the markings correct as updates are made to the database.960

Discrimination networks, such as RETE [45], TREAT [46] and their vari-

ations, are mainly used by production and reactive rule systems including

Drools12 and Jess13 . RETE adopts a rule-centric approach that uses rules to

construct a network that eﬃciently locates tuples based on constant constraints

between rules and tuples. TREAT is a tuple-centric network that records all po-965

tential useful tuples in a network. The recent published work, Yield Index [47],

is similar to discrimination networks. It connects the matching tuples through

its semantic index that is organized on top of its data index to speed up the

update.

6http://www.dlvsystem.com

7http://jena.apache.org

8http://graphdb.ontotext.com

9https://lists.w3.org/Archives/Public/semantic-web/2010Jul/0411.html

10https://www.w3.org/2001/sw/wiki/OntoBroker

11https://www.leap.com.au

12https://www.drools.org

13https://jessrules.com/jess/index.shtml

The above three techniques are designed for forward chaining rules. The970

forth technique, query rewrite, is popular in backward chaining implementation.

Each applicable rule is substituted into user commands to produce a modiﬁed

command. [48] shows how to extend query rewrite to also support forward

chaining implementations.

All the above approaches improve the eﬃciency of rule matching but they975

are attribute-based methods. For legal reasoning using defeasible deontic logic,

each predicate itself could be a simple or complex SQL query. Therefore, those

techniques are not practical to an inference engine for legal system. Further-

more, long rules with many predicates create query eﬃciency problem which is

called long chain eﬀect. Previous study mainly focuses on the rule matching980

step but neglects the time for query execution step. [13] is the only work that

investigated the integration problem between databases and the rule-based legal

system with focus on the query eﬃciency problem. However, the eﬃciency of

an inference engine for legal systems is not studied.

The Argumentation Framework Legal reasoning, at its core, is a process985

of argumentation with opposing sides attempting to justify their own interpreta-

tion, with appeals to present, principle, policy and purpose. AI and law research

has addressed this with formal argumentation based on Dung’s Abstract Argu-

mentation Frameworks (AAFs) [49]. An AAF is a pair (A, D), where A is a set

of arguments and D⊆A×Ais a binary relation of defeat [50]. We say that990

A strictly defeats B if A defeats B while B does not defeat A. A semantics for

AAFs returns sets of arguments called extensions, which are internally coherent

and defend themselves against each attack.

Compared with AAFs in which arguments are interpreted as abstract entities

and only logical relationships between arguments are taken into account, struc-995

tured argumentation considers an argument’s internal structure. The authors

[51] overviewed some concrete algorithmic approaches to structured argumen-

tation, including ASPIC+ formalism [52] (e.g. TOAST system [53]), Defeasible

logic programming (DeLP) (e.g. Tweety system [54]), Assumption-Based Argu-

mentation and Carneades [55] etc. In previous study [56], some representative1000

approaches were compared regarding the representation and reasoning over legal

rule. The comparative analysis showed that Defeasible Deontic Logic provides

the largest set of feature and was the most eﬃcient system among those tested.

DeLP oﬀers a computational reasoning system that uses the Abstract Di-

alectical Frameworks (ADFs) [57] to obtain answers from a knowledge base rep-1005

resented using an extended logic programming language with defeasible rules.

This combination generates a computationally eﬀective system together with

a reasoning model similar to the one used by humans that facilities its use in

real-world applications [58]. ADF is regarded as a powerful generalization of

Dung’s AFs. An ADF is a directed graph whose nodes represent arguments,1010

statements or positions. The nodes can be arbitrary items which can be ac-

cepted or not. The links represent dependencies. However, unlike a link in an

abstract argumentation framework (AAF) in which only defeat relation is mod-

elled, the meaning of an ADF link can vary. Bipolar ADF [59] is a sub-class of

general ADFs in which only defeat and support relations are deﬁned.1015

Some strategies [60, 61, 62, 63] were proposed to eﬃciently construct di-

alectical trees or structured argumentation in general by pruning the search

space and speed up the inference process. This is realized by only expanding

the dialectical tree so far until the evaluation status of the query is decided.

For example, if an argument possesses multiple attackers, and it can already be1020

decided that the ﬁrst attacker is ultimately accepted and defeats the argument,

then there is no need to evaluate the acceptance status of the remaining attack-

ers as it can already be decided that the argument under consideration is not

acceptable [51].

Next we provide some comparison analysis from three perspectives including1025

conception, computation and functionality between the proposed Semantic Rule

Index (SRI) and the structured argumentation framework.

•Conception: As discussed, in a dialectical framework, nodes represent ar-

guments and edges represent the relationships between arguments. Each

argument may contain a rule or a set of rules that reach a particular conclu-1030

sion. For SRI, nodes represent rules and edges represent the relationships

between rules. Informally, a sub-graph in the Semantic Rule Graph could

be regarded as an argument and some edges between sub-graphs could be

regarded as the relationships between arguments. How to incorporate the

semantics deﬁned in the Argumentation Framework into the SRI is out of1035

the scope of this work but it deserves a formal study in future. However,

formal relationships between argumentation systems and defeasible logic

have been carried out [64, 15, 65, 66], showing that several argumenta-

tion frameworks proposed for legal reasoning correspond to (fragment of)

variants of defeasible logic. Accordingly, we have strong reasons to believe1040

that SRI can be adapted to argumentation systems.

•Computation:

–For DeLP, to build a dialectical tree, the starting points for con-

structing an argument are the facts in the knowledge base. They

are arguments themselves and on top of them new arguments can1045

be made. The facts must be speciﬁed as follows: true →f act. This

implies that users have to have pre-knowledge of what facts are avail-

able. Diﬀerent facts provided will reach diﬀerent conclusions due to

diﬀerent dialectical trees generated. Given a large scale of rules and

frequent updated databases, it is hard for users to know every fact1050

that should be considered for inferencing.

Our framework enables users to provide some predicates interested

in. It is not necessary for users to pre-know if the predicates are

facts or not. The framework is able to identify which conclusions can

be reached based on the predicated provided as well as databases1055

at the point of querying time which may impact on what reasoning

processes need to be followed. Therefore, the proposed solution is

more practical compared with the traditional ADFs approach.

–A dialectical tree uses arguments as nodes. An argument itself is an

inference tree that is constructed through rules. Conceptually, for-1060

malisms for structured argumentation often follow the steps of the

so-called argumentation process or argumentation pipeline: (a) ar-

gument construction that builds arguments composed of a claim and

a derivation of that claim (e.g. a proof tree) from the given knowl-

edge base; (b) determining conﬂicts among arguments; (c) evaluation1065

of acceptability of arguments; and (d) drawing conclusions from the

acceptable arguments [67]. From a computational point of view, all

of the steps of the process taken individually can be quite computa-

tionally expensive: for instance even construction of single arguments

may be computationally complex (NP-hard in cases); a large number1070

of arguments may be constructed; ﬁnding conﬂicts can be non-trivial;

and evaluation of acceptability has in general a high complexity, as

in the case of abstract argumentation [51].

For the proposed framework, the Semantic Rule Index uses rules as

nodes that is straightforward for construction. SRI can be built as1075

a pre-processing step by scanning Rule Systems once. The index

captures the defeat and support relations among the rules. During

querying time, the framework searches the index to identify all the

candidate rules and evaluate the acceptability of a rule conclusion

through database queries. As discussed in Section 4.2, the search1080

can be done in O(V+E) time. Therefore, the proposed approach

will support eﬃcient large scale rule-based reasoning better.

•Functionality: The reparation chain feature for defeasible deontic logic will

bring many dynamics into the argumentation process and to the best of our

knowledge, there is no work considering this feature. Therefore, it is not1085

clear how the current structured argumentation frameworks could address

this. However, this semantics could be captured through the proposed

Semantic Rule Index by enriching edges’ semantic representation.

The proposed inference engine achieved its goal of answering the Rule Con-

tainment/Intersection query eﬃciently. Although the setting is diﬀerent from1090

the previous work, as the future work, we will explore the semantics (e.g.

grounded extension, admissible extension etc.) deﬁned by the structured argu-

mentation framework and Dung’s work within the proposed framework. Based

on that, a large-scale and systematic implementation comparison between these

two approaches will be conducted.1095

7. Conclusion

In summary, we proposed the ﬁrst uniﬁed framework that seamlessly in-

tegrates database systems with inference engines for legal domain. It is able

to answer Rule Containment Query eﬃciently through the proposed inference

engine in which the search space for rule matching can be reduced eﬃciently1100

through the Semantic Rule Index, and the unnecessary reasoning computa-

tions and database queries can be avoided by the proposed Inference Controller.

Moreover, query and reasoning results can be shared by multiple candidate rules

that further reduce the overall response time. Furthermore, by adopting some

caching techniques, the database query performance can be improved signif-1105

icantly. The framework and techniques can be adopted to address the Rule

Intersection Query and backward chining strategy. In future, we will extend the

work to support constitutive rules.

References

References1110

[1] G. Sartor, Legal Reasoning: A Cognitive Approach to the Law, Springer,

2005.

[2] T. F. Gordon, G. Governatori, A. Rotolo, Rules and norms: Requirements

for rule interchange languages in the legal domain, in: G. Governatori,

J. Hall, A. Paschke (Eds.), RuleML 2009, no. 5858 in LNCS, Springer,1115

Heidelberg, 2009, pp. 282–296.

[3] G. Governatori, F. Olivieri, A. Rotolo, S. Scannapieco, Computing strong

and weak permissions in defeasible logic, Journal of Philosophical Logic

42 (6) (2013) 799–829.

[4] T. F. Gordon, H. Prakken, D. Walton, The carneades model of argument1120

and burden of proof, Artiﬁcial Intelligence 171 (10-15) (2007) 875–896.

doi:10.1016/j.artint.2007.04.010.

URL https://doi.org/10.1016/j.artint.2007.04.010

[5] H. Prakken, A. Z. Wyner, T. J. M. Bench-Capon, K. Atkinson, A for-

malization of argumentation schemes for legal case-based reasoning in1125

ASPIC+, Journal of Logic and Computation 25 (5) (2015) 1141–1166.

doi:10.1093/logcom/ext010.

URL https://doi.org/10.1093/logcom/ext010

[6] H. Herrestad, Norms and formalization, in: ICAIL’91, ACM, 1991, pp.

175–184. doi:10.1145/112646.112667.1130

URL http://doi.acm.org/10.1145/112646.112667

[7] G. Sartor, The structure of norm conditions and nonmonotonic reasoning

in law, in: R. E. Susskind (Ed.), Proceedings of the Third International

Conference on Artiﬁcial Intelligence and Law, ICAIL ’91, Oxford, England,

June 25-28, 1991, ACM, 1991, pp. 155–164. doi:10.1145/112646.112665.1135

URL https://doi.org/10.1145/112646.112665

[8] H. Prakken, G. Sartor, Law and logic: A review from an argumentation

perspective, Artif. Intell. 227 (2015) 214–245. doi:10.1016/j.artint.

2015.06.005.

URL https://doi.org/10.1016/j.artint.2015.06.0051140

[9] A. J. I. Jones, M. J. Sergot, Deontic logic in the representation of law:

Towards a methodology, Artif. Intell. Law 1 (1) (1992) 45–64. doi:10.

1007/BF00118478.

URL https://doi.org/10.1007/BF00118478

[10] NSW Government, The NSW mandatory reporter guide (2016).1145

URL http://www.keepthemsafe.nsw.gov.au/reporting_concerns/

mandatory_reporter_guide

[11] S. Liang, P. Fodor, H. Wan, M. Kifer, Openrulebench: An analysis of

the performance of rule engines, in: Proceedings of the 18th International

Conference on World Wide Web, WWW ’09, ACM, New York, NY, USA,1150

2009, pp. 601–610. doi:10.1145/1526709.1526790.

URL http://doi.acm.org/10.1145/1526709.1526790

[12] R. Agrawal, Alpha: an extension of relational algebra to express a class of

recursive queries, IEEE Transactions on Software Engineering 14 (7) (1988)

879–885. doi:10.1109/32.42731.1155

[13] M. B. Islam, G. Governatori, RuleRS: a rule-based architecture for decision

support systems, Artiﬁcial Intelligence and Law 26 (4) (2018) 315–344.

[14] P. Wahlgren, Legal reasoning - a jurisprudential description, in: Pro-

ceedings of the 2Nd International Conference on Artiﬁcial Intelligence

and Law, ICAIL ’89, ACM, New York, NY, USA, 1989, pp. 147–156.1160

doi:10.1145/74014.74034.

URL http://doi.acm.org/10.1145/74014.74034

[15] G. Governatori, On the relationship between Carneades and defeasible

logic, in: Proceedings of the ICAIL 2011, ACM, 2011, pp. 31–40.

[16] G. Antoniou, D. Billington, G. Governatori, M. J. Maher, Representation1165

results for defeasible logic, ACM Transactions on Computational Logic 2 (2)

(2001) 255–287.

[17] G. Governatori, Representing business contracts in RuleML, International

Journal of Cooperative Information Systems 14 (2-3) (2005) 181–216.

arXiv:9617/coala.pdf,doi:10.1142/S0218843005001092.1170

[18] G. Governatori, S. Shek, Regorous: A business process compliance checker,

in: Proceedings of the Fourteenth International Conference on Artiﬁcial

Intelligence and Law, 2013, pp. 245–246. doi:10.1145/2514601.2514638.

[19] G. Governatori, The Regorous approach to process compliance, in: 2015

IEEE 19th International Enterprise Distributed Object Computing Work-1175

shop, IEEE Press, 2015, pp. 33–40. doi:10.1109/EDOC.2015.28.

[20] G. Governatori, M. Hashmi, No time for compliance, in: Enterprise Dis-

tributed Object Computing Conference (EDOC), 2015 IEEE 19th Interna-

tional, IEEE, 2015, pp. 9–18.

[21] G. Governatori, A. Rotolo, BIO logical agents: Norms, beliefs, intentions1180

in defeasible logic, Autonomous Agents and Multi-Agent Systems 17 (1)

(2008) 36–69.

[22] D. Nute, Defeasible logic, in: D. M. Gabbay, C. H. Hogger, J. Robinson

(Eds.), Handbook of logic in artiﬁcial intelligence and logic programming,

Vol. 3, Oxford University Press, 1994, pp. 353–395.1185

[23] G. Antoniou, D. Billington, G. Governatori, M. J. Maher, On the model-

ing and analysis of regulations, in: Australian Conference on Information

Systems, 1999.

[24] B. N. Grosof, Representing e-commerce rules via situated courteous logic

programs in RuleML, Electronic Commerce Research and Applications1190

3 (1) (2004) 2–20.

[25] S. Sadiq, G. Governatori, Managing regulatory compliance in business pro-

cesses, in: J. vom Brocke, M. Rosemann (Eds.), Handbook of Business

Process Management 2nd edition, 2nd Edition, Vol. 2, Springer, 2015, pp.

265–288. doi:10.1007/978-3-642-45103-4_11.1195

[26] G. Governatori, A. Rotolo, A conceptually rich model of business process

compliance, in: Proceedings of the APCCM 2010, no. 110 in CRPIT, ACS,

2010, pp. 3–12. arXiv:papers/2010/apccm2010.pdf.

[27] T. Skylogiannis, G. Antoniou, N. Bassiliades, G. Governatori, A. Bikakis,

Dr-negotiate— a system for automated agent negotiation with defeasible1200

logic-based strategies, Data & Knowledge Engineering 63 (2007) 362–380.

arXiv:13283/DKE-nego.pdf,doi:10.1016/j.datak.2007.03.004.

[28] M. J. Maher, Propositional defeasible logic has linear complexity, Theory

and Practice of Logic Programming 1 (06) (2001) 691–711.

[29] D. Billington, G. Antoniou, G. Governatori, M. J. Maher, An Inclusion1205

Theorem for Defeasible Logics, ACM Transactions in Computational Logic

12 (1) (2010) 1–27.

[30] G. Governatori, Business process compliance: An abstract normative

framework, IT – Information Technology 55 (6) (2013) 231–238. doi:

10.1515/itit.2013.2003.1210

[31] M. Hashmi, G. Governatori, M. T. Wynn, Normative requirements for reg-

ulatory compliance: An abstract formal framework, Information Systems

Frontiers (2015). doi:10.1007/s10796-015-9558-1.

[32] G. Governatori, A. Rotolo, Logic of violations: A Gentzen system for rea-

soning with contrary-to-duty obligations, Australasian Journal of Logic 41215

(2006) 193–215.

[33] G. Governatori, A. Rotolo, Defeasible logic: Agency, intention and obli-

gation, in: Proceedings of the DEON 2004, no. 3065 in LNCS, Springer,

2004, pp. 114–128.

[34] I. Savnik, Index data structure for fast subset and superset queries, in:1220

A. Cuzzocrea, C. Kittl, D. E. Simos, E. Weippl, L. Xu (Eds.), Availability,

Reliability, and Security in Information Systems and HCI, Springer Berlin

Heidelberg, Berlin, Heidelberg, 2013, pp. 134–148.

[35] M. Labib, Database caching strategies using redis, Amazon Web Services

(May 2017).1225

[36] A. P Mohod, M. Chaudhari, Improve query performance using eﬀective ma-

terialized view selection and maintenance: A survey, International Journal

of computer science & Mobile computing 2 (2013) 485–490.

[37] J. Challenger, A. Iyengar, P. Dantzig, A scalable system for consistently

caching dynamic web data, in: Proceedings of the IEEE INFOCOM ’99.1230

Conference on Computer Communications, Vol. 1, 1999, pp. 294–303 vol.1.

[38] A. Datta, K. Dutta, H. M. Thomas, D. E. VanderMeer, K. Ramamritham,

D. Fishman, A comparative study of alternative middle tier caching solu-

tions to support dynamic web content acceleration, in: Proceedings of the

27th International Conference on Very Large Data Bases, VLDB ’01, 2001,1235

pp. 667–670.

[39] L. Degenaro, A. Iyengar, I. Lipkind, I. Rouvellou, A middleware system

which intelligently caches query results, in: IFIP/ACM International Con-

ference on Distributed Systems Platforms, Middleware ’00, 2000, pp. 24–44.

[40] S. Ghandeharizadeh, J. Yap, Materialized views and key-value pairs in a1240

cache augmented sql system: Similarities and diﬀerences, Technical report,

University of Southern California, Database Laborary Technical Report

2012 (01 2012).

[41] M. D. Souﬁ, T. Samad-Soltani, S. S. Vahdati, P. Rezaei-Hachesu,

Decision support system for triage management: A hybrid ap-1245

proach using rule-based reasoning and fuzzy logic, International

Journal of Medical Informatics 114 (2018) 35 – 44. doi:https:

//doi.org/10.1016/j.ijmedinf.2018.03.008.

URL http://www.sciencedirect.com/science/article/pii/

S13865056183021561250

[42] M. Stonebraker, The integration of rule systems and database systems,

IEEE Trans. on Knowl. and Data Eng. 4 (5) (1992) 415–423. doi:10.

1109/69.166984.

URL https://doi.org/10.1109/69.166984

[43] V. S. Costa, R. Rocha, L. Damas, The yap prolog system, Theory Pract.1255

Log. Program. 12 (1-2) (2012) 5–34. doi:10.1017/S1471068411000512.

URL http://dx.doi.org/10.1017/S1471068411000512

[44] K. Sagonas, T. Swift, D. S. Warren, Xsb as an eﬃcient deductive database

engine, in: Proceedings of the 1994 ACM SIGMOD International Confer-

ence on Management of Data, SIGMOD ’94, ACM, New York, NY, USA,1260

1994, pp. 442–453. doi:10.1145/191839.191927.

URL http://doi.acm.org/10.1145/191839.191927

[45] C. L. Forgy, Expert systems, IEEE Computer Society Press, Los Alamitos,

CA, USA, 1990, Ch. Rete: A Fast Algorithm for the Many Pattern/Many

Object Pattern Match Problem, pp. 324–341.1265

URL http://dl.acm.org/citation.cfm?id=115710.115736

[46] D. P. Miranker, TREAT: A New and Eﬃcient Match Algorithm for AI

Production Systems, Morgan Kaufmann Publishers Inc., San Francisco,

CA, USA, 1990.

[47] Y. Qin, X. Tao, Y. Huang, J. L¨u, An index structure supporting rule1270

activation in pervasive applications, World Wide Web 22 (1) (2019) 1–

37. doi:10.1007/s11280-017-0517-2.

URL https://doi.org/10.1007/s11280-017-0517-2

[48] M. Stonebraker, A. Jhingran, J. Goh, S. Potamianos, On rules, procedure,

caching and views in data base systems, in: Proceedings of the 1990 ACM1275

SIGMOD International Conference on Management of Data, SIGMOD ’90,

ACM, New York, NY, USA, 1990, pp. 281–290. doi:10.1145/93597.

98737.

URL http://doi.acm.org/10.1145/93597.98737

[49] P. Dung, On the acceptability of arguments and its fundamental role in1280

nonmonotonic reasoning, logic programming and n-person games, Artif.

Intell. 77 (1995) 321–358.

[50] H. Prakken, G. Sartor, Law and logic: A review from an argumentation

perspective, Artif. Intell. 227 (2015) 214–245.

[51] F. Cerutti, S. Gaggl, M. Thimm, J. Wallner, Foundations of implementa-1285

tions for formal argumentation, FLAP 4 (2017).

[52] S. Modgil, H. Prakken, The aspic+ framework for structured argumenta-

tion: a tutorial, Argument Comput. 5 (2014) 31–62.

[53] M. Snaith, C. Reed, Toast: online aspic+ implementation, Vol. 245, 2012.

doi:10.3233/978-1-61499-111-3-509.1290

[54] M. Thimm, Tweety: A comprehensive collection of java libraries for logical

aspects of artiﬁcial intelligence and knowledge representation, in: KR, 2014.

[55] T. Gordon, D. Walton, Formalizing balancing arguments, in: COMMA,

2016.

[56] S. Batsakis, G. Baryannis, G. Governatori, I. Tachmazidis, G. Antoniou,1295

Legal representation and reasoning in practice: A critical comparison, in:

JURIX, 2018.

[57] G. Brewka, S. Woltran, Abstract dialectical frameworks, in: KR, 2010.

[58] M. A. Leiva, G. I. Simari, S. Gottifredi, A. Garc´ıa, G. R. Simari, Daqap:

Defeasible argumentation query answering platform, in: FQAS, 2019.1300

[59] C. Cayrol, M. Lagasquie-Schiex, On the acceptability of arguments in bipo-

lar argumentation frameworks, in: ECSQARU, 2005.

[60] C. I. Ches˜nevar, G. R. Simari, A. Garc´ıa, Pruning search space in defeasible

argumentation, arXiv: Artiﬁcial Intelligence (2004).

[61] A. Cohen, S. Gottifredi, A. Garc´ıa, A heuristic pruning technique for di-1305

alectical trees on argumentation-based query-answering systems, in: FQAS,

2019.

[62] N. D. Rotstein, S. Gottifredi, A. Garc´ıa, G. R. Simari, A heuristics-based

pruning technique for argumentation trees, in: SUM, 2011.

[63] B. Testerink, D. Odekerken, F. Bex, A method for eﬃcient argument-based1310

inquiry, in: FQAS, 2019.

[64] G. Antoniou, M. J. Maher, D. Billington, Defeasible logic versus logic pro-

gramming without negation as failure, J. Log. Program. 42 (1) (2000) 47–

57. doi:10.1016/S0743-1066(99)00060-6.

URL https://doi.org/10.1016/S0743-1066(99)00060-61315

[65] G. Governatori, M. J. Maher, D. Billington, G. Antoniou, Argumentation

semantics for defeasible logics, Journal of Logic and Computation 14 (5)

(2004) 675–702. arXiv:9614/preamble.pdf,doi:10.1093/logcom/14.5.

675.

[66] H.-P. Lam, G. Governatori, R. Riveret, On ASPIC+and defeasible logic,1320

in: P. Baroni, T. F. Gordon, T. Scheﬄer, M. Stede (Eds.), Proceed-

ings of COMMA 2016, Vol. 287 of Frontiers in Artiﬁcial Intelligence and

Applications, IOS Press, Amsterdam, 2016, pp. 359–370. doi:10.3233/

978-1-61499-686-6-359.

[67] M. Caminada, L. Amgoud, On the evaluation of argumenta-1325

tion formalisms, Artiﬁcial Intelligence 171 (5) (2007) 286 – 310.

doi:https://doi.org/10.1016/j.artint.2007.02.003.

URL http://www.sciencedirect.com/science/article/pii/

S0004370207000410

The new model for medicine distribution by combining of supply chain and expert system using rule-based reasoning method

Article

Full-text available

Mar 2023

The medicine distribution supply chain is important, especially during the COVID-19 pandemic, because delays in medicine distribution can increase the risk for patients. So far, the distribution of medicines has been carried out exclusively and even some medicines are distributed on a limited basis because they require strict supervision from the Medicine Supervisory Agency in each department. However, the distribution of this medicine has a weakness if at one public Health center there is a shortage of certain types of medicines, it cannot ask directly to other public Health center, thus allowing the availability of medicines not to be fulfilled. An integrated process is needed that can accommodate regulations and leadership policies and can be used for logistics management that will be used in medicine distribution. This study will create a new model by combining supply chains with information systems and expert systems using the rule-based reasoning method as an inference engine that can be developed for medicine distribution based on a mobile hybrid system in the Demak District Health Office, Indonesia. So that a new framework model based on a mobile hybrid system can facilitate the distribution of medicines effectively and efficiently.

Explainable AI and Law: An Evidential Survey

Article

Full-text available

Dec 2023

Decisions made by legal adjudicators and administrative decision-makers often found upon a reservoir of stored experiences, from which is drawn a tacit body of expert knowledge. Such expertise may be implicit and opaque, even to the decision-makers themselves, and generates obstacles when implementing AI for automated decision-making tasks within the legal field, since, to the extent that AI-powered decision-making tools must found upon a stock of domain expertise, opacities may proliferate. This raises particular issues within the legal domain, which requires a high level of accountability, thus transparency. This requires enhanced explainability, which entails that a heterogeneous body of stakeholders understand the mechanism underlying the algorithm to the extent that an explanation can be furnished. However, the “black-box” nature of some AI variants, such as deep learning, remains unresolved, and many machine decisions therefore remain poorly understood. This survey paper, based upon a unique interdisciplinary collaboration between legal and AI experts, provides a review of the explainability spectrum, as informed by a systematic survey of relevant research papers, and categorises the results. The article establishes a novel taxonomy, linking the differing forms of legal inference at play within particular legal sub-domains to specific forms of algorithmic decision-making. The diverse categories demonstrate different dimensions in explainable AI (XAI) research. Thus, the survey departs from the preceding monolithic approach to legal reasoning and decision-making by incorporating heterogeneity in legal logics: a feature which requires elaboration, and should be accounted for when designing AI-driven decision-making systems for the legal field. It is thereby hoped that administrative decision-makers, court adjudicators, researchers, and practitioners can gain unique insights into explainability, and utilise the survey as the basis for further research within the field.

Explainable AI and Law: An Evidential Survey

Preprint

Full-text available

Oct 2023

Decisions made by legal adjudicators and administrative decision-makers often found upon a reservoir of stored experiences, from which is drawn a tacit body of expert knowledge. Such expertise may be implicit and opaque, even to the decision-makers themselves, and generates obstacles when implementing AI for automated decision-making tasks within the legal field, since, to the extent that AI-powered decision-making tools must found upon a stock of domain expertise, opacities may proliferate. This raises particular issues within the legal domain, which requires a high level of accountability, thus transparency. This requires enhanced explainability, which entails that a heterogeneous body of stakeholders understand the mechanism underlying the algorithm to the extent thatanexplanationcanbefurnished.However,the’black-box’nature ofsomeAIvariants,suchas deep learning, remains unresolved, and many machine decisions therefore remain poorly understood. This survey paper, based upon a unique interdisciplinary collaboration between legal and AI experts, provides a review of the explainability spectrum, as informed by a systematic survey of relevant research papers, and categorises the results. The article establishes a novel taxonomy, linking the differing forms of legal inference at play within particular legal sub-domains to specific forms of algorithmic decision-making. The diverse categories demonstrate different dimensions in explainable AI (XAI) research. Thus, the survey departs from the preceding monolithic approach to legal reasoning and decision-making by incorporating heterogeneity in legal logics: a feature which requires elaboration, and should be accounted for when designing AI-driven decision-making systems for the legal field. It is thereby hoped that administrative decision-makers, court adjudicators, researchers, and practitioners can gain unique insights into explainability, and utilise the survey as the basis for further research within the field.

A transformer framework for generating context-aware knowledge graph paths

Article

Full-text available

Jul 2023
APPL INTELL

Contextual Path Generation (CPG) refers to the task of generating knowledge path(s) between a pair of entities mentioned in an input textual context to determine the semantic connection between them. Such knowledge paths, also called contextual paths, can be very useful in many advanced information retrieval applications. Nevertheless, CPG involves several technical challenges, namely, sparse and noisy input context, missing relations in knowledge graphs, and generation of ill-formed and irrelevant knowledge paths. In this paper, we propose a transformer-based model architecture. In this approach, we leverage a mixture of pre-trained word and knowledge graph embeddings to encode the semantics of input context, a transformer decoder to perform path generation controlled by encoded input context and head entity to stay relevant to the context, and scaling methods to sample a well-formed path. We evaluate our proposed CPG models derived using the above architecture on two real datasets, both consisting of Wikinews articles as input context documents and ground truth contextual paths, as well as a large synthetic dataset to conduct larger-scale experiments. Our experiments show that our proposed models outperform the baseline models, and the scaling methods contribute to better quality contextual paths. We further analyze how CPG accuracy can be affected by different amount of context data, and missing relations in the knowledge graph. Finally, we demonstrate that an answer model for knowledge graph questions adapted for CPG could not perform well due to the lack of an effective path generation module.

Automatic Generation of Temporal Data Provenance From Biodiversity Information Systems

Article

Full-text available

Jul 2022

Aim/Purpose Although the significance of data provenance has been recognized in a variety of sectors, there is currently no standardized technique or approach for gathering data provenance. The present automated technique mostly employs workflow-based strategies. Unfortunately, the majority of current information systems do not embrace the strategy, particularly biodiversity information systems in which data is acquired by a variety of persons using a wide range of equipment, tools, and protocols. Background This article presents an automated technique for producing temporal data provenance that is independent of biodiversity information systems. The approach is dependent on the changes in contextual information of data items. By mapping the modifications to a schema, a standardized representation of data provenance may be created. Consequently, temporal information may be automatically inferred. Methodology The research methodology consists of three main activities: database event detection, event-schema mapping, and temporal information inference. First, a list of events will be detected from databases. After that, the detected events will be mapped to an ontology, so a common representation of data provenance will be obtained. Based on the derived data provenance, rule-based reasoning will be automatically used to infer temporal information. Consequently, a temporal provenance will be produced. Contribution This paper provides a new method for generating data provenance automatically without interfering with the existing biodiversity information system. In addition to this, it does not mandate that any information system adheres to any particular form. Ontology and the rule-based system as the core components of the solution have been confirmed to be highly valuable in biodiversity sci�ence. Findings Detaching the solution from any biodiversity information system provides scalability in the implementation. Based on the evaluation of a typical biodiver�sity information system for species traits of plants, a high number of temporal information can be generated to the highest degree possible. Using rules to en�code different types of knowledge provides high flexibility to generate temporal information, enabling different temporal-based analyses and reasoning. Recommendations for Practitioners The strategy is based on the contextual information of data items, yet most in�formation systems simply save the most recent ones. As a result, in order for the solution to function properly, database snapshots must be stored on a fre�quent basis. Furthermore, a more practical technique for recording changes in contextual information would be preferable. Recommendations for Researchers The capability to uniformly represent events using a schema has paved the way for automatic inference of temporal information. Therefore, a richer represen�tation of temporal information should be investigated further. Also, this work demonstrates that rule-based inference provides flexibility to encode different types of knowledge from experts. Consequently, a variety of temporal-based data analyses and reasoning can be performed. Therefore, it will be better to in�vestigate multiple domain-oriented knowledge using the solution. Impact on Society Using a typical information system to store and manage biodiversity data has not prohibited us from generating data provenance. Since there is no restriction on the type of information system, our solution has a high potential to be widely adopted. Future Research The data analysis of this work was limited to species traits data. However, there are other types of biodiversity data, including genetic composition, species population, and community composition. In the future, this work will be expanded to cover all those types of biodiversity data. The ultimate goal is to have a standard methodology or strategy for collecting provenance from any biodiversity data regardless of how the data was stored or managed.

Bridging between LegalRuleML and TPTP for Automated Normative Reasoning (extended version)

Preprint

Sep 2022

LegalRuleML is a comprehensive XML-based representation framework for modeling and exchanging normative rules. The TPTP input and output formats, on the other hand, are general-purpose standards for the interaction with automated reasoning systems. In this paper we provide a bridge between the two communities by (i) defining a logic-pluralistic normative reasoning language based on the TPTP format, (ii) providing a translation scheme between relevant fragments of LegalRuleML and this language, and (iii) proposing a flexible architecture for automated normative reasoning based on this translation. We exemplarily instantiate and demonstrate the approach with three different normative logics.

Volume x, 20xx

Article

Jul 2022

Aim/Purpose: Although the significance of data provenance has been recognized in a variety of sectors, there is currently no standardized technique or approach for gathering data provenance. The present automated technique mostly employs workflow-based strategies. Unfortunately, the majority of current information systems do not embrace the strategy, particularly biodiversity information systems in which data is acquired by a variety of persons using a wide range of equipment, tools, and protocols. Background: This article presents an automated technique for producing temporal data provenance that is independent of biodiversity information systems. The approach is dependent on the changes in contextual information of data items. By mapping the modifications to a schema, a standardized representation of data provenance may be created. Consequently, temporal information may be automatically inferred. Methodology: The research methodology consists of three main activities: database event detection, event-schema mapping, and temporal information inference. First, a list of events will be detected from databases. After that, the detected events will be mapped to an ontology, so a common representation of data provenance will be obtained. Based on the derived data provenance, rule-based reasoning will be automatically used to infer temporal information. Consequently, a temporal provenance will be produced. Contribution : This paper provides a new method for generating data provenance automatically without interfering with the existing biodiversity information system. In addition to this, it does not mandate that any information system adheres to any particular forms. Ontology and the rule-based system as the core components of the solution have been confirmed to be highly valuable in biodiversity science. Findings : Detached the solution from any biodiversity information system provides scalability in the implementation. Based on the evaluation of a typical biodiversity information system for species traits of plants, a high number of temporal information can be generated to the highest degree possible. Using rules to encode different types of knowledge provides high flexibility to generate temporal information, enabling different temporal-based analyses and reasoning. Recommendations for Practitioners : The strategy is based on the contextual information of data items, yet most information systems simply save the most recent ones. As a result, in order for the solution to function properly, database snapshots must be stored on a frequent basis. Furthermore, a more practical technique for recording changes in contextual information would be preferable. Recommendations for Researchers : The capability to uniformly represent events using a schema has paved the way for automatic inference of temporal information. Therefore, a richer representation of temporal information should be investigated further. Also, this work demonstrates that rule-based inference provides flexibility to encode different types of knowledge from experts. Consequently, a variety of temporal-based data analyses and reasoning can be performed. Therefore, it will be better to investigate multiple domain-oriented knowledge using the solution. Impact on Society : Using a typical information system to store and manage biodiversity data has not prohibited us from generating data provenance. Since there is no restriction on the type of information system, our solution has a high potential to be widely adopted. Future Research : The data analysis of this work was limited to species traits data. However, there are other types of biodiversity data, including genetic composition, species population, and community composition. In the future, this work will be expanded to cover all those types of biodiversity data. The ultimate goal is to have a standard methodology or strategy for collecting provenance from any biodiversity data regardless of how the data was stored or managed. Keywords: temporal data provenance, biodiversity, ontology, rule-based reasoning

SISTEM EVALUASI TINGKAT PEMAHAMAN HUKUM DENGAN ALGORITMA FUZZY LOGIC BERBASIS PYTHON

Article

Full-text available

Feb 2024

Penelitian ini bertujuan untuk membangun sistem evaluasi tingkat pemahaman hukum menggunakan algoritma fuzzy logic berbasis bahasa python. Dengan memanfaatkan komponen input untuk menerima data kelompok pemahaman hukum dan komponen output untuk menghasilkan nilai tingkat pemahaman hukum. Metode algoritma fuzzy logic digunakan untuk mengolah data jawaban responden berdasarkan kelompok data, hasil uji sistem menunjukkan keakurasian sistem yang tepat. Temuan ini menegaskan kemampuan sistem dalam memberikan penilai dasar dari tingkat pemahaman hukum seseorang secara akurat. Implikasi dari sistem ini menunjukkan potensi sebagai alat evaluasi yang efektif untuk mengukur pemahaman hukum dari berbagai sampel kelompok tingkatan pemahaman seseorang khususnya di bidang hukum. Dengan memperhitungkan diversitas partisipan maka penelitian ini dapat memberikan landasan kuat bagi pengembangan sistem serupa dalam berbagai konteks evaluasi pemahaman yang akan datang.

Supply chain information system for product management control using rule based reasoning model

Conference Paper

Jan 2023

Bridging Between LegalRuleML and TPTP for Automated Normative Reasoning

Chapter

Dec 2022

Legal Representation and Reasoning in Practice: A Critical Comparison

Conference Paper

Full-text available

Oct 2018

Representation and reasoning over legal rules is an important application domain and a number of related approaches have been developed. In this work, we investigate legal reasoning in practice based on three use cases of increasing complexity. We consider three representation and reasoning approaches: (a) Answer Set Programming, (b) Argumentation and (c) Defeasible Logic. Representation and reasoning approaches are evaluated with respect to semantics, expressiveness, efficiency , complexity and support.

RuleRS: a rule-based architecture for decision support systems

Article

Full-text available

Dec 2018
Artif Intell Law

Decision-makers in governments, enterprises, businesses and agencies or individuals, typically, make decisions according to various regulations, guidelines and policies based on existing records stored in various databases, in particular, relational databases. To assist decision-makers, an expert system, encompasses interactive computer-based systems or subsystems to support the decision-making process. Typically, most expert systems are built on top of transaction systems, databases, and data models and restricted in decision-making to the analysis, processing and presenting data and information, and they do not provide support for the normative layer. This paper will provide a solution to one specific problem that arises from this situation, namely the lack of tool/mechanism to demonstrate how an expert system is well-suited for supporting decision-making activities drawn from existing records and relevant legal requirements aligned existing records stored in various databases.We present a Rule-based (pre and post) reporting systems (RuleRS) architecture, which is intended to integrate databases, in particular, relational databases, with a logic-based reasoner and rule engine to assist in decision-making or create reports according to legal norms. We argue that the resulting RuleRS provides an efficient and flexible solution to the problem at hand using defeasible inference. To this end, we have also conducted empirical evaluations of RuleRS performance.

An index structure supporting rule activation in pervasive applications

Article

Full-text available

Jan 2019
WORLD WIDE WEB

Rule mechanism has been widely used in many areas, such as databases, artificial intelligent and pervasive computing. In a rule mechanism, rule activation decides which rules are activated, when the rules are activated, and which tuples can be generated through the activation. Rule activation determines the efficiency of rule mechanism. In this article, we define the semantic constraints, constant constraint and variable constraint, of the rule according to the semantics of Datalog rules. Based on the constraints, we propose an index structure, named Yield index, to support the rule activation effectively. Yield index consists of the data index and semantic index, and records the complete information of a rule, including the matching relationship among the tuples of different relations in rule body. The index integrates tuple insertion and rule activation to directly determine whether the matching tuples of new inserted tuple exist. Due to this character, we perform effective rule activation only, avoiding ineffective rule activation that cannot generate new tuples, so that the efficiency of rule activation is improved. The article describes the structure of Yield index, the construction and maintenance algorithms, and the rule activation algorithm based on Yield index. The experimental results show that Yield index has better performance and improves activation efficiency of one order of magnitude, comparing with other index structures. In addition, we also discuss the possible extensions of Yield index in other applications.

Improve Query Performance Using Effective Materialized View Selection and Maintenance: A Survey

Article

Full-text available

May 2013

Data warehouse (DW) can be defined as a set of data cubes defined over the source relation. To avoid complex query evaluation based on master table, to increase the speed of queries posted to a data warehouse, we can use some snapshot results from the query processing stored in the data warehouse called materialized views. Appropriate Materialized views selection is one of the better and crucial decisions in designing a data warehouse for high efficiency as well as it is the basic requirement of successful business application. Materialized views are found extremely useful for quick query processing. In this paper, first we are focusing on various techniques that are implemented in past, recent for the selection of materialized view. Second, the most critical issues related to maintaining the materialized view and the effective query maintenance strategy are also discussed along with comparison between all the discussed systems.

A Heuristic Pruning Technique for Dialectical Trees on Argumentation-Based Query-Answering Systems

Chapter

Sep 2019

Arguments in argumentation-based query-answering systems can be associated with a set of evidence required for their construction. This evidence might have to be retrieved from external sources such as databases or the web, and each attempt of retrieving a piece of evidence comes with an associated cost. Moreover, a piece of evidence may be available at one moment but not at others, and this is not known beforehand. As a result, the set of active arguments (whose entire set of evidence is available) that can be used by the argumentation machinery of the system may vary from one scenario to another. In this work we propose a heuristic pruning technique for building dialectical trees in argumentation-based query-answering systems, with the aim of minimizing the cost of retrieving the pieces of evidence associated with the arguments that need to be accounted for in the reasoning process.

A Method for Efficient Argument-Based Inquiry

Chapter

Sep 2019

In this paper we describe a method for efficient argument-based inquiry. In this method, an agent creates arguments for and against a particular topic by matching argumentation rules with observations gathered by querying the environment. To avoid making superfluous queries, the agent needs to determine if the acceptability status of the topic can change given more information. We define a notion of stability, where a structured argumentation setup is stable if no new arguments can be added, or if adding new arguments will not change the status of the topic. Because determining stability requires hypothesizing over all future argumentation setups, which is computationally very expensive, we define a less complex approximation algorithm and show that this is a sound approximation of stability. Finally, we show how stability (or our approximation of it) can be used in determining an optimal inquiry policy, and discuss how this policy can be used to, for example, determine a strategy in an argument-based inquiry dialogue.

DAQAP: Defeasible Argumentation Query Answering Platform

Chapter

Sep 2019

In this paper we present the DAQAP, a Web platform for Defeasible Argumentation Query Answering, which offers a visual interface that facilitates the analysis of the argumentative process defined in the Defeasible Logic Programming (DeLP) formalism. The tool presents graphs that show the interaction of the arguments generated from a DeLP program; this is done in two different ways: the first focuses on the structures obtained from the DeLP program, while the second presents the defeat relationships from the point of view of abstract argumentation frameworks, with the possibility of calculating the extensions using Dung’s semantics. Using all this data, the platform provides support for answering queries regarding the states of literals of the input program.

Decision support system for triage management: A hybrid approach using rule-based reasoning and fuzzy logic

Article

Mar 2018
INT J MED INFORM

Objectives: Fast and accurate patient triage for the response process is a critical first step in emergency situations. This process is often performed using a paper-based mode, which intensifies workload and difficulty, wastes time, and is at risk of human errors. This study aims to design and evaluate a decision support system (DSS) to determine the triage level. Methods: A combination of the Rule-Based Reasoning (RBR) and Fuzzy Logic Classifier (FLC) approaches were used to predict the triage level of patients according to the triage specialist's opinions and Emergency Severity Index (ESI) guidelines. RBR was applied for modeling the first to fourth decision points of the ESI algorithm. The data relating to vital signs were used as input variables and modeled using fuzzy logic. Narrative knowledge was converted to If-Then rules using XML. The extracted rules were then used to create the rule-based engine and predict the triage levels. Results: Fourteen RBR and 27 fuzzy rules were extracted and used in the rule-based engine. The performance of the system was evaluated using three methods with real triage data. The accuracy of the clinical decision support systems (CDSSs; in the test data) was 99.44%. The evaluation of the error rate revealed that, when using the traditional method, 13.4% of the patients were miss-triaged, which is statically significant. The completeness of the documentation also improved from 76.72% to 98.5%. Conclusions: Designed system was effective in determining the triage level of patients and it proved helpful for nurses as they made decisions, generated nursing diagnoses based on triage guidelines. The hybrid approach can reduce triage misdiagnosis in a highly accurate manner and improve the triage outcomes.

Rete: A fast algorithm for the many pattern/many object pattern match problem

Article

Jan 1990

C.L. Forgy

A conceptually rich model of business process compliance

Article

Jan 2010

Towards an Efficient Rule-based Framework for Legal Reasoning

Abstract and Figures

Recommended publications

RuleRS: a rule-based architecture for decision support systems

Database Independent Analysis of Adverse Events Using Rule-Based Systems

RuleOMS: A Rule-Based Online Management System

Practical Normative Reasoning with Defeasible Deontic Logic: 14th International Summer School 2018,...