ArticlePDF Available

Effective and efficient forgetting of learned knowledge in Soar’s working and procedural memories

September 2013
Cognitive Systems Research 24:104–113

September 2013
24:104–113

DOI:10.1016/j.cogsys.2012.12.003

Authors:

Nate Derbinsky

Northeastern University

John E. Laird

University of Michigan

The Soar cognitive architecture (Laird, 2012).

…

Evaluation of decay-approximation quality.

…

Model maximum decision time comparison.

…

Avg. memory usage vs. games played.

…

Figures - uploaded by Nate Derbinsky

Content may be subject to copyright.

Content uploaded by Nate Derbinsky

Content may be subject to copyright.

Eﬀective and eﬃcient forgetting of learned knowledge

in Soar’s working and procedural memories

Action editor: David Peebles

Nate Derbinsky

⇑

, John E. Laird

University of Michigan, 2260 Hayward Street, Ann Arbor, MI 48109-2121, USA

Available online 5 January 2013

Abstract

Eﬀective management of learned knowledge is a challenge when modeling human-level behavior within complex, temporally extended

tasks. This work evaluates one approach to this problem: forgetting knowledge that is not in active use (as determined by base-level acti-

vation) and can likely be reconstructed if it becomes relevant. We apply this model to the working and procedural memories of Soar.

When evaluated in simulated, robotic exploration and a competitive, multi-player game, these policies improve model reactivity and scal-

ing while maintaining reasoning competence. To support these policies for real-time modeling, we also present and evaluate a novel algo-

rithm to eﬃciently forget items from large memory stores while preserving base-level ﬁdelity.

Keywords: Large-scale cognitive modeling; Working memory; Procedural memory; Cognitive architecture; Soar

1. Introduction

Typical cognitive models persist for short periods of

time (seconds to a few minutes) and have modest learning

requirements. For these models, current cognitive architec-

tures, such as Soar (Laird, 2012) and ACT-R (Anderson

et al., 2004), executing on commodity computer systems,

are suﬃcient. However, prior work (e.g. Kennedy & Traf-

ton, 2007) has shown that cognitive models of complex,

protracted tasks can accumulate large amounts of knowl-

edge, and that the computational performance of existing

architectures degrades as a result.

This issue, where more knowledge can harm problem-

solving performance, has been dubbed the utility problem,

and has been studied in many contexts, such as explana-

tion-based learning (Minton, 1990; Tambe, Newell, &

Rosenbloom, 1990), case-based reasoning (Smyth &

Keane, 1995; Smyth & Cunningham, 1996), and language

learning (Daelemans, van den Bosch, & Zavrel, 1999).

Markovitch and Scott (1988) have characterized strategies

for dealing with the utility problem in terms of information

ﬁlters applied at diﬀerent stages in the problem-solving

process. One common strategy that is relevant to cognitive

modeling is selective retention, or forgetting, of learned

knowledge. The beneﬁt of this approach, as opposed to

selective utilization, is that the agent does not have to

expend computational resources at run time to decide

whether to utilize knowledge or not, a property that may

be crucial for real-time modeling in temporally extended,

complex tasks. However, it can be challenging to devise

forgetting policies that work well across a variety of prob-

lem domains, eﬀectively balancing the task performance of

models with reductions in retrieval time and storage

requirements of learned knowledge.

In context of this challenge, we present two tasks where

eﬀective behavior requires that the model accumulate large

amounts of information from the environment, and where

over time this learned knowledge overwhelms reasonable

computational limits. In response, we present and evaluate

novel policies to forget learned knowledge in the working

http://dx.doi.org/10.1016/j.cogsys.2012.12.003

⇑

Corresponding author.

E-mail addresses: nlderbin@umich.edu (N. Derbinsky), laird@umich.

edu (J.E. Laird).

www.elsevier.com/locate/cogsys

Available online at www.sciencedirect.com

Cognitive Systems Research 24 (2013) 104–113

and procedural memories of Soar. These policies investi-

gate a common hypothesis: it is rational for the architec-

ture to forget a unit of knowledge when there is a high

degree of certainty that it is not of use, as calculated by

base-level activation (Anderson et al., 2004), and that it

can be reconstructed in the future if it becomes relevant.

We demonstrate that these task-independent policies

improve model reactivity and scaling, while maintaining

problem-solving competence. To support these policies

for real-time modeling, we also present and evaluate a

novel algorithm to eﬃciently forget items from large mem-

ory stores while preserving ﬁdelity of base-level activation.

2. Related work

Previous cognitive-modeling research has investigated

forgetting in order to account for human behavior and

experimental data. As a prominent example, memory decay

has long been a core commitment of the ACT-R theory

(Anderson et al., 2004), as it has been shown to account

for a class of memory retrieval errors (Anderson, Reder,

& Lebiere, 1996). Similarly, research in Soar investigated

task-performance eﬀects of forgetting short-term (Chong,

2003) and procedural (Chong, 2004) knowledge. By con-

trast, the motivation for this work is to discover the degree

to which forgetting can support long-term, real-time mod-

eling in complex tasks.

Prior work has demonstrated that there are potential

cognitive beneﬁts to using memory decay, such as in

task-switching (Altmann & Gray, 2002) and heuristic

inference (Schooler & Hertwig, 2005). In this paper, we

focus on improving reactivity and scaling.

We extend prior investigations of long-term symbolic

learning in Soar (Kennedy & Trafton, 2007), where the

source of learning was internal problem solving. In this

paper, the evaluation domains accumulate information

from interaction with an external environment.

Prior work has addressed many of the computational

challenges associated with retrieving a single memory

according to the base-level activation (BLA) model (Pet-

rov, 2006; Derbinsky, Laird, & Smith, 2010; Derbinsky &

Laird, 2011). However, eﬃciently removing items from

memory, while preserving BLA ﬁdelity, presents a diﬀerent

challenge. As such, before presenting the empirical evalua-

tion domains, we formally describe this computational

problem; present a novel algorithm to forget according to

BLA in large memories; and evaluate our approach with

synthetic data.

3. The Soar cognitive architecture

Soar is a cognitive architecture that has been used for

developing intelligent agents and modeling human cogni-

tion. Historically, one of Soar’s main strengths has been

its ability to eﬃciently represent and bring to bear large

amounts of symbolic knowledge to solve diverse problems

using a variety of methods (Laird, 2012).

Fig. 1 shows the structure of Soar. At the center is a

symbolic working memory that represents the agent’s cur-

rent state. It is here that perception, goals, retrievals from

Fig. 1. The Soar cognitive architecture (Laird, 2012).

N. Derbinsky, J.E. Laird / Cognitive Systems Research 24 (2013) 104–113 105

long-term memory, external action directives, and struc-

tures from intermediate reasoning are jointly represented

as a connected, directed graph. The primitive representa-

tional unit of knowledge in working memory is a symbolic

triple (identiﬁer,attribute,value), termed a working-memory

element, or WME. The ﬁrst symbol of a WME (identiﬁer)

must be an existing node in the graph, whereas the second

(attribute) and third (value) symbols may be either terminal

constants or non-terminal graph nodes. Multiple WMEs

that share the same identiﬁer are termed an “object,”and

the set of individual WMEs sharing that identiﬁer are

termed “augmentations”of that object.

Procedural memory stores the agent’s knowledge of

when and how to perform actions, both internal, such as

querying long-term declarative memories, and external,

such as controlling robotic actuators. Knowledge in this

memory is represented as if-then rules. The conditions of

rules test patterns in working memory and the actions of

rules add and/or remove working-memory elements. Soar

makes use of the Rete algorithm for eﬃcient rule matching

(Forgy, 1982) and retrieval time scales to large stores of

procedural knowledge (Doorenbos, 1995). However, in

the worst case, the Rete algorithm scales linearly with the

number of elements in working memory, a computational

issue that motivates maintaining a relatively small working

memory.

Soar learns procedural knowledge via chunking (Laird,

Rosenbloom, & Newell, 1986) and reinforcement-learning

(RL; Nason & Laird, 2005) mechanisms. Chunking creates

new rules: it converts deliberate subgoal processing into

reactive rules by compiling over rule-ﬁring traces, a form

of explanation-based learning (EBL). If subgoal processing

does not interact with the environment, the chunked rule is

redundant with existing knowledge and serves to improve

performance by reducing deliberate processing. However,

memory usage in Soar scales linearly with the number of

rules, typically at a rate of 1–5 KB/rule, which provides a

motivation for forgetting under-utilized rules.

Reinforcement learning incrementally tunes existing rule

actions: it updates the expectation of action utility, with

respect to a subset of state (represented in rule conditions)

and an environmental and/or intrinsic reward signal. A

rule that can be updated by the RL mechanism (termed

an RL rule) must satisfy a few simple criteria related to

its actions, and is thus distinguishable from other rules.

This distinction is relevant to forgetting rules. When an

RL rule that was learned via chunking is updated, that rule

is no longer redundant with the knowledge that led to its

creation, as it now incorporates information from environ-

mental interaction that was not captured in the original

subgoal processing.

Soar incorporates two long-term declarative memories,

semantic and episodic (Derbinsky & Laird, 2010). Seman-

tic memory stores working-memory objects, independent

of overall working-memory connectivity (Derbinsky

et al., 2010), and episodic memory incrementally encodes

and temporally indexes snapshots of working memory,

resulting in an autobiographical history of agent experience

(Derbinsky, Li, & Laird, 2012; Nuxoll & Laird, 2012).

Agents retrieve knowledge from one of these memory sys-

tems by constructing a symbolic cue in working memory;

the intended memory system then interprets the cue,

searches its store for the best matching memory, and if it

ﬁnds a match, reconstructs the associated knowledge in

working memory. For episodic memory, the time to recon-

struct knowledge depends on the size of working memory

at the time of encoding, another motivation for a concise

agent state (Derbinsky & Laird, 2009).

Agent reasoning in Soar consists of a sequence of deci-

sions, where the aim of each decision is to select and apply

an operator in service of the agent’s goal(s). The primitive

decision cycle consists of the following phases: encode per-

ceptual input; ﬁre rules to elaborate agent state, as well as

propose and evaluate operators; select an operator; ﬁre

rules that apply the operator; and then process output

directives and retrievals from long-term memory. Unlike

ACT-R, multiple rules may ﬁre in parallel during a single

phase. The time to execute the decision cycle, which pri-

marily depends on the speed with which the architecture

can match rules and retrieve knowledge from episodic

and semantic memories, determines agent reactivity. We

have found that 50 ms is an acceptable upper bound on this

response time across numerous domains, including robot-

ics, video games, and human–computer interaction (HCI)

tasks.

There are two types of persistence for working-memory

elements added as the result of rule ﬁring. Rules that ﬁre to

apply a selected operator create operator-supported struc-

tures. These WMEs will persist in working memory until

deliberately removed. In contrast, rules that do not test a

selected operator create instantiation-supported structures,

which persist only as long as the rules that created them

match. This distinction is relevant to forgetting WMEs.

As evident in Fig. 1, Soar has additional memories and

processing modules; however, they are not pertinent to this

paper and are not discussed further.

4. Eﬃcient forgetting via base-level activation

In later sections, we present and evaluate forgetting pol-

icies in the working and procedural memories of Soar.

Both of these policies use base-level activation (BLA;

Anderson et al., 2004) as a heuristic for identifying memo-

ries that may not be useful to the agent. In this section, we

formally describe the computational problem of forgetting

according to the BLA model; present a novel approach that

scales eﬃciently in large memories; and evaluate our

approach using synthetic data.

4.1. Problem formulation

Let memory Mbe a set of elements, {m1, m2, ... }. Let

each element m

be deﬁned as a set of pairs (a

), where

refers to the number of times element m

was activated

106 N. Derbinsky, J.E. Laird / Cognitive Systems Research 24 (2013) 104–113

at time a

. We assume jm

j6c: the number of activation

events for any element is bounded. These assumptions

are consistent with the ACT-R declarative memory when

bounding chunk-history size (Petrov, 2006). This is also

consistent with the semantic memory in Soar (Laird, 2012).

We assume that activation of an element mat time tis

computed according to the BLA model (Anderson et al.,

2004), where dis a ﬁxed decay parameter:

Bðm;t;dÞ¼ln X

jmj

j¼1

kj½tajd

We deﬁne an element as decayed, with respect to a thresh-

old parameter Hif B(m,t,d)<H. Given a static element m,

we deﬁne Las the fewest number of time steps required for

the element to decay, relative to time step t:

Lðm;t;d;HÞ:¼infftd2N:Bðm;tþtd;dÞ<Hg

For example, element x= {(3, 1), (5, 2)} was activated once

at time step three and twice at time step ﬁve. Assuming de-

cay rate 0.5 and threshold 2, xhas activation about 0.649

at time step 7 and is not decayed: L(x, 7,0.5, 2) = 489.

During a model time step t, the following actions can

occur with respect to memory M:

S1. A new element is added to M.

S2. An existing element is removed from M.

S3. An existing element is activated ytimes.

If S3 occurs with respect to element m

, a new pair (t,y)

is added to m

. To maintain a bounded history size, if jm

j>c, the pair with smallest a(i.e. the oldest) is removed

from m

Thus, given a memory M, we deﬁne that the forgetting

problem, at each time step, t, is to identify the subset of ele-

ments, D#M, that have decayed since the last time step.

4.2. Eﬃcient approach

Given this problem deﬁnition, a naı

¨ve approach is to

determine the decay status of each element at every time

step. This test requires computation OðjMjÞ, scaling linearly

with average memory size. The computation expended

upon each element, m

, will be linear in the number of time

steps where m

2M, estimated as OðLÞfor a static element.

Our approach draws inspiration from the work of Nux-

oll, Laird, and James (2004): rather than checking memory

elements for decay status, “predict”the future time step

when the element will decay. First, at each time step, exam-

ine elements that either (S1) were not previously in the

memory or (S3) were activated. The number of items

requiring inspection is bounded by the total number of ele-

ments (jMj), but is likely to be a small subset, assuming few

memory elements are created or tested by the model at each

time step. For each of these elements, predict the time of

future decay (discussed shortly) and add the element to a

map, where the map key is the predicted time step and

the value is the set of elements predicted to decay at that

time. If the element was already within the map (S3),

remove it from its old location before adding to its new

location. All insertions/removals require time at most log-

arithmic in the number of distinct decay time steps, which

is bounded by the total number of elements (jMj). At any

time step, the set Dis those elements in the set indexed

by the current time step that are decayed.

To predict element decay, we present a novel, two-phase

process. After a new activation (S3), ﬁrst employ an

approximation that is guaranteed to underestimate the true

value of L. If, at a future time step, an element is in Dand it

has not decayed, then compute the exact prediction using a

binary parameter search.

We approximate Lfor an element mas the sum of Lfor

each independent pair (a,k)2m. Here we derive the closed-

form calculation: given a single element pair at time t,we

solve for t

, the future time of element decay ...

lnðk½tpþðtaÞdÞ¼H

lnðkÞdlnðtpþðtaÞÞ ¼ H

tp¼e

HlnðkÞ

dðtaÞ

Since krefers to a single time point, a, we rewrite the

summed terms as a product. Furthermore, we time shift

the decay term by the diﬀerence between the current time

step, t, and that of the element pair, a, thereby predicting L.

Computing this approximation for a single pair takes

constant time (and common values can be cached). Overall

approximation computation is linear in the number of

pairs, which is bounded by c, and therefore Oð1Þ. The com-

putation required for binary parameter search of an element

is Oðlog2LÞ. However, this computation is only necessary if

the element has not decayed, or removed from M.

4.3. Synthetic evaluation

In later sections, we empirically evaluate this approach

with it embedded within the working and procedural mem-

ories of Soar; here we focus on the quality and eﬃciency of

our prediction approach and utilize synthetic data. This

synthetic data set comprises 50,000 memory elements, each

with a randomly generated pair set. The size of each ele-

ment was randomly selected from between 1 and 10, the

number of activations per pair (k) was randomly selected

between 1 and 10, and the time of each pair (a) was ran-

domly selected between 1 and 999. We veriﬁed that each ele-

ment had a valid history with respect to time step 1000,

meaning that each element would not have decayed before

t= 1000. In addition, each element contained a pair with at

least one access at time point 999, which simulated a fresh

activation (S3). For this evaluation, we used decay rate

d= 0.8 and threshold H=1.6. Given these constraints,

the largest possible value of Lfor an element was 3332.

We ﬁrst evaluated the quality of the decay approxima-

tion. In Fig. 2, the y-axis is the cumulative proportion of

N. Derbinsky, J.E. Laird / Cognitive Systems Research 24 (2013) 104–113 107

the elements and the x-axis plots absolute temporal error of

the approximation, where a value of 0 indicates that the

approximation was correct, and non-zero indicates how

many time steps the approximation under-predicted. We

see that the approximation was correct for over 60% of

the elements, but did underestimate over 500 time steps

for 20% of the elements and over 1000 time steps for 1%

of the elements. Under the constraints of this data set, it

was possible for this approximation to underestimate up

to 2084 time steps.

We also compared the prediction time, in microseconds

(ls), of the approximation to an exact calculation using

binary parameter search. The maximum computation time

across the data set was >19faster for the approximation

(1.37 vs. 26.28 ls/element) and the average time was >15

faster (0.31 vs. 4.73 ls/element). We did not compare these

results with a naı

¨ve algorithm, wherein computation time

at each time step depends upon the number of memory ele-

ments (jMj). This comparison would have required a model

of memory size across a variety of tasks and such a model

would have been diﬃcult to develop, as prior work (Doo-

renbos, 1995; Derbinsky et al., 2012) has shown that while

the number of memory changes tends to be small across a

variety of problem domains, absolute size can vary drasti-

cally between tasks.

In summary, this two-phase forgetting approach main-

tains ﬁdelity to the BLA model (due to the second phase

of prediction) and scales to large memories. Results from

synthetic data show that the ﬁrst phase of our approach

is a high-quality approximation and is an order of magni-

tude less costly than the exact calculation in the second

phase.

5. Forgetting in Soar’s working memory

The core intuition of our working-memory forgetting

policy is to remove the augmentations of objects that are

not actively in use and that the model can later reconstruct

from long-term semantic memory, if they become relevant.

As deﬁned earlier, we characterize WME usage via the

base-level activation model (BLA; Anderson et al., 2004),

which estimates future usefulness of memory based upon

prior usage. The primary activation event for a working-

memory element is the ﬁring of a rule that tests or creates

that WME. In addition, when a rule ﬁrst adds an element

to working memory, the activation of the new WME is ini-

tialized to reﬂect the aggregate activation of the set of

WMEs responsible for its creation. This model of activa-

tion sources, events, and decay is task independent.

At the end of each decision cycle, Soar removes from

working memory each element that satisﬁes all of the fol-

lowing requirements, with respect to s, a static, architec-

tural threshold parameter:

R1. The WME was not encoded directly from perception.

R2. The WME is operator-supported.

R3. The activation level of the WME is less than s.

R4. The WME augments an object, o, in semantic

memory.

R5. The activation levels of all augmentations of oare less

than s.

We adopted requirements R1-R3 from Nuxoll et al.

(2004), whereas R4 and R5 are novel. Requirement R1

distinguishes between the decay of representations of

perception, and any dynamics that may occur with actual

sensors, such as refresh rate, fatigue, noise, or damage.

Requirement R2 is a conceptual optimization: as opera-

tor-supported WMEs are persistent, while instantiation-

supported structures are direct entailments. Thus, if we

properly remove operator-supported WMEs, any instantia-

tion-supported structures that depend on them will also be

removed. Therefore our mechanism only manages opera-

tor-supported structures. The concept of a ﬁxed lower

bound on activation, as deﬁned by R3, was adopted from

activation limits in ACT-R (Anderson et al., 1996), and

dictates that working-memory elements will decay in a

task-independent fashion as their use for reasoning becomes

less recent/frequent.

Requirement R4 dictates that our mechanism only

removes elements from working memory that augment

objects in semantic memory. This requirement serves to

balance the degree of working-memory decay with support

for sound reasoning. Knowledge in Soar’s semantic mem-

ory is persistent, though it may change over time. Depend-

ing on the task and the model’s knowledge-management

strategies, it is possible that forgotten working-memory

knowledge may be recovered via deliberate reconstruction

from semantic memory. Additionally, augmentations of

objects that are not in semantic memory can persist indef-

initely to support model reasoning.

Requirement R5 supplements R4 by providing partial

support for the closed-world assumption. R5 dictates that

either all object augmentations are removed, or none. This

policy leads to an object-oriented representation whereby

procedural knowledge can distinguish between objects that

have been completely cleared of substructure, and those

that simply are not augmented with a particular feature

or relation. R5 makes an explicit tradeoﬀ, weighting more

heavily model competence at the expense of the speed of

Fig. 2. Evaluation of decay-approximation quality.

108 N. Derbinsky, J.E. Laird / Cognitive Systems Research 24 (2013) 104–113

working-memory decay. This requirement resembles the

declarative module of ACT-R, where activation is associ-

ated with each chunk and not individual slot values.

5.1. Empirical evaluation

We extended an existing system where Soar controls a

simulated mobile robot (Laird, Derbinsky, & Voigt,

2011). Our evaluation uses a simulation instead of a real

robot because of the practical diﬃculties in running numer-

ous, long experiments in large physical spaces. However,

the simulation is quite accurate and the Soar rules (and

architecture) used in the simulation are exactly the same

as the rules used to control the real robot.

The robot’s task is to visit every room on the third ﬂoor

of the Bob & Betty Beyster building at the University of

Michigan. For this task, the robot visits over 100 rooms

and takes about 1 h of real time. During exploration, it

incrementally builds an internal topological map, which,

when completed, requires over 10,000 WMEs to represent

and store. In addition to storing information, the model

reasons about and plans using the map in order to ﬁnd eﬃ-

cient paths for moving to distant rooms it has sensed but

not visited. The model uses episodic memory to recall

objects and other task-relevant features during exploration.

In our experiments, we aggregate working-memory size

and maximum decision time for each 10 s of elapsed time,

all of which is performed on an Intel i7 2.8 GHz CPU, run-

ning Soar v9.3.1. Because each experimental run takes 1 h,

we did not duplicate our experiments suﬃciently to estab-

lish statistical signiﬁcance and the results we present are

from individual experimental runs. However, we found

qualitative consistency across our runs, such that the vari-

ance between runs is small as compared to the trends we

focus on below.

We make use of the same model for all experiments, but

modify small amounts of procedural knowledge and

change architectural parameters, as described here. The

baseline model (A0) maintains all declarative map informa-

tion in Soar’s working memory. A modiﬁcation to this

baseline (A1) maintains the declarative map in both work-

ing and semantic memories, and additionally includes

hand-coded rules to prune away rooms in working memory

that are not required for immediate reasoning or planning,

as well as to reconstruct these structures from semantic

memory when they are needed. The experimental model

(A2) also maintains the declarative map in both working

and semantic memories, but rather than task-speciﬁc rules,

it makes use of our task-independent working-memory for-

getting policy to prune working-memory structures and

task-independent rules to reconstruct knowledge, as

needed, from semantic memory. For this experimental con-

dition, we held constant the activation-history size (c= 10)

and base-level threshold (s=2), but explored a set of

decay-rate values (d2{0.3, 0.4, 0.5}). For more aggressive

decay rates (dP0.6), the model was unable to maintain

suﬃcient declarative-map data in working memory to com-

plete planning in this task.

Fig. 3 compares working-memory size between condi-

tions A0, A1, and A2 over the duration of the experiment.

We note ﬁrst the major diﬀerence in working-memory size

between A0 and A1 after one hour, when the working

memory of A1 contains more than 11,000 fewer elements,

more than 90% less than A0. We also ﬁnd that the greater

the decay-rate parameter for A2, the smaller the working-

memory size, where a value of 0.5 qualitatively tracks

A1. This ﬁnding suggests that our policy, with an appropri-

ate decay, keeps working-memory size comparable to that

maintained by hand-coded rules.

Fig. 4 compares maximum decision-cycle time in ms,

between conditions A0, A1, and A2 as the simulation pro-

gresses. The dominant cost reﬂected by this data is time to

reconstruct prior episodes that are retrieved from episodic

memory. We see a growing diﬀerence in time between A0

and A2 as working memory is more aggressively managed

(i.e. greater decay rate), demonstrating that episodic recon-

Fig. 3. Model working-memory size comparison.

N. Derbinsky, J.E. Laird / Cognitive Systems Research 24 (2013) 104–113 109

struction, which scales with the size of working memory at

the time of episodic encoding, beneﬁts from forgetting. We

also ﬁnd that with a decay rate of 0.5, our mechanism per-

forms comparably to A1. We note that without suﬃcient

working-memory management (A0; A2 with decay rate

0.3), episodic-memory retrievals are not tenable for a

model that must reason with this amount of acquired infor-

mation, as the maximum required processing time exceeds

the reactivity threshold of 50 ms.

5.2. Discussion

It is possible to write rules that prune Soar’s working

memory; however, this task-speciﬁc knowledge is diﬃcult

to encode and learn.

In this work, we presented and evaluated a novel, task-

independent approach that utilizes a memory hierarchy to

bound working-memory size while maintaining sound rea-

soning. This approach assumes that the amount of knowl-

edge required for immediate reasoning is small relative to

the overall amount of knowledge accumulated by the

model. Under this assumption, as demonstrated in the

robotic evaluation task, our policy scales even as learned

knowledge grows large over long trials.

We note that since Soar’s semantic memory can change

over time and is independent of working memory, our for-

getting policy does admit a class of reasoning errors

wherein the contents of semantic memory are changed so

as to be inconsistent with decayed WMEs. However, this

corruption requires deliberate reasoning in a relatively

small time window and has not arisen in our models. While

the model completed this task for all conditions reported

here, at larger decay rates (dP0.6) the model thrashed

because map information was not held in working memory

long enough to complete deep look-ahead planning. Based

upon this ﬁnding, we expect that if the agent had to per-

form deeper searches within this task, then the model

would thrash with even less aggressive decay rates (e.g.

d= 0.5), but we do not have data for such circumstances.

This line of reasoning suggests that additional research is

needed on either adaptive decay-rate settings or

approaches to planning, and other forms of temporally

extended reasoning, that are robust in the face of memory

decay.

6. Forgetting in Soar’s procedural memory

The core intuition of our procedural-memory forgetting

policy is to remove rules that are not actively used and that

the model can later reconstruct via deliberate subgoal rea-

soning, if the knowledge embedded in them is relevant to a

given situation. As with working-memory forgetting, we

characterize rule usage via the base-level activation model,

where the activation event is the ﬁring of an instantiation

of a rule. As with our working-memory forgetting policy,

the activation source, event, and decay is task independent:

we utilize the base-level activation model to summarize the

history of rule ﬁring.

At the end of each decision cycle, Soar removes from

procedural memory each rule that satisﬁes all of the follow-

ing requirements, with respect to parameter s:

R1. The rule was learned via chunking.

R2. The rule is not actively ﬁring.

R3. The activation level of the rule is less than s.

R4. The rule has not been updated by RL.

We adopted R1–R3 from Chong (2004), whereas R4 is

novel. Chong was modeling human skill decay, and did

not delete rules, so as to not lose each rule’s activation his-

tory. Instead, decayed rules were prevented from ﬁring,

similar to below-utility-threshold rules in ACT-R. R1 is a

practical consideration to distinguish learned knowledge

from “innate”rules developed by the modeler, which, if

modiﬁed, would likely break the model. R2 recognizes that

matched rules are in active use and thus should not be for-

gotten. R3 dictates that rules will decay in a task-indepen-

dent fashion as their use for reasoning becomes less recent/

Fig. 4. Model maximum decision time comparison.

110 N. Derbinsky, J.E. Laird / Cognitive Systems Research 24 (2013) 104–113

frequent. We note that for ﬁxed parameters (dand s)anda

single activation, the BLA model is equivalent to the use-

gap heuristic of Kennedy and Trafton (2007). However,

the time between sequential rule ﬁrings ignores ﬁring fre-

quency, which the BLA model incorporates.

Requirement R4 attempts to retain rules that include

information that cannot be regenerated in the future. Rules

learned by chunking can be regenerated if they have not

been updated by RL; however, once they have been

updated, they encode expected-utility information, which

is not recorded by any other learning mechanism and can-

not be regenerated if the rule is removed.

6.1. Empirical evaluation

We extended an existing system (Laird, Derbinsky, &

Tinkerhess, 2011) where Soar plays Liar’s Dice, a multi-

player game of chance. The rules of the game are numerous

and complex, yielding a task that has rampant uncertainty

and a large state space (millions-to-billions of relevant

states for games of 2–4 players). Prior work has shown that

RL allows Soar models to signiﬁcantly improve perfor-

mance after playing a few thousand games. However, this

involves learning large numbers of RL rules to represent

the value function spanning this state space.

The model we use for all experiments learns two classes

of rules: RL rules that capture expected action utility; and

non-RL rules that capture symbolic game heuristics. Our

experimental baseline (B0) does not forget knowledge.

The ﬁrst experimental modiﬁcation (B1) implements our

forgetting policy, but does not enforce requirement R4

and is thereby comparable to prior work (Kennedy & Traf-

ton, 2007; Chong, 2004). The second modiﬁcation (B2)

fully implements our policy. We experiment with a range

of representative decay rates (d), including 0.999, where

rules not immediately updated by RL are deleted (c= 10,

s=2 for all).

We alternated 1000 2-player games of training then test-

ing, each against a non-learning version of the model. After

each testing session, we recorded maximum memory usage

(Mac OS v10.7.3; dominated, in this task, by the rules in

procedural memory), task performance (% games won),

and average decisions/task action. We do not report max-

imum decision time, as this was below 6 ms. for all condi-

tions (Intel i7 2.8 GHz CPU, Soar v9.3.1). We collected

data for all conditions in at least three independent trials

of 40,000 games. For conditions that forget knowledge,

we were able to gather more data in parallel, due to

reduced memory consumption (six trials for d= 0.35, seven

for remaining).

Fig. 5 presents average memory growth, in megabytes,

as the model trains (within each experimental condition,

error bars of 1 standard deviation are too small to be con-

sistently visible on this plot and thus variance data is not

included in Fig. 5). For all models, the memory growth

of games 1–10 K follows a power law (R

P0.96), whereas

for 11–40 K, growth is linear (R

P0.99). These plots indi-

cate that memory usage for the baseline (B0) and the slowly

decaying model (B2, d= 0.3) is much greater, and faster

growing, than models that more aggressively decay. It also

shows that there is a diminishing beneﬁt from faster decay

(e.g. d= 0.5 and d= 0.999 for B2 are indistinguishable).

Fig. 6 presents average task performance after 1000

games of training, where the error bars represent ±1 stan-

dard deviation. This data shows that given the inherent sto-

chasticity of the task, there is little, if any, diﬀerence

between the performance of the baseline (B0) and decay

levels of B2. However, by comparing B0 and B2 to B1, it

is clear that without R4, the model suﬀers a dramatic loss

of task competence. For clarity, the model begins by play-

ing a non-learning copy of itself and learns from experience

with each training session. While the B0 and B2 models

improve from winning 50% of games to 75–80%, the B1

model improves to below 55%. We conclude that a forget-

ting policy that only incorporates rule-ﬁring history (e.g.

Chong, 2004; Kennedy & Trafton, 2007) will negatively

impact performance in tasks that involve informative inter-

action with an external environment. Our policy incorpo-

rates both rule-ﬁring history and rule reconstruction, and

thus retains this source of feedback.

Fig. 5. Avg. memory usage vs. games played.

N. Derbinsky, J.E. Laird / Cognitive Systems Research 24 (2013) 104–113 111

Finally, Fig. 7 presents average number of decisions for

the model to take an action in the game after training for

10,000 games. In prior work (e.g. Kennedy & Trafton,

2007), this value was a major performance metric, as it

reﬂected the primary reason for learning new rules. In this

work, each decision takes very little time, and so the num-

ber of decisions to choose an action is not as crucial to task

performance as the correctness of the selected action. How-

ever, these data show that there exists a space of decay val-

ues (e.g. d= 0.35) in which memory usage is relatively low

and grows slowly (Fig. 5), task performance is relatively

high (Fig. 6), and the model makes decisions relatively

quickly (Fig. 7).

6.2. Discussion

This work contributes evidence that we can develop

models that improve using RL in tasks with large state

spaces. Currently, it is typical to explicitly represent the

entire state space, which is not feasible in complex prob-

lems. Instead, Soar learns rules to represent only those por-

tions of the space it experiences, and our policy retains only

those rules that include feedback from environmental

reward. Future work needs to validate this approach in

other domains.

7. Concluding remarks

This paper presents and evaluates policies and algo-

rithms for eﬀective and eﬃcient forgetting of learned

knowledge in complex environments. While forgetting

mechanisms are common in cognitive modeling, this work

pursues this line of research for functional reasons: improv-

ing computational resource usage while maintaining rea-

soning competence. We have presented compelling results

from applying these policies in two complex, temporally

extended tasks, but there is additional work to evaluate

these policies, and their parameters, across a wider variety

of problem domains.

Acknowledgment

We acknowledge the funding support of the Air Force

Oﬃce of Scientiﬁc Research, contract FA2386-10-1-4127.

References

Altmann, E. M., & Gray, W. D. (2002). Forgetting to remember: The

functional relationship of decay and interference. Psychological

Science, 13, 27–33.

Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., &

Qin, Y. (2004). An integrated theory of the mind. Psychological

Review, 111, 1036–1060.

Anderson, J. R., Reder, L. M., & Lebiere, C. (1996). Working memory:

Activation limits on retrieval. Cognitive Psychology, 30, 221–256.

Chong, R. (2003). The addition of an activation and decay mechanism to

the Soar architecture. In Proceedings of the ﬁfth international confer-

ence on cognitive modeling (pp. 45–50). Bamberg, Germany.

Chong, R. (2004). Architectural explorations for modeling procedural skill

decay. In Proceedings of the sixth international conference on cognitive

modeling. Pittsburgh, PA, USA.

Daelemans, W., van den Bosch, A., & Zavrel, J. (1999). Forgetting

exceptions is harmful in language learning. Machine Learning, 34,

11–41.

Derbinsky, N., & Laird, J. E. (2009). Eﬃciently implementing episodic

memory. In Proceedings of the 8th international conference on case-

based reasoning (pp. 403–417). Seattle, WA, USA.

Derbinsky, N., & Laird, J. E. (2010). Extending soar with dissociated

symbolic memories. In Proceedings of the 1st symposium on human

memory for artiﬁcial agents (pp. 31–37). Leicester, UK.

Derbinsky, N., & Laird, J. E. (2011). A functional analysis of historical

memory retrieval bias in the word sense disambiguation task. In

Proceedings of the 25th AAAI conference on artiﬁcial intelligence (pp.

663–668). San Francisco, CA, USA.

Derbinsky, N., Laird, J. E., & Smith, B. (2010). Towards eﬃciently

supporting large symbolic declarative memories. In Proceedings of the

10th international conference on cognitive modeling (pp. 49–54).

Philadelphia, PA, USA.

Derbinsky, N., Li, J., & Laird, J. E. (2012). A multi-domain evaluation of

scaling in a general episodic memory. In Proceedings of the 26th AAAI

conference on artiﬁcial intelligence (pp. 193–199). Toronto, Canada.

Doorenbos, R. B. (1995). Production matching for large learning systems.

Ph.D. Thesis. Carnegie Mellon University.

Forgy, C. L. (1982). Rete: A fast algorithm for the many pattern/many

object pattern match problem. Artiﬁcial Intelligence, 19, 17–37.

Kennedy, W. G., & Trafton, J. G. (2007). Long-term symbolic learning.

Cognitive Systems Research, 8, 237–247.

Laird, J. E., Derbinsky, N., & Tinkerhess, M. (2011). A case study in

integrating probabilistic decision making and learning in a symbolic

cognitive architecture: Soar plays dice. In Papers from the 2011 AAAI

fall symposium series: advances in cognitive systems (pp. 162–169).

Arlington, VA, USA.

Laird, J. E., Derbinsky, N., & Voigt, J. (2011). Performance evaluation of

declarative memory systems in Soar. In Proceedings of the 20th

behavior representation in modeling and simulation conference (pp. 33–

40). Sundance, UT, USA.

Fig. 6. Avg. task performance ±1 std. dev.

Fig. 7. Avg. decisions/task action ±1 std. dev.

112 N. Derbinsky, J.E. Laird / Cognitive Systems Research 24 (2013) 104–113

Laird, J. E. (2012). The Soar Cognitive Architecture. Cambridge: MIT

Press.

Laird, J. E., Rosenbloom, P. S., & Newell, A. (1986). Chunking in Soar:

The anatomy of a general learning mechanism. Machine Learning, 1,

11–46.

Markovitch, S., & Scott, P. D. (1988). The role of forgetting in learning. In

Proceedings of the ﬁfth international conference on machine learning

(pp. 459–465). Ann Arbor, MI, USA.

Minton, S. (1990). Qualitative results concerning the utility of explana-

tion-based learning. Artiﬁcial Intelligence, 42, 363–391.

Nason, S., & Laird, J. E. (2005). Soar-RL: Integrating reinforcement

learning with Soar. Cognitive Systems Research, 6, 51–59.

Nuxoll, A. M., Laird, J. E., & James, M. (2004). Comprehensive working

memory activation in Soar. In Proceedings of the sixth international

conference on cognitive modeling (pp. 226–230). Pittsburgh, PA, USA.

Nuxoll, A. M., & Laird, J. E. (2012). Enhancing intelligent agents with

episodic memory. Cognitive Systems Research, 17–18, 34–48.

Petrov, A. A. (2006). Computationally eﬃcient approximation of the base-

level learning equation in ACT-R. In Proceedings of the seventh

international conference on cognitive modeling (pp. 391–392). Trieste,

Italy.

Schooler, L. J., & Hertwig, R. (2005). How forgetting aids heuristic

inference. Psychological Review, 112, 610–628.

Smyth, B., & Cunningham, P. (1996). The utility problem analysed – A

case-based reasoning perspective. In Proceedings of the third European

workshop on case-based reasoning (pp. 392–399). Lausanne,

Switzerland.

Smyth, B., & Keane, M. T. (1995). Remembering to forget: A competence-

preserving case deletion policy for case-based reasoning systems. In

Proceedings of the fourteenth international joint conference on artiﬁcial

intelligence (pp. 377–383). Montreal, Quebec, Canada.

Tambe, M., Newell, A., & Rosenbloom, P. S. (1990). The problem of

expensive chunks and its solution by restricting expressiveness.

Machine Learning, 5, 299–349.

N. Derbinsky, J.E. Laird / Cognitive Systems Research 24 (2013) 104–113 113

An Analysis and Comparison of ACT-R and Soar

Preprint

Full-text available

Jan 2022

John E. Laird

This is a detailed analysis and comparison of the ACT-R and Soar cognitive architectures, including their overall structure, their representations of agent data and metadata, and their associated processing. It focuses on working memory, procedural memory, and long-term declarative memory. I emphasize the commonalities, which are many, but also highlight the differences. I identify the processes and distinct classes of information used by these architectures, including agent data, metadata, and meta-process data, and explore the roles that metadata play in decision making, memory retrievals, and learning.

Modelling Autobiographical Memory Loss across Life Span

Article

Full-text available

Jul 2019

Neurocomputational modelling of long-term memory is a core topic in computational cognitive neuroscience, which is essential towards self-regulating brain-like AI systems. In this paper, we study how people generally lose their memories and emulate various memory loss phenomena using a neurocomputational autobiographical memory model. Specifically, based on prior neurocognitive and neuropsychology studies, we identify three neural processes, namely overload, decay and inhibition, which lead to memory loss in memory formation, storage and retrieval, respectively. For model validation, we collect a memory dataset comprising more than one thousand life events and emulate the three key memory loss processes with model parameters learnt from memory recall behavioural patterns found in human subjects of different age groups. The emulation results show high correlation with human memory recall performance across their life span, even with another population not being used for learning. To the best of our knowledge, this paper is the first research work on quantitative evaluations of autobiographical memory loss using a neurocomputational model.

Ontology-Based Method for Fault Diagnosis of Loaders

Article

Full-text available

Feb 2018
SENSORS-BASEL

This paper proposes an ontology-based fault diagnosis method which overcomes the difficulty of understanding complex fault diagnosis knowledge of loaders and offers a universal approach for fault diagnosis of all loaders. This method contains the following components: (1) An ontology-based fault diagnosis model is proposed to achieve the integrating, sharing and reusing of fault diagnosis knowledge for loaders; (2) combined with ontology, CBR (case-based reasoning) is introduced to realize effective and accurate fault diagnoses following four steps (feature selection, case-retrieval, case-matching and case-updating); and (3) in order to cover the shortages of the CBR method due to the lack of concerned cases, ontology based RBR (rule-based reasoning) is put forward through building SWRL (Semantic Web Rule Language) rules. An application program is also developed to implement the above methods to assist in finding the fault causes, fault locations and maintenance measures of loaders. In addition, the program is validated through analyzing a case study.

A comprehensive optimization method for complex industrial processes

Conference Paper

Oct 2014

Based on psychological experiments and psychophysical mechanisms, it is possible to acquire manufacturing workshop operator's behavior and cognitive situations. A comprehensive optimization model for complex industrial processes is proposed by integrating expert operator's behavior and electrophysiological characters. By using the proposed multiple layer modeling method includes process control, manufacturing and management factors, a human-in-the-loop control system can be created including the situation awareness of manufacturing process and workshop operator cognition.

Alleviating the curse of dimensionality – A psychologically-inspired approach

Article

Nov 2014

CBR-RBR fusion based parametric rapid construction method of bridge BIM model

Article

Aug 2023
ADV ENG INFORM

Generation and retrieval of procedural memory using natural intelligence for an articulated robot

Conference Paper

Mar 2023

Entrepreneurial Team Learning, Forgetting and Knowledge Levels in Business Incubators: An Exploration and Exploitation Perspective

Article

Jan 2019
JASSS-J ARTIF SOC S

Exploration and exploitation are common in entrepreneurial teams. This paper considers the relationship among entrepreneurial teams in business incubators (BIETs) and the relationship between leaders and members of BIETs. It also examines the effects of BIET learning, forgetting and exit and entry on their knowledge level (KL) ¹ in different environments and models; two general situations involving the development and use of knowledge in BIETs and business incubators. The results indicate that in a static environment, the rate of BIET learning from each other and BIET equilibrium KL are negatively correlated, but a moderate rate of forgetting leads to a positive correlation. Second, in a static environment within a BIET, the combination of the leader learning from members quickly and members learning from the leader slowly can improve BIETs’ KL. However, with forgetting, improving BIETs’ KL requires a combination of fast learning by the leader and moderate learning by members. Third, in a dynamic environment, maintaining a moderate amount of exit and entry and forgetting within BIETs moderately improves BIETs’ KL in the long run. This effect is even more significant with BIETs’ exit and entry.

Modeling visuospatial reasoning

Article

Apr 2018

Stephen K. Reed

I apply my proposed modification of Soar/Spatial/Visual System and Kosslyn’s (1983) computational operations on images to problems within a 2 × 2 taxonomy that classifies research according to whether the coding involves static or dynamic relations within an object or between objects (Newcombe & Shipley, 2015). I then repeat this analysis for problems that are included in mathematics and science curricula. Because many of these problems involve reasoning from diagrams Hegarty’s (2011) framework for reasoning from visual-spatial displays provides additional support for organizing this topic. Two more relevant frameworks specify reasoning at different levels of abstraction (Reed, 2016) and with different combinations of actions and objects (Reed, 2018). The article concludes with suggestions for future directions.

Ontology and CBR based automated decision-making method for the disassembly of mechanical products

Article

Aug 2016
ADV ENG INFORM

This paper proposes an ontology and CBR (case-based reasoning) based method which overcomes the difficulty for computers to understand complex structures of various mechanical products and makes the disassembly decision-making process of the products fully automated and cost-saving. In this method, (1) ontology concept is applied to the disassembly decision-making. This enables computers to understand and self-reason the CBR/RBR (rule-based reasoning) based disassembly decision-making process. Since ontology uniforms different kinds of disassembly-related knowledge from different sources, the integration and sharing of the knowledge could be achieved; (2) high flexible decision-making to various conditions with high quality is achieved by the combination of ontology and CBR; (3) to achieve the decision-making when CBR fails, an ontology based RBR method is designed to complement the shortage of CBR in the disassembly decision-making field. The paper also presents an application program to realise the proposed method. In addition, a case study is analysed to verify the validity and automation of the program.

A Multi-Domain Evaluation of Scaling in a General Episodic Memory

Article

Full-text available

Jan 2012

Episodic memory endows agents with numerous general cognitive capabilities, such as action modeling and virtual sensing. However, for long lived agents, there are numerous unexplored computational challenges in supporting useful episodic memory functions while maintaining real time reactivity. In this paper, we review the implementation of episodic memory in Soar and present an expansive evaluation of that system. We demonstrate useful applications of episodic memory across a variety of domains, including games, mobile robotics, planning, and linguistics. In these domains, we characterize properties of environments, tasks, and episodic cues that affect performance, and evaluate the ability of Soar's episodic memory to support hours to days of real time operation. Copyright © 2012, Association for the Advancement of Artificial Intelligence. All rights reserved.

Extending Soar with Dissociated Symbolic Memories

Article

Full-text available

Over long lifetimes, learning agents accumulate large stores of knowledge. To support human-level decision-making, their cognitive architectures must efficiently manage this experience and bring to bear pertinent data to act in the world. 1 Prior psychological and computational work suggests the need for multiple, dissociated memory systems, citing significant functional and computational tradeoffs that arise when implementing a single memory mechanism for different types of learning tasks. In this context, we develop a memory-centric analysis of Soar 9, a general cognitive architecture that incorporates multiple long-term memories. In this analysis, we explore the functional abilities, computational opportunities, and theoretical challenges entailed by integrating a diverse set of symbolic memory systems.

Performance Evaluation of Declarative Memory Systems in Soar

Article

Full-text available

Jan 2011

A rarely studied issue with using persistent computational models is whether the underlying computational mechanisms scale as knowledge is accumulated through learning. In this paper we evaluate the declarative memories of Soar: working memory, semantic memory, and episodic memory, using a detailed simulation of a mobile robot running for one hour of real-time. Our results indicate that our implementation is sufficient for tasks of this length. Moreover our system executes orders of magnitudes faster than real-time, with relatively modest storage requirements. We also project the computational resources required for extended operations.

A Functional Analysis of Historical Memory Retrieval Bias in the Word Sense Disambiguation Task

Article

Aug 2011

Effective access to knowledge within large declarative memory stores is one challenge in the development and understanding of long-living, generally intelligent agents. We focus on a sub-component of this problem: given a large store of knowledge, how should an agent's task-independent memory mechanism respond to an ambiguous cue, one that pertains to multiple previously encoded memories. A large body of cognitive modeling work suggests that human memory retrievals are biased in part by the recency and frequency of past memory access. In this paper, we evaluate the functional benefit of a set of memory retrieval heuristics that incorporate these biases, in the context of the word sense disambiguation task, in which an agent must identify the most appropriate word meaning in response to an ambiguous linguistic cue. In addition, we develop methods to integrate these retrieval biases within a task-independent declarative memory system implemented in the Soar cognitive architecture and evaluate their effectiveness and efficiency in three commonly used semantic concordances.

The Soar Cognitive Architecture

Book

Apr 2012

John E. Laird

The definitive presentation of Soar, one AI's most enduring architectures, offering comprehensive descriptions of fundamental aspects and new components. In development for thirty years, Soar is a general cognitive architecture that integrates knowledge-intensive reasoning, reactive execution, hierarchical reasoning, planning, and learning from experience, with the goal of creating a general computational system that has the same cognitive abilities as humans. In contrast, most AI systems are designed to solve only one type of problem, such as playing chess, searching the Internet, or scheduling aircraft departures. Soar is both a software system for agent development and a theory of what computational structures are necessary to support human-level agents. Over the years, both software system and theory have evolved. This book offers the definitive presentation of Soar from theoretical and practical perspectives, providing comprehensive descriptions of fundamental aspects and new components. The current version of Soar features major extensions, adding reinforcement learning, semantic memory, episodic memory, mental imagery, and an appraisal-based model of emotion. This book describes details of Soar's component memories and processes and offers demonstrations of individual components, components working in combination, and real-world applications. Beyond these functional considerations, the book also proposes requirements for general cognitive architectures and explicitly evaluates how well Soar meets those requirements.

Rete: A fast algorithm for the many pattern/many object pattern match problem

Article