ArticlePDF Available

Complexity equals change

May 2012
Cognitive Systems Research 15-16

May 2012
15-16

DOI:10.1016/j.cogsys.2011.01.002

Authors:

University of Roehampton

Traditionally, models of complexity used in psychology have been based on probabilistic and algorithmic paradigms. While these models have inspired a great deal of research, they are generally opaque about the relationship between complexity and the cost of information processing. We argue that the psychological complexity is easily defined and quantified in terms of change and support this argument with a measure of complexity for binary patterns. We extend our measure to 2-D binary arrays, and show that it correlates well with a number of existing complexity and randomness measures, both subjective and objective. We suggest that measuring change represents an intuitively and mathematically transparent way of defining and quantifying psychological complexity which provides the missing link between subjective and objective approaches to complexity.

The right-hand tail of the population frequency distribution of C(s) for binary strings of length 24. Mean unnormalized complexity is 20.14 with a standard deviation of 0.43. The mode is 20.37. Maximum complexity equals 20.85, and mean normalized complexity is 0.88. Inset: A plot of the entire distribution from C(s) = 0 to 20.8.

…

Correlations between four objective measures of complexity and subjective responses for sequentially presented patterns of length 8 (Psotka, 1975). See text for details.

…

Summary of comparisons of C(s) with different complexity measures.

…

Frequency distributions of the complexity values for patterns studied by Psotka (1975). Hcode corresponds to the measure proposed by Vitz and Todd (1969) and structural complexity refers to our measure.

…

Figures - uploaded by Aleksandar Aksentijevic

Content may be subject to copyright.

Content uploaded by Aleksandar Aksentijevic

Content may be subject to copyright.

This article appeared in a journal published by Elsevier. The attached

copy is furnished to the author for internal non-commercial research

and education use, including for instruction at the authors institution

and sharing with colleagues.

Other uses, including reproduction and distribution, or selling or

licensing copies, or posting to personal, institutional or third party

websites are prohibited.

In most cases authors are permitted to post their version of the

article (e.g. in Word or Tex form) to their personal website or

institutional repository. Authors requiring further information

regarding Elsevier’s archiving and manuscript policies are

encouraged to visit:

http://www.elsevier.com/copyright

Author's personal copy

Complexity equals change

Action editor: Gregg Oden

Aleksandar Aksentijevic

a,⇑

, Keith Gibson

Roehampton University, London, UK

Birkbeck College, London, UK

Received 26 July 2010; received in revised form 30 December 2010; accepted 3 January 2011

Available online 21 January 2011

Abstract

Traditionally, models of complexity used in psychology have been based on probabilistic and algorithmic paradigms. While these

models have inspired a great deal of research, they are generally opaque about the relationship between complexity and the cost of infor-

mation processing. We argue that the psychological complexity is easily deﬁned and quantiﬁed in terms of change and support this argu-

ment with a measure of complexity for binary patterns. We extend our measure to 2-D binary arrays, and show that it correlates well

with a number of existing complexity and randomness measures, both subjective and objective. We suggest that measuring change rep-

resents an intuitively and mathematically transparent way of deﬁning and quantifying psychological complexity which provides the miss-

ing link between subjective and objective approaches to complexity.

Keywords: Complexity; Entropy; Pattern; Structure; Change

1. Introduction

We propose a measure of pattern complexity, C(s),

which connects the subjective, ﬁrst-person perspective of

the human observer and the third-person perspective of

mathematics and science. Such a measure should be based

on a simple, primitive notion, which underpins perception

and cognition. Perhaps the most primitive is the notion of

change. Change is of fundamental importance for psychol-

ogy as well as science and computation. Study of changes

in sensation represents the basis of experimental psychol-

ogy and psychophysics. Any textbook on the subject

clearly demonstrates that enquiry into perception and cog-

nition must begin with change. Any account of sensory

processing stresses the importance of changes in physical

parameters of the stimulus for its encoding.

Change allows direct quantiﬁcation of the relationship

between pattern elements. One of the reasons for the failure

of information theory (Shannon, 1948) to capture the

essence of psychological complexity (see Chater, 1996;

Luce, 2003) has been its focus on individual symbols and

their frequencies at the expense of a description of their

relationships (structure). In the words of Luce (2003),

“...the stimuli of psychological experiments are to some

degree structured, and so, in a fundamental way, they are

not in any sense interchangeable (p. 185).”In other words,

diﬀerent levels of structural organization within a pattern

should not be treated as mutually independent statistical

events. It is here that the importance of change becomes

clear. Structural information is contained in the transition

from one symbol (or element) to another and not in the

symbols themselves (see e.g. Attneave, 1954).

An alternative to information theory was oﬀered by

transformational approaches to pattern goodness/com-

plexity (Palmer, 1983) that begin with certain prescribed

forms of invariance assuming that these govern the percep-

tion of structure. Many inﬂuential mathematical theories

doi:10.1016/j.cogsys.2011.01.002

⇑

Corresponding author. Address: Department of Psychology, Roe-

hampton University, Whitelands College, Holybourne Avenue, London

SW154JD, UK. Tel.: +44 208 392 5756; fax: +44 208 392 3527.

E-mail address: a.aksentijevic@roehampton.ac.uk (A. Aksentijevic).

www.elsevier.com/locate/cogsys

Available online at www.sciencedirect.com

Cognitive Systems Research 15–16 (2012) 1–16

Author's personal copy

are founded on the notion of invariance (e.g. Weyl, 1952)

and invariance has been given a special place in psycholog-

ical theorizing on complexity and goodness (e.g. Garner,

1974; Leyton, 1986a, 1986b; Palmer, 1977).

While acknowledging the importance of invariance, we

start from change because change has not been suﬃciently

considered in complexity literature thus far. Change repre-

sents the inverse of invariance and as we show here, it oﬀers

a viable and potentially useful way of quantifying complex-

ity. Another important motivation for equating psycholog-

ical complexity with change has to do with the ease with

which change links psychological, physical and computa-

tional interpretations. The traditional approaches to com-

plexity have ignored the crucial fact that information

processing involves eﬀort and cost (Falk & Konold,

1997). Our approach avoids the dichotomy between “mes-

sage”and “process”complexity (a pattern containing little

change is easy to describe AND compress). Consequently,

we focus on the information in a pattern (message) and

quantify the eﬀort needed to describe and/or compress it.

We then argue that our measure is meaningfully related

to the way in which the brain processes change and

complexity.

An important attempt to quantify pattern structure has

been Algorithmic Information Theory (AIT; Chaitin, 1969;

Kolmogorov, 1965; Solomonoﬀ, 1964; see Li & Vitanyi,

1997 for an overview).The development of the AIT repre-

sents an attempt to reconcile structural complexity with

the probabilistic nature of information theory. Algorithmic

complexity represents the length of the shortest algorithm

in any programming language, which computes a particu-

lar binary string. A string of length x is incompressible if

the shortest program that can produce it is at least x bits

long.

Algorithmic complexity provides a more intimate link

between the observer and the observed by introducing

structure into computational complexity. Speciﬁcally, this

approach makes it possible, at least theoretically, to com-

pute the complexity of individual, albeit inﬁnite, strings.

Many pattern-coding languages have been proposed based

on algorithmic compression, which are widely used in psy-

chology for describing structure. (e.g. Leeuwenberg, 1969;

Restle, 1970; Simon & Kotovsky, 1963; Vitz & Todd,

1969). These models involve diﬀerent kinds of algorithmic

notation aimed at providing compact description of pat-

terns (Simon, 1972). While many of these approaches have

faced criticisms (see Simon, 1972), others have advanced

complexity research by incorporating perceptually and psy-

chologically relevant forms of invariance into their coding

scheme (e.g. van der Helm, 2000).

Theoretically, the notion of algorithmic complexity is

useful because it describes the relationship between an

algorithm and its output. However, it is diﬃcult if not

impossible to apply this idea to human observers. For

sequences of a certain level of complexity, it is not possible

to know if a better (more eﬃcient) algorithm does not exist

(Chaitin, 2001). In addition, algorithmic compression is

complex and often irreversible. Any simple pattern can

be encoded in a complex way and at the same time, any

encoding can hide almost inﬁnitely many meanings. Fur-

thermore, simple algorithms can produce highly complex

outputs (e.g. Wolfram, 2001, p. 27). Consequently, algo-

rithms can shed little light on the human response to

complexity.

Understanding involves eﬀort and cost and a meaning-

ful measure of psychological complexity should reﬂect this.

We propose that any perception, cognition or action

involves change, and is accompanied by an irreversible

expenditure (conversion) of energy. Change equals increase

in entropy, and this in turn equals cost. Any action, how-

ever trivial, must incur cost and this cost is reﬂected in

an increase in (physical or computational) entropy. Regis-

tering change always costs more than registering no

change. This means that in its interaction with the environ-

ment, the agent converts a certain amount of available

energy, irrespective of its scale. In the context of what fol-

lows, we are assuming that all computing (human or other-

wise) is thermodynamically irreversible. This formulation

brings together physical, computational and psychological

meanings of entropy.

In order to apply the concept of information cost, we

describe a measure of complexity of a binary string based

on the amount of change present in the string. We have

restricted ourselves to discussing binary patterns because

we believe that binary representation oﬀers the most trans-

parent way of encoding and describing change. In the

words of Vitz (1968), the simplicity of binary representa-

tion “presumably exposes the process of perceptual organi-

zation more clearly than other patterns (p. 275).”In

addition, the complexity of structures encoded by larger

alphabets cannot be judged or interpreted without intro-

ducing an appropriate metric which might diﬀer from con-

text to context. This requires additional assumptions,

which might have little to do with the structure of the

pattern.

Our account of complexity has parallels with the Gestalt

approach to perception. The notion of interdependence of

diﬀerent levels of structure represents one of the corner-

stones of Gestalt psychology, whose impact on complexity

research in psychology cannot be overestimated (e.g.

Hochberg & McAlister, 1953; see also van der Helm,

van Lier, & Leeuwenberg, 1992). The ﬁrst serious attempt

to deﬁne and explain pattern goodness was oﬀered by

Gestalt psychologists in the 1920s and 1930s. They pro-

posed that human observers organized sensory/perceptual

and cognitive information according to a number of simple

rules. Fundamentally, an individual perceptual scene is

organized in such a way as to minimize the expenditure

(or rather conversion) of energy. This is what the Gestal-

tists named the “Law of Pra

¨gnanz”or “Minimum Princi-

ple”(e.g., Koﬀka, 1935). Patterns are considered “good”

if they are compact, symmetrical, repetitive, or predictable.

Such patterns contain little change and have low entropy.

They are predictable, resistant to disruption and easy to

2A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16

Author's personal copy

process, assimilate and memorize. Part of the aesthetic

appeal of periodicity and symmetry might lie in the fact

that they appear to defy entropy (increase in disorder). In

the words of Garner (1970),“good patterns have few alter-

natives”. By contrast, “poor”patterns – those that contain

a great deal of change – are complex, asymmetrical and

defy easy description. They contain more information than

simple patterns and are consequently unpredictable and

more changeable when subjected to spatial transforma-

tions. The observer searches for informational “oases”pro-

vided by symmetry, similarity and regularity in order to

minimize the expenditure (conversion) of energy associated

with understanding.

2. The measure

In order to quantify our intuition, we propose an index

C(s) of structural complexity, based on the amount of

change present in one- or two-dimensional binary arrays.

In this section we describe its mathematical properties

and describe a form of generalized structure invariant that

emerges from the properties of the model.

2.1. Informal description and motivation

Let Sbe a binary string of length LP2. We could scan

Stwo symbols at a time, and register a change every time

we encountered a substring “01”or “10”of S. The crudest

measure of the complexity of Swould then be the number

of changes registered. However this will not do for our pur-

poses because the maximum complexity would then be

ascribed to the string “0101 ... 01”, which is clearly very

simple for a human observer. We therefore conduct a scan

jsymbols at a time, j=2toL, and ask which substrings of

length jof Sshould register a change. Suppose we encoun-

ter the substring x=“010”of S. Then xwill register a

change if there is in some sense a change in passing from

the ﬁrst two symbols of xto the last two, that is, in passing

from “01”to “10”. These two strings are not equal, but are

essentially the same. They both display a change, and any-

way diﬀer only in the encoding of their symbols. Thus x

should register no change. On the other hand, a substring

y=“001”should register a change, because the ﬁrst and

last two symbols of y are the strings “00”and “01”, and

these strings are essentially diﬀerent. It is not true that they

diﬀer only in the encoding of their symbols, and one dis-

plays a change while the other does not. If we look again

at the string “0101 ... 01”we will ﬁnd that no substring T

of length j> 2 will register a change, because the strings

formed from the ﬁrst and last j1 symbols of Twill diﬀer

only in the encoding of their symbols.

We proceed therefore to deﬁne a change function on bin-

ary strings that determines when a substring of S should

proﬁle P =(p

, ..., , p

)ofS, where p

is the number

of substrings of length jof Sthat register a change. We

then obtain the complexity C of Sas a suitably weighted

average of the coordinates of P. Our choice of change func-

tion and complexity are as follows. A substring of Sof

length 2 registers a change if its two symbols are not the

same. A substring of Sof length j> 2 registers a change

if the strings formed from its ﬁrst and last j1 symbols

have the same change proﬁle. Noting that there are

Lj+ 1 substrings of Sof length jwe deﬁne Cto be the

sum of the quantities p

/(L+j1), j=2to L. The string

“0101 ... 01”has a complexity value of 1 under this deﬁni-

tion, and only strings of all ones or all zeros, with complex-

ity value 0, have a smaller complexity value.

It is not clear a priori that our choice of change function

is a good one. It succeeds because it provides a strong con-

nection between change and symmetry, even though sym-

metry is not mentioned in the deﬁnition, and it is this

connection that we now explore. A number of lemmas

and theorems are needed, the proofs of which are not

required for a basic understanding of this paper. They

are given in Appendix A.

2.2. Symmetric equivalence

We deﬁne two operators, r(reverse), and c(comple-

ment) on binary strings, and use them to deﬁne the notions

of symmetric equivalence and palindromicity. In the fol-

lowing deﬁnition the complement of 0 is 1 and the comple-

ment of 1 is 0.

Deﬁnition 1. Let S=(s

,... ,s

) be a binary string, and

let t

be the complement of s

,i=1to L.

rS =(s

,... ,s

) = the reverse of S.

cS =(t

,... ,t

) = the complement of S.

Note that rand ccommute, i.e. rcS =crS.

Deﬁnition 2. Let Sand Tbe two binary strings of the same

length.

Sis symmetrically equivalent to T, written ST,ifTis

one of S,rS,cS,rcS.

Deﬁnition 3. A binary string Sis a symmetric palindrome if

Sis one of rS,rcS.

If the length Lof Sis odd we cannot have S=rcS,so

if Sis a symmetric palindrome then we must have S=rS,

i.e. Sis a palindrome in the classical sense. However if L

is even we can have S=rcS, and then Sis not a palin-

drome in the classical sense. For example, “0101”is a

symmetric palindrome, but not a classical palindrome.

Note that all strings of length 2 are symmetric palin-

dromes. For the rest of this paper the words equivalence

and palindrome will refer to symmetric equivalence and

symmetric palindrome.

It is easy to see that is an equivalence relation, and

that if a binary string Sis a palindrome then its equivalence

class has two members, Sand cS, both of which are palin-

dromes, while if Sis not a palindrome its equivalence class

has four members, S,rS,cS,rcS, none of which are

palindromes.

A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16 3

Author's personal copy

The following lemma relates palindromicity of a string

of length Lto symmetry between its ﬁrst and last L1

symbols, paving the way for the change function of the

next section.

Lemma 1. Let S =(s

,... ,s

)be a binary string, L >1,

and let U =(s

,... ,s

L1

), V =(s

,... ,s

Then S is a palindrome if and only if U V.

2.3. The change function

Deﬁnition 4. Achange function on binary strings is any

map from binary strings of length >1 to {0, 1}.

We denote a change function by [], so for a binary string

Sof length >1, [S] is either 0 or 1. We will refer to [S] as the

change in S.

Deﬁnition 5. Let Sbe the binary string (s

,... ,s

L> 1, and let [] be a change function.

Let X

be the change in the substring of length jof S

starting at s

,j=2to L,i=1 toLj+1.

Let p

be the number of substrings of length jof Swhose

change is 1, j=2to L. Thus

Xij ¼½si;siþ1;...;siþj1;j¼2toL;

i¼1toLjþ1ð1aÞ

pj¼X

Ljþ1

i¼1

Xij;j¼2toLð1bÞ

The change matrix of Sis the matrix Xwhose entry in

row iand column jis X

The change proﬁle of S is the array P=(p

,... ,p

Note that p

=[S].

Our task is to deﬁne a suitable change function.

Although we are not going to refer to symmetry in the def-

inition, there are nevertheless two symmetry properties

that any reasonable change function can be expected to

have. The ﬁrst is that equivalent strings have the same

change proﬁle. The second is that if S,U, and Vare as

in Lemma 1, then [S]=0ifUV, since there is then really

no change in passing from the ﬁrst to the last L1 charac-

ters of S. It is tempting therefore to deﬁne [S]tobe0if

UV, and 1 otherwise, which in view of Lemma 1 makes

[S] = 0 if and only if Sis a palindrome. This would cer-

tainly guarantee equivalent strings had the same change,

and as we shall see later, that would be enough to guaran-

tee they had the same change proﬁle. However as we shall

also see later, such a deﬁnition does not pick up periodicity

in a string. We therefore make the following recursive

deﬁnition:

Deﬁnition 6. Let S,U,Vbe as in Lemma 1.

Deﬁne [s

] to be 0 if and only if s

If L> 2 then deﬁne [S] to be 0 if and only if Uand V

have the same change proﬁle.

The ﬁrst thing we do is remove the recursion from this

deﬁnition, and show how to compute the change function

eﬃciently. The next result is central to our theory, and in

view of its importance we state it ﬁrst for a string of length

5, so as to make the general statement easier to grasp.

We have

] = 0 if and only if [s

]=[s

], [s

]=[s

Theorem 1. Let the binary string S = (s

,... ,s

), L>2.

Then [S] = 0 if and only if

½s1;s2;...;sj¼½sLjþ1;sLjþ2;...;sL;j¼2toL1ð2Þ

Numerical example 1:

As an example we use theorem 1 to show [1 1 0 1 1] = 0.

We have to check that [1 1] = [1 1], [1 1 0] = [0 1 1],

[1 1 0 1] = [1 0 1 1].

The ﬁrst is vacuously true, and applying theorem 1

shows [1 1 0] = [0 1 1] = 1.

To check the third, we have by theorem 1 to check

[1 1] = [1 1], [1 1 0] = [0 1 1], both now proven.

Theorem 1 allows us to calculate the change function for

all substrings of a string. The following algorithm com-

putes the change matrix, and hence the change proﬁle, of

a binary string Sof length L, at a cost of L(L

1)/6 bin-

ary comparisons and space for L(L1)/2 binary values. In

particular it calculates [S].

Algorithm 1.

Let Sand X

be as in Deﬁnition 5.

Initialize all the X

to 0.

For i=1to L1

If s

is not equal to s

i+1

then set X

=1.

For j=3to L,i=1 toLj+1,k=2 toj1

If X

–X

, where ris i+jk, then set X

=1, and

continue with the next value of i.

Numerical example 2:

As an example we use algorithm 1 to calculate the

change matrix Xand change proﬁle Pof the string

S=“11011”. Recall X

= [substring of length jstarting at

position i].

Substrings of length 2:

= [11] = 0, X

= [10] = 1, X

= [01] = 1,

= [11] = 0

Column 2 of X

X12 X22 X32 X42

0110

Substrings of length 3:

= [110]: X

–X

Set X

= [101]: X

Set X

4A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16

Author's personal copy

= [011]: X

–X

Set X

Column 3 of X

X13 X23 X33

101

Substrings of length 4

= [1101]: X

–X

. No need to check whether

. Set X

= [1011]: X

–X

. No need to check whether

. Set X

Column 4 of X

X14 X24

Substrings of length 5

= [11011]: X

Set X

Column 5 of X

X15

X¼0110

101

P= 2 2 2 0 obtained by summing the entries in each col-

umn of X.

The next two results show that the change proﬁle is

invariant under equivalence, the ﬁrst of the two symmetry

properties we expect our change function to have. Lemma

2states that if equivalent strings always have the same

change then they always have the same change proﬁle,

and Theorem 2 states that equivalent strings do always

have the same change.

Lemma 2. Let S, T be binary strings of length L P2, and

let [] be any change function.

If S T implies [S] = [T] then S T implies that S and

T have the same change proﬁle.

Theorem 2. Let S, T be binary strings of length L P2. If

ST then [S] = [T].

Corollary (using Lemma 2): If S T then S and T have

the same change proﬁle.

The converse is not true for strings of length P5. The

strings “01110”and “00100”have the same change proﬁle

but are not symmetrically equivalent. This is encouraging,

because these two strings could be regarded as being per-

ceptually equivalent in respect of simplicity, so that change

proﬁle is capturing something beyond what symmetric

equivalence can capture.

We note that the change matrix is not invariant under

equivalence. If ST, it will not normally be the case that

Sand Thave the same change matrix. Their change matri-

ces will have the same set of entries in each column, but the

entries will be in diﬀerent places.

2.4. Generalized palindromes

The next lemma relates change to palindromicity, and

together with Lemma 1 is suﬃcient to guarantee the second

symmetry property expected of our change function.

Lemma 3. Let S be a binary string of length L > 2. If S is a

palindrome then [S] = 0.

Note that L> 2 is needed since “01”is a palindrome but

we have deﬁned [01] to be 1.

Corollary 1 (using Lemma 1). If the ﬁrst and last L1

symbols of Sform equivalent strings, then [S]=0.

The converse is not true for LP9. Consider the string

S=“111011000”of length 9. Then [S] = 0, but Sis not a

palindrome. However if 2 < L< 9, then [S] = 0 does

imply Sis a palindrome. In view of this we make the

following deﬁnition.

Deﬁnition 7. Let Sbe a binary string of length L>2. We

call Sageneralized palindrome if [S]=0.

Thus a palindrome is a generalized palindrome, but for

LP9 there are generalized palindromes that are not palin-

dromes, and we shall see that these strings play an impor-

tant part in the ability of the change function to pick up

periodicity.

Theorem 1 highlights a kind of palindromic structure in

generalized palindromes. Reading the values of square

brackets from the left is the same as reading them from

the right.

2.5. Complexity

Deﬁnition 8. Let Sbe a binary string of length LP2, with

change proﬁle P=(p

,... ,p

The complexity C of Sis deﬁned to be

C¼X

j¼2

pjwj;where wj¼1=ðLjþ1Þð3Þ

The change proﬁle Pof Sis an array of integers that

forms a representation of the amount of change in S,and

a natural way to derive a single number Cfrom Pis to take

a weighted average of its coordinates. We can use the

weights to regulate the contribution of diﬀerent lengths of

substrings of Sto the overall complexity of S, and this

allows us to model diﬀerent levels of structure in our com-

plexity measure.

Psychological complexity research has shown that the

importance of diﬀerent levels of structure in judging com-

plexity depends on a number of factors. Here we refer to

the hypothesis put forward by Chipman (1977) and

A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16 5

Author's personal copy

supported by Ichikawa (1985), according to which subjec-

tive appreciation of pattern complexity is underpinned by

two processes: quantitative and structural. Ichikawa sug-

gests that quantitative processes rely on enumeration of

distinct elements in a pattern, which might include the

number of dots, runs or clusters. This can be contrasted

to structural processes, which evaluate periodicity and

symmetry within the pattern. He proposed a model of the

processing of structure consisting of two stages: primary

and higher cognitive. With short stimulus presentation

times, quantitative factors dominate the judgment and

the structural aspects are “discovered”only given suﬃcient

time. We model the quantitative aspect with weights that

favour short substrings of S, and the structural aspect with

weights that favour all lengths equally.

Adjusting weights to favour the contribution of short

substrings can be achieved by taking all the w

in (3) to

be 1, but that provides only L(L1)/2 diﬀerent complexity

values over 2

strings. The low number of distinct values

could be avoided by viewing Pas a mixed radix represen-

tation of C,withp

being a digit in radix Lj+2,p

being

the most signiﬁcant digit. To understand mixed radix

representations, think of time as a triple (days, hours, min-

utes). The number of minutes represented is minutes +

60 h + 60.24 days. The weights for our “mixed radix”com-

plexity are given by w

=1, w

Lj

=(j+1)w

Lj+1,

j=1to

L2, which makes w

Lj

=(j+ 1)! = (j+1)j(j1) ...

3.2.1. There would then be as many diﬀerent complexity

values as there are diﬀerent change proﬁles, and the rank-

ing of complexities would be intuitively good. Moreover

comparison of complexity values would be easy, carried

out by comparing proﬁle entries from the left. The contri-

bution to the complexity from long substrings would be

negligible however. We will see later that both “proﬁle

sum”complexity (w

= 1) and mixed radix complexity can

be useful in modelling the “quantitative”mode of complex-

ity perception.

The actual choice of w

=1/(Lj+ 1) in (3) derives

from the fact that there are Lj+ 1 substrings of S of

length j, so we have deﬁned Cto be the sum over jP2

of the proportion of substrings of length jof S whose

change is 1. This choice gives equal weight to all substring

lengths, and models the “structural”mode of perception.

Furthermore it provides a suﬃcient number of complexity

values. Indeed for medium length strings it makes complex-

ity a very eﬃcient representation of proﬁle. As an example,

for L= 16 there are 11,889 diﬀerent complexity values and

13,420 diﬀerent proﬁles, and the ratio of these two numbers

is 0.886. For L625 this ratio is P0.725.

Since each p

in (3) is in the range 0 to Lj+ 1, we see

that the complexity Cof a binary string Sof length Lis in

the range 0 to L1. We therefore deﬁne the normalized

complexity N of Sto be C/(L1), so Nis in the range 0

to 1. Where necessary we refer to Cas the unnormalized

complexity of S. The upper bounds for Cand Nare not

attained, but we can expect that as Lgets large both the

mean and maximum values of Napproach 1, and the stan-

dard deviation of Napproaches 0, so that the normalized

complexity of almost all long strings is near to the maxi-

mum value of 1.

For strings of length up to 32 we have determined the

distribution of complexity values by exhaustive search,

and for strings of length up to 128 we have estimated it

from a sample of one million strings chosen uniformly at

random. The distribution of unnormalized complexity for

strings of length 24 is shown in Fig. 1. A feature of the dis-

tribution is that the values are tightly bunched at the right

end of the distribution curve, which could be seen as

reﬂecting the decreasing ability of our measure to discrim-

inate between increasingly complex structures that is also

characteristic of human observers.

Generalized palindromes tend to have low complexity.

The reason is that if Sis a binary string of length Lwith

unnormalized complexity C, then [S] contributes 1 to Cif

Sis not a generalized palindrome, and contributes 0 if S

is a generalized palindrome. If LP5 a diﬀerence of 1 in

unnormalized complexity translates to more than 1.2 stan-

dard deviations, and for LP21 it translates to more than

2 standard deviations.

2.6. Periodicity

One of the most striking features of our complexity mea-

sure and change function is their ability to pick up period-

icity in a string. This is of considerable interest since our

deﬁnition of complexity is not based on structure, and in

particular does not attempt to encode any speciﬁc form

of structure. Rather we have taken the view that change

can quantify complexity, and based our deﬁnitions on

change accordingly. We have already seen that generalized

palindromes have low complexity. That our complexity

measure is related to both palindromic and periodic struc-

ture is a partial vindication of our approach, and is one of

the most important features of our model. Coupled with

the results in the ﬁnal part of this paper, which show that

20.6

20.2

19.8

19.4

19.0

Frequency (million)

Unnormalized Complexity

Fig. 1. The right-hand tail of the population frequency distribution of

C(s) for binary strings of length 24. Mean unnormalized complexity is

20.14 with a standard deviation of 0.43. The mode is 20.37. Maximum

complexity equals 20.85, and mean normalized complexity is 0.88. Inset: A

plot of the entire distribution from C(s) = 0 to 20.8.

6A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16

Author's personal copy

our measure performs well alongside other measures in the

literature, it strongly suggests that one of the primary activ-

ities performed by the brain when it assimilates structure in

a pattern is to process the elements of change present in the

pattern.

First, a periodic string with period Tand period length t

is a string Sthat takes the form TTT..., where Tis a string

of length t, and tis minimal with this property, i.e. Sis not

of the form RRR..., where Ris a string of length r<t.An

ultimately periodic string is one that is periodic from some

point onwards. We give results for periodic strings, but

similar results can be observed for ultimately periodic

strings.

Let Sbe a periodic binary string with period length t,

and let S

be the string formed from the ﬁrst ncharacters

of S. Numerical calculations indicate that the percentile

complexity of S

quite rapidly approaches zero as ngets

large. It is usually zero for n>t

/2. This behaviour could

perhaps have been expected. A partial explanation is that

our calculations also indicated that a periodic string S

tends to have an above average number of large substrings

that are generalized palindromes, and these depress the

complexity of Sfor much the same reasons as generalized

palindromes tend to have low complexity. As an example,

let Sbe the periodic string with period length 9 and period

“101101110”. Then the percentile complexity of S

is 0 for

nP33.

Even more striking, and unexpected, is the following,

which while almost certainly true, must remain a conjecture

until we ﬁnd a proof. It says that the change function is

itself an indicator of structure, which would provide fur-

ther evidence that processing change is fundamental to

the detection of structure.

Conjecture 1. Let Sbe a periodic binary string with period

Tand period length tP2.

Let S

be the string formed from the ﬁrst ncharacters of

S, let u

=[S

], n> 1, and u

=0.

Then the string U=(u

, ...) takes the form IXX

..., where Xhas just one zero entry.

If Tis of the form U V, where V is the complement of U,

then Iand Xare of length t/2, otherwise they are of

length t.

Shown below are the string Swith period length 9 we

considered earlier, and the associated string U.

S 101101110 101101110 101101110 101101110...

U 010110111 111110111 111110111 111110111...

It is important to note that if we had deﬁned [S]tobe0

if Sis a palindrome, periodic strings would not have low

complexity, and the sequence Uwould not detect periodic-

ity in periodic strings. In the example just given, we would

ﬁnd with this deﬁnition of [S] that [S

] is 1 for every nP7,

and that the percentile complexity of S

is more than 90%

for nP9.

2.7. Instability and non-intuitive low complexity

A small change to a string can result in a large percentile

change in its complexity, with the consequence that some

strings that do not appear at ﬁrst sight to be intuitively sim-

ple can have low complexity. Two examples of strings dif-

fering in only one position are

“101100 010001011101110010”(C= 18.97, p= 2%)

“101100 110001011101110010”(C= 20.49, p= 86%)

“101 1011101011011101011011”(C= 20.04, p= 2%)

“101 0011101011011101011011”(C= 21.33, p= 66%)

Here Cdenotes unnormalized complexity and pdenotes

percentile. In the ﬁrst example, the string with low com-

plexity is a palindrome of length 24, and in the second

example it consists of the ﬁrst 25 symbols of a periodic

string of period length 9. A brief subjective view of these

strings will probably not see these structures, and will con-

sequently assess the strings as intuitively complex. Our

complexity measure is acting as an agent with suﬃcient

resources to pick up structure which is not immediately

obvious to a human observer on a brief glance, and so gives

these strings low complexity.

Although it is not surprising that a one bit change to a

string with low complexity can destroy internal structure to

the extent that our measure can no longer detect the struc-

ture, it is not true that a one bit change always turns low

complexity to high complexity. If we change the bit in posi-

tion 7 of the ﬁrst string in the second example from 1 to 0,

we obtain the string “1011010101011011101011011”, which

still has C= 20.04 and p= 2%. This string is not a palin-

drome, nor even a generalized palindrome, and it is not

the start of a periodic string. Its ﬁrst 24 characters do form

a generalized palindrome, and this has helped to depress its

complexity. We are unable to specify exactly when a string

will get low complexity, but we believe that low complexity

will generally indicate some kind of internal structure. The

existence of strings whose internal structure is not immedi-

ately obvious does however mean there will be an imperfect

correlation between our measure and brief subjective

assessment of intuitive complexity.

We have found that strings which are intuitively simple

do get low complexity, and that the complexity ranking of

strings with low complexity is intuitively correct, though

there will always be room for argument. Thus the string

“010101010101”has complexity 1.0 and percentile com-

plexity 0%, while the string “001100110011”has complex-

ity 5.45 and percentile complexity 1%. Intuitively these

strings are little diﬀerent in complexity, and indeed at the

percentile level our measure barely distinguishes them.

2.8. Array complexity

It is possible to extend our theory of binary strings to

binary arrays, though the results and their proofs are con-

siderably more intricate, and the best change function may

A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16 7

Author's personal copy

not be binary. In this paper we have taken an easier road,

and deﬁned the complexity of a binary array in terms of the

complexities of its rows, columns, and diagonals, showing

in the ﬁnal part of the paper that this rather crude

approach works surprisingly well. We experimented with

several ways of doing it, and describe the one that worked

best.

Let Abe an m by n binary array. We consider four sets

of strings, the rows of A, the columns of A, the main diag-

onals of A(left to right slanting), and the back diagonals of

A(right to left slanting), and calculate the average unnor-

malized complexity of each of these four sets of strings.

Diagonals of length 1 are included and taken to have zero

complexity. We deﬁne the normalized complexity Nof Ato

be the sum of these four averages, scaled so that N<1.We

found it convenient to work with U=(L1)N, where Lis

the average length of all the rows, columns, and diagonals

of A, and call Uthe unnormalized complexity of A, because

Nand Ureduce to normalized and unnormalized string

complexity when mor nis 1. The details of the calculation

of Nand Uare given in Appendix B.

Fig. 2 shows the sample distribution of unnormalized

complexity for 24 24 arrays, obtained from a sample

of 100,000 arrays chosen uniformly at random. The ﬁgure

is a bit misleading because it hides a long left hand tail of

outliers which do not show up in a sample of any feasible

size, and there is in fact even tighter bunching of values

at the right hand end of the distribution of array com-

plexity than there is for string complexity. As with

strings, most suﬃciently large arrays have nearly the same

complexity, so that array complexity does not distinguish

well between arrays of average (perceptually high) com-

plexity. The outliers with low complexity correspond to

images possessing some structure, and array complexity

does provide good discriminability for such images. This

means that for our purposes the distribution of array

complexity is of limited use, but that the ranking of intu-

itively simple arrays provided by array complexity is

useful.

3. Comparisons with existing measures

We have examined the relationship between C(s)and

several subjective and objective complexity measures

reported in the psychological literature. The purpose of this

overview is to demonstrate how well C(s) correlates with

diﬀerent measures of complexity based on apparently

diverse principles. Given the assumed ordinal level of our

measure, all reported correlations involving C(s) are non-

parametric.

3.1. String complexity: simultaneous presentation

A well-known study of the eﬀects of pattern complexity

on recall was carried out by Glanzer and Clark (1962) who

presented participants with the exhaustive set of binary

patterns of length 8. The stimuli were arrays of symbols

(diamond, circle, square, spade, diagonal cross, club, heart

and triangle) which were patterned in all possible combina-

tions of black and white. Two complement subsets of 128

patterns were presented for 500 ms to two diﬀerent groups

with the dependent variable being the accuracy of repro-

duction. Speciﬁcally, participants were asked to reproduce

the pattern by writing Bfor “black”and Wfor “white”in

the blank arrays. The authors reported that patterns which

appeared simple were also reproduced more correctly.

They measured the complexity of the patterns using diﬀer-

ent methods including the number of runs, Attneave’s

(1959) redundancy measure and an ad hoc measure based

on the Gestalt principles. They obtained a signiﬁcant corre-

lation between the number of runs and subjects’ accuracy

scores (r=.771) but rejected the measure because it (a)

had no theoretical basis and (b) there was a curvilinear

relationship between the number of runs and accuracy

scores. This was caused by the fact that subjects processed

patterns with few runs (e.g. BWWWWWWW) and those

with many runs but regular alternations (BWBWBWBW)

with comparable ease. Note that similar curvilinear rela-

tionships are obtained in distinct but related contexts such

as subjective randomness research (e.g. Falk & Konold,

1997). Our measure addresses this issue by “removing”

periodic strings from the right-hand tail of the distribution

and placing them among strings which have few

alternations.

Not satisﬁed by Attneave’s redundancy measure (low

correlation) or ad hoc Gestalt-inspired measures (no

theoretical rationale), the authors used the length of the

subjects’ verbal descriptions of the patterns (Mean Verbal-

ization Length; MVL) as a measure of stimulus complexity.

Perhaps, not surprisingly, they obtained a very high corre-

lation between MVL and accuracy scores (.826).

Although the result seems impressive, there is a serious

objection to using MVL as a measure (as opposed to a cor-

relate) of complexity. Given the complexity of the underly-

ing psychological processes, one can justiﬁably view the

ease or diﬃculty with which subjects verbally describe a

pattern as a consequence rather than as a predictor of

Unnormalized Complexity

13.09

13.0112.93

12.85

Frequency (thousand)

12.77

Fig. 2. Sample distribution of C(s) for 24 24 arrays (N= 100,000).

Mean unnormalized complexity is 12.97 with standard deviation of 0.04.

Average length of rows, columns and diagonals is 16.22. Mean normalized

complexity is 0.85. The left tail of the distribution is omitted.

8A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16

Author's personal copy

stimulus complexity (see Vitz, 1968 for a related criticism).

In a sense, MVL must be highly correlated with other

performance indicators because it can be seen as one of

them. More seriously, MVL tells us nothing about the

qualities and structural properties of the pattern which

are associated with subjective perception of complexity.

A similarly “agnostic”position on pattern complexity

was adopted by Lordahl (1970), who proposed a

40-parameter model of sequential prediction for binary

patterns.

In order to examine the relationship between our mea-

sure and subjects’ performance, we collapsed the two sub-

sets of 128 complementary patterns and calculated a mean

of mean accuracy scores for each pattern. We obtained a

highly signiﬁcant correlation between C(s) and accuracy

scores (r=.349, p< .001). Following Glanzer and

Clark’s example however, we had to conclude that our

measure accounted for about 10% of the variance in

response accuracy, hardly a satisfactory result.

The explanation for the low correlation was found in the

distinction between the quantitative and structural pro-

cesses described in Section 2.4. Out of all studies analyzed

here, Glanzer and Clark’s study is the only one to have

included short presentation time (500 ms). The dominance

of the quantitative factors can be illustrated by the fact that

a very simple pattern (0 10101010) was recalled very

poorly (mean accuracy score of around .350 out of the

maximum of 1).

After considering the high task demands (fast stimulus

presentation and possible distraction caused by the symbol

outlines), we correlated the proﬁle sums with mean accu-

racy scores and after removing three outliers (including

01010101) obtained a correlation of .695 (p< .001),

accounting for almost 50% of the variance. Restricting

the proﬁles to the ﬁrst three levels increased the correlation

to .744. We then used the mixed radix representation of

the pattern proﬁle, which gives even more weight to

changes at the lowest level of structure. Interestingly, the

correlation (.830) was higher than that between the num-

ber of runs and accuracy (.771) and slightly higher than

the correlation between MVL and accuracy. This evidence

allows us to interpret the quantitative/structural distinction

in terms of levels of change. Quantitative factors are related

to low-level change (e.g. number of runs and alternations

and second-order entropy) whereas structural processes

require the system to ascend the hierarchy of change to

higher levels. As we show below, in its current form our

measure appears suited to situations in which observers

have suﬃcient time to consider the structural properties

of a pattern.

An inﬂuential study using black and white 1-D patterns

of length 7 was carried out by Alexander and Carey (1968),

who investigated the eﬀect of pattern complexity on the

performance of diﬀerent tasks. Participants were presented

with 35 patterns – linear arrangements of three black and

four white squares diﬀering in complexity – and were asked

to complete four tasks (search, reconstruction, memoriza-

tion and verbal description). The data for the four experi-

ments were ranked with respect to the patterns and

correlations showed remarkable agreement. The authors

pointed out that simple concepts such as overall symmetry

or number of blocks could not account for the agreement.

They proposed the concept of “subsymmetry”, that is, sym-

metry of the parts of the pattern. Without oﬀering a theo-

retical rationale, they suggested that patterns possessing

more symmetry at all levels would be perceived as simpler.

The measure was highly correlated with the average rank-

ing of the patterns (r= .808, p< .001). Although ad hoc,

subsymmetry is related to our measure because patterns

containing symmetries at all levels also contain less change

at all levels. Instead of counting instances of symmetry at

every level, we count instances of (recursively-deﬁned)

change. As expected, the average number of subsymmetries

was signiﬁcantly correlated with C(s)(r= .672, p< .001).

The correlation between pattern goodness rank averaged

across the experiments and the corresponding values of

C(s) was highly signiﬁcant (r= .694, p< .001). Interest-

ingly, the number of runs was weakly correlated with the

number of subsymmetries and not correlated with subjec-

tive pattern ranks. This result should be contrasted to that

reported by Glanzer and Clark (1962), whose subjects

showed clear reliance on low-level quantitative informa-

tion. The diﬀerence can be explained by the fact that in

Alexander and Carey’s study, subjects had suﬃcient time

to inspect the patterns and consider higher-level structural

information.

A context closely related to complexity is subjective ran-

domness. Although in psychology the two have not been

explicitly linked, algorithmic information theory considers

random those patterns that possess high Kolmogorov com-

plexity. Consequently, we could predict that our measure

would correlate signiﬁcantly with subjective randomness

judgments. An inﬂuential study on the relationship

between complexity and subjective randomness was carried

out by Falk and Konold (1997), in which objective, infor-

mation-theoretic predictors of complexity (e.g. second-

order entropy and probability of alternation) were

compared with subjective ones (apparent randomness,

copying diﬃculty and memorization time) using a set of

40 binary strings of length 21. The strings were selected

so that although most were perceptually diﬀerent, four sets

of ten strings were matched for number of runs and sec-

ond-order entropy. Our measure was highly correlated

with all three performance variables, namely, apparent ran-

domness (r= .720), copying diﬃculty (r= .796) and mem-

orization time (r= .865; all p< .001). In addition,

distributions of apparent randomness and copying diﬃ-

culty were negatively skewed. In contrast to Glanzer and

Clark’s study, low-level change did not seem to be impor-

tant to Falk and Konold’s subjects, since the number of

runs correlated modestly only with apparent randomness

(r= .364, p= .021) but not with other measures. Again,

this can be interpreted in terms of the tasks employed by

Falk and Konold. Copying and memorization tasks give

A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16 9

Author's personal copy

subjects time to consider higher levels of structure and our

measure in its current form appears to be particularly well

suited to such tasks.

In another study of subjective randomness, Griﬃths and

Tenenbaum (2003) asked subjects to rank 128 binary pat-

terns of length 8 according to how random they perceived

them to be. The authors also provided a multi-parameter

model of subjective randomness which corresponded clo-

sely to the subjective scores. The predictions of the model

correlated highly with the subjective rankings (r= .806,

p< .001). We observed highly signiﬁcant correlations

between our measure and the two variables (subjective

rankings and the model; r= .663, p< .001 and r= .695,

p< .001 respectively). Controlling for complementary and

mirror symmetrical strings increased the correlations some-

what (.715 and .713 respectively).

From the above examples, we can conclude that C(s)

correlates highly with disparate measures of complexity

and randomness, both subjective and objective. Further-

more, these results support the view, originally put forward

by Kolmogorov, that subjective randomness equals high

complexity.

3.2. String complexity: sequential presentation

C(s) does not take into account sequential eﬀects which

have been demonstrated with serial presentation (e.g.

Feldman & Hanna, 1966; Restle, 1970) and as such it does

not appear suited to quantifying complexity in this context.

Yet, if complexity is viewed even partly as a property of a

pattern, the relationship between change and processing

diﬃculty should hold irrespective of the mode of presenta-

tion. Two objective (statistical) models of binary pattern

complexity, called H(k-span) and H(run-span), were pro-

posed by Vitz (1968) as elaborations of an earlier model

(Vitz & Todd, 1967). The former is based on the transi-

tional uncertainties between pattern elements of diﬀerent

length and equals log

of the length of a string, whereas

the latter is log

of the total number of runs in the pattern.

Vitz correlated these two measures with subjective judged

complexity (JC) and mean verbalization length (MVL)

for 26 binary patterns of varying length, obtained over

two experiments. In order to increase sample size, three

measures [H(k-span), H(run-span) and judged complexity]

were collapsed across experiments and MVL was excluded

because it had been used with only eight patterns in Exper-

iment 1. This did not inﬂate the result, because the correla-

tions were highly signiﬁcant within individual experiments.

Again, correlation coeﬃcients between C(s) and the three

measures were highly signiﬁcant (C(s)/H(k-span) = .838,

p< .001; C(s)/H(run-span) = .786, p< .001; C(s)/JC =

.861, p< .001). Although C(s) was very highly correlated

with the participants’ performance, so were Vitz’s mea-

sures. Given that those measures were based on quantita-

tive aspects of information (string length and number of

runs), there appeared to be no reason for claiming that

C(s) was more psychologically valid.

Vitz (1968), Vitz and Todd (1967) and Leeuwenberg

(1969) reported high correlations between their measures

and subjective data as evidence of the validity of their

approaches. However, high correlations can be obtained

between variables even when one of them possesses few dis-

tinct values. Thus, correlation on its own does not guaran-

tee that a measure is a realistic index of psychological

complexity. Humans are reasonably (if not inﬁnitely), dis-

criminating with regard to complexity and a subjective

complexity scale can assume a number of distinct values.

Given the importance of subjective judgment, in addition

to correlation, what is needed is some way of knowing

how sensitive a measure is, that is, how close it is to the sen-

sitivity of the human observer. Vitz (1968) was aware of

this problem and stated that “... probably... coding into

runs is too simple a description of the coding process (p.

280).”

One possibility is to compute the ratio of distinct subjec-

tive responses based on the number of distinct stimuli and

compare this with the ratios obtained from diﬀerent mea-

sures. Clearly, a measure should be capable of assuming

a number of distinct values, which is suﬃciently large to

approximate the complexity of the underlying psychologi-

cal process. The sensitivity of C(s) (.92) was virtually indis-

tinguishable from subjective judgment (.96) and much

higher than the sensitivity of Vitz and Todd’s measures

(.23 and .17 for k-code and run-code measures respec-

tively). With the number of subjective judgments used as

the baseline, Vitz’s measures appear highly insensitive

and signiﬁcantly diﬀerent from subjective judgment

(p= .001 and p< .001 respectively; Binomial test). By con-

trast, there was no statistical diﬀerence between the number

of subjective judgments and C(s)(p= 1). This suggests that

C(s) corresponds better to the subjective complexity judg-

ments than do Vitz’s probabilistic measures.

Vitz and Todd (1969) extended their approach by devel-

oping a measure of complexity called Hcode. Based on the

hierarchical coding of (binary or trinary) pattern elements

and the application of Garner’s (1962) multivariate uncer-

tainty analysis, the measure was developed speciﬁcally to

account for sequential pattern learning. The authors

obtained very high correlations between their measure

and judged complexity of the 20 binary patterns 1–8 sym-

bols long (.941, p< .001). C(s) was signiﬁcantly correlated

with both Hcode and judged complexity scores (.605 and

.581, respectively, both p= .005).We reiterate that unlike

Hcode, which assumes sequential pattern processing, our

model makes no such assumptions.

Psotka (1975) proposed a measure of sequential pattern

structure called “syntely”, based on expectancies created by

runs and alterations within a pattern. He investigated 35

binary patterns of length 8 comparing six measures of pat-

tern structure: an information-theoretic complexity mea-

sure by Vitz and Todd (Hcode; 1969; discussed in Simon,

1972), judged complexity, measured and judged symmetry,

and measured and judged syntely. As can be seen from

Table 1,C(s) showed no correlation with Vitz and Todd’s

10 A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16

Author's personal copy

measure or with judged syntely. In contrast, it correlated

signiﬁcantly with judged complexity, measured symmetry,

judged symmetry and measured syntely. Interestingly, mea-

sured syntely correlated only with judged syntely, indicat-

ing that these measures were not adequate models of

subjective complexity. Furthermore, Vitz and Todd’s mea-

sure showed no correlation with judged symmetry. Table 1

shows correlations between the measures reported by Pso-

tka and C(s). In order to enable a comprehensive compar-

ison, the table also includes the following complexity

measures: number of runs, Leeuwenberg’s (1969) original

SIT code and van der Helm’s (2000) SIT code. It should

be noted that out of eight measures (seven objective and

one subjective) C(s) exhibited the highest correlation both

with subjective complexity and subjective symmetry scores

(highlighted). This is particularly surprising given that our

measure is based on a simple theoretical premise and has

not been adapted to the requirements of sequential

presentation.

Fig. 3 illustrates the relationship between four diﬀerent

measures and subjective complexity judgments. It can be

seen that our measure (Cs) not only shows the highest cor-

relation with subjective judgments but also provides more

distinct values. By contrast, runs, which take into account

only low-level structure correlate poorly. Hcode, which was

speciﬁcally designed for sequential patterns performs

slightly better but shows a pronounced clustering at the

upper end of response distribution. Van der Helm’s SIT

code shows an even higher correlation (.55) but it possesses

only ﬁve diﬀerent values over a range of 35 stimuli.

Although C(s) appeared be non-linearly related to the

responses, a quadratic ﬁt was only slightly better than a lin-

ear one (R

= .542 and .527 respectively).

Our claim that C(s) performed better than other mea-

sures was conﬁrmed in terms of sensitivity. As can be seen

from Table 2,C(s) is the closer than other objective models

in terms of the number of distinct values to the subjective

complexity judgment. There was no diﬀerence in sensitivity

between C(s) and subjective complexity judgments

(p= .560 on a Binomial test). By contrast, Vitz’s probabi-

listic model, for example, provides signiﬁcantly fewer dis-

tinct values (p= .001).

Finally, we conducted a visual inspection of complexity

distributions for diﬀerent measures. Traditionally, invari-

ance has been viewed as the opposite of change. This is

implicitly represented in the entropy/redundancy distinc-

tion, which views the former as the inverse of the latter

(entropy = 1–redundancy; Attneave, 1954). Yet, psycho-

logical research has indicated that change is more diﬃcult

to process than the absence of change and that symmetrical

information-theoretic probability distributions which treat

change and its absence as equiprobable do not agree with

distributions of scores on a number of complexity and ran-

domness-related tasks, which are negatively skewed (Falk

& Konold, 1997, p. 306). Brieﬂy, observers consider as

most random or complex those patterns which contain a

large amount of change as long as the change is not

regular.

An examination of distributions of diﬀerent complexity

measures in Fig. 4 shows that C(s) behaves reasonably well

in modelling the distribution of subjective judgments. By

contrast, Hcode contains too few distinct values to repre-

sent a plausible model of subjective complexity. Note the

negative skew of the distribution of subjective complexity

judgments, which agrees with our theoretical distribution.

From this, it appears that, at least for short binary pat-

terns, our measure successfully approximates the shape of

the subjective complexity scaling distribution.

A point worth mentioning is that the reason our mea-

sure produced identical values for diﬀerent strings is that

in most cases the strings in question were structurally

equivalent (or “logically equivalent”according to Vitz) a

point not generally raised by other authors. According to

our deﬁnition, mirror symmetrical (001 = 100) and comple-

mentary (001 = 110) strings possess the same complexity.

In other words, mirror symmetrical and complementary

strings contain an equal amount of change or information.

Clearly, subjective judgments will not agree with this all of

the time. To illustrate, in Psotka’s study, strings 01101010

and 01010110 have very diﬀerent subjective complexity

Table 1

Rank order correlations between C(s) and other complexity measures on the data provided by Psotka (1975).

Measure JC MS JS Msyn Jsyn C(s) Runs SIT(L) SIT(vdH)

MC 468

.560

***

.056 .006 .101 .247 .673

.084 .419

JC – .605

***

.601

***

.273 .186 .668

***

.338

***

.307 .550

***

MS – – .356

.185 .117 .389

.431

.339

.642

***

JS – – – .246 .062 .800

***

.120 .422

.345

Msyn – – – – .677

***

.388

.100 .433

.221

Jsyn – – – – – .263 .063 .243 .193

C(s) – – – – – – .052 .290 .442

Runs – – – – – – – .250 .451

SIT(L) – – – – – – – – .385

Note. MC = measured complexity; JC = judged complexity; MS = measured symmetry; JS = judged symmetry; Msyn = measured syntely; Jsyn = judged

syntely; C(s) = our measure; Runs = number of runs; SIT(L) = SIT code (Leeuwenberg, 1969); SIT(vdH) = SIT code (van der Helm, 2000).

p< .05.

p< .01.

***

p< .001.

A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16 11

Author's personal copy

values (525 and 400 respectively) despite the fact that the

latter is a mirror-symmetrical version of the former. The

latter string could have been judged less complex because

it began with a sequence of 01 pairs – an important consid-

eration in the context of sequential presentation. Neverthe-

less, the high correlation between our measure and

subjective complexity indicates that local variations in

complexity are less important than the overall structure

of the pattern.

The supramodal nature of perceptual organization (e.g.

Aksentijevic, Elliott, & Barber, 2001) suggests that the

eﬀectiveness of C(s) in quantifying complexity/information

might extend to auditory pattern perception. In a famous

study, Royer and Garner (1966) investigated the ability

of participants to synchronize their tapping responses with

19 repeated binary auditory sequences of length 8. The

sequences were played continuously and the subjects were

required to tap in synchrony with the perceived start of

the pattern. As expected, the more complex patterns were

more diﬃcult to organize and were associated with higher

response uncertainty, longer response delays and higher

error rate. Response uncertainty (in bits), median delay

and mean number of errors for each pattern were signiﬁ-

cantly correlated with C(s)(r= .638, p= .003; .595,

p= .007; and .580, p= .009, respectively). While the corre-

lations are not very high, they are comparable to the ones

reported above. The number of runs in individual patterns

was not correlated with any dependent variables, suggest-

ing that the subjects relied substantially on higher-order

structural information.

Another well-known study of sequential pattern pro-

cessing, this time using visual presentation, was carried

out by Garner and Gottwald (1967). Subjects were pre-

sented with two binary patterns of length 5. The patterns

were labeled “simple”(RRRLL) and “complex”(LLRLR)

and presented to subjects via two lights (left or right). The

patterns were presented from each possible starting point

Subjective complexity

23456

100

600

200

300

400

500

Runs

r = .34

2345

SIT code (van der Helm)

r = .55

100

600

200

300

400

500

6912 15 18

H(code)

r = .47

123456

C(s)

r = .67

Fig. 3. Correlations between four objective measures of complexity and subjective responses for sequentially presented patterns of length 8 (Psotka, 1975).

See text for details.

Table 2

Sensitivity of measures employed by Psotka (1975) and C(s).

Measure Distinct values Sensitivity ratio

MC 7 .20

JC 26 .74

MS 8 .23

JS 34 .97

Msyn 17 .48

Jsyn 11 .31

C(s) 21 .60

Runs 7 .20

SIT (L) 4 .11

SIT (vdH) 4 .11

123456

100 300 500

Judged complexity

6101418

Hcode

Frequency

C(s)

Fig. 4. Frequency distributions of the complexity values for patterns

studied by Psotka (1975). Hcode corresponds to the measure proposed by

Vitz and Todd (1969) and structural complexity refers to our measure.

12 A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16

Author's personal copy

giving 10 diﬀerent patterns. Subjects were asked either to

predict the location of the next light from the beginning

of the sequence (immediate responding) or when they were

conﬁdent about the form of the sequence (delayed respond-

ing). The authors examined a number of dependent vari-

ables including the mean number of trials needed to

reach criterion and number of errors for each starting

point. C(s) correlated signiﬁcantly with the former variable

(r= .749, p= .013) and there was a marginally signiﬁcant

correlation with the latter (r= .625, p= .054). Although

the number of data points was small (10), the signiﬁcant

correlation indicates the importance of change in sequen-

tial pattern prediction. In contrast to the Royer and Garner

study, the number of runs was signiﬁcantly correlated with

both variables.

3.3. Array complexity

Although C(s) is proposed primarily as an index of

string complexity, preliminary investigations suggest that

it could be useful in determining the complexity of binary

arrays. In a large and comprehensive study, Chipman

(1977) investigated the importance of diﬀerent structural

properties on the judgment of the complexity of black

and white patterns (6 6 matrices containing 12 black

squares). Judging the complexity of 2-D arrays is a highly

complex process which might involve a large number of

diﬀerent quantitative and structural factors. To illustrate,

Chipman examined the following: The number of turns,

(perimeter)

/area, horizontal and vertical symmetry, diago-

nal symmetry, opposition symmetry and number of repeti-

tions. We examined the complexity of 45 patterns used in

Experiment 1 (of 7) because the patterns were assigned

judged complexity values, making them accessible for anal-

ysis. The correlation between C(s) and judged complexity

was highly signiﬁcant (r= .754, p< .001). Interestingly,

the sensitivity of C(s) (.98) exceeded that of the subjective

judgment (.87).

As discussed earlier, symmetry represents one of the pri-

mary determinants of pattern goodness. Howe (1980) con-

ducted a large-scale study on the eﬀects of partial symmetry

(symmetry of parts of an object) on diﬀerent tasks such as

exposure duration, masking, immediate memory and

reproduction. Participants were presented with 60 dot pat-

terns containing varying degrees of symmetry. The stimu-

lus set was constructed by gradually reducing the amount

of symmetry from 12 randomly chosen “good”patterns.

Subjects’ performance was clearly governed by the amount

of symmetry. To illustrate, subjective judgments of good-

ness highly correlated with the change in degree of partial

symmetry and the performance on exposure, masking,

memory and reproduction tasks was inversely proportional

to the complexity (absence of symmetry) in the stimuli. To

test the validity of C(s) in the context of the relationship

between goodness and symmetry, each of Howe’s patterns

was assigned a structural complexity value and these values

were correlated with mean subjective goodness ratings

given by the subjects. The correlation between subjective

judgment and C(s) was highly signiﬁcant (r= .673,

p< .001) conﬁrming that C(s) in its current form represents

a good predictor of subjective perception of goodness. The

sensitivity ratios for the subjective judgments and our mea-

sure were .91 and .65, respectively.

Related to this, Yodogawa (1982) proposed an objective

measure of symmetry for 2-D binary patterns, based on the

two-dimensional discrete Walsh transform. It should be

noted that Yodogawa’s measure is very successful in pre-

dicting judged goodness. He compared his measure “sym-

metropy”with goodness judgments using the patterns

previously employed by Howe (1980). The rank order cor-

relation between symmetropy and C(s) for ten binary pat-

terns presented in the paper was .790 (p= .007).

4. Discussion

The fact that our approach is based on a simple premise

(amount of change), together with excellent correspon-

dence with empirical data, suggests that it could be more

powerful than other comparable models (see Table 3).

One of the interesting properties of the model is its ability

to detect periodicities in binary strings, which appear inac-

cessible to a measure deﬁned in terms of symmetry.

Although partly motivated by physical and computational

accounts of entropy, our measure successfully models the

well-documented negative skew of subjective complexity

and randomness distributions. In addition, our measure

appears to be more sensitive (i.e. closer to subjective perfor-

mance in terms of distinct complexity values) than other

measures, both probabilistic and algorithmic.

Our approach elucidates the relationship between infor-

mational entropy on the one hand and McKay’s (1950)

metric/structural distinction on the other. Informational

entropy is a metric and frequentist concept deﬁned by the

size of the source or frequencies of occurrence of diﬀerent

outcomes. This is why information theory is not well suited

to quantifying structure, which refers to relationships

between elements/symbols/objects. Our measure integrates

the quantitative and structural aspects of information by

quantifying the occurrence of the simplest and most gen-

eral of relationships – same or diﬀerent.

As the complexity of a pattern increases, the more eﬀort

is needed to encode, assimilate or compress the pattern and

this is reﬂected in our measure. As expected, simple pat-

terns are few and represent statistical aberrations of high

order. An overwhelming proportion of possible structural

arrangements is far too complex (contains too much

change) for the observer to relate to immediately. In order

to achieve this, the observer has to consider structural rela-

tions at all levels of the pattern. Our approach sheds light

on the well-documented relationship between the amount

of information present in a pattern and the amount of time

needed to assimilate it. A brief exposure to a stimulus com-

pels the perceptual/cognitive system to rely on surface,

quantitative information. This information is represented,

A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16 13

Author's personal copy

for example, by the number of uniform patches, turns or

angles (Ichikawa, 1985). The more time the observers are

given to study the pattern, the more their judgment is inﬂu-

enced by the relations between the elements and diﬀerent

levels of structure. A broad analogy could be drawn here

with evidence that visual perception follows a similar

course.

In agreement with Falk and Konold (1997), we propose

that psychological complexity reﬂects eﬀort (or cost)

required in order to convert available into useful informa-

tion. In a sense, C(s) measures the distance between the two

domains. For simple, orderly patterns, this distance is small

and this is reﬂected in our measure. Patterns are easily dis-

criminated and little eﬀort is required to assimilate them.

The increase in the distance between the two domains

caused by the limitation of the human observer is reﬂected

in the fact that complex patterns are increasingly more dif-

ﬁcult to discriminate. When complexity reaches a certain

level, diﬀerent patterns become indistinguishable. This

observation can be linked to the current debate on the nat-

ure of randomness with some authors equating random-

ness with simplicity (e.g. Adami & Cerf, 2000;

Gell-Mann, 1995). We suggest that the perceived simplicity

of random patterns is due to the fundamental limitation of

the human observer who has to abandon trying to assimi-

late highly complex contexts and is compelled to treat them

as undiﬀerentiated noise.

To summarize, the notion of cost allows us to relate psy-

chological understanding of complexity to the physical and

computational contexts. The fact that C(s) correlates reli-

ably with a wide range of disparate measures of complexity

suggests that change represents the conceptual core of com-

plexity. Increase in change implies an increase in entropy.

Symmetry and periodicity denote transformational invari-

ance, that is, absence of change under transformation.

Sequences (visual and auditory) and arrays that contain

Table 3

Summary of comparisons of C(s) with diﬀerent complexity measures.

Study N(patterns) Modality

(auditory/

visual)

Presentation

(simultaneous/

sequential)

Dimensions Measure Subjective/

objective

Correlation

with C(s)

Glanzer and Clark (1962) 128 V Sim 1 8 Reproduction accuracy O .83

***

Alexander and Carey (1968) 35 V Sim 1 7 Perceived goodness S .69

***

N subsymmetries O .67

***

Falk and Konold (1997) 40 V Sim 1 21 Apparent randomness S 72

***

Copying diﬃculty S .80

***

Memorization time S .86

***

Griﬃths and Tenenbaum (2003) 128 V Sim 1 8 Perceived randomness S 71

***

G & T model O 71

***

Vitz (1968) 26 V Seq 1 1–1 8 H(k-span) O .84

***

H(run-span) O .79

***

Judged complexity S .86

***

Vitz and Todd (1969) 20 V Seq 1 1–1 8 H(code) O .58

Judged complexity S .61

Psotka(1975) 35 V Seq 1 8 H(code) O .25

Judged complexity S .67

***

Measured symmetry O .39

Judged symmetry S .80

***

Measured syntely O .39

Judged syntely S .26

Van der Helm (2000) 35 V N/A 1 8 SIT code O .44

Garner and Gottwald(1967) 10 V Seq 1 5 Trials to criterion S .75

Number of errors S .65



Royer and Garner (1966) 19 A Seq 1 8 Response uncertainty S .64

Response delay S .59

Error rate S .58

Chipman (1977) 45 V Sim 6 6 Judged complexity S .75

***

Howe (1980) 60 V Sim 5 5 Perceived goodness S .67

***

Yodogawa (1982) 10 V Sim 5 5 Symmetropy O .79

***

p< .001.

p< .01.

p< .05;



p< .1.

14 A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16

Author's personal copy

little change are easier to compress, perceive, analyze or

describe. This is true of computers as well as human

observers.

Acknowledgments

The authors wish to thank Slobodan Markovic for his

help in surveying psychological complexity literature. We

also thank Ruma Falk and Cliﬀ Konold for providing

the complete data set from their 1997 study and for their

encouraging suggestions as well as Mike Oaksford for his

editorial advice. Finally, we extend special thanks to Peter

van der Helm for his exhaustive and helpful comments as

well as for his help with data coding.

Appendix A. Proofs of the lemmas and theorems in Sections

2.2 and 2.3

Lemma 1. Let S =(s

,... ,s

)be a binary string, L >1,

and let U =(s

,... ,s

L1

), V =(s

,... ,s

Then Sis a palindrome if and only if UV.

Proof. Suppose Sis a palindrome. Then S=rS, where ris

one of r,rc. It is straightforward to verify that in both cases

U=rVso that UV.

Conversely, suppose UV. Then V is one of U, cU, rU,

rcU. It is straightforward to verify the following statements.

If V =Uthen Sis one of “000 ...”,“111 ...”.IfV=cU

then Sis one of “0101 ...”,“1010 ...”.IfV=rU then

S=rS.IfV=rcU then S=rcS with Leven. In all cases

Sis a palindrome. h

Theorem 1. Let the binary string S =(s

,... ,,s

), L>2.

Then [S]=0 if and only if [s

,... ,s

]=[s

Lj+1

Lj+2

,... ,s

], j=2to L 1.

Proof. Let (s

,...,,s

L1),

,... ,s

have change

proﬁles P =(p

,... ,p

L1

), Q =(q

,... ,q

L1

)

respectively. By Deﬁnition 6, [S] = 0 if and only if P =Q.

From Eq. (1b), we have that for j=2toL1, h

pj¼½s1;s2;...;sjþ½s2;s3;...;sjþ1þ 

þ½sLj;sLjþ1;...;sL1

qj¼½s2;s3;...;sjþ1þþ½sLj;sLjþ1;...;sL1

þ½sLjþ1;sLjþ2;...;sL

Thus P=Qif and only if [s

,... ,s

]=[s

Lj+1

Lj+2

,... ,s

], j=2to L1, proving the theorem.

Lemma 2. Let S,T be binary strings of length L P2, and let

[] be any change function.

If STimplies [S]=[T] then STimplies that Sand

Thave the same change proﬁle.

Proof. Write S=(s

,... ,s

), T=(t

,... ,t

Suppose ST. Then S=rT, where ris one of e,r,c,

rc. Here edenotes the identity operator. Consider the

following one to one correspondence ubetween the

substrings of Sand T.

Let X

be the substring of Sof length jthat starts at s

j=2to L,i=1 toLj+1.

If ris eor cthen umaps X

to the substring of Tof

length jthat starts at t

If ris ror rc then umaps X

to the substring of Tof

length jthat ﬁnishes at t

Li+1

It is straightforward to verify that in all cases uX

X

Now suppose that for all string lengths P2, UV

implies [U]=[V].

Then we have that [uX]=[X] for all substrings Xof Sof

length P2.

By the deﬁnition of change proﬁle, this means S and T

have the same change proﬁle. h

Theorem 2. Let S, T be binary strings of length L P2. If

ST then [S]=[T].

Proof. The proof is by induction on L. We ﬁrst observe

that the theorem is true for L= 2. We then make the

“induction hypothesis”that the theorem is true for string

lengths j=2toL1, L > 2, and use the induction hypoth-

esis to show that the theorem is true for string length j =L.

For L= 2 just note that [00] = [11] = 0, and [01] =

[10] = 1. Suppose therefore L> 2 and that the theorem

holds for string lengths j=2toL1. Let U

be formed

from the ﬁrst and last jsymbols of S, and W

be formed

from the ﬁrst and last jsymbols of T,j=2 to L1.

Suppose ST. Then S=rT, where ris one of e,r,c,rc.

Here edenotes the identity operator. It is straightforward

in each case to verify that U

=rW

and V

=rZ

,j=2to

L1. In other words U

W

and V

Z

for each j=2to

L1. By the induction hypothesis this means that

]=[W

] and [V

]=[Z

] for each j=2 to L1. By

Theorem 1 this makes [S]=[T]. h

Lemma 3. Let S be a binary string of length L >2.

If S is a palindrome then [S]=0.

Proof (using theorems 1 and 2). Let U

be the strings

formed from the ﬁrst and last jsymbols of S,j=2 to L.

Suppose Sis a palindrome. Then S=rS, where ris one

of r,rc. It is straightforward to verify that in both cases

=rV

for each j, i.e. U

V

, so that [U

]=[V

] from

Theorem 2. The result now follows from Theorem 1.h

Appendix B. The calculation of array complexity

Let Abe an mby nbinary array, and let R,C,M,Bbe

the sums of the unnormalized complexities of the rows,

columns, main diagonals, and back diagonals of A. To cal-

culate the normalized and unnormalized complexities N

and Uof A, compute

A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16 15

Author's personal copy

d¼mþn1

S¼R=mþC=nþM=d:þB=d

X¼d1þ2ðm1Þðn1Þ=d

L¼4mn=ð3dþ1Þ

N¼S=X

U¼ðL1ÞN

References

Adami, C., & Cerf, N. J. (2000). Physical complexity of symbolic

sequences. Physica D, 137, 62–69.

Aksentijevic, A., Elliott, M. A., & Barber, P. J. (2001). Dynamics of

perceptual grouping similarities in the organization of visual and

auditory groups. Visual Cognition, 8, 349–358.

Alexander, C., & Carey, S. (1968). Subsymmetries. Perception and

Psychophysics, 4, 73–77.

Attneave, F. (1954). Some informational aspects of visual perception.

Psychological Review, 61, 183–193.

Attneave, F. (1959). Applications of information theory to psychology: A

summary of basic concepts, methods, and results. New York, NY:

Henry Holt.

Chaitin, G. J. (1969). On the length of the programs for computing ﬁnite

binary sequences: Statistical considerations. Journal of the Association

for Computing Machinery, 16, 145–159.

Chaitin, G. J. (2001). Exploring randomness. London: Springer.

Chater, N. (1996). Reconciling simplicity and likelihood principles in

perceptual organization. Psychological Review, 103, 566–581.

Chipman, S. F. (1977). Complexity and structure in visual patterns.

Journal of Experimental Psychology: General, 106, 269–301.

Falk, R., & Konold, C. (1997). Making sense of randomness: Implicit

encoding as a bias for judgment. Psychological Review, 104, 301–318.

Feldman, J., & Hanna, J. F. (1966). The structure of responses to a

sequence of binary events. Journal of Mathematical Psychology, 3,

371–387.

Garner, W. R. (1962). Uncertainty and structure as psychological concepts.

New York, NY: Wiley.

Garner, W. R. (1970). Good patterns have few alternatives. American

Scientist, 58, 34–42.

Garner, W. R. (1974). The processing of information and structure.

Potomac, MD: Lawrence Erlbaum.

Garner, W. R., & Gottwald, R. L. (1967). Some perceptual factors in the

learning of sequential patterns of binary events. Journal of Verbal

Learning and Verbal Behavior, 6, 582–589.

Gell-Mann, M. (1995). What is complexity? Complexity, 1, 1–9.

Glanzer, M., & Clark, W. H. (1962). Accuracy of perceptual recall: An

analysis of organization. Journal of Verbal Learning and Verbal

Behavior, 1, 289–299.

Griﬃths, T. L., & Tenenbaum, J. B. (2003). Probability, algorithmic

complexity and subjective randomness. In: Proceedings of the 25th

Annual Conference of the Cognitive Science Society (pp. 480–485).

Hochberg, J., & McAlister, E. (1953). A quantitative approach to ﬁgural

“goodness”.Journal of Experimental Psychology, 46, 361–364.

Howe, E. S. (1980). Eﬀects of partial symmetry, exposure time, and

backward masking on judged goodness and reproduction of visual

patterns. Quarterly Journal of Experimental Psychology, 32, 27–55.

Ichikawa, S. (1985). Quantitative and structural factors in the judgment of

pattern complexity. Perception and Psychophysics, 38, 101–109.

Koﬀka, K. (1935). Principles of gestalt psychology. London: Lund

Humphries.

Kolmogorov, A. N. (1965). Three approaches to the quantitative

deﬁnition of information. Problems in Information Transmission, 1,

1–7.

Leeuwenberg, E. L. J. (1969). Quantitative speciﬁcation of information in

sequential patterns. Psychological Review, 76, 216–220.

Leyton, M. (1986a). A theory of information structure. I. General

principles. Journal of Mathematical Psychology, 30, 103–160.

Leyton, M. (1986b). A theory of information structure. II. A theory of

perceptual organization. Journal of Mathematical Psychology, 30,

257–305.

Li, M., & Vitanyi, P. (1997). An introduction to Kolmogorov complexity and

its applications. New York, NY: Springer.

Lordahl, D. S. (1970). An hypothesis approach to sequential prediction of

binary events. Journal of Mathematical Psychology, 7, 339–361.

Luce, R. D. (2003). Whatever happened to information theory in

psychology? Review of General Psychology, 7, 183–188.

McKay, D. (1950). Quantal aspects of scientiﬁc information. Philosophical

Magazine, 41, 289–301.

Palmer, S. (1977). Hierarchical structure in perceptual representation.

Cognitive Psychology, 9, 441–474.

Palmer, S. E. (1983). The psychology of perceptual organization: A

transformational approach. In J. Beck, B. Hope, & A. Rosenfeld

(Eds.), Human and machine vision (pp. 269–339). New York: Academic

Press.

Psotka, J. (1975). Simplicity, symmetry, and syntely. Memory and

Cognition, 3, 434–444.

Restle, F. (1970). Theory of serial pattern learning: Structural trees.

Psychological Review, 77, 481–495.

Royer, F. L., & Garner, W. R. (1966). Response uncertainty and

perceptual diﬃculty of auditory temporal patterns. Perception and

Psychophysics, 1, 41–47.

Shannon, C. (1948). A mathematical theory of communication. Bell

Technical Journal, 27, 623–656.

Simon, H. A. (1972). Complexity and the representation of patterned

sequences of symbols. Psychological Review, 79, 369–382.

Simon, H. A., & Kotovsky, K. (1963). Human acquisition of concepts for

sequential patterns. Psychological Review, 70, 534–546.

Solomonoﬀ, R. J. (1964). A formal theory of inductive inference, part 1

and part 2. Information and Control, 7, 224–254.

van der Helm, P. A. (2000). Simplicity versus likelihood in visual

perception: From surprisals to precisals. Psychological Bulletin, 126,

770–800.

van der Helm, P. A., van Lier, R. J., & Leeuwenberg, E. L. J. (1992). Serial

pattern complexity: Irregularity and hierarchy. Perception, 21,

517–544.

Vitz, P. C. (1968). Information, run structure and binary pattern

complexity. Perception and Psychophysics, 3, 275–280.

Vitz, P. C., & Todd, R. C. (1967). A model of learning for simple

repeating binary patterns. Journal of Experimental Psychology, 75,

108–117.

Vitz, P. C., & Todd, R. C. (1969). A coded element model of the

perceptual processing of sequential stimuli. Psychological Review, 76,

433–449.

Weyl, H. (1952). Symmetry. Princeton, CA: Princeton University Press.

Wolfram, S. (2001). The new kind of science. Champaign, IL: Wolfram

Media.

Yodogawa, E. (1982). Symmetropy, an entropy-like measure of visual

symmetry. Perception and Psychophysics, 32, 230–240.

16 A. Aksentijevic, K. Gibson / Cognitive Systems Research 15–16 (2012) 1–16

A theory of memory for binary sequences: Evidence for a mental compression algorithm in humans

Article

Full-text available

Jan 2021
PLOS COMPUT BIOL

Working memory capacity can be improved by recoding the memorized information in a condensed form. Here, we tested the theory that human adults encode binary sequences of stimuli in memory using an abstract internal language and a recursive compression algorithm. The theory predicts that the psychological complexity of a given sequence should be proportional to the length of its shortest description in the proposed language, which can capture any nested pattern of repetitions and alternations using a limited number of instructions. Five experiments examine the capacity of the theory to predict human adults’ memory for a variety of auditory and visual sequences. We probed memory using a sequence violation paradigm in which participants attempted to detect occasional violations in an otherwise fixed sequence. Both subjective complexity ratings and objective violation detection performance were well predicted by our theoretical measure of complexity, which simply reflects a weighted sum of the number of elementary instructions and digits in the shortest formula that captures the sequence in our language. While a simpler transition probability model, when tested as a single predictor in the statistical analyses, accounted for significant variance in the data, the goodness-of-fit with the data significantly improved when the language-based complexity measure was included in the statistical model, while the variance explained by the transition probability model largely decreased. Model comparison also showed that shortest description length in a recursive language provides a better fit than six alternative previously proposed models of sequence encoding. The data support the hypothesis that, beyond the extraction of statistical knowledge, human sequence coding relies on an internal compression using language-like nested structures.

Natural language syntax complies with the free-energy principle

Article

Full-text available

May 2024
SYNTHESE

Natural language syntax yields an unbounded array of hierarchically structured expressions. We claim that these are used in the service of active inference in accord with the free-energy principle (FEP). While conceptual advances alongside modelling and simulation work have attempted to connect speech segmentation and linguistic communication with the FEP, we extend this program to the underlying computations responsible for generating syntactic objects. We argue that recently proposed principles of economy in language design—such as “minimal search” criteria from theoretical syntax—adhere to the FEP. This affords a greater degree of explanatory power to the FEP—with respect to higher language functions—and offers linguistics a grounding in first principles with respect to computability. While we mostly focus on building new principled conceptual relations between syntax and the FEP, we also show through a sample of preliminary examples how both tree-geometric depth and a Kolmogorov complexity estimate (recruiting a Lempel–Ziv compression algorithm) can be used to accurately predict legal operations on syntactic workspaces, directly in line with formulations of variational free energy minimization. This is used to motivate a general principle of language design that we term Turing–Chomsky Compression (TCC). We use TCC to align concerns of linguists with the normative account of self-organization furnished by the FEP, by marshalling evidence from theoretical linguistics and psycholinguistics to ground core principles of efficient syntactic computation within active inference.

Brain-imaging evidence for compression of binary sound sequences in human memory

Article

Full-text available

Nov 2023
eLife

According to the language-of-thought hypothesis, regular sequences are compressed in human memory using recursive loops akin to a mental program that predicts future items. We tested this theory by probing memory for 16-item sequences made of two sounds. We recorded brain activity with functional MRI and magneto-encephalography (MEG) while participants listened to a hierarchy of sequences of variable complexity, whose minimal description required transition probabilities, chunking, or nested structures. Occasional deviant sounds probed the participants’ knowledge of the sequence. We predicted that task difficulty and brain activity would be proportional to the complexity derived from the minimal description length in our formal language. Furthermore, activity should increase with complexity for learned sequences, and decrease with complexity for deviants. These predictions were upheld in both fMRI and MEG, indicating that sequence predictions are highly dependent on sequence structure and become weaker and delayed as complexity increases. The proposed language recruited bilateral superior temporal, precentral, anterior intraparietal, and cerebellar cortices. These regions overlapped extensively with a localizer for mathematical calculation, and much less with spoken or written language processing. We propose that these areas collectively encode regular sequences as repetitions with variations and their recursive composition into nested structures.

Physics of Complex Systems: Discovery in the Age of Gödel

Book

Jul 2023

Task-based differences in brain state dynamics and their relation to cognitive ability

Article

Mar 2023
NEUROIMAGE

Transient patterns of interregional connectivity form and dissipate in response to varying cognitive demands. Yet, it is not clear how different cognitive demands influence brain state dynamics, and whether these dynamics relate to general cognitive ability. Here, using functional magnetic resonance imaging (fMRI) data, we characterised shared, recurrent, global brain states in 187 participants across the working memory, emotion, language, and relation tasks from the Human Connectome Project. Brain states were determined using Leading Eigenvector Dynamics Analysis (LEiDA). In addition to the LEiDA-based metrics of brain state lifetimes and probabilities, we also computed information-theoretic measures of Block Decomposition Method of complexity, Lempel-Ziv complexity and transition entropy. Information theoretic metrics are notable in their ability to compute relationships among sequences of states over time, compared to lifetime and probability, which capture the behaviour of each state in isolation. We then related task-based brain state metrics to fluid intelligence. We observed that brain states exhibited stable topology across a range of numbers of clusters (K=2:15). Most metrics of brain state dynamics, including state lifetime, probability, and all information theoretic metrics, reliably differed between tasks. However, relationships between state dynamic metrics and cognitive abilities varied according to the task, the metric, and the value of K, indicating that there are contextual relationships between task-dependent state dynamics and trait cognitive ability. This study provides evidence that the brain reconfigures across time in response to cognitive demands, and that there are contextual, rather than generalisable, relationships among task, state dynamics, and cognitive ability.

Physics of complex systems Discovery in the age of Gödel

Preprint

Jan 2023

This book will be published by Taylor&Francis in 2023. Content Preface v 1. Prolegomenon 1 1.1. The generality of physics 1 1.2. Physics: A crisis that has been lasting for a century! Is that really so? 5 1.3. Complex systems in physics 13 1.4. Physics and mathematics walk together along a narrow path 16 2. Gödel’s incompleteness theorems and physics 31 2.1. Gödel’s biography and historical background of incompleteness theorems 31 2.2. An informal proof of Gödel’s incompleteness theorems of formal arithmetic 35 2.3. Gödel’s incompleteness theorems as a metaphor. Real possibilities and misrepresentation in their applications 42 2.4. Gödel’s work in physical problems and computer science 45 3. Time in physics 54 3.1. Time in philosophy and physics. Beyond Gödel’s time 54 3.2. Does the quantum of time exist? 59 3.3. Continuous and discrete time 62 3.4. Time in complex systems 67 4. Are model and theory synonymous in physics? Between epistemology and practice 82 4.1. Some background concepts and epistemology 82 4.2. Choice in model building 86 4.3. The discrete versus continuous dichotomy: Time and space in model building 90 4.4. The predictability of complex systems. Lyapunov and Kolmogorov time 93 4.5. Chaos in environmental interfaces in climate models 98 5. How to assimilate hitherto inaccessible information? 107 5.1. The physicality, abstractness, and concept of information 107 5.2. The metaphysics of chance (probability) 110 5.3. Shannon information. The triangle of the relationships between energy, matter, and information 114 5.4. Rare events in complex systems: What information can be derived from them? 118 5.5. Information in complex systems 122 6. Kolmogorov and change complexity and their applications to physical complex systems 132 6.1. Kolmogorov complexity: An incomputable measure and Lempel-Ziv algorithm 132 6.2. Change complexity: A measure that detects change 136 6.3. Kolmogorov complexity in the analysis of the LIGO signals and Bell’s experiments 141 6.4. Change complexity in the search for patterns in river flows 149 7. The separation of scales in complex systems. “Breaking” point at the time scale 160 7.1. The generalization of scaling in Gödel’s world. Scaling in phase transitions and critical phenomena 160 7.2. The separation of scales and capabilities of the renormalization group 166 7.3. A phase transition model example: The longevity of the Heisenberg model 174 7.4. Complexity and time scale. The “breaking” point with an experimental example 178 8. The representation of the randomness and complexity of turbulent flows 194 8.1. The randomness of turbulence in fluids 194 8.2. The representation of the randomness and complexity of turbulent flows with Kolmogorov complexity 199 8.3. The complexity of coherent structures in the turbulent mixing layer 205 8.4. Information measures describing the river flow as a complex natural fluid system 211 9. The physics of complex systems and art 221 9.1. An attempt to grasp the complexity of the human brain 221 9.2. The dualism between science and art 228 9.3. Perception: Change complexity in psychology 232 9.4. Entropy, change complexity, and Kolmogorov complexity in observing differences in painting 238 10. The modeling of complex biophysical systems 251 10.1. The role of physics in the modeling of the human body’s complex systems 251 10.2. The stability of the synchronization of intercellular communication in the tissue with the closed contour arrangement of cells 258 10.3. The instability of the synchronization of intercellular communication in the tissue with a closed contour arrangement of cells: a potential trigger for autoimmune disorders 263 10.4. The search for information in brain disorders 269 Appendix A 281 Appendix B 284 Short abstract Ch1 is a discursive introduction to the book about the current state of physics considering complex systems (CSs) through the relationship between physics and mathematics. Ch2 deals with Kurt Gödel's background giving informal proof of his incompleteness theorems (ITs) and misconceptions about applying them in physics. Ch3 deals with issues in philosophy regarding time. We shortly outlined the understanding of time in physics since, in CSs, it operates concurrently at different scales. Ch4 is devoted to models in physics considering model choice, continuous-time versus discrete-time in model building, model predictability (Lyapunov time), and chaos in climate models. Ch5 discusses information and its relation to physics, addressing the following aspects of information: physicality, abstractness, concept, metaphysics of chance, and information in CSs. In Ch6 are elaborated two complexity measures that are used for the analysis of Bell's and the LIGO signals and environmental fluid flows. In Ch7 (i) we set one view on the separation of scales in CSs as a reflection of Gödel's ITs, (ii) we pointed out the limits of the renormalization group related to the separation of scales, (iii) we emphasized a need for new mathematics for scaling in CSs introducing the "breaking" point on a time scale. Ch8 discusses randomness in turbulent flows and its quantification via complexity and considers information measures suitable for its description. We elaborate on the dualism between physics and art in Ch9, emphasizing the place of the physics of CSs in creating an impression about a picture through perception analyzed with change complexity and the recognition of order and disorder with entropy. In Ch10 are presented the contributions of the physics of CSs to medical science (intercellular communication, autoimmune diseases, and brain disorders.)

Compression of binary sound sequences in human working memory

Preprint

Full-text available

Oct 2022

According to the language of thought hypothesis, regular sequences are compressed in human working memory using recursive loops akin to a mental program that predicts future items. We tested this theory by probing working memory for 16-item sequences made of two sounds. We recorded brain activity with functional MRI and magneto-encephalography (MEG) while participants listened to a hierarchy of sequences of variable complexity, whose minimal description required transition probabilities, chunking, or nested structures. Occasional deviant sounds probed the participants’ knowledge of the sequence. We predicted that task difficulty and brain activity would be proportional to minimal description length (MDL) in our formal language. Furthermore, activity should increase with MDL for learned sequences, and decrease with MDL for deviants. These predictions were upheld in both fMRI and MEG, indicating that sequence predictions are highly dependent on sequence structure and become weaker and delayed as complexity increases. The proposed language recruited bilateral superior temporal, precentral, anterior intraparietal and cerebellar cortices. These regions overlapped extensively with a localizer for mathematical calculation, and much less with spoken or written language processing. We propose that these areas collectively encode regular sequences as repetitions with variations and their recursive composition into nested structures.

Algorithmic Complexity in Cognition

Chapter

May 2022

In this chapter, we review a series of topics relevant to psychological science in which the Algorithmic Complexity of Short Strings (ACSS), as estimated using the methods described in the first part of this book, proved useful. These topics are remarkably diverse, including fields such as development, working memory, reasoning, aesthetic preferences, visual cognition, randomness perception and production, language evolution [1], and even belief in conspiracy theories [2].

A novel approach to the study of spatio-temporal brain dynamics using change-based complexity

Article

Jun 2021
APPL MATH COMPUT

Brain complexity and neural oscillations are of vital significance for understanding brain dynamics. Although widely used in the quantification of complex behavior of the brain, standard complexity measures are hampered by some theoretical and implementation issues which limit their applicability to the study of nonlinear brain signals. Further, they have been used for spatial analysis but are insensitive to the temporal structure of data and not applicable to short EEG segments. We demonstrated the ability of a change-based complexity measure (Aksentijevic-Gibson complexity) to analyze neural activity in both spatial and temporal domains. We examined EEG recordings from a group of adolescents with schizophrenia and an age-matched group of healthy controls. AG revealed that the spatial complexity was significantly lower in the patient group across all electrodes with the lowest complexity at the frontal area of the brain. By focusing on outer frontal regions and temporal data averaged over schizophrenia adolescents, we observed periodic drops in complexity and low-delta oscillations at the left electrode F7 and numerous irregularly placed drops at the right side (F8). Periodicity was caused by zero complexity troughs in several individuals associated with large voltage declensions and connected with hypofrontality supporting left frontal pathology. Both averaged spatial and temporal complexities of healthy and schizophrenia adolescents were significantly different from the complexities of random sequences confirming that neural activity is more regular than a true random process. As a final step, we performed the analysis of short EEG segments for the two subjects (one control, one patient) in order to search for differences in neural oscillations. The examination revealed a clear 33 Hz gamma rhythm in a healthy participant and only a fast beta rhythm of 27 Hz in a patient as well as the presence of a full range of rhythms in a healthy EEG and only theta oscillations in that of a patient. The results support the large body of work which associates schizophrenia with lowered complexity, low-frequency rhythms and the scarcity of high-frequency oscillations.

Article

Apr 2021
J HYDROL

A streamflow time series contains a large quantity of information and finding the instruments that are capable of accessing this information remains an important task. Using the Kolmogorov complexity (KC) and its derivatives, it has previously been shown that the degree of randomness in streamflow partly depends on its orographic characteristics. This paper applied a change-based complexity measure (Aksentijevic-Gibson complexity—AG) to investigate the spatiotemporal regularities of monthly streamflow of seven rivers from Bosnia and Herzegovina over a 20-year period (1965-1986), and evaluate their hydrological effects. A unique advantage of this complexity measure is its ability to quantify large-scale complexity and to zoom in on low-level structure and detect patterns that are normally inaccessible. Ten time series created from the seven rivers were examined and temporal and structural complexity profiles of different streamflows and annual periodicities associated with bimodal/mixed flow regimes were obtained. The AG complexity can be employed as a standard instrument for the analysis of hydrological data and it holds the promise of uncovering patterns and rhythms in the data that cannot be captured by KC and other complexity measures.