ArticlePDF Available

Word Classes

November 2007
Language and Linguistics Compass 1(6):709-726

November 2007
1(6):709-726

DOI:10.1111/j.1749-818X.2007.00030.x

Authors:

Aarhus University

This article provides an overview of recent literature and research on word classes, focusing in particular on typological approaches to word classification. The crosslinguistic classification of word class systems (or parts-of-speech systems) presented in this article is based on statements found in grammatical descriptions of some 50 languages, which together constitute a representative sample of the world’s languages. It appears that there are both quantitative and qualitative differences between word class systems of individual languages. Whereas some languages employ a parts-of-speech system that includes the categories verb, noun, adjective and adverb, other languages may use only a subset of these four lexical categories. Furthermore, quite a few languages have a major word class whose members cannot be classified in terms of the categories verb–noun–adjective–adverb, because they have properties that are strongly associated with at least two of these four traditional word classes (e.g. adjective and adverb). Finally, this article discusses some of the ways in which word class distinctions interact with other grammatical domains, such as syntax and morphology.

Basic classification of parts-of-speech (PoS) systems (based on Hengeveld 1992; adverb, manner adverb).

…

Figures - uploaded by Jan Rijkhoff

Content may be subject to copyright.

Content uploaded by Jan Rijkhoff

Content may be subject to copyright.

Language and Linguistics Compass 1/6 (2007): 709–726, 10.1111/j.1749-818x.2007.00030.x

Word Classes

Jan Rijkhoff*

University of Aarhus

Abstract

This article1 provides an overview of recent literature and research on word classes,

focusing in particular on typological approaches to word classification. The cross-

linguistic classification of word class systems (or parts-of-speech systems) presented

in this article is based on statements found in grammatical descriptions of some

50 languages, which together constitute a representative sample of the world’s

languages. It appears that there are both quantitative and qualitative differences

between word class systems of individual languages. Whereas some languages

employ a parts-of-speech system that includes the categories verb, noun, adjective

and adverb, other languages may use only a subset of these four lexical categories.

Furthermore, quite a few languages have a major word class whose members

cannot be classified in terms of the categories verb–noun–adjective–adverb, because

they have properties that are strongly associated with at least two of these four

traditional word classes (e.g. adjective and adverb). Finally, this article discusses

some of the ways in which word class distinctions interact with other grammatical

domains, such as syntax and morphology.

1. Introduction

Differentiation between word classes can be regarded as an instance of one

of the most fundamental traits of human cognition: putting people or

things, but also more abstract entities such as words, into groups on the

basis of certain shared characteristics (categorization). Traditionally, the

following ten word classes are distinguished: verb (sit, go, read, etc.), noun

(dog, tree, table, etc.), adjective (blue, cheap, nice, etc.), adverb (here, today,

well, often; more on adverbs below), preposition (in, on, below, before, after,

etc.),2 numeral (one, two, etc.), article (the, a/an), pronoun (you, they;

someone, anyone; who, whose etc.), conjunction (and, or; if, because, etc.)

and interjection (shh, oh no, phew, hey, hmm, etc.). This categorization

is, however, rather biased towards word classes in the familiar European

languages and several typological studies have suggested that the traditional

set of categories mentioned above needs to be revised so as to be able to

account for certain word classes attested in other, often more ‘exotic’

languages (see, for example, Kuipers 1968; Broschart 1997).

710 Jan Rijkhoff

The term ‘word class’ is used in two ways. In the wide sense it covers

both grammatical and lexical word classes, in the narrow sense it only includes

lexical word categories. This article is mostly concerned with lexical

word classes in the languages of the world (parts -of-speec h system s).

Words that belong to a grammatical word class (also called ‘function

words’ or ‘empty words’) have little or no identifiable meaning and belong

to a closed paradigm with very few members (i.e. there are few other

words that belong to the same word class). articles, various kinds of

pronouns and conjunctions are all examples of grammatical words.

(1) Definition of the grammatical word ‘article’ according to the Long-

man Dictionary of Contemporary English (LDCE):

a word used before a noun to show whether the noun refers to a particular

example of something or to a general example of something. In English, ‘the’

is called the definite article and ‘a’ and ‘an’ are called the indefinite article.

lexical words, on the other hand, have a definable semantic content

(hence, they are also referred to as ‘content words’) and typically belong

to an open word class, that is, the number of words belonging to a lexical

word class is not fixed. English has four major lexical words classes: verbs,

nouns, adjectives and adverbs. In (2) all major lexical word classes are

represented.

(2) The BrazilianAdj studentN workedV hardAdv.

Often a lexical word has several meanings (senses), as is shown in the

definition of the noun ‘tree’:

(3) Definition of the lexical word ‘tree’ (LDCE):

1. a very tall plant that has branches and leaves, and lives for many years;

2. a drawing that connects things with lines to show how they are

related to each other.

It can be shown for many grammatical words, however, that they originated

as lexical words or phrases (a phenomenon known as grammaticalization;

Heine et al. 1991; Hopper and Traugott 2003). For example, verbs are often

the source of prepositions (e.g. Yoruba

∫

iV ‘use’ →

∫

iPrep ‘with’), comple-

mentizers (e.g. Ewe béV ‘say’ → béCompl ‘that’, as in ‘He saw that ...’),

subordinators or quotation markers (Lord 1989: 307–8; Heine and Kuteva

2002: 263), but as the grammaticalization of a lexical word or phrase does

not happen overnight, there is a long period in which it is not always

possible to draw a hard and fast line between lexical and grammatical words

(on gradience, see, for example, Sasse 2001; Aarts 2007). In English,

auxiliary verbs are good examples of words that fall somewhere between

the lexical and the grammatical end of the word spectrum:

(4) I am going to go to Amsterdam in an hour (or: I’m GOnna GO to Amsterdam

in an hour)

Word Classes 711

Whereas the second instance of the verb go in (4) is clearly lexical (LDCE:

‘to move or travel to a place that is away from where you are or where

you live’), the first occurrence (in the phrase be going to or be gonna) is

clearly less than completely lexical and mainly serves to indicate future

time reference.

As was already mentioned at the outset, this article is concerned with

word classes in the narrow sense, that is, we will only discuss the four

major lexical word classes verb, noun, adjective and adverb.

2. Some Recent Approaches to Lexical Word Classes (Parts-of-Speech Systems)

More than 2000 years ago, words were already categorized into distinct

groups by ancient Greek philosophers, but it has proven to be a rather

difficult task to come up with definitions for the various lexical word

classes that have cross-linguistic validity (Anward et al. 1997). This is

mainly due to the fact that word classes in individual languages are often

distinguished on the basis of certain language-specific criteria, which do

not necessarily apply to other languages. For example, to say that a noun

is a word that is inflected for number is quite irrelevant for the many

languages across the globe in which number marking is absent (Rijkhoff

2004: 146–52). Notice also that even languages with nominal number

marking have nouns that do not inflect for number (cf. treeN – treesN, but

goldN – *golds). It is not possible to use purely semantic criteria either.

Although verbs are typically associated with actions or situations (i.e.

temporal entities) and nouns with things (spatial entities), there are many

nouns that are used for non-spatial entities such as events (meetingN, wed-

dingN, funeralN and gameN), feelings (loveN, hateN) and other abstract entities

(linguisticsN, politicsN). Furthermore, a concept that is lexicalized as a noun

in one language may be lexicalized as a verb in another language (Evans

2000a). Recently, however, some new ideas have been proposed to deal

with word classes across languages by Croft (1991, 2000), Anward et al.

(1997), Baker (2003) and Hengeveld (1992) (see also Hengeveld et al.

2004, Hengeveld and Rijkhoff 2005).

Croft argues that verb, noun and adjective are not categories of par-

ticular languages but rather language universals in the sense that they

constitute what he calls ‘typological prototypes’. He focuses on the con-

structions that are used for the three universally attested communicative

functions, predication, reference and modification, rather than the word

classes associated with these functions (verb, noun and adjective, respec-

tively).3 In his view (Croft 2000: 84–5, 87):

Categories in a particular language are defined by the constructions of the

language. Moreover, the constructions are the primitive elements of syntactic

representation; categories are derived from constituents. [ . . . ] Constructions

define grammatical categories.

712 Jan Rijkhoff

The range of constructions in the universal-typological theory of parts of

speech covers constructions for predication, reference and modification. Most

important, it explicitly recognises that predication, reference and modification are

pragmatic (communicative) functions or, as Searle described them, propositional

acts [ . . . ]

Croft then argues that one should focus on the unmarked combination

of pragmatic function and lexical class and that in the case of parts-of-speech

the unmarked combinations are (Croft 2000: 88):

Any other combination of pragmatic function and lexical class is marked.

Thus, the English object word vehicle is unmarked for reference but

marked for predication (be a vehicle) and modification (vehicular, vehicle’s).

Croft’s approach has much in common with the ideas proposed earlier by

Hopper and Thompson (1984), who investigated properties of word

classes from a discourse perspective. They stated that the basic categories

‘noun’ and ‘verb’ are best viewed as ‘universal lexicalizations of the pro-

totypical discourse functions of “discourse-manipulable participant” and

“reported event”, respectively’, and concluded that ‘categoriality itself is

another fundamental property of grammars which may be directly derived

from discourse function’ (Hopper and Thompson 1984: 703).

Anward et al. (1997) propose a multi-dimensional method for the

cross-linguistic study of word classes. In their view, a combination of

phonological (lexical), morphological and syntactic criteria should be used

(feature-clustering) to distinguish between word classes. In the case of

maximal part-of-speech differentiation, we find that the members of dif-

ferent word classes can be distinguished on phonological, morphological

and syntactic grounds. At the opposite extreme, in the case of no part-

of-speech differentiation, there are no phonological, morphological or

syntactic differences between words. In between these two extremes, there

are six other logical possibilities (Anward et al. 1997: 173 –8):4

(5) lexical class pragmatic function

Verb predication of an action

Noun reference to an object

Adjective modification by a property

(6) PHON MORPH SYNT

+ + + Maximal part-of-speech differentiation

+−−Exclusively phonological part-of-

speech differentiation

−+−Exclusively morphological part-of-

speech differentiation

−− + Exclusively syntactic part-of-speech

differentiation

++ −Non-syntactic part-of-speech

differentiation

Word Classes 713

This template would then provide the basis of more complex parts-of-

speech systems, also allowing for subclasses, superclasses and overlap. Addition-

ally, it should be investigated how the various parts of speech cluster in a

language (e.g. Are there languages with nouns but without verbs?). The

authors point out that if we were to take into account all dimensions of

variation (notably the possibility of allowing for overlap and partial or total

inclusion), we should find an ‘enormous spectrum of logically possible

part-of-speech-related variation among languages.’ Because it is unlikely

that all logical possibilities actually occur, they state that it is the task of

typology to establish the limits on the cross-linguistic variation of parts-

of-speech systems (Anward et al. 1997: 179).

Baker’s (2003) proposals are formulated within Chomsky’s generative

grammar. In this syntactocentric approach, verbs are characterized as

words that always project a subject (or rather specifier in generative ter-

minology). Nouns are characterized by the fact that they bear a referential

index; they are the only words about which it makes sense to ask whether

or not its referent [or rather the referent of the phrase it is the head of,

as only noun phrases (NPs) can be used to refer] is identical with another

entity. Adjectives, finally, are regarded as the unmarked lexical category:

they lack both a specifier and a referential index. In Baker’s view, the

verb–noun–adjective distinction is universal, leaving no room for languages

whose parts-of-speech system deviates from the traditional lexical categories

that have dominated European linguistics since antiquity.

In Hengeveld’s approach crucial reference is made to the function(s) that

a lexical item can fulfill in certain linguistic structures without the speaker

having to resort to special grammatical measures such as relative clause

formation, as in (9) or (14), or the use of medial verb constructions, as in

(15) below. Here are Hengeveld’s (slightly reformulated) definitions of the

four categories verb, noun, adjective and (manner) adverb:

a verb is a word that can ONLY be used as the head of a clause (or rather ‘predicate

phrase’ in Hengeveld’s terminology; see Figure 2);

a noun is a word that can be used as the head of a noun phrase or ‘NP’ (called

‘referential phrase’ in the original publication; more on the label ‘noun phrase’

below);

an adjective is a word that can be used as a modifier of the head of a noun phrase;

a manner adverb is a word that can be used as a modifier of the head of a clause.5

(Hengeveld 1992: 68; see also Hengeveld and Rijkhoff 2005: 406–7)

The reason Hengeveld restricts himself to manner adverbs (such as hard

in She works hard) is that other kinds of adverbs, such as yesterday or hopefully,

−+ + Non-phonological part-of-speech

differentiation

+−+ Non-morphological part-of-speech

differentiation

−− −No part-of-speech differentiation

714 Jan Rijkhoff

do not modify the head of the clause, but larger units within the sentence.

Notice that Hengeveld (contrary to Croft) claims that word classes con-

stitute true categories of particular languages, giving special status to verbs,

which are defined as words that can only occur as predicates (head of the

clause). This is because in many languages members of other word classes

can also be used as the head of a clause, but for verbs this is the only

unmarked option.

For example, in Dutch an adjective such as langAdj ‘long, tall’ requires

the presence of a copula (i.e. an extra measure is necessary; in the example

below a form of zijn ‘be’) when it appears as the head of the clause:

Dutch

Because many other languages do not require an extra measure in the

case of a non-verbal predicate, Hengeveld’s definition of adjectives and other

non-verbal word classes (nouns, adverbs) leaves open the possibility that

they can also be used as the head of the clause without extra measures, as

in Mojave (a native American language of the Yuman family, spoken in

Arizona and California):

Mojave (Schachter 1985: 19)

This does not mean that Mojave does not distinguish between verbs and

adjectives. The following examples show that verbs but not adjectives must be

relativized (REL) when they modify the head of the NP (Hengeveld 1992: 75):

Mojave (Schachter 1985: 19)

The next section gives an overview of lexical parts-of-speech systems

in the languages of the world.

3. Parts-of-Speech Systems in the Languages of the World

The overview provided in the current section focuses on two important

aspects of parts-of-speech systems: (i) the number of lexical word classes in a

(7) [Die manN]NP isCop langAdj

that manNbe:3.SG.PRES tallAdj

‘That man is tall’

(8) [

i:paN-c]NP homi:Adj-k

manN-SUBJ tallAdj-PRES

‘The man is tall’

(9) [

i:paNkw-su:pawV-ny-c]NP . . .

manNREL-knowV-DEM-SUBJ . . .

‘The man who knows . . .’

(10) [

i:paNhomi:Adj-ny-c]NP . . .

manNtallAdj-DEM-SUBJ . . .

‘The tall man . . .’

Word Classes 715

language (a quantitative consideration), and (ii) the nature of a lexical word

class, that is, whether its members are ‘rigid’ or ‘flexible’ (a qualitative

consideration).

3.1. LANGUAGE S WITH A SINGLE LEXICAL WORD CLASS

Whether or not there are languages with just one lexical word class is

a controversial issue, which is at least partly due to the fact that such

languages are so unlike the familiar, well-studied languages of Western

Europe and contradict the widely accepted claim that all languages have

distinct classes of nouns and verbs.6 Never theless, several linguists working

with lesser known languages have argued that there are indeed languages

with only one lexical word class, which comes in two varieties. On the

one hand, it has been claimed that certain languages only have verbs and

that reference to participants in an event is achieved by clause-like con-

structions, as in this example from Cayuga (an American Indian language

from the Iroquoian family):

Cayuga (Iroquoian; Sasse 1993b: 657)

The literal meaning, however, would be something like (Sasse 1993b:

657): ‘It became lost to him, it is his wallet, he is this man’ or ‘it losted

him, it walle t s him, the one who mans.’

On the other hand, there are languages that are deemed to have a

single lexical category whose members are extremely flexible in that

the same word can fulfill all the major lexical functions (head of clause,

head of NP, modifier of head of clause or modifier of head of NP)

without requiring any special measures. Thus, in Samoan (a Polynesian

language) ‘there are no lexical or grammatical constraints on why a

particular word cannot be used in the one or the other function’ (Mosel

and Hovdhaugen 1992: 73–4, 77; see also, for example, Broschart 1997

on Tongan):

. . . the categorization of full words is not given a priori in the lexicon. [ . . . ]

It is only their actual occurrence in a particular environment which gives them

the status of a verb or a noun. [ . . . ] What is given in the lexicon is not a

particular word class assignment, but the potential to be used in certain syn-

tactic environments as a noun or a verb.

[ . . . ] all full words which function as noun and verb phrase nucleus can also

be used as attributive modifiers.

The following examples show that la can occur both as the head of an

NP (12) and the head of a clause (13):

(11) a-hó-hto:’ ho-tkwe’t-a’ n e:kyÈh-okweh

PAST-it:to_him-become_lost it:him-wallet-be this he:it-man

‘This man lost his wallet’

716 Jan Rijkhoff

Samoan (Austronesian; Mosel and Hovdhaugen 1992: 80)

3.2. LANGUAGE S WITH TWO LE XICAL WORD CLASSES

Languages that have two lexical word classes also come in the two varieties

rigid and flexible. Many languages only have distinct classes of verbs and

nouns; adjectival notions and manners are expressed in various ways. For

example, Galela (a Papuan language spoken on the island of Halmaheira)

has verbs and nouns but no separate class of adjectives. Instead it uses a

construction headed by a participialized verb to express a concept such as

English ‘big’ (notice that the participle is formed by reduplicating the first

syllable of the verb lamo ‘be big’).

Galela (van Baarda 1908: 35)

The Australian language Ngiyambaa also seems to have just two major

lexical word classes. Like Galela, it has a class of true verbs, but in Ngiymabaa

the other word class does not consist of nouns, but comprises a group of

words that can serve both as the head of an NP and as a modifier. Even

though members of this word class do not all behave in exactly the same

way morphologically (cf. English, where not all nouns can occur in the

plural), the author sees no reason to distinguish more classes, because such

a distinction ‘would serve no descriptive purpose elsewhere in the grammar’

(Donaldson 1980: 70).

3.3. LANGUAGE S WITH THRE E LEXI CAL WORD CLASSES

Wambon (a Papuan language spoken in Southern Irian Jaya, Indonesia)

has separate classes, verbs, nouns and adjectives, but apart from one or two

exceptions it has no (manner) adverbs. Instead it employs medial verb

constructions (de Vries 1989: 49):

The category of manner adverbs can be so marginal because Wambon prefers

to use medial verbs as modifiers of other verbs in serial verb constructions in

which the modifying verb immediately precedes the modified verb. [...] Very

often the medial verbs specifying manner, are verbs which are derived from

adjectives by -mo [ . . . ].

(12) ‘Ua malosi le la

PERF strong ART sun

‘The sun is strong.’ (lit. ‘The sun strongs’)

(13) ‘Ua lale aso.

PERF sun ART day

‘The sun is shining today.’ (lit. ‘The day suns’)

(14) awi dohu i lalamo

his foot it be_big:PRT

‘his big foot’

Word Classes 717

For example, in (15) the verb matetmo ‘be good’ is derived from the

adjective matet ‘good’:

Wambon (de Vries 1989: 49)

Ngiti, a Nilo-Saharan language (Central Sudanic branch) spoken in Zaire,

also has three major word classes, but in addition to verbs and nouns it

has a word class that combines the modifier function of adjectives and

adverbs in other languages (Kutsch Lojenga 1994: 336):

There is no morphological nor a clear syntactic distinction between a class of

adjectives and a class of adverbs in Ngiti. The functional term modifiers is

therefore used [...] to cover a fairly large . . . class of words, containing about

150 items, which are neither nouns nor verbs and which all have a modifying

function in relation to different constituents.

In (16) the word àn

is used adjectivally to modify a noun meaning

‘light (of weight)’, whereas in (17) it serves as a manner adverb (modifying

the verb carry) meaning ‘easily, without effort’.

Ngiti (Kutsch Lojenga 1994: 338)

3.4. LANGUAGES WITH FOUR LEXICAL WORD CLASSES

Languages with four distinct lexical classes (verb, noun, adjective, adverb)

are attested across the globe, English being one of them. Adverbs differ

from the other word classes in that it is generally not possible to establish

which members of this word class are more or less prototypical. Instead

we find a variety of subtypes, each of which is equally ‘representative’ for

the word class adverb.

In English many adverbs are actually derived from adjectives: beautifulAdj

– beautifullyAdv, nice – nicely, polite – politely.

(15) Jakhov-e matet-mo ka-lembo?

they-CN good-SUPP.SS go-3PL.PAST

‘Did they travel well?’

(16) ngbángba n6-ìtdù 6s

àn

child RSM-carry:PERF.PRES light load

‘The child carried a light load’

(17) 6s

ngbángba n6-ìtdù àn

light child RSM-carry:PERF.PRES load

‘The child carried the load easily’

(18) a. I will give you the keys tomorrow.[tomorrow = adverb of time]

b. She often goes swimming

in the morning.

[often = adverb of frequency]

c. Does she speak German well? [well = manner adverb]

d. He was not there. [there = adverb of place]

718 Jan Rijkhoff

3.5. CLASSIFICATION OF PARTS -OF-SPEE CH SYSTEM S

The cross-linguistic overview of parts-of-speech systems provided above

is based on statements in language descriptions rather than ideas put

forward in theoretical discussions. Because Hengeveld (1992) collected

information about parts-of-speech systems (henceforth PoS systems) in a

representative sample of the world’s languages, it is hardly surprising that

his classification of PoS systems covers all the PoS systems exemplified in

Sections 3.1–3.4 (recall that we are only taking into account manner

adverbs here). It must be emphasized, however, that the various types of

PoS systems in Figure 1 should be regarded as reference points on a scale

rather than distinct categories. Because languages are dynamic entities,

they can only approximate the ideal types in this classification.

Hengeveld makes a distinction between language with a flexible PoS

system and languages with a rigid PoS system. In languages with a flexible

PoS system, some or all of the functions that are typically associated with

the four traditional (rigid) lexical categories are performed by members

of the same word class (Types 1–3). In languages with a rigid PoS system

(Types 4–7), these functions are distributed over distinct, non-overlapping

groups of words (for details see Hengeveld et al. 2004). The three flexible

word classes are called contentive, non-verb and modifier. A contentive

is the most flexible kind of lexical item. It can be used in all major lexical

functions: head of a clause, head of an NP, modifier of the head of the

NP and modifier of the head of the clause. Notice, however, that strictly

speaking there is no NP in languages with contentives (Type 1) or non-

verbs (Type 2), as the head of the phrase is not a true noun but a member

of a flexible word class. Members of the flexible class modifier (Type 3)

can be used to modify the head of a clause or the head of an NP, but for

the other lexical functions the language employs members of specialized

or rigid word classes (verbs and nouns).

A flexible word class is not some kind of union of two or more rigid

word classes, but a distinct category in itself, just like a rigid word class.

Fig. 1. Basic classification of parts-of-speech (PoS) systems (based on Hengeveld 1992; adverb,

manner adverb).

Word Classes 719

Perhaps the main semantic difference between rigid and flexible word

classes resides in the fact that members of a flexible word class seem to

lack properties that are highly characteristic for members of a rigid word

class. In other words, the fact that a contentive can be used in verbal,

nominal and adjectival function does not imply that it combines the

typical properties of verb, nouns and adjectives. Rather, a contentive is

neither a verb nor a noun or an adjective, precisely because it lacks the

characteristic properties of these rigid word classes (Rijkhoff 2003, forth-

coming). For example, Samoan (Figure 2: Type 1) has no words that are

coded as transitive, a prototypical verbal feature (see also below).

Hengeveld’s classification indicates that rigid word classes adhere to the

following hierarchy (Hengeveld 1992: 68):

(19) Parts-of-speech hierarchy: rigid word classes

Verb > Noun > Adjective > (manner) Adverb

According to this hierarchy, a language that employs members of a

certain rigid word class ‘down’ the hierarchy (e.g. adjective) must also

employ members of the rigid word classes ‘up’ the hierarchy (verb, noun).

How can we explain this hierarchy? Notice, for example, that some lan-

guages have distinct classes of nouns and adjectives (Type 3 in Figure 2),

whereas other languages have nouns but no adjectives (Type 4). Interest-

ingly, it seems that adjectives are only attested in languages whose nouns

denote a property that is specified as having a spatial outline (shape),

which means that these nouns do not require some extra measure (such

as the employment of a numeral classifier) to make them countable.7

Dutch

In contrast, Thai has no adjectives and its transnumeral nouns do not

seem to denote properties that are specified for a spatial contour (shape):

‘[Thai nouns] purely denote concepts and, for this reason, are incompatible

with direct quantification’ (Hundius and Kölver 1983: 166).8 In other

words, the concepts denoted by Thai nouns first need to be ‘individuated’

(by a numeral classifier) before they can be counted.

Thai (Hundius and Kölver 1983: 172; CLF = numeral classifier)

There is also evidence to suggest that a language can only have a true

(rigid) class of nouns if it also has a group of words that are coded as being

transitive (i.e. verbs).9 For example, all languages with nouns have transitive

(20) drieNum oudeAdj paraplu-s

three old umbrella-PL

‘three old umbrellas’

(21) rôm saam khan

umbrella(s) three CLF:long, handled object

‘three umbrellas’

720 Jan Rijkhoff

words, but they are claimed to be absent in Samoan, which does not

distinguish between nouns and verbs (Mosel 1991: 188; Mosel and

Hovdhaugen 1992: 724).

Apparently, a language can only have distinct classes of verbs, nouns and

adjectives if the words in that language somehow encode the prototypical

properties of temporal and spatial entities, that is, events and things

(Rijkhoff 2003). The prototypical event is an activity that involves an

agent and a patient; the prototypical thing is a concrete object. In other

words, it seems that a language can only have major, distinct classes of

verbs, nouns and adjectives if the lexicon contains (i) words that designate

a dynamic relationship between an agent and a patient (+transitive), and

(ii) words that designate a property that is specified as having a boundary

in the spatial dimension (+shape).

(22) Necessary conditions in the hierarchy of rigid word classes:

So far no attempts have been made to explain why some languages have

a distinct class of (manner) adverbs, whereas others do not (hence the question

mark in the hierarchy above), but it has been suggested that the adjectival

feature gradable might play a role here (Rijkhoff forthcoming).10

It was already mentioned at the outset that claims about languages with

a single lexical word class, whether flexible (Type 1) or rigid (Type 7), are

controversial (Section 3.1). For example, Mithun (2000: 397) has argued

that Cayuga, Tuscarora and other Iroquioan languages do distinguish

between verbs and nouns. In her view, words that can be identified as

verbs on morphological grounds, function as nouns semantically or syn-

tactically. The situation is complicated by the fact that verbs have been

lexicalized in different degrees:

Some morphological verbs have been so fully lexicalized as nominals that

speakers no longer use them as predicates and may even be unaware of their

literal verbal meanings. Others are never used as nominals. Still others have

two uses, one as a referential nominal, one as a predicate. (Mithun 2000: 419)

If it is true that Cayuga has a major class of verbs and a minor class of

nouns, this would mean its PoS system falls somewhere between Types 6

and 7 at the rigid end of the scale.

As to languages at the flexible end of the scale (Type 1 or 2), the most

detailed discussion of such a language concerns Mundari, an Austroasiatic

language of India (Munda family). Whereas Evans and Osada (2005) have

tried to show that Mundari has distinct classes of verbs and nouns (and a

closed adjective class), Hengeveld and Rijkhoff (2005) have argued that

Mundari is one of the languages with a single class of contentives, at least

VERB →NOUN →ADJECT IVE →MANN ER

ADVER BS

[+TRANSITIVE][+SHAPE][+GRADABLE?]

Word Classes 721

as far as its basic, non-derived words are concerned. They add that the

Mundari PoS system would occupy a position between Types 1 and 2 if

one also takes into account derived words, which can be used for all major

lexical functions, except head of the clause (i.e. derived words are non-verbs).

Details such as these illustrate the usefulness of a more refined classifi-

cation of PoS systems, as shown in Figure 2. The classification in Figure 1

has been expanded with two kinds of intermediate systems. A language is

considered to have an intermediate PoS system of the flexible type when

it has a word class that is compatible with two contiguous systems of the

hierarchy. This situation obtains in Mundari (Type 1/2), for example,

that has a class of derived words (non-verbs) that have fewer functional

possibilities than the class of basic words (contentives). A rigid language is

Fig. 2. Parts-of-speech systems, including intermediate types (Smit 2001; Hengeveld and

Rijkhoff 2005).

722 Jan Rijkhoff

classified as having an intermediate PoS system when the last word class in

the PoS hierarchy consists of a minor (smallish, closed) class of items. This

is true, for example, for Babungo, Bambara, Gude, Kisi and many other

sub-Saharan African languages with a minor class of adjectives, which all

have PoS systems of Type 5/6 (Rijkhoff 2004: 142).

4. Parts-of-Speech Systems and Grammar

Recent studies indicate that there are certain grammatical features that

correlate with the presence of a flexible word class in the PoS system of

a language (Hengeveld et al. 2004; Rijkhoff 2004; Hengeveld and

Rijkhoff 2005). It was already mentioned that words in a language with

an extremely flexible PoS system are not specified for the typical verbal

feature transitive. One might say that this is only to be expected: if

languages such as Samoan had words that were specified as being transitive

(denoting a dynamic action between an agent and a patient), this would

immediately characterize these words as verbs (i.e. a rigid word class8).

Neither do we expect to find words codified for grammatical number or

gender in languages without a distinct class of nouns, as this would make

the words ‘unflexible’. In other words, one may hypothesize that flexible

words in languages with a PoS system of Type 1 or 2 are always transnu-

meral and genderless. This appears to be the case for all languages with

the relevant PoS systems that we are aware of (Rijkhoff 2004: 42). Other

grammatical properties that are deemed to correlate with the flexibility in

the PoS system of a language are the absence of copulas and suppletion

(Hengeveld and Rijkhoff 2005: 421).

Finally, it has been shown on the basis of data from a representative

sample of the world’s languages that the word order possibilities of a

language are partly determined by the PoS system of that language. As

flexible words are not coded for a particular lexical function (head of

the clause, head of the NP and modifier of a head), it is often not clear

for the hearer how these words should be interpreted. It turns out

that in languages with a flexible word class, word order constraints

are used to signal what kind of function a flexible word has in an

actual utterance. For example, in quite a few languages with a distinct

class of adjectives, some adjectives occur before the noun, whereas others

follow the noun (doubling). This kind of word order variation appears

to be absent in languages with a flexible word class (Types 1–3),

apparently because it would lead to processing difficulties (Hengeveld

et al. 2004: 563–4).

5. Recent Publications

In addition to the more or less recent publications on word classes that

were already mentioned (Croft 1991; Hengeveld 1992; Anward et al.

Word Classes 723

1997; Baker 2003), there are various overview articles in anthologies,

handbooks or encyclopedias (see, for example, Schachter 1985; Sasse

1993b; Evans 2000b; Vogel and Comrie 2000, Haspelmath 2001). Indi-

vidual word classes are discussed in, for example, Bybee (2000), Lehmann

and Moravcsik (2000), Bhat and Pustet (2000). Several monographs have

been devoted to adjectives (e.g. Pustet 1989; Bhat 1994; Wetzer 1996;

Beck 2002; Dixon and Aikenvald 2004). Plank (1997) provides a useful

bibliography on word classes in linguistic typology.

Short Biography

Jan Rijkhoff’s main areas of research are linguistic typology, parts-of-

speech, lexical semantics (especially nominal aspect and Seinsart) and

grammatical theory, in particular semantic and morpho–syntactic parallels

between the NP and the sentence within the theoretical framework of

Dik’s Functional Grammar and its successor Functional Discourse Grammar

(Hengeveld and Mackenzie 2005). He has authored or co-authored papers

in these areas for Journal of Linguistics, Journal of Semantics, Linguistics, Stud-

ies in Language, Linguistic Typology, Functions of Language, Acta Linguistica

Hafniensia, Italian Journal of Linguistics (Rivista di Linguistica), Belgian Journal

of Linguistics, as well as various anthologies such as Approaches to the Typol-

ogy of Word Classes (Vogel and Comrie eds. 2000) and International Hand-

book of Typology (Haspelmath et al. 2001). His book The Noun Phrase

(Oxford University Press 2002/2004) investigates NPs in a representative

sample of the world’s languages and proposes a four-layered, semantic

model to describe their underlying structure in any language. It examines

the semantic and morpho-syntactic properties of the constituents of NPs,

and in doing so it shows that the NP word order patterns of any language

can be derived from three universal ordering principles. His current

research is concerned with the parts-of-speech hierarchy, the semantics of

flexible word classes, the relation between form and function of noun

modifiers, and various aspects of NPs in Functional Discourse Grammar. From

1990 to 1994 Rijkhoff was a core member of the EuroTyp project (funded

by the European Science Foundation) and in 1995 he held a fellowship from

the Alexander von Humboldt Stiftung at the University of Konstanz (Ger-

many). Before coming to the University of Aarhus (Denmark), where he

presently teaches, Rijkhoff was a visiting scholar at the University of Texas

at Austin (1997–1999). He holds a BA in Dutch language and literature

from the Free University and an MA and a PhD in Linguistics from the

University of Amsterdam (both in the Netherlands).

Notes

*Correspondence address: Dr. Jan Rijkhoff, Department of Linguistics, University of Aarhus,

Nordre Ringgade 1, Building 1410, DK-8000 Aarhus C, Denmark. Email: linjr@hum.au.dk.

724 Jan Rijkhoff

1 Abbreviations: 3, 3rd person; Adj, adjective; Adv, adverb; ART, article; CLF, classifier; Compl,

complementizer; CN, connector; Cop, copula; DEM, demonstrative; N, noun; NP, noun

phrase; Num, numeral; PAST, past tense; PERF, perfective aspect; PL, plural; PoS system, parts-

of-speech system; Prep, preposition; PRES, present tense; PRT, participle; REL, relativizer;

RSM, resumptive marker; SG, singular; SS, same subject; SUBJ, subject; SUPP, support verb;

V, v e r b .

2 The more general term is ‘adposition’, which also includes postpositions (Dutch de tuin IN

‘into the garden’) and circumpositions (Dutch OM de tuin HEEN ‘around the garden’).

3 As adverbs constitute a rather mixed bag of different subcategories, they are often ignored in

discussions of lexical word classes (see also below).

4 For a recent application of the multi-dimensional approach, see Francis and Matthews (2005).

5 The original definitions have ‘lexeme’ instead of ‘word’.

6 Cf. Evans (2000b) or Mithun (2000); see also the collection of articles a special issue on word

classes in the journal Linguistic Typology (volume 9, number 3; 2005).

7 Only entities with a definite outline can be counted. Notice that this does not mean that

adjectives cannot occur in a classifier language. In many languages numeral classifiers have

developed into markers of other grammatical categories such as definiteness, specificity or

topicality (Rijkhoff 2000, 2004: 51). In such cases the erstwhile classifiers no longer serve as

‘individualizers’ in the sense of Lyons (1977: 462).

8 A transnumeral noun is neutral with respect to number, hence the same form can be used to

talk about one or more entities.

9 Note that the presence of a set of transitive words in the basic lexicon is a necessary and

sufficient condition for a language to have a major, distinct class of verbs, but only a necessary

condition for a language before it can have a major, distinct class of nouns.

10 When we go from left to right in the hierarchy of rigid word classes, we see increased

specialization. In contrast, the hierarchy of flexible word classes shows a decrease in specializa-

tion (but an increase in flexibility): (i) Parts-of-speech hierarchy (flexible word classes): modifier

> non-verb > contentive.

Works C i t e d

Aarts, Bas. 2007. Syntactic gradience: the nature of grammatical indeterminacy. Oxford, UK:

Oxford University Press.

Anward, Jan, Edith Moravcsik, and Leon Stassen. 1997. Parts of speech: a challenge for typology.

Linguistic Typology 1–2.167–83.

Baker, Mark C. 2003. Lexical categories: verbs, nouns, and adjectives. Cambridge, UK: Cam-

bridge University Press.

van Baarda, M. J. 1908. Leiddraad bij het bestuderen van’t Galela’sch dialekt, op het eiland

Halmaheira [Manual for the study of the Galela dialect, on the island of Halmahera]. The

Hague, The Netherlands: Nijhoff.

Beck, David. 2002. The typology of parts of speech systems: the markedness of adjectives. New

York, NY: Routledge.

Bhat, Darbhe N. S. 1994. The adjectival category: criteria for differentiation and identification

(Studies in Language Companion Series 24). Amsterdam, The Netherlands: Benjamins.

Bhat, Darbhe N. S., and Regina Pustet. 2000. Adjective. Morphology: a handbook on inflec-

tion and word formation (Part 1), ed. by Geert Booij, Christian Lehmann and Joachim

Mugdan, 757– 69. Berlin, Germany: Walter de Gruyter.

Broschart, Jürgen. 1997. Why Tongan does it differently: categorial distinctions in a language

without nouns and verbs. Linguistic Typology 1–2.123–65.

Bybee, Joan. 2000. Verb. Morphology: a handbook on inflection and word formation (Part 1),

ed. by Geert Booij, Christian Lehmann and Joachim Mugdan, 794–808. Berlin, Germany:

Wa l t e r de Gr u y t e r .

Croft, William. 1991. Syntactic categories and grammatical relations: the cognitive organization

of information. Chicago, IL: University of Chicago Press.

Croft, William. 2000. Parts of speech as language universals and as language-particular categories.

Approaches to the typology of word classes (Empirical Approaches to Language Typology 23),

Word Classes 725

ed. by Petra M. Vogel and Bernard Comrie, 65–102. Berlin, Germany/New York, NY:

Mouton de Gruyter.

Dixon, Robert M. W., and Alexandra Y. Aikhenvald (eds.). 2004. Adjective classes: a cross-

linguistic typology. Oxford, UK: Oxford University Press.

Donaldson, Tamsin. 1980. Ngiyambaa: the language of the Wangaaybuwan of New South

Wales. Cambr idge, UK: Cambridge University Press.

Evans, Nicholas. 2000a. Kinship verbs. Approaches to the typology of word classes (Empirical

Approaches to Language Typology 23), ed. by Petra M. Vogel and Bernard Comrie, 103–

72. Berlin, Germany/New York, NY: Mouton de Gruyter.

Evans, Nicholas. 2000b. Word classes in the world’s languages. Morphology: a handbook on

inflection and word formation (Part 1), ed. by Geert Booij, Christian Lehmann and Joachim

Mugdan, 708–32. Berlin, Germany: Walter de Gruyter.

Evans, Nicholas, and Toshiki Osada. 2005. Mundari: the myth of a language without word

classes. Linguistic Typology 9–3.352–90.

Francis, Elaine J., and Stephen Matthews. 2005. A multi-dimensional approach to the category

‘verb’ in Cantonese. Journal of Linguistics 41.269–305.

Haspelmath, Martin. 2001. Word classes and parts of speech. International Encyclopedia of the

Social and Behavioral Sciences, ed. by Paul B. Baltes and Neil J. Smelser, 16538–45. Amster-

dam, The Netherlands: Pergamon.

Haspelmath, Martin, Ekkehard König, Wulf Oesterreicher, and Wolfgang Raible (eds.). 2001.

Language typology and linguistic universals: an international handbook, vol. 1. Berlin,

Germany/New York, NY: Walter de Gruyter.

Heine, Bernd, and Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge, UK:

Cambridge University Press.

Heine, Bernd, Ulrike Claudi, and Friederike Hünnemeyer. 1991. Grammaticalization: a con-

ceptual framework. Chicago, IL: The University of Chicago Press.

Hengeveld, Kees, 1992. Non-verbal predication: theory, typology, diachrony. Berlin, Germany/

New York, NY: Mouton de Gruyter.

Hengeveld, Kees, and J. Lachlan Mackenzie. 2006. Functional discourse grammar. Encyclopedia

of language and linguistics, 2nd edn, vol. 4, ed. by Keith Brown, 668–76. Oxford, UK:

Elsevier.

Hengeveld, Kees, and Jan Rijkhoff. 2005. Mundari as a flexible language. Linguistic Typology

9–3.406–31.

Hengeveld, Kees, Jan Rijkhoff, and Anna Siewierska. 2004. Parts-of-speech systems and word

order. Journal of Linguistics 40–3.527–70.

Hopper, Paul J., and Elizabeth Closs Traugott. 2003. Grammaticalization, 2nd edn. Cambridge,

UK: Cambridge University Press.

Hopper, Paul J., and Sandra A. Thompson. 1984. The discourse basis for lexical categories in

universal grammar. Language 60–3.703–52.

Hundius, Harald, and Ulrike Kölver. 1983. Syntax and semantics of numeral classifiers in Thai.

Studies in Language 7–2.164–214.

Kuipers, Aert. 1968. The categories verb-noun and transitive-intransitive in English and Squa-

mish. Lingua 21.610–26.

Kutsch Lojenga, Constance. 1994. Ngiti: a Central-Sudanic language of Zaire. Köln, Germany:

Rüdiger Köppe.

Lehmann, Christian, and Edith A. Moravcsik. 2000. Noun. Morphology: a handbook on

inflection and word formation (Part 1), ed. by Geert Booij, Christian Lehmann and Joachim

Mugdan, 732–57. Berlin, Germany: Walter de Gruyter.

Lord, Carol Diane. 1989. Syntactic reanalysis in the historical development of serial verb

constructions in languages of West Africa. PhD dissertation, University of California, Los

Angeles. Ann Arbor, MI: University Microfilms International.

Lyons, John. 1977. Semantics (2 volumes). Cambridge, UK: Cambridge University Press.

Mithun, Marianne. 2000. Noun and verb in Iroquioan languages: multicategorisation from

multiple criteria. Approaches to the typology of word classes (Empirical Approaches to

Language Typology 23), ed. by Petra M. Vogel and Bernard Comrie, 397–420. Berlin,

Germany/New York, NY: Mouton de Gruyter.

726 Jan Rijkhoff

Mosel, Ulrike, 1991. Transitivity and reflexivity in Samoan. Australian Journal of Linguistics

11.175–94.

Mosel, Ulrike, and Even Hovdhaugen. 1992. Samoan reference grammar. Oslo, Norway:

Scandinavian University Press.

Plank, Frans. 1997. Word classes in typology: recommended readings (a bibliography). Linguistic

Typology 1–2.185–92.

Pustet, Regina. 1989. Die Morphosyntax des ‘Adjektivs’ im Sprachvergleich. Frankfurt am

Main, Germany: Lang.

Rijkhoff, Jan. 2000. When can a language have adjectives? Approaches to the typology of word

classes (Empirical Approaches to Language Typology 23), ed. by Petra M. Vogel and Bernard

Comrie, 217–57. Berlin, Germany/New York, NY: Mouton de Gruyter.

Rijkhoff, Jan. 2003. When can a language have nouns and verbs? Acta Linguistica Hafniensia

35.7–38.

Rijkhoff, Jan. 2004. The noun phrase (expanded paperback edition of 2002 hardback publica-

tion). Oxford, UK: Oxford University Press.

Rijkhoff, Jan. Forthcoming. On flexible and rigid nouns. Studies in Language.

Sasse, Hans-Jürgen. 1993a. Syntactic categories and subcategories. Syntax: an international

handbook of contemporary research (2 vols.), ed. by Joachim Jacobs, Arnim von Stechow,

Wolfgang Sternefeld and Theo Vennemann, 646–86. Berlin, Ger many: Walter de Gruyter.

Sasse, Hans-Jürgen. 1993b. Das Nomen – eine universale Kategorie? Sprachtypologie und

Universalienforschung (STUF) 46–3.187–221.

Sasse, Hans-Jürgen. 2001. Scales of nouniness and verbiness. Language typology and linguistic

universals: an international handbook, vol. ii, ed. by Martin Haspelmath, Ekkehard König,

Wulf Oesterreicher and Wolfgang Raible, 495–509. Berlin, Germany: Walter de Gruyter.

Schachter, Paul. 1985. Parts-of-speech systems. Language typology and syntactic description (3 vols.).

Volume I: Clause structure, ed. by Timothy Shopen, 3–61. Cambridge, UK: Cambridge

University Press.

Smit, Niels. 2001. De rol van derivatie bij lexicale specialisatie. MA thesis, Department of

Linguistics, University of Amsterdam.

Vogel, Petra M., and Bernard Comrie (eds.). 2000. Approaches to the typology of word classes

(Empirical Approaches to Language Typology 23). Berlin, Germany/New York, NY: Mouton

de Gruyter.

de Vries, Lourens. 1989. Studies in Wambon and Kombai: aspects of two Papuan languages of

Irian Jaya. PhD dissertation, University of Amsterdam.

Wetzer, Har rie. 1996. The typology of adjectival predication (Empirical Approaches to Language

Typology 17). Berlin, Germany: Mouton de Gruyter.

Grammar efficiency of parts-of-speech systems

Article

Full-text available

Jan 2010

Vulanoviü's formula for calculating grammar efficiency is applied to 17 theoretically possible parts-of-speech systems. It is demonstrated that the formula represents grammar efficiency of parts-of-speech systems more adequately than the methods used in the theory of formal languages. Moreover, it is shown that the correlation between parts-of-speech system types and their grammar efficiency can be modeled by the 3-dimensional generalization of the sigmoid.

Word Embedding for Understanding Natural Language: A Survey

Chapter

Full-text available

May 2017

Nominal Word Order Typology in Signed Languages

Article

Full-text available

Jan 2022

Caitie Coons

Although spoken language nominal typology has been subject to much scrutiny, research on signed language nominal word order typology is still a burgeoning field. Yet, the structure of signed languages has important implications for the understanding of language as a human faculty, in addition to the types of universals that may exist across the world’s languages and the influence of language modality on linguistic structure. This study examines the order of nouns and attributive modifiers (adjectives, numerals, demonstratives, quantifiers, genitives, and relative clauses) in 41 signed languages, which span national and village signed languages from various lineages and geographic regions. Despite previous typological research on clausal phenomena indicating that the clausal structure of signed languages differs systematically from spoken languages ( Napoli and Sutton-Spence, 2014 , among others), the results of this survey indicate that signed language nominal word order typology is strikingly similar to spoken languages in several ways: 1) the most common word orders in spoken languages are also common in signed languages, 2) the uncommon word orders in spoken languages are also uncommon in signed languages, but are attested, unlike uncommon major constituent orders, and 3) the relative ranking of word order strategies, particularly relative clauses, is similar across signed and spoken languages.

Parts of speech, comparative concepts and Indo-European linguistics

Chapter

Jul 2021

Luca Alfieri

Critique de Colombat & Lahaussois (2019): Histoire des parties du discours

Article

Mar 2021

Compte rendu de Nico Lioce

“Uniformity” or “Dispersion”? -- The evolution of Chinese poetic word categories’ distribution patterns.

Article

Full-text available

Feb 2021

The daily language in mainland China has experienced a shift from traditional Chinese language to modern mandarin Chinese at the beginning of the twentieth century. The Chinese poetry ‘revolution’started in the 1910s is considered as a turning point in the Chinese poetry evolution process due to the novel applications of the modern Chinese language. Many temporal poetic studies consider the poems written in traditional Chinese and modern Chinese as two different genres. The two genres are saliently different in rhyme, meter, theme, etc. We aim to detect the specific properties of the evolution process of Chinese poetry in terms of the word categories’ distribution patterns. For the purpose, a corpus with 438 randomly selected traditional and modern Chinese poems is built, and some quantitative language indicators (entropy, relative entropy, repeat rate) and some exploratory statistical analysis techniques applicable in corpus linguistics and quantitative linguistics (one-way ANOVA test, cluster analysis)1 are used to abstract and analyze language data from the corpus. It is concluded that the word categories are distributed significantly differently in traditional poetry and modern poetry. The sound reasons would be that (1) traditional Chinese poetry is more likely to focus on the application of some specific content word categories, for example, nouns, but not auxiliary words and (2) modern poems tend to choose more categories of words. From the perspective of word class distribution patterns, we suppose that the birth of modern Chinese poetry in the 1910s is a sharp change to Chinese poetry production.

Parts of Speech

Article

Jan 2019

Mara Haslam

The parts of speech that are generally most helpful for English teaching are noun, pronoun, verb, adjective, adverb, preposition, conjunction, and determiner. Each part of speech is best defined not by the meaning of the word in question but rather by the syntactic relationship of the word to other words in the sentence. Neuroscientific research provides a still incomplete understanding of the correlation between parts of speech and brain structure, but the organization of many grammar books by parts of speech reflects that learning about different parts of speech in English can be helpful for learners. Corpus linguistics provides further understanding of how parts of speech are used by allowing us to see patterns of use over many speakers and genres. This entry discusses each of the above‐mentioned parts of speech in detail with examples and points out which aspects of each part of speech in English can be difficult for English learners, while providing suggestions for pedagogical materials on these topics.

The Word Class Adjective in English Business Magazines Online

Article

Full-text available

May 2018

The aim of this paper is to research the word class adjective in one sequence of the ESP: Business English, more precisely English business magazines online. It is an empirical study on the corpus taken from a variety of business magazines online. The empirical analysis allows a comprehensive insight into the word class adjective in this variety of Business English and makes its contribution to English syntax, semantics and word formation. The syntactic part analyses the adjective position in the sentence. The semantic part of the study identifies the most common adjectives that appear in English business magazines online. Most of the analysis is devoted to the word formation of the adjectives found in the corpus. The corpus is analysed in such a way that it enables its division into compounds, derivatives and conversions. The results obtained in this way will give a comprehensive picture of the word class adjective in this type of Business English and can act as a starting point for further research of the word class adjective.

Word Embedding for Understanding Natural Language: A Survey

Chapter

May 2018

Word embedding, where semantic and syntactic features are captured from unlabeled text data, is a basic procedure in Natural Language Processing (NLP). The extracted features thus could be organized in low dimensional space. Some representative word embedding approaches include Probability Language Model, Neural Networks Language Model, Sparse Coding, etc. The state-of-the-art methods like skip-gram negative samplings, noise-contrastive estimation, matrix factorization and hierarchical structure regularizer are applied correspondingly to resolve those models. Most of these literatures are working on the observed count and co-occurrence statistic to learn the word embedding. The increasing scale of data, the sparsity of data representation, word position, and training speed are the main challenges for designing word embedding algorithms. In this survey, we first introduce the motivation and background of word embedding. Next we will introduce the methods of text representation as preliminaries, as well as some existing word embedding approaches such as Neural Network Language Model and Sparse Coding Approach, along with their evaluation metrics. In the end, we summarize the applications of word embedding and discuss its future directions.

Ontological Scope and Linguistic Diversity: Are There Universal Categories?

Article

Jul 2015

Johanna Seibt

The aim of this paper is to address a longstanding concern about the linguistic ‘relativity’ of ontological categories, and resulting limitations in the scope of ontological theories. Given recent evidence on the influence of language on cognitive dispositions, do we have empirical reasons to doubt that there are ontological categories that have universal scope across languages? I argue that this is the case, at least if we retain the standard ‘inferential’ approach within analytical ontology, i.e., if we evaluate ontological interpretations of L-sentences relative to certain material inferences in L. Research in linguistic typology suggests that types of entities postulated for the domain of Indo-European languages cannot capture the ontological commitments of the (much larger group of) non-Indo-European languages. Ontological category theory thus seems to have three options. The first option is to abandon the standard ‘inferential’ approach to ontological category theory. Alternatively, if we stay with the inferential approach, we face the following choice. Either ontology must let go of its ambitions to provide general domain descriptions for any language and settle for the more modest project of reconstructing the ontological commitments of a group of natural languages. Or else analytical ontologists should turn to linguistic typology in order to accommodate the diversity of inferential structures embedded in natural languages. I recommend and explore this third option, illustrating a strategy for how to construct a domain theory that can be used across languages. In a first step I show how linguistic research on the semantics of verbs and nouns (studies on so-called “Aktionsarten” and “Seinsarten”) can be used to identify the inferential patterns of ten basic concepts of modes of existence in time and space. In a second step I show how these inferential data can be interpreted ontologically within General Process Theory , an ontological framework based on nonparticular individuals (“dynamics”).

The Typology of Parts of Speech Systems: The Markedness of Adjectives

Book

Full-text available

Jan 2013

David Beck

Syntactic Gradience: The Nature of Grammatical Indeterminacy

Book

Oct 2023

Bas Aarts

This is the first exhaustive investigation of gradience in syntax, conceived of as grammatical indeterminacy. It looks at gradience in English word classes, phrases, clauses and constructions, and examines how it may be defined and differentiated. Professor Aarts addresses the tension between linguistic concepts and the continuous phenomena they describe by testing and categorizing grammatical vagueness and indeterminacy. He considers to what extent gradience is a grammatical phenomenon or a by-product of imperfect linguistic description, and makes a series of linked proposals for its theoretical formalization. Bas Aarts draws on, and reviews, work in psychology, philosophy and language from Aristotle to Chomsky., and writes clearly on a fascinating and important aspect of language and cognition. His book will appeal to scholars and graduate students of language and syntactic theory in departments of (English) linguistics, philosophy and cognitive science.

Adjective Classes: A Cross-Linguistic Typology

Article

Sep 2004

This book shows that every language has an adjective class and examines how these vary in size and character. The opening chapter considers current generalizations about the nature and classification of adjectives and sets out the cross-linguistic parameters of their variation. Thirteen chapters then explore adjective classes in languages from North, Central and South America, Europe, Africa, Asia, and the Pacific. Studies of well-known languages such as Russian, Japanese, Korean and Lao are juxtaposed with the languages of small hunter-gatherer and slash-and-burn agriculturalist groups. All are based on fine-grained field research. The nature and typology of adjective classes are then reconsidered in the conclusion. This pioneering work shows, among other things, that the grammatical properties of the adjective class may be similar to nouns or verbs or both or neither; that some languages have two kinds of adjectives, one hard to distinguish from nouns and the other from verbs; that the adjective class can sometimes be large and open, and in other cases small and closed. The book will interest scholars and advanced students of language typology and of the syntax and semantics of adjectives. Each book in this series focuses on an aspect of language that is of current theoretical interest and for which there has not previously or recently been any full-scale cross-linguistic study. The series is for typologists, fieldworkers, and theory developers at graduate level and above. The books will be suited for use as the basis for advanced seminars and courses. The subjects of next three volumes will be serial verb constructions, complementation, and grammars in contact.

Syntactic Categories and Subcategories

Chapter