ChapterPDF Available

Online dictionaries of English

Authors:

Abstract and Figures

In this paper I present an overview of the spectrum of available online English language dictionaries, and then offer some general comments on a few selected key issues. Given the current explosion of web content, it is quite pointless to try to list every single dictionary available. It makes better sense to identify the salient categories of online dictionaries and selectively focus on their prominent and typical representatives. The first notable category, so important to the many learners of English worldwide, are the famous British monolingual learners' dictionaries (the Big Five). Here, it is interesting to observe the gradual transition to the online medium in what has sometimes been called the freemium approach. Quality general English dictionaries aimed at the native speaker are not so well represented, but there are a wide choice of specialized (subject) dictionaries of varying quality and provenance. Special-purpose dictionaries include pronouncing dictionaries and onomasiological dictionaries. Diachronic dictionaries have also established a presence on the internet. As one guise of the Web 2.0 experience, we witness the emergence of bottom-up (or user-involvement) lexicography, with such prominent exemplars as the Urban Dictionary or Wiktionary. Hyperlinking is a fundamental feature of the web, but it is, arguably, overused in the so called dictionary aggregators: dictionary portals which put together entries from several online dictionaries. This creates highly redundant assemblages of lexicographic data. How to tap the richness of the Web but present the results in a user-friendly manner without laborious human intervention is a tough question. Another issue that still awaits satisfactory answers is the organization of access to data in online dictionaries. Even in highly respected dictionaries, there remain basic problems of access, such as with locating multi-word units, notwithstanding the upbeat tone of metalexicographers who often just pronounce the problem as essentially solved in the electronic medium. Other issues related to new technologies are the use of graphics, multimedia and alternative presentation modes, and these receive some attention. Finally, I play with the idea of the dictionary as an advanced query system sitting on top of a text corpus. Using collocation dictionaries as an example, I demonstrate that the difference between a sophisticated corpus query system and a more traditional lexicographic product may soon become something of a technical subtlety.
Content may be subject to copyright.
Online dictionaries of English
Robert Lew
Adam Mickiewicz University
Abstract
In this paper I present an overview of the spectrum of available online English language
dictionaries, and then offer some general comments on a few selected key issues. Given the
current explosion of web content, it is quite pointless to try to list every single dictionary
available. It makes better sense to identify the salient categories of online dictionaries and
selectively focus on their prominent and typical representatives. The first notable category, so
important to the many learners of English worldwide, are the famous British monolingual
learners’ dictionaries (the Big Five). Here, it is interesting to observe the gradual transition to
the online medium in what has sometimes been called the freemium approach. Quality general
English dictionaries aimed at the native speaker are not so well represented, but there are a
wide choice of specialized (subject) dictionaries of varying quality and provenance. Special-
purpose dictionaries include pronouncing dictionaries and onomasiological dictionaries.
Diachronic dictionaries have also established a presence on the internet. As one guise of the
Web 2.0 experience, we witness the emergence of bottom-up (or user-involvement)
lexicography, with such prominent exemplars as the Urban Dictionary or Wiktionary.
Hyperlinking is a fundamental feature of the web, but it is, arguably, overused in the so called
dictionary aggregators: dictionary portals which put together entries from several online
dictionaries. This creates highly redundant assemblages of lexicographic data. How to tap the
richness of the Web but present the results in a user-friendly manner without laborious human
intervention is a tough question. Another issue that still awaits satisfactory answers is the
organization of access to data in online dictionaries. Even in highly respected dictionaries,
there remain basic problems of access, such as with locating multi-word units,
notwithstanding the upbeat tone of metalexicographers who often just pronounce the problem
as essentially solved in the electronic medium. Other issues related to new technologies are
the use of graphics, multimedia and alternative presentation modes, and these receive some
attention. Finally, I play with the idea of the dictionary as an advanced query system sitting on
top of a text corpus. Using collocation dictionaries as an example, I demonstrate that the
difference between a sophisticated corpus query system and a more traditional lexicographic
product may soon become something of a technical subtlety.
Introductory
The present paper is intended as an overview of online dictionaries of English, often seen, and
probably rightly, as the leading lexicographic tradition of the present. Although a balanced
overview is my primary goal, I will also touch upon some general issues and adopt a more
evaluative position here and there. However, this will only be a secondary perspective, as the
specific issues are covered in greater depth in some of the other papers in the present volume.
Obviously, given the sheer number of the currently available on-line dictionaries, no-one
can hope to produce a complete catalogue, and this is not the purpose here. Rather, the idea is
to present prominent and representative exemplars of specific types of dictionaries and focus
on their properties of interest. But what are those types of dictionaries? As dictionaries can be,
and have been, compared on a number of different levels, classifying them has traditionally
been problematic. This has become even more of a challenge in the age of electronic
dictionaries. What, then, could be the basic classifying criteria for online dictionaries?
Clearly, most of the traditional criteria can still be applied to online products. Here, of
course, we find the complex (and at times confusing) network of overlapping oppositions:
general/specialized subject, general/special purpose, L1/L2/FL speaker, expert/layman,
contemporary/historical, etc...
There do appear, however to be some criteria or oppositions that have not been inherited
from printed dictionaries but rather are specific to online dictionaries.
1 Some additional criteria for classifying online dictionaries
1.1 Institutional vs. collective
A variety of overlapping classification criteria have been used to categorize online
dictionaries. For example, in terms of user involvement, there is the institutional versus
collective opposition (Fuertes-Olivera 2009); the latter category signifies a collaborative effort
by a community of non-professionals, who can themselves be dictionary users; an earlier
paper by Carr (1997) has also used the terms bottom-up and collaborative. User-involvement
is yet another designation for a similar concept, while open stresses a slightly different aspect
of what might again be a fairly similar formula.
1.2 Free vs. paid
Collective dictionaries would normally be free to use. Conversely, institutional dictionaries
need not necessarily involve fee-based access, so the free versus paid contrast is an
independent one. It is also increasingly difficult to demarcate clearly between free and paid,
with the clear cases leaving a substantial grey area in the middle, as revenue to the publisher
can take different forms. For example, individual pay-per-view or subscription-based access is
a clear case, but when syndicated as part of a more comprehensive service and sold, say, to
libraries, the end user often does not bear the direct cost. Then there are cases where online
access is offered (perhaps for a limited time) as a bonus for buyers of paper editions. Still
closer to the free end of the cline are ad-supported dictionaries, and this appears to be a rather
popular model at the moment.
1.3 Number of dictionaries
In terms of how many dictionaries are offered by the specific services, at least the following
four options come to mind:
1. individual dictionaries: just like the traditional printed dictionaries, there are
standalone, single online dictionaries;
2. dictionary sets consisting of clusters of related dictionaries may be offered from a
single landing page; a good example is the Cambridge dictionaries online page;1
3. dictionary portals only include hyperlinks to actual dictionaries (examples will be
presented below);
4. dictionary aggregators excel at pasting together the content of various dictionaries and
serving them on a single page (again, examples will be discussed below).
In my overview below, I will begin with some notable representatives of institutional
dictionaries offered free of charge to the world internet community.
2 Institutional Dictionaries
2.1 General English Dictionaries
General English Dictionaries are traditional general-purpose dictionaries which provide a
relatively rich microstructural treatment of (primarily) contemporary English, which is
traditionally expected from general reference desk dictionaries, and where the word list is not
restricted by domain or register.
2.1.1 American
Traditional US dictionary publishers seem to have embraced the web: as many as three of the
major American players on the market of general desk and college dictionaries make their
dictionaries available online free of charge. These are the Merriam-Webster Online
Dictionary, American Heritage Dictionary and Random House Unabridged Dictionary, the
last one being included only as part of the Dictionary.com service (on which see 3.1 below).
2.1.2 British
Until recently, the available offer of online general-purpose dictionaries on the British scene
had been less complete, with the traditional and most prestigious publishers (notably Oxford
University Press) apparently hesitant about placing their products online for free. Only very
recently, OUP created the new oxforddictionaries.com2 lexicographic portal, built around two
of the publisher’s recent dictionaries: the newest (third) edition of the Oxford Dictionary of
English (under the heading World English), and its American counterpart, the New Oxford
American Dictionary, also in its third edition. A premium subscription service is available,
with one year free for buyers of the printed copy.
The availability of the free/premium combination for these Oxford dictionaries exemplifies
rather well the new business model that is currently being followed by a number of
publishers: the model known by the linguistic blend freemium. The approach works on the
principle that basic content and functionality is offered essentially free of charge (in response,
we might say, to the free-lunch mindset of today’s netizens). The free offer, however, is used
as an opportunity to market and sell extra content, which might be richer lexicographic data
and/or non-lexicographic content, such as exercises or language testing materials. To continue
with our example, the premium oxforddictionaries.com service offers the following extra
features (Judy Pearsall, personal communication):
sense-linked thesaurus of 600,000 synonyms and antonyms;
advanced search and browse features;
1.9 million sense-linked examples from the Oxford English Corpus;
audio pronunciations;
My Oxford Dictionary personalization features;
browsing and search by subject area, meaning category, part of speech, etc.;
four additional zones fully linked to dictionary content, including Writing Skills zone,
Writers and Editors zone, Example sentences zone, and Puzzles zone.
To some extent, free online versions may drive the sales of paper copies — but of course this
argument could be reversed, with online access deterring some potential buyers from
purchasing a printed copy.
Apart from the two Oxford dictionaries, there are also other notable British dictionaries
offered free of charge. Collins offers what it refers to as the Collins English Free Dictionary.3
A closer examination reveals that this is not the same as the authoritative Collins English
Dictionary; the latter, however, does seem to be available, but only as part of
TheFreeDictionary service (on which see 3.1 below). The venerable Scottish publisher
Chambers offers on its website its Chambers 21st Century Dictionary.4 Again, though not
really the same as the renowned Chambers English Dictionary, the 21st Century is still a
usable, solid reference work for general consultation.
The Encarta World English Dictionary,5 having originated in a cooperation between the
London-based Bloomsbury publisher and Microsoft, actually comes in two versions, and both
are available via the same website; there is the World English version, marketed as the
dictionary that provides unrivalled treatment of the regional varieties of English, and the
localized US version; the site provides an option to switch quickly between the two, and it is
fascinating to observe, by switching back and forth, the differences in the coverage of
regional terms and meaning, spelling and pronunciation.
2.2 Learners' dictionaries: the Big Five
According to data from internetworldstats,6 English is the foreign language of some 86% of
Europe’s active internet users. Now, given that English is today’s de facto lingua franca and
that WWW content in English dwarfs out that in any other language, it becomes clear that
non-native speakers are a significant category of online dictionary users, present or future. In
this context, the category of English learners’ dictionaries comes to the focus, since these are
the reference works designed specifically with the non-native speaker in mind. English
learners’ dictionaries enjoy a long-standing tradition, which goes back to around the 1940’s
or, as some claim, the 1930’s (cf. Cowie 1999). Their content has been meticulously reworked
over numerous successive editions, and because of the worldwide customer base and the
corresponding sales volumes, publishers of monolingual English learners’ dictionaries have
been able to take advantage of select teams of expert lexicographers. These dictionaries have
enjoyed high levels of prestige, and so have their traditionally British publishers.
The last few years has seen free versions of British monolingual dictionaries for advanced
learners appear online, one by one. On the whole, the major British MLD’s have followed a
pattern of remarkable similarity (Yamada 2010), perhaps as part of the competitive drive, and
this is also reflected in the features offered in their online versions. There is also a more
down-to-earth reason for the similarities found in a number of British MLD’s: they tend to use
the same software dictionary production platform from IDM.
The range of available English MLD’s opens with the pioneer in this segment, Oxford
Advanced Learner's Dictionary,7 a free version now roughly based on the 7th print edition. A
long-time competitor, Longman Dictionary of Contemporary English, currently in its fifth
edition, has also offered a free online version8 for some time. The dictionary’s landing page
specifically mentions a limitation of the free version: recordings of spoken pronunciation are
only available for a subset of headwords and example sentences (more specifically, the audio
is available for the entries in the letter stretches D and S). The note further states that audio
recordings for all entries are available in “the CD-ROM version”: this is not quite accurate, as
the optical disk version is actually offered on a DVD-ROM. But the free version is not the
only online version of this dictionary: there is also a radically different premium online
edition9, which offers essentially the same content as the off-line DVD-ROM version.
Cambridge Dictionaries Online10 represents an example of an institutional dictionary set (as
defined in 1.3 above): apart from the flagship Cambridge Advanced Learner’s Dictionary,
four other learners’ dictionaries from the publisher are available at the same address.
Amongst the major British learners’ dictionaries, Macmillan English Dictionary may well
be the one to have offered the most complete set of lexicographic content online11 free of
charge, including audio pronunciations of all headwords and a sense-linked thesaurus.
The one member of the Big Five set which has remained apparently sceptical when it comes
to offering free online access of any kind is COBUILD. Although it has offered subscription-
based access for some time,12 none of this is available freely, if we disregard an outdated 4th
edition being hosted on a third-party service.13 Recently, it looked as if COBUILD was set to
become the most widely used learner’s dictionary when, in autumn 2009, Google apparently
obtained a licence for COBUILD content and placed it online as the main Google dictionary
for English. This was a questionable choice, as COBUILD is not really well-suited for the
type of uses that Google users were most likely to need the dictionary for, i.e. problems with
text reception: of all the major learners’ dictionaries, COBUILD has the smallest coverage
(Rundell 2006). On the other hand, the features supporting text production would remain
underused. Google’s half-hearted implementation of the interface certainly would not have
made users more sympathetic towards the dictionary. For example, Google dictionary
included COBUILD’s syntactic codes, but without a word of explanation anywhere. Surely, it
is a long shot to assume that a casual user of the Google dictionary will appreciate the
significance of a code such as “NVAR” (in this case, an indication that a noun in the sense so
marked has both mass and individuated uses). Considering all this, it is not at all surprising
that in August 2010, COBUILD was replaced as the database for Google dictionary with The
Oxford American College Dictionary (Judy Pearsall, personal information).
2.2.1 American learners' dictionaries
Although it is the British publishers that lead the market of monolingual English learners’
dictionaries, such dictionaries have also been published elsewhere, and one particular
dictionary that made a premiere recently with quite a bit of publicity is the Merriam-Webster's
Learner's Dictionary.14 What is rather unique about this dictionary is that the launch of its
online version coincided with the publication of the first paper edition. The free online content
includes audio pronunciation, and the user interface is at least as good as those of the British
dictionaries, but despite the marketing claims, the lexicographic content itself is not
groundbreaking, and still lacks a number of modern features now taken for granted in the
leading British products (Bogaards 2010; Hanks 2009). The dictionary does have more
examples than the competition, but their quality has been questioned (Hanks 2009).
Despite what some might be led to believe, the Merriam-Webster's Learner's Dictionary is
by no means the first American dictionary of its type: several have already been published,
and at least one of them, Heinle's Newbury House Dictionary of American English,15 is freely
available online. However, the latter is a rather small dictionary and not a particularly
impressive one. All in all, learners of American English may actually be better off using
British-published dictionaries of American English, such as the Cambridge Dictionary of
American English.16
2.2.2 Louvain EAP Dictionary (LEAD)
Apart from the established publishers, some academic centres are also trying to enter the field
of learners’ dictionaries. One particularly promising project currently in progress (not yet
publicly accessible) is the Louvain EAP dictionary (LEAD), which is being developed as a
dictionary for non-native writers. Its main novelty is that it is customizable in terms of field
domain (business, medicine) and mother tongue (French, Dutch). In consequence, usage notes
and equivalents match the L1 of the user, and some of the examples are domain-specific. The
dictionary will also have (as you might expect from a product created at the Centre for
English Corpus Linguistics) a solid grounding in corpora, and integrated corpus access.
2.3 User-involvement (bottom-up) lexicography
In the democratic world of the internet, users can play lexicographer as well and create their
own online dictionaries. There is quite an impressive range of these, but let us have a look at
three representative exemplars:
2.3.1 Urban dictionary
A success story in its own right, the Urban dictionary17 is a true bottom-up initiative which
recently celebrated its 10th anniversary. One of the community features exemplified here is
that users vote on the „best” definitions. But such democracy does not necessarily serve
lexicography well: as it turns out, the most liked definitions are not of the type that would
really help someone who does not already know the meaning. Clearly, true explanatory
definitions are too predictable and thus not “interesting” enough, and are being pushed back to
the bottom of the list. Instead, collaborative dictionary entries, unless properly moderated,
tend to become the playing ground for showing off wit, marking in-group membership and
venting prejudice. For example, one entry at the headword BOOTYISM runs as follows: “The
gospel according to Beyonce. Often confused with Buddhism.” This entry is written in an
abbreviated style posing as lexicographese, and manages to allude rather cleverly to the
semantics as well as origins of the slang term, but it would probably not be of much help to a
user who has no clue about the meaning. In this case, the author seems to be aware of this
deficiency, and makes up for it in the (entirely invented) example exchange:
Todd: I'm thinking about converting to Bootyism.
Michael: Nah man, it's BUDDHISM.
Todd: No, cause in Bootyism all you do is worship ass.
2.3.2 Wiktionary
Wiktionary18 may be the ultimate collaborative dictionary. A recent in-depth analysis of this
resource (Fuertes-Olivera 2009) presents a number of interesting findings. It is observed that,
contrary to what is often claimed, Wiktionary is not a multilingual dictionary but an English
dictionary with a translation overlay for several other languages. It is also noted that very
similar items may receive radically different treatments, lacking internal consistency and
contradicting the Wiktionary guidelines.
2.3.3 Wordnik
Wordnik19 presents an interesting blend of online dictionary genres, involving a collaborative
community-driven component built around a “professional” core. According to the founder
Erin McKean (personal communication), user-generated content is encouraged here but in
"guided" ways, with less emphasis on user-created definitions than is usual in collaborative
projects. Wordnik embeds content from other datasets: at this time, Twitter and Flickr are
being tapped for real-time citations and relevant images, respectively. The service employs
modern data mining techniques to identify in corpora citations of the self-defining and
exemplar types (McKean, personal communication). Overall, there is less reliance on
traditional definitions and the emphasis is shifted to citations.
2.3.4 Collaborative-institutional dictionaries
Commercial publishers also try to get their users actively interested and involved in
lexicography, perhaps in an effort to persuade them to stay on the site and come back for
more. Examples of collaborative sections hosted on institutional dictionary sites suggest that
the opposition institutional versus collective dictionary (Fuertes-Olivera 2009) may no longer
be a sharp one. Two such examples from well-known institutional publishers are the
Merriam-Webster Open Dictionary20 and Macmillan Open Dictionary.21 A perusal of the
user-added entries reveals that most of the entries added would not meet the criteria for
inclusion in the regular edition of the dictionary, and their presence merely provides evidence
of the conventional wisdom that “the dictionary” is a collection of “all the words” of a
language.
Apart from adding open dictionary components, online dictionaries sometimes offer other
extras aimed at involving the users. Recent add-ons include social networking features, such
as the award-winning Macmillan Dictionary blog.22
So far we have discussed general dictionaries of contemporary English, aimed at both
native speakers of English and foreign learners. Let us now move beyond these common
types, to diachronic and specialized dictionaries.
2.4 Diachronic (historical) dictionaries
Users of diachronic dictionaries are most typically language scholars, and so their level of
sophistication and language awareness is normally far beyond that of lay users. As language
experts, they can reasonably be trusted to make choices that a non-expert user will not be in a
position to make, such as the explicit selection of microstructural data categories (and we will
revisit the issue of customization in a later part of this article). The makers of academic
diachronic dictionaries appear to be aware of these ramifications, as exemplified by the online
version of what is perhaps the most famous dictionary world-wide (at least for English), the
Oxford English Dictionary. Access to the OED is subscription-based, and affiliated scholars
would most of the time rely on their institutional subscription rather than a personal one. In
contrast, a more restricted (in terms of period) but no less voluminous Middle English
Dictionary23 has been freely available online since 2007, when the University of Michigan
completed the digitization process with the help of a government grant. The dictionary offers
a rather large number of technically complex search options but these should be manageable
for language scholars and their students.
2.5 Subject field dictionaries
There are countless online specialized dictionaries out there on the web, most of them fairly
small in size, dealing with the vocabulary of a specific subject field (as well as narrower sub-
fields). Because of the sheer number, many users will find it useful to consult online
directories of such dictionaries, one of the most comprehensive being Glossarist.com: an
example of a dictionary portal as listed in my provisional taxonomy under 1.3 above.
Indexing portals of this type only include links to dictionaries on external pages, without
themselves hosting or displaying actual lexicographic content.
The lexicographic wisdom that content and presentation are largely two separate aspects is
strengthened by those products where there is a sharp contrast in quality between one and the
other. One case in point is Dorland's Medical Dictionary24 from the respectable pair Merck
Medicus and Elsevier, where solid content is marred by the uninspired (to say the least)
access interface. Users are presented with a long chain of alphabetic stretches which have to
be navigated linearly in a fashion resembling page-turning, only much slower (although there
is a term search window, it does not apply to the dictionary itself, but to other services). To
metalexicographers, this dictionary serves as a warning against sweeping generalizations
about electronic dictionaries being faster and superior in terms of access: apparently, it is
perfectly possible to produce an online dictionary where access is more cumbersome than in a
paper book.
2.6 Dictionaries with restricted macrostructure
One way to think of special-purpose dictionaries is that they often involve systematically
restricted treatment in either macrostructure or microstructure. In the earlier case, only a
distinct subset of the vocabulary is included in the wordlist. Field dictionaries, already
covered in 2.5 above, may be included here. Another exemplar of a restricted macrostructure
dictionary is the well-known and successful Acronym Finder25, which aims to include
acronyms, including those pronounced as one word and letter by letter (sometimes called
initialisms). Although Acronym Finder does not limit its headword list to English acronyms, it
is a fact that English very clearly dominates.
2.7 Dictionaries with restricted microstructure
In contrast to dictionaries with restricted macrostructure, restricted-microstructure dictionaries
are characterized by a systematic reduction, not in the word list itself, but in the lexicographic
data categories presented at each entry, compared to a general dictionary. The free Online
Etymology Dictionary26 is a representative of the genre: the lexicographic data for a given
headword is restricted to an explanation of the word’s origins.
Pronouncing dictionaries are another major category of restricted-microstructure
dictionaries, where the chief lexicographic data given indicates the phonetic form of the entry
word. Semantic information is only given in exceptional cases, such as to disambiguate
between graphemically identical words that are pronounced differently (i.e. homographs that
are not homophones). There is the question of the exact form in which information on
pronunciation is conveyed. In printed books, transcription (in one of a number of standards,
the most universal being the IPA) used to be the only option, but in the multimedia
environment of the web, the expectation of users is to be able to hear an audio rendition of an
item’s pronunciation. This expectation is met by the popular free online talking English
dictionary howjsay.com,27 which provides recorded audio clips, but no written transcription.
At the other end of the cline are academic pronouncing dictionaries such as the Carnegie
Mellon University Pronouncing Dictionary,28 which presents transcriptions in the ARPAbet
respelling system, or Péter Szigetvári’s English Pronouncing Dictionary,29 which employs a
variant of the SAMPA respelling system.
There is no denying that being able to hear what the word or phrase sounds like is an asset,
but does this mean, as most people seem to assume, that phonetic transcription is now
dispensable? It probably is for native speakers of English, but hardly so for speakers of other
languages looking up English pronunciation. For them, it is an illusion to believe that just
hearing a word pronounced in a foreign language is enough to register, less still learn, its
correct pronunciation. Due to the effect known as categorical perception, speakers of a
language tend to hear foreign language sounds through the filter of their native language
phonology. Consequently, what foreigners will hear is mostly their native language sounds
and tend to miss the distinctions not present in their own language. For example, a speaker of
Polish may easily miss the difference between met and mat. The important advantage of
phonemic transcription is that it provides an explicit graphic representation of the phonemes
involved, drawing attention to the phonemes as entities. (This is not to say that the two
academic dictionaries cited above do this in a very user-friendly way: they do not.) Of course,
it is also true that efficient use of phonetic transcription does not usually come naturally for a
language learner and requires guided training.
But that is not the end of the story. Apart from pure phonemic identity, there is the
important subphonemic phonetic detail, including positional allophony which, again, is very
hard to hear for the untrained learner. Although traditional printed pronouncing dictionaries
tend not to give subphonemic detail, there is no principled reason why future online
dictionaries should not be able to offer a choice of the level of transcription, including a
narrow-phonetic rendition for those who might want or need it. Technically, it should not be
terribly difficult to take stock of at least the rule-based variants.
As noted by Sobkowiak (2009), phonetic transcription has a representational function and
an indexical function. The former has to do with the representation of the phonetic form of a
word (or, more generally, other linguistic string). The indexical function allows the user to use
symbols for accessing (sets of) lexical items, such as when looking for words that exhibit a
given phonetic pattern. A systematic transcription system is at present a prerequisite for the
indexical function to be possible, although not all dictionaries that do have transcription,
allow ‘sound search’ options. Clearly, of the three free pronouncing dictionaries here
presented, Szigetvári’s English Pronouncing Dictionary is the most sophisticated in this
respect.
2.8 Onomasiological dictionaries
Onomasiological dictionaries are those that are specifically designed to take the user from a
concept or idea to linguistic form, rather than explaining the meaning or use of a given form.
A traditional paper dictionary of this type would most typically be a thesaurus or synonym
dictionary. Thesaurus.com30 is a companion site to the popular Dictionary.com aggregator
(see 3.1 below). A more interesting online example of such a dictionary is RhymeZone,31
which started off as a synonym dictionary calling itself the Semantic Rhyming Dictionary.
Somewhat predictably, probably because of the phrase “rhyming dictionary” in the name,
users arrived at the dictionary from search engines looking for traditional phonetic rhymes,
and this is what the default search mode now offers. In fact, searching for rhyming words is
also an onomasiological query, albeit in a broader sense. In the more restricted sense of
onomasiological, the dictionary offers lists of synonyms, antonyms and “related words”. For
these, RhymeZone relies on data from the English WordNet32 lexical database, just as so
many other lexical resources do these days: it has become one of the favourite dataset for
many online dictionaries, because it is free and NLP-tractable in ways that make such
integration relatively easy.
One interesting way in which WordNet data is used is graphic visualization engines such as
VisuWords33 or Visual Thesaurus,34 where the idea is to represent WordNet’s lexical relations
in a visually appealing graphical form. The latter now shows up in Cambridge Dictionaries
Online entries.
Having completed a quick tour of the representative online dictionaries of English, we will
move on to a number of overarching issues that are relevant and topical for online dictionaries
of today and tomorrow.
3 Some issues in online dictionaries
3.1 The dictionary web
The World Wide Web is built around the concept of hypertext, where texts, documents and
media make up an interconnected network. Like most other sites, online dictionaries
hyperlink, interlink, embed and integrate, and it will not take long for a careful user of online
dictionaries to start noticing that quite a lot of the same content crops up again and again on a
variety of dictionary sites. For example, the very same Visual Thesaurus images which feature
in Cambridge Dictionaries Online are also present at the Dictionary.com35 site. The latter is an
example of a dictionary resource which does not rely on its own data, but instead aggregates
lexicographic content from other electronic (online) dictionaries. Dictionary.com is a
particularly popular such aggregator. The popularity, one might suspect, has a lot to do with
the attractive domain name, which to many users (and search engines?) strongly suggests that
this is the Dictionary (see e.g. Béjoint 2010 on the popular image of the dictionary). As of this
writing, the resource aggregates lexicographic content from 15 dictionaries, including the
American favourites Random House Dictionary and American Heritage Dictionary, as well as
half a dozen special-purpose and special-subject dictionaries.
Another aggregator is TheFreeDictionary, with American Heritage Dictionary (again!),
WordNet (again!), and Collins English Dictionary (and Thesaurus). The resource is worth
consulting for this last one, as this time (compare 2.1.2 above), it is indeed the respectable
Collins English Dictionary, which is generally not freely available elsewhere.
While the ability to hyperlink and embed is one that lies at the heart of the World Wide
Web, in dictionary aggregators the idea is taken to extremes, with the result that such
dictionary portals produce absurdly long entries by mechanically pasting together, back-to-
back, entries from several online dictionaries. These individual entries are often very similar,
which results in highly unhelpful, many-times redundant, tortuous assemblages of
disconnected lexicographic data.
3.2 Access
Electronic dictionaries, including online dictionaries, are often praised for their access
functionality, which is claimed to be superior compared to paper book form. Clearly, the
electronic interface is by definition more flexible and has a potential for efficiency that is not
achievable in static printed form, but it is also true that this potential is not always properly
utilized, especially if the online dictionary is retrospectively digitalized (Wiegand et al. 2010:
209). One example of a respectable online dictionary with paper-like access is the American
Heritage Dictionary, which has no search facility at all, worse still is Dorland's Medical
Dictionary (see 2.5 above), where outer access is even slower and more cumbersome than in a
printed book. However, some online dictionaries do take advantage of the electronic media
and explore alternative access routes. As an illustration of this issue, let us consider issues of
access in cases where a search term potentially returns large amounts of data.
3.2.1 The step-wise approach to outer access?
Over ten years ago, Hulstijn and Atkins (1998) proposed what they called “step-wise access”
for electronic dictionaries. In this connection, it is interesting to observe how this proposal
stands up in view of the practical implementations in online English dictionaries. For this, we
need to examine the volume of data that a dictionary presents to the user in those cases when
a search term matches more than a single treatment unit, such as multiple lemmata (such as
items of different part of speech), or includes multi-word expressions (MWE's), such as fixed
phrases, idioms or phrasal verbs. The spectrum of actual solutions seen in English online
dictionaries can essentially be reduced to three options:
1. a menu of target items is presented;
2. a menu is presented, but the most likely choice opens by default;
3. partial entries are listed.
The first option, by far the most common, can be illustrated using Macmillan Dictionary
Online as an example. Here, a search on a word-long string team returns a vertical menu of
nine matches, each one hyperlinked to an entry or subentry. The top of the menu looks like
this:
team NOUN
team VERB
dream team NOUN
sales team NOUN
...
Option 2. features in the Merriam-Webster's Advanced Learner's English Dictionary, where
a search for team produces a similar list of seven items, but the first of these (here again, team
NOUN) is already given as the full entry immediately below the list.
Option 3. is implemented in the online dictionary at myCOBUILD.com,36 available to
buyers of the printed copy of the Collins COBUILD Advanced Dictionary. The approach is an
intermediate one between a bare lemma list (Option 1.) and complete entries (Option 2.). As
seen in Figure 1, showing the entry TEAM in myCOBUILD.com, the dictionary interface alerts
the user that multiple entries have been found, and then displays the top of each lemma with a
More link leading to the complete entry for that lemma.
Figure 1: The entry for TEAM in myCOBUILD.com as an example of a stepwise interface
Which of the three options is best? A universal answer, ignoring lexicographically relevant
details such as the nature of the lookup situation and specific user needs and skills, rarely
makes sense in lexicography, but let us offer some observations that might have a more
universal appeal. Option 2. looks attractive, but there is a danger here that users may fail to
recognize that the default choice (as here team NOUN) is the wrong one in their case. In
contrast, Option 1. seems relatively safe in terms of the risk of missing the right option, but
the problem here lies in the economy of effort (aka laziness): users may lack the patience to
navigate through the menu to actual full treatment, and may decide instead to ditch a tool
which requires two much clicking work. In view of the above reservations, Option 3. might
perhaps be optimal (other things being equal), and it is surprising that so few dictionaries have
adopted it.
3.3 Customization and profiling in online English dictionaries
A recent study by Tono (2011), the first dictionary use study ever to employ eye tracking,
confirms the suspicion that dictionary users differ greatly in their consultation habits and
strategies. The realization that different users have different needs and expectations lies
behind efforts to vary or customize e-dictionaries (De Schryver 2009; Verlinde et al. 2010),
and, indeed, in some online dictionaries of English we have reviewed above, users do have
some ability to control the presentation of lexicographic data.
Oxford English Dictionary online has control buttons to display or hide away the following
data types: Pronunciation, Spellings, Etymology, Quotations, Date Chart, Additions. It should
be observed that this solution is not really lexicographic-function-driven (Tarp 2008), as the
user here is required to explicitly select the data fields included in the dictionary. However,
the users of an academic dictionary such as this usually represent a high level of
sophistication (many being language scholars), and so they are much more likely than naive
users to know directly and explicitly what data types they actually need.
Macmillan English Dictionary Online offers two pre-packaged presentation modes which
can be selected by flipping the Show Less/Show More control button located next to the
lemma sign. The choice is suggestive of the difference between a text reception mode and a
text production mode, respectively. Switching to the more basic mode hides away the
phonetic transcription, collocations (with examples), grammar labels and some of the
examples.
However, synonym links are still included, even though, arguably, a synonym list is not
very useful for text reception. Only a minority of dictionary users will be aware that the
dictionary has a third, even simpler mode, available via the so-called interstitial page,
accessible from collaborating news sites37 by double-clicking on any word in the text (luckily,
the engine includes lemmatization, so the word-form stealing takes the user to the lemma
STEAL). In this mode, all examples and synonyms are now absent, as one would expect in true
reception mode.
User profiling is one of the highlights in the new Louvain EAP dictionary (see also 2.2.2
above), now in development, where the content presented depends on the user-selected native
language and discipline (field domain) of interest.
3.4 Multimedia in online dictionaries
Online dictionaries can potentially include a range of multimedia content. The potential is
utilized in online dictionaries of English to varying degrees.
3.4.1 Graphics
Graphical elements are not the sole domain of electronic dictionaries, as drawings, and (to a
lesser extent) photographs, diagrams and tables have been used for a long time in paper
dictionaries. However, pictorials are more easily and cheaply included in electronic
dictionaries (Lew 2010). For example, illustrations are present in some entries in Cambridge
Dictionaries Online or the free online version of Longman Dictionary of Contemporary
English.
Thanks to the linkability of the web, it is quite possible to embed media from other
providers. However, one has to count with the ramifications of limited control over
hyperlinked content. For example, between (roughly) November 2009 and June 2010, the
Google Dictionary used to display popular images from Google’s own image search service
next to some entries. As a consequence, the Google Dictionary entry for KILT included a
photograph which, likely without conscious intent, conveyed all too clearly the cultural
information that kilts need no accompanying underwear (in the interest of propriety, no
screenshot is included in this article). As of this writing, the Google Dictionary has
discontinued the inclusion of images.
3.4.2 Audio
It is becoming increasingly popular for online dictionaries of English to offer audio recordings
of entry words. However, recordings of other verbal elements (definition, examples) are rarely
included: of the dictionaries discussed in this article, it is only the subscription version of
LDOCE which offers spoken recordings of all example sentences. One novel use of audio is
to present characteristic sounds associated with the entry word: an interesting subgenre of
ostensive defining. Proposals to include such elements in electronic dictionaries have been
made by Dodd (1989: 91) and Ooi (1998: 112). Dodd called them sound effects, and such
recordings are now available in the free Macmillan English Dictionary Online. There, the user
can hear the sounds produced by musical instruments under their relevant headwords, both
popular ones (GUITAR, PIANO, VIOLIN, RECORDER), as well the less well-known (SITAR).
Animal noises and bird calls are likewise included (ROAR, HOOT: perhaps also worth linking
under the entries LION and OWL), as well as sounds made by humans (CLAP, LAUGH, HICCUP),
and noisy machines (TRAIN, HELICOPTER).
3.4.3 Video and animation
With the speed of the internet steadily on the increase, video content is becoming mainstream
on the web. However, English online dictionaries have not really embraced the video
technology so far. This caution may, in fact, be well-motivated: Chun and Plass (1996) point
out that video sequences are too transient to allow the spectator to build a stable mental
model. Thus, videos may not make good cognitive sense, because the viewer may be unable
to pace the information processing at the rate that works for them.
Similar reservations can be raised for animated graphics, and there is at least one empirical
study which appears to substantiate the pessimistic view of the effectiveness of animations, at
least for dictionary-induced vocabulary learning. Lew and Doroszewska’s recent study (2009)
found a strong and significant negative impact of viewing animations on vocabulary retention.
3.5 Dictionaries, corpora and lexical databases
We have seen above repeatedly online dictionaries using WordNet data. In fact, WordNet is
often loosely referred to as a “dictionary”, even though, in more careful usage, it is a lexical
database rather than a dictionary. I suspect that for the average user, the distinction is too fine
a point. Yet, if we look at the recent history of dictionary-making, we see the growing role of
information technology and structured data: corpora, databases, the use of structured markup
such as XML. The current trend then is towards a clearer separation of the data layer from
presentation, in line with Sue Atkins’ visionary proposal (1996). Increasingly, the dictionary
as the user sees it is likely to be but an epiphenomenon on a structured lexical database or
corpus, and the presentation layer is set to become an automated procedure, requiring little or
no human intervention (De Schryver 2009; Atkins et al. 2010; Kilgarriff and Rychlý 2010)
(also see Almind and Nielsen, this volume, Gouws, this volume?).
Indeed, as corpus interfaces and wrappers get increasingly sophisticated, they can be used
in some ways similar to dictionaries, so that even a more cultured user may not care what’s
“under the hood” as long as the interface can be used as a sort-of dictionary. As an example,
consider the fully automatic collocations dictionary ForBetterEnglish.com,38 which uses the
SketchEngine and GDEX technologies (Kilgarriff et al. 2008) on server-resident corpora to
automatically produce entries such as the one in Figure 2. Clearly, it takes quite an expert to
tell that this is not your usual human-made dictionary entry. The illusion would have been
even better if the type-of-collocation indicators (object_of, etc.) had been given less technical
and more user-friendly names.
Figure 2: Entry for TOOTH in the ForBetterEnglish.com automated collocations dictionary
Another corpus-based online resource, also having to do with English collocations,
JustTheWord,39 is even capable of correcting unnatural word combinations. Figure 3 shows
the output for the query POWERFUL TEA with the “find alternatives” option selected. The
interface indicates whether the word combination is “good” (green bar on the right, colours
not shown in print), or “bad” (red bar), and the length of the bar indicates the (un)typicality of
the word combination. Further, the narrow blue bar directly underneath each combination
indicates the degree of meaning similarity between the combination to be replaced and each
candidate for replacement. Here, the collocation strong tea has the longest blue bar, and
indeed this is the idiomatic phrase that a learner of English would have wanted to use instead
of the non-idiomatic powerful tea, had they known any better themselves. All in all, the
information provided is very useful and relevant, and it may actually be hard to believe that
this output has been computed fully automatically.
Figure 3: JustTheWord alternative collocation suggestions for ‘powerful tea’
There exist other “smart” interfaces to corpora. One of them is http://corpus.byu.edu,
created and maintained by Mark Davies, and it offers free access to several corpora, including
the Corpus of Contemporary American English (COCA),40 currently the largest publicly
available corpus of English. Another one is the SketchEngine,41 available by subscription. A
subset of the British Academic Spoken English corpus is available through IBM’s many
eyes42 clever visualizing interface, allowing the user to investigate the syntagmatic
relationships of the most common words, though it is not all that useful for the less common
combinations, due to small corpus size. A rich and comprehensive lexical database of English
with a dictionary-like interface will very soon become publicly available online as part of the
DANTE43 project.
These resources represent a high level of sophistication and so there is not much hope that
their popularity will extend much beyond a relatively small group of power users; the others
will just increasingly Google for any answers, irrespective of the nature of the problem, and I
fear that this tendency presents a real threat to more specialized reference tools, including
dictionaries.
4 Summary and conclusion
In our necessarily sketchy overview of English online dictionaries, we have seen that a great
variety of dictionaries exist, but without proper guidance users run the risk of getting lost in
the riches. It is surprising to see so many of the online dictionaries (including some from
respectable publishers) still largely constrained by the paper model, with access mechanisms
to lexicographic data often being substandard for today’s technology. Furthermore, users may
get flooded with irrelevant and highly repetitive information, especially by dictionary
aggregators. And even if hyperlinking to external sources embodies the best practice in
hypertext philosophy, it is not without danger, as it relinquishes much of the control over the
content of “our” dictionary page. More generally, the universal use of search engines (or one
dominant search engine) presents a risk of dictionaries (or any specialized online works of
reference) being marginalized. Finally, learners of English are still waiting for a function-
driven lexical resource of the type represented by the excellent Base lexicale du français44
(Verlinde et al. 2010 and this volume).
Notes
1 http://dictionary.cambridge.org
2 http://oxforddictionaries.com
3 http://www.collinslanguage.com
4 http://www.chambersharrap.co.uk/chambers/features/chref/chref.py/main
5 http://encarta.msn.com/encnet/features/dictionary/dictionaryhome.aspx
6 http://www.internetworldstats.com
7 http://www.oup.com/elt/catalogue/teachersites/oald7/lookup?oup_jspFileName=document.jsp&cc=pl
8 http://www.ldoceonline.com
9 http://ldoce.longmandictionariesonline.com/dict/SearchEntry.html
10 http://dictionary.cambridge.org
11 http://www.macmillandictionary.com
12 http://www.mycobuild.com
13 http://dictionary.reverso.net/english-cobuild
14 http://www.learnersdictionary.com
15 http://nhd.heinle.com/home.aspx
16 http://dictionary.cambridge.org/Default.asp?dict=A
17 http://www.urbandictionary.com
18 http://en.wiktionary.org
19 http://www.wordnik.com
20 http://www3.merriam-webster.com/opendictionary/
21 http://www.macmillandictionary.com/open-dictionary/latestEntries.htm
22 http://www.macmillandictionaryblog.com, winner of the 2009 Edublog award for best education blog on the
web
23 http://quod.lib.umich.edu/m/med
24 http://www.merckmedicus.com/pp/us/hcp/thcp_dorlands_content_split.jsp?pg=/ppdocs/us/common/dorlands/
drlnd/misc/dmd-a-b-000.htm
25 http://www.acronymfinder.com
26 http://www.etymonline.com
27 http://www.howjsay.com, the domain name being an eye-dialect rendition of the casual pronunciation of the
phrase ‘how do you say?’
28 http://www.cmu.edu
29 http://seas3.elte.hu/epd.html
30 http://thesaurus.com/?regHome=true
31 http://www.rhymezone.com
32 http://wordnetweb.princeton.edu
33 http://www.visuwords.com
34 http://www.visualthesaurus.com
35 http://dictionary.reference.com
36 http://www.myCobuild.com
37 One example is http://www.shanghaidaily.com
38 http://forbetterenglish.com
39 http://193.133.140.102/justTheWord, Sharp Laboratories
40 http://www.americancorpus.org
41 http://www.sketchengine.co.uk
42 http://manyeyes.alphaworks.ibm.com/manyeyes/visualizations/3e335458358611de909d000255111976
43 http://www.webdante.com
44 http://ilt.kuleuven.be/blf
References
Atkins, Beryl T. Sue. 1996. ‘Bilingual Dictionaries - Past, Present and Future’ in Gellerstam,
Martin, Jerker Jarborg, Sven-Göran Malmgren, Kerstin Noren, Lena Rogström and
Catarina Röjder Papmehl (eds.), EURALEX '96 Proceedings. Göteborg: Department of
Swedish, Göteborg University, 515-546.
Atkins, Beryl T. Sue, Adam Kilgarriff and Michael Rundell. 2010. ‘Database of Analysed
Texts of English (Dante): The Neid Database Project’ in Dykstra, Anne and Tanneke
Schoonheim (eds.), Proceedings of the XIV Euralex International Congress. Ljouwert:
Afûk, 549-556.
Béjoint, Henri. 2010. The Lexicography of English. From Origins to Present. Oxford:
Oxford University Press.
Bogaards, Paul. 2010. ‘The Evolution of Learners' Dictionaries and Merriam-Webster's
Advanced Learner's English Dictionary’ in Kernerman, Ilan and Paul Bogaards (eds.),
English Learners' Dictionaries at the DSNA 2009. Tel Aviv: K Dictionaries, 11-27.
Carr, Michael. 1997. ‘Internet Dictionaries and Lexicography.’ International Journal of
Lexicography 10.3: 209-230.
Chun, Dorothy M. and Jan L. Plass. 1996. ‘Effects of Multimedia Annotations on
Vocabulary Acquisition.’ Modern Language Journal 80.2: 183-198.
Cowie, Anthony Paul. 1999. English Dictionaries for Foreign Learners: A History. Oxford:
Clarendon Press.
De Schryver, Gilles-Maurice. 2009. ‘State-of-the-Art Software to Support Intelligent
Lexicography’ in Zhu, R. (ed.), Proceedings of the International Seminar on Kangxi
Dictionary & Lexicology. Beijing: Beijing Normal University, 565–580.
Dodd, W. Steven. 1989. ‘Lexicomputing and the Dictionary of the Future’ in James, Gregory
(ed.), Lexicographers and Their Works. Exeter Linguistic Studies 14. Exeter: Exeter
University Press, 83-93.
Fuertes-Olivera, Pedro A. 2009. ‘The Function Theory of Lexicography and Electronic
Dictionaries: Wiktionary as a Prototype of Collective Free Multiple-Language
Internet Dictionary’ in Bergenholtz, Henning, Sandro Nielsen and Sven Tarp (eds.),
Lexicography at a Crossroads: Dictionaries and Encyclopedias Today,
Lexicographical Tools Tomorrow. Linguistic Insights - Studies in Language and
Communication, Vol.90. Bern: Peter Lang, 99-134.
Hanks, Patrick. 2009. ‘Review of Stephen J. Perrault (Ed.). 2008. Merriam-Webster's
Advanced Learner's English Dictionary.’ International Journal of Lexicography 22.3:
301-315.
Hulstijn, Jan H. and Beryl T. Sue Atkins. 1998. ‘Empirical Research on Dictionary Use in
Foreign-Language Learning: Survey and Discussion’ in Atkins, Beryl T. Sue (ed.),
Using Dictionaries. Studies of Dictionary Use by Language Learners and Translators.
Lexicographica Series Maior 88. Tübingen: Niemeyer, 7-19.
Kilgarriff, Adam, Milos Husak, Katy McAdam, Michael Rundell and Pavel Rychlý.
2008. ‘GDEX: Automatically Finding Good Dictionary Examples in a Corpus’ in
Bernal, Elisenda and Janet DeCesaris (eds.), Proceedings of the XIII EURALEX
International Congress. Barcelona: Universitat Pompeu Fabra, 425-432.
Kilgarriff, Adam and Pavel Rychlý. 2010. ‘Semi-Automatic Dictionary Drafting’ in De
Schryver, Gilles-Maurice (ed.), A Way with Words: Recent Advances in Lexical
Theory and Analysis. A Festschrift for Patrick Hanks. Kampala: Menha Publishers,
299-312.
Lew, Robert. 2010. ‘Multimodal Lexicography: The Representation of Meaning in Electronic
Dictionaries.’ Lexikos 20.
Ooi, Vincent Beng Yeow. 1998. Computer Corpus Lexicography. Edinburgh: Edinburgh
University Press.
Rundell, Michael. 2006. ‘More Than One Way to Skin a Cat: Why Full-Sentence Definitions
Have Not Been Universally Adopted’ in Corino, Elisa, Carla Marello and Cristina
Onesti (eds.), Atti Del XII Congresso Di Lessicografia, Torino, 6-9 Settembre 2006.
Allessandria: Edizioni dell'Orso, 323-337.
Sobkowiak, Włodzimierz. 2009. ‘Review of Wells, John C., Longman Pronunciation
Dictionary (3rd Edition).’ International Journal of Lexicography 22.2: 191-209.
Tarp, Sven. 2008. Lexicography in the Borderland between Knowledge and Non-Knowledge:
General Lexicographical Theory with Particular Focus on Learner’s Lexicography.
(Lexicographica Series Maior 134.). Tübingen: Max Niemeyer Verlag.
Tono, Yukio. 2011. ‘Application of Eye-Tracking in EFL Learners’ Dictionary Look-up
Process Research.’ International Journal of Lexicography 23.1.
Verlinde, Serge, Patrick Leroyer and Jean Binon. 2010. ‘Search and You Will Find. From
Stand-Alone Lexicographic Tools to User Driven Task and Problem-Oriented
Multifunctional Leximats.’ International Journal of Lexicography 23.1: 1-17.
Wiegand, Herbert Ernst, Michael Beißwenger, Rufus H. Gouws, Matthias Kammerer,
Angelika Storrer and Werner Wolski. 2010. Wörterbuch Zur Lexikographie und
Wörterbuchforschung. Dictionary of Lexicography and Dictionary Research. Vol. 1
(A-C). Berlin: Walter de Gruyter.
Yamada, Shigeru. 2010. ‘EFL Dictionary Evolution: Innovations and Drawbacks’ in
Kernerman, Ilan and Paul Bogaards (eds.), English Learners' Dictionaries at the
DSNA 2009. Tel Aviv: K Dictionaries, 147-168.
... Sözlükbilim dünyasında e-sözlüklerin basılı sözlüklere üstünlükleri hakkında çok şey söylenir (bkz. de Schryver, 2003;Bergenholtz & Gouws, 2007;Lew, 2011Lew, , 2012Dziemianko, 2016;Sutter, 2017;Atli, 2021a). Bu konuya ilişkin söylenenlere şöyle bir bakıldığında, aşağıdaki üstünlüklerin özellikle vurgulandığı görülür: 1) Basılı sözlüklere eklen(e)meyen modern multimedya araçlarının yalnızca e-sözlüklere özgü olması, ...
... İşaret dili sözlükleri için video kayıtları neredeyse vazgeçilmezdir. Ne var ki, yabancı dildeki seslerin ana dilin süzgecinden geçirildiği ve ana dilde var olmayan ses ayrımlarının tespit edilemediği unutulur (Lew, 2011). E-sözlükler, maliyetleri büyük ölçüde düşürür. ...
Article
Full-text available
The use of computers in practical lexicography in the 1960s introduced a fundamental change in traditional lexicography. Especially in the 1990s, the accumulating of electronic language data and the increasing use of low-cost high-speed broadband internet at the beginning of the 21st century is leading to a rapid growth of e- lexicography studies. Along with these developments, many printed dictionaries became available as e-dictionaries within a short time. The advantage of electronically published dictionaries over printed dictionaries catches the attention of many lexicographers immediately. But nobody noticed the problems caused by hybridizations. Likewise, issues such as corpus coherence, data reliability, access path, quality, and utility attract few researchers' attention. Nevertheless, modern developments in e- lexicography have led to an unbalanced growth in expectations of the discipline, as well as a radical change in its functional field. In this study, the definition of the term computer lexicography is emphasized in addition to the historical advances of e-lexicography. And the phases of e-dictionary making were also discussed. In addition, an attempt was made to determine whether there are problems with today's e-dictionaries in terms of hybridization, corpus, data reliability, access path, personalization and quality. It is known that electronic devices require speech technology software to analyse linguistic and lexicographical data, and language technology applications require lexical data to be readable/understandable. Language needs to be standardized, because e-devices are more sensitive than humans to detect mistakes and errors. Therefore, this study makes some suggestions to identify and get rid of problems with e-dictionaries.
... This is why we take advantage of the possibilities offered by the electronic medium, heeding the plea made by many researchers for the development of e-dictionaries in which to custom display data depending on the user profile and choices (cf. Lew 2011 andTrap-Jensen 2010). ...
Article
Full-text available
This article is intended as the first of a series of papers designing an electronic linguistic resource made up of three modules: (1) a phrase-based active dictionary thought of as a first attempt to implement John Sinclair’s vision of the “ultimate dictionary”; (2) a grammar / construction describing not only the morphologic and syntactic rules of a language but also its systematic (semantic) alternations and the derivations generating new meaningful constructions from old ones; (3) a phrase thesaurus / phraseological conceptual ontology taking WordNet into the modern “age of phraseology”. After introducing the theoretical framework of our project, we present the microstructure of our Phrase-based Active Dictionary (PAD) model and describe it in a general theory of lexicography.
... The data was collected by the CED project, constructed in a crowdsourcing mode (Kistner, 2013;Lew, 2011;Nagle, 2014;Qin, 2015;Qin & Gao, 2020) Table 3. ...
Article
Full-text available
This study aims to cast light on the nature and features of word formation in the lexis of Chinese English and provide a synchronic formational analysis of Chinese English neologisms by adopting a sequential data‐driven approach with a combination of qualitative and quantitative methods. The study has 1) constructed a hierarchical and quantificational four‐level structure for Chinese English word formation through meticulous coding, categorization, and calculation of 3522 headwords collected in the Chinese English Dictionary; and 2) revealed departures of the present taxonomy with extant models of indigenized varieties in both qualitative (coherence, inclusiveness, refinedness, adaptability) and quantitative measures (the changing status of formation processes, source languages, and transliteration systems). The study sheds significant insights into the motivation of Chinese English words, the status of the Chinese English variety, the sociolinguistic conditions of the variety, and word formation in the Expanding Circle where a dictionary of the variety is available.
Article
Full-text available
This paper aims to foster debate about the language of racist hate speech in online English lexicography. For this purpose, it presents a study on the treatment of ethnophaulisms, or ethnic slurs, in “powered by Oxford Languages” Google’s English dictionary. The focus is indeed on the perspective of the general user of the Internet, in light of the connection between two facets of this digital age. The first one is the strong and growing tendency among Internet users to ‘google’ their language issues. The second one is the alarming increase in cases of hate speech online, most of which are based on ethnicity and nationality, according to reports by the United Nations. Consequently, the free and pervasive content of Google’s English dictionary represents a case in point to investigate whether and how online users are warned against the power of these hate words. A selected sample of 285 English ethnic slurs have been looked up in the dictionary and, if recorded, their entries have been scrutinised to identify lexicographic data regarding their semantic relevance and offensiveness. Findings show that the majority are included, they mostly present ethnicity-related senses, but less than half of the total are treated as ethnophaulisms. In this respect, the major dictionary markers indicating offensiveness are effect labels, predominantly alone or combined with definitions. Relative to their size, thus, ethnophaulisms in Google’s English dictionary are clearly described as offensive or derogatory expressions, thus making online users aware of their hurtful nature.
Article
В работе впервые описывается технология применения методов электронной лексикографии в практике изучения языка специальности и формирования коммуникативной компетенции у будущих преподавателей системы профессионального образования. В статье содержится подробное описание процедуры создания учебного словаря с использованием языка разметки DSL. Разработка двуязычного электронного словаря рассматривается одновременно и как объект изучения, и как средство обучения языку специальности. При проведении исследования был выполнен анализ нормативно-правовой документации, а также учебно-методической литературы; представлен анализ существующих двуязычных электронных словарей. Результаты работы внедрены в практику подготовки преподавателей системы среднего профессионального образования, изложенные в работе идеи и практические рекомендации могут быть использованы при проведении занятий по изучению языка специальности. The paper is the first one to describe the technology of applying methods of electronic lexicography to teaching foreign language for professional purposes, and even to form communicative competence among future teachers of the secondary vocational education system. A detailed description of the procedure for creating learning dictionary using the Dictionary Specification Language (DSL) is suggested. The composing of a bilingual electronic dictionary is considered both as an object of studying and as a teaching tool. In addition to the analysis of scientific literature, the analysis of regulatory and legal documentation, as well as educational and methodological literature was carried out. An analytical review of electronic bilingual dictionaries is presented. The results of the study can be used to train vocational education teachers and to teach foreign language for professional purposes.
Article
The search for the optimal and most comprehensible ways of lexical units semantization has been the key problem of lexicography and metalexicography for the dictionary user and remains the contemporary problem. Introduction of computer technology has radically changed the concept of the dictionary, solved many problems of the lexicographic genre, and expanded the capabilities of the dictionary user. Along with this, the abundance of lexicographic works, including electronic ones, dictionary platforms, that act as aggregators of several online dictionaries, an overabundance of reference information, disorient an inexperienced consumer. Due to this, the objective of this article is the typology of existing online dictionaries and information dictionary platforms based on the material of the German language, depending on their content and general purpose. The most accurate multi-aspect interpretation of a word comprises various features and connotations taken into account. Nevertheless, the reference nature requires a laconic description of the material, devoid of a formalized meta-language, and the ability to quickly find the necessary explanation. Regarding this, the analysis of the application of the principles of multimodality is conducted in the article to solve the problems of meaning representation in electronic dictionaries. Besides, the author underlines the significance of the multimodal meaning representation for the effective interpretation of foreign lexis in all variety of nuances, and, therefore, for the effectiveness of a foreign language lexis acquisition and overcoming translation difficulties.
Article
Full-text available
Lexicography faces a pivotal transition in the 21st century. In a context of disruptive innovation, publishing companies confront a big challenge due to a plethora of open-access online dictionaries, mature free platforms such as WordReference, Reverso, Linguee or Wiktionary, and ever-progressing automatic translators. In order to analyse the critical moment that lexicography is going through, this paper deals with the technological impact and the advantages and disadvantages of paper and electronic dictionaries. Moreover, transversal issues such as the quality of dictionaries and the satisfaction of users’ needs are tackled. Our purpose is to provide a picture of present-day metalexicography and to indicate the challenges and obstacles that lexicography faces in the digital era.
Article
La lexicografía del siglo xxi afronta una transición decisiva. En un contexto de disrupción tecnológica, la proliferación de diccionarios en línea de consulta abierta, la consolidación de plataformas gratuitas como WordReference, Reverso, Linguee o Wiktionary, y el avance de los traductores automáticos plantean todo un desafío a las editoriales. Para analizar el momento crítico que atraviesa la lexicografía, este artículo aborda el impacto tecnológico, comparando las ventajas y desventajas de los diccionarios en papel frente a los diccionarios electrónicos. Asimismo, se debaten cuestiones transversales como la calidad de los diccionarios y la satisfacción de las necesidades de los usuarios. El objetivo es ofrecer un mapa del estado actual de la metalexicografía y señalar los desafíos y obstáculos a los que se enfrenta la lexicografía en la era digital.
Article
Full-text available
Los diccionarios son herramientas de extraordinaria importancia en el campo del aprendizaje de idiomas extranjeros tanto por su larga tradición en la educación como por su marcado carácter pedagógico. A pesar de ello, su uso es limitado y, en la mayoría de ocasiones, este se limita al diccionario bilingüe aunque existan obras monolingües destinadas a este fin: los diccionarios de aprendizaje. Con el objetivo de profundizar en el conocimiento sobre la percepción y el uso que los estudiantes de lenguas extranjeras hacen de los diferentes tipos de diccionarios, en este artículo se presentan los resultados obtenidos a partir de las respuestas a un cuestionario distribuido a 41 alumnos de Tercero de Educación Secundaria Obligatoria (E.S.O.) del C. D. P Madre del Divino Pastor en Andújar (Jaén) para dirimir el estado en el que se encuentran los diccionarios en este nivel educativo y comprobar si se hace un uso regular de unas obras que pueden contribuir a la mejora drástica de las calificaciones en esta asignatura.
Article
This paper aims to relate two linguistic phenomena: neology (along with sources for its study) and collaborative lexicography. A pair of case studies is presented concerning two thematically defined groups of recent Czech neologisms: those abusing the Czech ex-president V. Havel’s name and those reflecting the Covid-19 pandemic. An initial dataset was provided by the user-generated content web dictionary of non-standard Czech Čeština 2.0 and the Neomat neology database, fostered by professional linguists. The objective data from a monitor corpus of Czech is used in contrast with the initial dataset and thereby leads to some open questions, especially with regards to the extent to which amateur and professional, two branches of lexicography, can inspire and enrich each other.
Conference Paper
Full-text available
This paper presents a proposal for a revolutionary type of electronic dictionary, one in which the potential is explored to link an automatically derived dynamic user profile to the proffered multimedia lexicographic output. Such adaptive and intelligent dictionaries may use the TshwaneLex dictionary production system at their core, to which a string of artificial intelligent components are added. This proposal is illustrated by means of the description of a project to compile an online Swahili to English dictionary. Swahili is both the most widely spoken African language, and the one sub-Saharan language most commonly taught throughout the world. As a theoretical framework for the development of this new type of electronic dictionary, the ‘fuzzy answer set programming’ framework (Van Nieuwenborgh et al. 2007) is advanced.
Article
Full-text available
Regardless of their name (dictionary, glossary, encyclopaedia, or even ‘leximat’, in the case of a new generation of online, semi-automated lexicographic tools), subject-field, purpose, or medium (paper or cyber), lexicographic reference works should be regarded as functional information tools that are solely designed to cater to the information needs of their users in different usage situations and that consequently help them solve specific communication (reading, writing, translation) or knowledge problems (acquiring new knowledge or verifying existing knowledge, learning a language or a subject field). In this article, we briefly outline the evolution of lexicographic reference works from stand-alone to multifunctional lexicographic tools, and we describe the theoretical principles and innovative functionalities of a new task and problem-oriented lexical database, the Base Lexicale du Français. In line with Tarp (2006), a tool that should be truly regarded as a ‘leximat’.
Book
This is the first history of dictionaries of English for foreign learners, from their origins in Japan and East Asia in the 1920s to the computerized compilations of the present. Monolingual dictionaries for foreign speakers were a revolutionary development at their outset, and now represent a coming-together of intellectual, technological and commercial forces almost unequalled in book publishing. As the author shows, the early history of EFL dictionaries was research-driven, arising directly from research in linguistic theory and language pedagogy; now it is user-driven, determined by what users require or are thought to require. The pioneering dictionaries were the work of individuals. Current dictionaries are the products of huge databases manipulated by sophisticated processing, as publishers strive to share an immense and constantly growing global market. The book has both a thematic and a chronological structure. Three chapters describe the historical sequence over a period of some sixty years. These alternate with chapters dealing with phraseology, computers and corpus linguistics, and research into dictionary users and uses - three subjects central to the development of ELT dictionaries over the last thirty years. Dr Cowie examines the way in which availability of massive computing power has transformed the recording and analysis of current speech, and shows how the growth of research into the users and uses of dictionaries has led to developments both in ELT lexicography and method. This readable and non-technical account is directed both at professionals in applied linguistics and English language teaching, and at lexicographers, but it will interest and fascinate everyone concerned with the analysis of English and faced with the challenge of recording of the subtelties of its grammar and meaning.
Article
The present study aims to apply eye-tracking technologies to analyse the process of dictionary look-up by learners of English as a foreign language. An experiment was conducted to examine detailed processes of look-up in the microstructure. Several variables (the availability of supporting devices such as signposts or menus, different types of grammar codes, positions of target definitions) were carefully controlled to see how look-up behaviour would change in both monolingual and bilingual dictionary interfaces. The findings show that look-up processes within a microstructure are very complex, showing interactive effects among positions of target information within the microstructure, functions of supporting devices, and users’ proficiency levels. Pedagogical and methodological implications will be discussed.
Article
At the last Euralex Congress, John Sinclair reiterated the case for full-sentence definitions (FSDs), and questioned why the COBUlLD approach to defining had not been generally adopted by other dictio-nary publishers. This paper answers his question. The theoretical case for FSDs is reviewed (and in general not challenged), and it is shown how the full-sentence model often results in definitions that are more effective and more readable than could be achieved using traditional styles. But the FSD is not al-ways the most appropriate strategy: the approach has several disadvantages, and a rigid adherence to this style does not always serve best interests of dictionary users (especially language learners). Rather, it will be argued, the goals that gave rise to the FSD may often be achieved through other means. The paper concludes with proposals for a range of defining strategies (including FSDs), along with sugges-tions as to when each is likely to be most effective.
Article
Cyberlexicography is definable as “employing the Internet to compile or create a dictionary.” Modern lexicographers can use the Net in various ways: participating in electronic conferences, consulting dictionaries and encyclopedias, or searching word usages within the ultimate corpus. An Appendix contains Figures 1–8. Submitted October 1996.
Article
Research on second language (L2) vocabulary acquisition has revealed that words associated with actual objects or imagery techniques are learned more easily than those without. With multimedia applications, it is possible to provide, in addition to traditional definitions of words, different types of information, such as pictures and videos. Thus, one of the fundamental research questions posed in the use of multimedia systems is: How effective are annotations with different media types for vocabulary acquisition? This article discusses the results of three studies done with 160 university German students using CyberBuch, a hypermedia application for reading German texts that contains a variety of annotations for words in the form of text, pictures, and video. The issues examined are related to (a) how well vocabulary is learned incidentally when the goal is reading comprehension, (b) the effectiveness of different types of annotations for vocabulary acquisition, and (c) the relationship between look‐up behavior and performance on vocabulary tests. The results showed a higher rate of incidental learning than expected (25% accuracy on production tests, 77% on recognition tests), significantly higher scores for words that were annotated with pictures + text than for those with video + text or text only, and a correlation between looking up a certain annotation type and using this type as the retrieval cue for remembering words.