ArticlePDF Available

Abstract

This paper uses the algorithm employed in a number of recent template-based NLG systems to challenge the wide-spread assumption that template-based methods are inherently less wellfounded than plan-based methods. Keywords: NLG paradigms, D2S, templates for NLG, plan-based NLG 1 Introduction: a caricature Natural Language Generation (NLG) systems are sometimes partitioned into two mutually exclusive, jointly exhaustive classes [1,11,13]: (A) theoretically well-founded systems, which embody generic linguistic insights and are, as a result, easily maintainable. Sometimes, the term `(full-blown) NLG' has been narrowed down to denote this class only; and (B) application-dependent systems which lack a proper theoretical foundation. These systems may be relatively easy to deploy but they are difficult to maintain. The following equalities tend to be stated or suggested: A = plan-based NLG systems; B = template-based NLG systems. 1 We will argue against these two identifications. We start...
Plan-based vs. template-based NLG:
a false opposition?
Kees van Deemter , Emiel Krahmer & Mari
¨
et Theune
ITRI , Brighton and IPO , Eindhoven
January 11, 2001
Abstract
Thispaperuses the algorithmemployedina numberofrecenttemplate-based
NLG systems to challenge the wide-spread assumption that template-based
methods are inherently less well-founded than plan-based methods.
Keywords: NLG paradigms, D2S, templates for NLG, plan-based NLG
1 Introduction: a caricature
Natural Language Generation (NLG) systems are sometimes partitioned into two
mutually exclusive, jointlyexhaustiveclasses[1,11,13]: (A)theoretically well-founded
systems, which embody generic linguistic insights and are, as a result, easily main-
tainable. Sometimes, the term ‘(full-blown) NLG’ has been narrowed down to de-
note this class only; and (B) application-dependent systems which lack a proper
theoretical foundation. These systems may be relatively easy to deploy but they
are difficult to maintain. The following equalities tend to be stated or suggested:
A = plan-based NLG systems; B = template-based NLG systems.
We will argue
against these two identifications. We start out by sketching a class of systems that
are template-based, while at the same time being as theoretically well-founded as
any existing plan-based system.
This identification may have originated when the term ‘template’ approach was used (‘for lack
of a better name’) to refer to ‘programs that simply manipulate character strings, in a way that uses
little, if any, linguistic knowledge’ [11]. In the present paper, ‘template-based’ will be taken to mean
“making extensive use of a mapping between semantic structures and representations of linguistic
surface structure that contain gaps”.
1
2 NLG with syntactically structured templates
In this section a brief description of a data-to-speech method called D2S is given.
D2S is the foundation of a number of language generation applications for different
domains (Mozart compositions, soccer reports, route descriptions, train informa-
tion) and languages (Dutch, English, German). As a running example we use the
GoalGetter system which generates Dutch soccer reports. (See http://iris19.ipo.-
tue.nl:9000/english.html for an on-line demonstration.) D2S consists of two mod-
ules: (1)alanguage generation module(LGM) whichconvertsatypeddata-structure
into enriched text, i.e., a text annotated with information about the placement of ac-
cents and boundaries, and (2) a speech generation module (SGM) which turns the
enriched text into a speech signal. Here we focus on the LGM and in particular on
its use of syntactically structured templates, an example of which is given in Figure
1.
CP
NP
time
C
C
V
liet
IP
NP
player
VP
NP
DET
playergen
N
ADJ
ordinal
N
doelpunt
V
aantekenen
time ExpressTime (currentgoal.time)
player
ExpressObject (currentgoal.player, P, nom)
playergen ExpressObject (currentgoal.player, P, gen)
ordinal ExpressOrdinal (ordinalnumber)
Known (currentmatch.result) currentgoal First (notknown,goallist)
currentgoal.type owngoal
goalscoring
Figure 1: Sample syntactic template from the GoalGetter system.
liet een doelpunt aantekenen (‘let a goal be noted’)
means put a goal on the scoresheet
Formally, a syntactic template , where is a syntactic tree (typ-
ically for a sentence) with open slots in it,
is a set of links to additional syntactic
structures (typically NPs and PPs) which may be substituted in the gaps of
, is
a condition on the applicability of
and is a set of topics. We discuss the four
components in more detail, beginning with the syntactic tree,
. All interior nodes
of the tree are labeled by non-terminal symbols, while the nodes on the frontier are
labeled by terminal or non-terminal symbols, where the non-terminal nodes are the
gaps which are open for substitution and are marked by a
. Many templates con-
tain only one (group of) lexical node(s), which may be thought of as the head of
the construction, while the gaps are to be lled by its arguments. An example is the
template in Figure 1, whose head is the collocation een doelpunt laten aantekenen
(put a goal on the scoresheet).
The second element of a syntactic template is
: the slot fillers. Each open slot
in the tree
is associated with a call of some Express function, which generates
the set of possible slot fillers. This process is handled by the function ApplyTem-
plate, shown on the left in Figure 2. ApplyTemplate first calls FillSlots to ob-
tain the set of all possible trees that can be generated from the template, using all
possible combinations of slot fillers generated by the Expressfunctions associated
with the slots. Figure 2 (right) shows an example Express function, namely Ex-
pressObject, which generates a set of NP-trees and is used to generate fillers for
the
player and playergen slots in the template of Figure 1. The first of the two,
for example, leads to the generation of NPs such as ‘Atteveld’ (proper name), ‘the
defender Atteveld’, ‘Vitesse player Atteveld’, ‘Vitesse’s Atteveld’, etc., depending
on the context in the which the NP is generated.
Once all the gaps in the template
are filled, the set all
trees results. For each tree in this set, it is checked (i) whether
it obeys Chomsky’s Binding Theory and (ii) whether it is compatible with the Con-
text Model, which is a record containing all the objects introduced so far and the
anaphoric relations among them. From the resulting set of allowed
trees, one is se-
lected randomly and returned to the main generation algorithm.
For a more sophisticated version of the way in which nominals are generated in context, see [8].
ApplyTemplate(template)
allowed trees
chosen tree nil
all trees FillSlots(template)
for each member of all trees do
if ViolateBindingTheory
false
Wellformed(UpdateContext = true
then allowed
trees allowed trees
if allowed trees = nil
then return false
else chosen
tree PickAny(allowed trees)
return final tree
ExpressObject(r, P, case)
PN, PR, RE nil
trees
PN MakeProperName (r)
PR MakePronoun (r, case)
RE MakeReferringExp (r, P)
trees
PN PR RE
return trees
Figure 2: Functions ApplyTemplate (left) and ExpressObject (right).
be410pie The thirdingredient of a syntactic template is : the Boolean condition.
A template
is applicable if and only if its associated condition is true. Several
kinds of conditions can be distinguished including, most notably perhaps, condi-
tions on the knowledge state. An example is the condition saying should not be
conveyed to the user before
is conveyed’, which implies that the template can
only be used if the result of the current match described has been conveyed to the
user (i.e., is known) and the current goal is the first one which has not been con-
veyed (is not known). Finally, each template
contains a set of topics , which the
LGM algorithm uses to group sentences together into coherent chunks of text.
3 The caricature exposed
Taking our inspiration from D2S [4,6,7], we will argue that the caricature from the
introduction is precisely this: a caricature. For starters, D2S’ application across do-
mains and languages (cf. Section 2), has revealed a remarkable genericity. Import-
ant parts of the system (e.g., the basic generation algorithm and such functions as
ApplyTemplate and ExpressObject) turned out to be independent of application
domain (Classical Music / Soccer games) and output language (English / Dutch).
This is, of course, not true for the templates themselves, many of which have to
be written anew for each new domain as well as for each language. Based on these
experiences, however, it seems fairto say that D2S is as generic andmaintainable as
anyplan-based system, whichwillhavetoadapt itsgrammar, forexample, whenever
a new application or a new output language comes along.
Butis D2S also well-founded? This depends on whatit means foran NLGsystem to
be well-founded. If it means that every decision made by the system (e.g., express-
ing a proposition in one or in two sentences, using passive or active voice; lexical
choice [2]) should be based on sound linguistic principles, then no NLG system we
are aware of qualifies as being even remotely well-founded: the gap between raw
data and text is bridged in ways that are often arbitrary. Many NLG systems use
linguistic principles, but typically such sophistication is reserved for a few aspects
of the generated text. D2S is no exception, as may be seen from Section 2. For
example, D2S uses well-established rules for constraining the use of anaphors (see
e.g., ViolateBindingTheoryand Wellformedin ApplyTemplate),and anew vari-
ant of Dale and Reiter’s algorithm [3] for the generation of referring expressions
that takes contextual salience into account (MakeReferringExp in ExpressOb-
ject) [7]. Other choices (most notably, perhaps, the choice of a pool of templates
from whichthe generator can picka candidate) are madeon less principled grounds.
The main limiting factor for the deployment of linguistic rules in D2S is not that
the method does not allow it, but simply that not enough good linguistic rules are
known. In sum: D2S, though it is a template-based system, is as well-founded as
any plan-based system.
In fact, we believe that the terminology itself is misleading. Few if any NLG sys-
tems are plan-based in the full sense in which this term is used in artificial intel-
ligence: in NLG, there usually is no place for logical inference (e.g., avoiding a
certain wording because of some explicitly represented common-sense knowledge)
or even backtracking. (Whether or not this limitation reflects a property of human
speaking and writing is a different matter.) If, as has become usual in NLG, the no-
tion of planning is stretched to cover, say, Moore andParis-style NLG[10], then the
system described in Section2 could be described as implementing a distributive, re-
active (‘situated’) planner. (See also the Conclusion of this paper.)
It is worth noting that D2S rather resembles an approach to NLG that is sometimes
omitted in discussions about practical versus applied systems, namely Tree Adjoin-
ing Grammar(TAG) (e.g., [5,9,14]). The trees in D2S are similarto the ‘initial trees
of TAG. Joshi [5:234] points out that “The initial (...) trees are not constrained in
any manner other than as indicated above. The idea, however, is that [they] will be
minimal in some sense.” The minimalism constraint is usually interpreted as: the
tree should not contain more than the lexical head plus its arguments. The com-
parison with TAG-based NLG suggests that it is not the choice of a template-based
approach that makes an NLG system theoretically unwell-founded, but the choice
for nonminimal templates / elementary trees in these systems (or the use of canned
text in plan-based systems, forthat matter). Of course, non-minimal templates/ ele-
mentary trees are essential for the treatment of any phenomena where composition-
ality breaks down, such as idioms, special collocations, etc. (cf. the treatment of
collocations in [14]). But, generally speaking, the larger the templates/elementary
trees, the less systematic the treatment, the less insight it gives into the composi-
tional structure oflanguage, andthelarger thenumberoftemplates/elementary trees
needed. Unlike the earliest D2S-based NLG systems (e.g., [4]), GoalGetter can be
argued to use templates that are minimal except where there is a good reason to
make them larger [6].
4 Conclusion
We have argued against the caricature presented in Section 1, according to which
template-based NLGsystemsarealways linguistically lessinteresting than so-called
plan-based systems. We have illustrated our claim by sketching a template-based
generation system that is theoretically as well-founded as any plan-based system,
as well as being practically useful (deployable, maintainable, etc.). Of course, there
are genuine and interesting differences between the two paradigms. For example,
template-based systems do not conform to the well-known pipeline model for NLG
[12], which starts from the assumption that the entire semantic content of a dis-
course is known at the beginning of the pipeline after which this content is pro-
cessed by the next module and so on until the document comes out at the end of
the pipeline. This could point the way to an understanding of what makes plan-
based systems more suitable for one type of application and template-based sys-
tems for another. We hypothesize that their pipeline structure (in a different jargon,
their top-down orientation) makes plan-based systems unsuitable for the modeling
of ‘spontaneous’ types of speaking/writing, in which the speaker/writer does not
always have a plan for the complete discourse before the first word is uttered. The
incremental setup of D2Ss language generation module, which lets templates ‘fire’
until a topic (e.g. the topic of goalscoring) is exhausted, without a preconceived
plan about the orderin which this must happen [4,6], illustrates how such a spontan-
eous manner of speaking/writing can be modeled using a template-based method.
References
1. S. Busemannand H. Horacek. A Flexible Shallow Approachto Text Generation. In Pro-
ceedings of the 9th InternationalWorkshop on NaturalLanguage Generation(IWNLG’98),
Niagara-on-the-Lake, 238-247, 1998.
2. L. Cahill. Lexicalisation in applied NLG systems. ITRI report ITRI-99-04, obtainable
via http://www.itri.brighton.ac.uk/projects/rags/, 1998.
3. R. Dale and E. Reiter. Computational Interpretationsof the GriceanMaxims in the Gen-
eration of Referring Expressions. Cognitive Science 18, 233-263, 1995.
4. K. van Deemter and J. Odijk. Context Modelling and the Generation of Spoken Dis-
course. Speech Communication 21(1/2), 101-121, 1997.
5. A.K. Joshi. The Relevance of Tree Adjoining Grammar to Generation. In G. Kempen
(ed.), Natural Language Generation, Martinus Nijhoff, Dordrecht, The Netherlands, 233-
252, 1987.
6. E. Klabbers, E. Krahmer, and M. Theune. A Generic Algorithm for Generating Spoken
Monologues. InProceedingsofthe 5thInternationalConferenceonSpokenLanguagePro-
cessing (ICSLP’98), Sydney, 2759-2762, 1998.
7. E. Krahmer and M. Theune. Context Sensitive Generation of Descriptions. In Pro-
ceedings of the 5th International Conferenceon Spoken LanguageProcessing (ICSLP’98),
Sydney, 1151-1154, 1998.
8. E. Krahmer and M. Theune. Efficient Generation of Descriptions in Context. To appear
in Proceedings of ESSLLI workshop Generation of Nominals, Utrecht, August 1999.
9. D. McDonaldand J. Pustejovsky. TAGs as a Grammatical Formalismfor Generation. In
Proceedings of the 23rd Annual Meeting of the Association for Computational Linguistics
(ACL’85), Chicago, 94-103, 1985.
10. J.D. Moore and C.L. Paris. Planning Text for Advisory Dialogues: Capturing Inten-
tional and Rhetorical Information. Computational Linguistics 19(4), 652-694, 1994.
11. E. Reiter. NLG vs. Templates. In Proceedings of the 5th European Workshop on Nat-
ural Language Generation (EWNLG’95), Leiden, 95-106, 1995.
12. E. Reiter. Has a Consensus NLG Architectureappearedandis it PsychologicallyPlaus-
ible? Proceedingsof the 7thInternational Workshop on Natural LanguageGeneration, pp.
163-170, 1994.
13. E. Reiter and R. Dale. Building Applied Natural Language Generation Systems. Nat-
ural Language Engineering 3(1), 57-87, 1997.
14. M. Stone and C. Doran. Paying Heed to Collocations. In Proceedings of the 8th Inter-
nationalWorkshop on Natural LanguageGeneration(IWNLG’96),Herstmonceux, 91-100,
1996.
... Often, the text planner has no reason to prefer one alternative over another. Rather than picking an arbitrary option within the text planner (as did, e.g., van Deemter et al. (1999)), we instead defer the choice and send all of the valid alternatives to the realizer, in a packed representation . This makes the implementation of the text planner more straightforward.Figure 9 shows an example of such a logical form, incorporating both of the above options under a <one-of> element. ...
... By generalized, we mean that, rather than manipulating flat strings with no underlying linguistic representation, these systems instead work with structured fragments, which are often processed recursively . Other systems that fall into this category include EXEMPLARS (White and Caldwell, 1998), D2S (van Deemter et al., 1999), Interact <xsl:template match="one-of"> <!--Recursive pruning step --> <xsl:variable name="pruned-alts"> <xsl:for-each select="*"> <xsl:variable name="pruned-alt"> <xsl:apply-templates select="."/> </xsl:variable> <xsl:if test="not(xalan:nodeset($pruned-alt)//fail)"> <xsl:copy-of select="$pruned-alt"/> </xsl:if> </xsl:for-each> </xsl:variable> <xsl:variable name="num-remaining" select="count(xalan:nodeset($pruned-alts)/*)"/> <!--Propagation step --> <xsl:choose> <!--keep one-of when multiple alts succeed --> <xsl:when test="$num-remaining &gt; 1"> <one-of> <xsl:copy-of select="$pruned-alts"/> </one-of> </xsl:when> <!--filter out one-of when just one choice remains --> <xsl:when test="$num-remaining = 1"> <xsl:copy-of select="$pruned-alts"/> </xsl:when> <!--fail if none remain --> <xsl:otherwise> <fail/> </xsl:otherwise> </xsl:choose> </xsl:template>Figure 10: Failure-pruning template (Wilcock, 2001; Wilcock, 2003), and SmartKom (Becker, 2002). ...
... The main novel contribution of the text-planning approach described here is in its use of an external realizer that processes logical forms with embedded alternatives. This eliminates the need to use a backtracking AI planner (Becker, 2002) or to make arbitrary choices when multiple alternatives are available (van Deemter et al., 1999 ). The realizer also uses a completely different algorithm than the XSLT template processing—bottom-up, chartbased search rather than top-down rule expansion— which allows it to deal with those aspects of NLG that are more easily addressed using this kind of processing strategy. ...
... Following van Deemter et al. (1999), we take template-based to mean "making extensive use of a mapping between semantic structures and representations of linguistic surface structures that contain gaps". On this interpretation, templates can clearly include linguistic knowledge and the ways in which the gaps can be filled can be highly flexible. ...
... The aggregation templates are quite similar to the syntactic templates described by van Deemter et al. (1999), who argue that this approach resembles generation with Tree-Adjoining Grammars, and that the approach is fundamentally well-founded. ...
... However, we suggest that even quite complex reordering can be performed in XSL, provided it is broken down into separate stages which can be organised on some principled basis. As van Deemter et al. (1999) argue, the syntactic template-based approach rather resembles generation with Tree-Adjoining Grammars, and should be considered fundamentally well-founded. White and Caldwell (1999) compare their Java-based generation system EXEM-PLARS with XSL and suggest that their system has advantages because it is more object-oriented. ...
Article
The paper discusses a number of ways in which XML can be used in natural language generation, including XML-based pipeline architectures, template-based generation with XSL templates, and tree-totree transformations. The ideas are based on practical experience in building an experimental XMLbased generation component for a spoken dialogue system. Prototype implementations using DOM, XSL and Translets are briefly compared.
... The user " s knowledge base is turned into useful conversational utterances through a template-driven utterance generation system (e.g. Van Deemter, Krahmer et al. 2005). A large set of templates has been authored, using the SimpleNLG programming interface, which turn data from the onotlogy into natural language utterances . ...
Article
We detail the design, development and evalua-tion of Augmentative and Alternative Com-munication (AAC) software which encourages rapid conversational interaction. The system uses Natural Language Generation (NLG) technology to automatically generate conver-sational utterances from a domain knowledge base modelled from content suggested by a small AAC user group. Findings from this work are presented along with a discussion about how NLG might be successfully applied to conversational AAC systems in the future.
... Indeed, the increase in variability and flexibility that deep generation systems provide is often touted as a major advantage over simpler, more easily implemented template generators (Reiter, 1995). Multilinguality from deep linguistic representations (Paris et al., 1995; Stede, 1996; Bateman and Sharoff, 1998; Scott, 1999; Kruijff et al., 2000) is generally considered to be one of the advantages that deep generation systems possess over templates (although this depends heavily on the definitions of deep and template methods, such as in (Deemter et al., 1999) ). By applying multilingual lexica and grammars to a single initial knowledge base, multilingual generators hope to leverage reusable components to produce texts in multiple languages with substantially less work than implementing an equivalent number of monolingual template or deep generators. ...
Article
Full-text available
Natural Language Generation has made great strides towards multilingual gen-eration from large-scale knowledge sources. Meanwhile, current research in revision has vastly improved the qual-ity of text that NLG systems produce. However, to-date there has been no at-tempt at combining revision and mul-tilingual NLG. This paper presents re-search in multilingual revision, the last major pipelined NLG component to be studied from a multilingual perspective. We describe the linguistic difficulties in achieving multilingual revision, re-view recent work, and present an imple-mented framework for multilingual revi-sion rules.
... 01). Based on input representation, any NLG technique can be broadly classified into two paradigms viz. Template based Approach and Plan based approach. The template-based approach does not need large linguistic knowledge resource but it cannot provide the expressiveness or flexibility needed for many real domains (Langkilde and Knight, 1998). In (Deemter et. al., 1999 ), it has been tried to prove with the example of a system (D2S: Direct to Speech) that both of the approaches are equally powerful and theoretically well founded. The D2S system uses a tree structured template organization that resembles Tag Adjoining Grammar (TAG) structure . The template-based approach that has been taken in the syst ...
Conference Paper
Natural Language Generation (NLG) is a way to automatically realize a correct ex- pression in response to a communicative goal. This technology is mainly explored in the fields of machine translation, re- port generation, dialog system etc. In this paper we have explored the NLG tech- nique for another novel application- assisting disabled children to take part in conversation. The limited physical ability and mental maturity of our intended users made the NLG approach different from others. We have taken a flexible ap- proach where main emphasis is given on flexibility and usability of the system. The evaluation results show this tech- nique can increase the communication rate of users during a conversation.
... There has been much debate as to what exactly is a template-based natural language generation system, as opposed to a real NLG system. According to (van Deemter, Krahmer, and Theune, 2005), "template-based systems are natural language generating systems that map their non-linguistic input directly (i.e. without intermediate representations) to the surface linguistic surface structure." ...
Article
Full-text available
Este artículo describe investigación en curso sobre generación de lenguaje natural para sistemas de diálogo. Una serie de plantillas se encargan de la fase de planificación clausal, mientras que unos módulos de transferencia y generación desarrollados previamente para un sistema de TA llevan a cabo los procesos de lexicalización y realización morfosintáctica. Este enfoque favorece la generación multilingüe, al separar claramente la información dependiente del lenguaje de aquella que no lo es. This paper describes ongoing research on NLG for dialogue systems. Sentence planning is performed by selecting the appropriate template, while previously developed transfer and generation components of a transfer–based MT architecture perform the lexicalization and linguistic realization processes in the generation process. This approach allows for multilingual generation since there is a clear division between language dependent and language independent information. This research has been funded by the Spanish Ministry of Education under Grant TIC2002-00526, and the European FP6 IST Talk Project (507802).
Chapter
Today, the behavioral culture on social networks is a painful issue. State agencies have been trying to clean up the network environment of country. Many policies are proposed to process videos and clips with offensive content. However, it is a small part of cleaning up the network environment. We often see hateful comments on social media sites. It exists anywhere from social media to online games that are difficult to control and punish because of their big data. There are not too many social networking sites and online games until now. Therefore, it is not too difficult for communities to limit inappropriate words. Therefore, we offer a chatbot model to manage the comments that helps to clean the network environment in the paper. The results show that the proposal model achieves up to 75% accuracy with 100,000 comments.
Article
En este artículo se propone la división del proceso de construcción de sistemas de Generación de Lenguajes Natural (GLN) en dos etapas: planificación del contenido (EPC), que es dependiente del dominio de la aplicación a desarrollar, y estructuración del documento (EED). Esta división permite que personas no expertas en GLN puedan desarrollar sistemas de generación de lenguajes natural enfocándose en construir representaciones abstractas de la información que se desea comunicar (denominadas mensajes). Adicionalmente se presenta una arquitectura específica para la etapa EED que permite a investigadores en GLN trabajar ortogonalmente en técnicas y metodologías específicas para la transformación de los mensajes en texto gramatical y sintácticamente correcto.
Conference Paper
Dialog System is an important research area in natural language processing. This paper presents a research work on the multi-talk dialog system, which is supported by a large scaled knowledge base called Pangu. Our experience in the design, improvement, xible and natural dialog, modularized knowledge evaluation of the multi-talk system are introduced is presented in this paper as well as the experimental results. To realize flebase, formalized knowledge representation, various efficient reasoners and assessment of system were used. Finally, this paper ends with some detailed conclusions and looks into the future research directions.
Article
this paper we propose an NLG strategy that takes into account multiple dimensions of the user's context. The input to the NLG process is annotated with both local and global information about the context of the user. We are interested in a hybrid approach: combining local context values (e.g., indicating how much a user is interested in a particular topic of the domain, or whether a specific object has been mentioned recently to the user) with global context parameters (esp. how much the user is available for new information). We experiment with different mechanisms for producing variable output within a template based approach. Our parameters for context sensitivity are application independent, but for every application domain it has to be made explicit how they influence the output text. The COMRIS environment is an interesting test-bed for these ideas on context-sensitive NLG.
Article
Full-text available
Tree Adjoining Grammars, or "TAG's", (Joshi, Levy & Takahashi 1975; Joshi 1983; Kroch & Joshi 1985) were developed as an alternative to the standard syntactic formalisms that are used in theoretical analyses of language. They are attractive because they may provide just the aspects of context sensitive expressive power that actually appear in human languages while otherwise remaining context free.This paper describes how we have applied the theory of Tree Adjoining Grammars to natural language generation. We have been attracted to TAG's because their central operation-the extension of an "initial" phrase structure tree through the inclusion, at very specifically constrained locations, of one or more "auxiliary" trees-corresponds directly to certain central operations of our own, performance-oriented theory.We begin by briefly describing TAG's as a formalism for phrase structure in a competence theory, and summarize the points in the theory of TAG's that are germaine to our own theory. We then consider generally the position of a grammar within the generation process, introducing our use of TAG's through a contrast with how others have used systemic grammars. This takes us to the core results of our paper: using examples from our research with weft-written texts from newspapers, we walk through our TAG inspired treatments of raising and wh-movement, and show the correspondence of the TAG 'adjunction" operation and our "attachment" process.In the final section we discuss extensions to the theory, motivated by the way we use the operation corresponding to TAG's" adjunction in performance. This suggests that the competence theory of TAG's can be profitably projected to structures at the morphological level as well as the present syntactic level.
Conference Paper
Full-text available
Tree Adjoining Grammars, or "TAG's", (Joshi, Levy & Takahashi 1975; Joshi 1983; Kroch & Joshi 1985) were developed as an alternative to the standard syntactic formalisms that are used in theoretical analyses of language. They are attractive because they may provide just the aspects of context sensitive expressive power that actually appear in human languages while otherwise remaining context free.This paper describes how we have applied the theory of Tree Adjoining Grammars to natural language generation. We have been attracted to TAG's because their central operation---the extension of an "initial" phrase structure tree through the inclusion, at very specifically constrained locations, of one or more "auxiliary" trees---corresponds directly to certain central operations of our own, performance-oriented theory.We begin by briefly describing TAG's as a formalism for phrase structure in a competence theory, and summarize the points in the theory of TAG's that are germaine to our own theory. We then consider generally the position of a grammar within the generation process, introducing our use of TAG's through a contrast with how others have used systemic grammars. This takes us to the core results of our paper: using examples from our research with weil-written texts from newspapers, we walk through our TAG inspired treatments of raising and wh-movement, and show the correspondence of the TAG "adjunction" operation and our "attachment" process.In the final section we discuss extensions to the theory, motivated by the way we use the operation corresponding to TAG's adjunction in performance. This suggests that the competence theory of TAG's can be profitably projected to structures at the morphological level as well as the present syntactic level.
Article
Full-text available
To participate in a dialogue a system must be capable of reasoning about its own previous utterances. Follow-up questions must be interpreted in the context of the ongoing conversation, and the system's previous contributions form part of this context. Furthermore, if a system is to be able to clarify misunderstood explanations or to elaborate on prior explanations, it must understand what is has conveyed in prior explanations. Previous approaches to generating multisentential texts have relied solely on rhetorical structuring techniques. In this paper, we argue that, to handle explanation dialogues successfully, a discourse model must include information about the intended effect of individual parts of the text on the hearer, as well as how the parts relate to one another rhetorically. We present a text planner that records this information, and show how the resulting structure is used to respond appropriately to a follow-up question. 1 1 Introduction Explanation systems m...
Article
Full-text available
Probably the best current algorithm for generating definite descriptions is the Incremental Algorithm due to Dale and Reiter. If we want to use this algorithm in a Concept-to-Speech system, however, we encounter two limitations: (i) the algorithm is insensitive to the linguistic context and thus always produces the same description for an object, (ii) the output is a list of properties which uniquely determine one object from a set of objects: how this list is to be expressed in spoken natural language is not addressed. We propose a modification of the Incremental Algorithm based on the idea that a definite description refers to the most salient element in the current context satisfying the descriptive content. We show that the modified algorithm allows for the context-sensitive generation of both distinguishing and anaphoric descriptions, while retaining the attractive properties of Dale and Reiter's original algorithm.
Article
Grammatical formalisms can be viewed as neutral with respect to comprehension or generation, or they can be investigated from the point of view of their suitability for comprehension or generation. Tree Adjoining Grammars (TAG) is a formalism that factors recursion and dependencies in a special way, leading to a kind of locality and the possibility of incremental generation. We will examine the relevance of these properties from the point of view of sentence generation.
Article
This paper presents the Dial-Your-Disc (dyd) system, an interactive system that supports browsing through a large database of musical information and generates a spoken monologue once a musical composition has been selected. The paper focuses on the generation of spoken monologues and, more specifically, on the various ways in which the generation of an utterance at a given point in the monologue requires modeling of the linguistic context of the utterance.ZusammenfassungIn diesem Artikel wird das Dial-Your-Disc (dyd) System vorgestellt. Dieses interaktive System unterstützt den Benutzer beim Durchsuchen einer grossen Datenbank mit Informationen über Musikkompositionen. Nach Auswahl einer Komposition erzeugt das System einen gesprochenen Monolog. Dieser Artikel konzentriert sich auf die Erzeugung gesprochener Monologe. Insbesondere wird besprochen, in welcher Weise die Erzeugung einer Åusserung je nach ihrer Position im Monolog es erfordert, den linguistischen Kontext der Åusserung zu modellieren.RésuméCet article présente le système dyd (Dial-Your-Disc/Choisissez votre disque), qui comprend entre autres la fonction de recherche dans une grande base de données d'information musicale, et celle de génération d'un monologue sous forme vocale, une fois que l'oeuvre musicale a été sélectionnée. L'article présente en détail la génération des monologues, avec un accent particulier sur la nécessité de recourir au contexte linguistique de l'énoncé pour effectuer différemment la génération en fonction de la position courante dans le monologue.
Article
This paper discusses some issues relating to the task of lexicalisation within applied NLG systems. We restrict ourselves here to the systems covered in "A Survey of Applied Natural Language Systems" [Pai98], that are fully implemented, complete