Todd Ward's research works | IBM Research and other places

This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

Figure 4: Histogram of different kinds of errors

Bootstrapping Multilingual AMR with Contextual Word Alignments

February 2021

153 Reads

Janaki Sheth

Young-Suk Lee

Ramon Fernandez Astudillo

[...]

Todd Ward

We develop high performance multilingualAbstract Meaning Representation (AMR) sys-tems by projecting English AMR annotationsto other languages with weak supervision. Weachieve this goal by bootstrapping transformer-based multilingual word embeddings, in partic-ular those from cross-lingual RoBERTa (XLM-R large). We develop a novel technique forforeign-text-to-English AMR alignment, usingthe contextual word alignment between En-glish and foreign language tokens. This wordalignment is weakly supervised and relies onthe contextualized XLM-R word embeddings.We achieve a highly competitive performancethat surpasses the best published results forGerman, Italian, Spanish and Chinese.

Download

Bootstrapping Multilingual AMR with Contextual Word Alignments

Conference Paper

January 2021

21 Reads

13 Citations

Janaki Sheth

Young-Suk Lee

Ramón Fernandez Astudillo

[...]

Todd Ward

Multi-Stage Pre-training for Low-Resource Domain Adaptation

October 2020

107 Reads

[...]

Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain. Current approaches directly adapt a pre-trained language model (LM) on in-domain text before fine-tuning to downstream tasks. We show that extending the vocabulary of the LM with domain-specific terms leads to further gains. To a bigger effect, we utilize structure in the unlabeled data to create auxiliary synthetic tasks, which helps the LM transfer to downstream tasks. We apply these approaches incrementally on a pre-trained Roberta-large LM and show considerable performance gain on three tasks in the IT domain: Extractive Reading Comprehension, Document Ranking and Duplicate Question Detection.

Download

Multi-Stage Pre-training for Low-Resource Domain Adaptation

January 2020

57 Reads

29 Citations

[...]

Conference Paper

January 2020

36 Reads

9 Citations

[...]

Scalable Cross-lingual Treebank Synthesis for Improved Production Dependency Parsers

January 2020

68 Reads

[...]

November 2019

188 Reads

[...]

We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size -- 600 training, 310 dev, and 490 evaluation question/answer pairs -- thus reflecting the cost of creating large labeled datasets with actual data. Consequently, TechQA is meant to stimulate research in domain adaptation rather than being a resource to build QA systems from scratch. The dataset was obtained by crawling the IBM Developer and IBM DeveloperWorks forums for questions with accepted answers that appear in a published IBM Technote---a technical document that addresses a specific technical issue. We also release a collection of the 801,998 publicly available Technotes as of April 4, 2019 as a companion resource that might be used for pretraining, to learn representations of the IT domain language.

Download

Improving Coreference Resolution by Using Conversational Metadata.

January 2009

79 Reads

11 Citations

Xiaoqiang Luo

Radu Florian

Todd Ward

In this paper, we propose the use of metadata contained in documents to improve corefer- ence resolution. Specifically, we quantify the impact of speaker and turn information on the performance of our coreference system, and show that the metadata can be effectively en- coded as features of a statistical resolution sys- tem, which leads to a statistically significant improvement in performance.

Download

Automatic Recognition of Spontaneous Speech for Access to Multilingual Oral History Archives

August 2004

242 Reads

133 Citations

IEEE Transactions on Speech and Audio Processing

[...]

Much is known about the design of automated systems to search broadcast news, but it has only recently become possible to apply similar techniques to large collections of spontaneous speech. This paper presents initial results from experiments with speech recognition, topic segmentation, topic categorization, and named entity detection using a large collection of recorded oral histories. The work leverages a massive manual annotation effort on 10 000 h of spontaneous speech to evaluate the degree to which automatic speech recognition (ASR)-based segmentation and categorization techniques can be adapted to approximate decisions made by human annotators. ASR word error rates near 40% were achieved for both English and Czech for heavily accented, emotional and elderly spontaneous speech based on 65-84 h of transcribed speech. Topical segmentation based on shifts in the recognized English vocabulary resulted in 80% agreement with manually annotated boundary positions at a 0.35 false alarm rate. Categorization was considerably more challenging, with a nearest-neighbor technique yielding F=0.3. This is less than half the value obtained by the same technique on a standard newswire categorization benchmark, but replication on human-transcribed interviews showed that ASR errors explain little of that difference. The paper concludes with a description of how these capabilities could be used together to search large collections of recorded oral histories.

Download

TIPS: A Translingual Information Processing System

March 2004

43 Reads

6 Citations

[...]

Searching online information is increasingly a daily activity for many people. The multilinguality of online content is also increasing (e.g. the proportion of English web users, which has been decreasing as a fraction the increasing population of web users, dipped below 50% in the summer of 2001). To improve the ability of an English speaker to search mutlilingual content, we built a system that supports cross-lingual search of an Arabic newswire collection and provides on demand translation of Arabic web pages into English. The cross-lingual search engine supports a fast search capability (sub-second response for typical queries) and achieves state-of-the-art performance in the high precision region of the result list. The on demand statistical machine translation uses the Direct Translation model along with a novel statistical Arabic Morphological Analyzer to yield state-of-the-art translation quality. The on demand SMT uses an efficient dynamic programming decoder that achieves reasonable speed for translating web documents.

Download

... That is, we can represent the semantics in other languages using the corresponding AMR graph of the semantic equivalent in English. A number of crosslingual AMR parsers (Damonte and Cohen, 2018;Blloshmi et al., 2020;Sheth et al., 2021;Procopio et al., 2021;Cai et al., 2021) have been developed to transform non-English texts into AMR graphs. Most of them rely on pre-trained multilingual language models and synthetic parallel data. ...
Reference:
Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation

Bootstrapping Multilingual AMR with Contextual Word Alignments

Citing Conference Paper
January 2021

Janaki Sheth

Young-Suk Lee

Ramón Fernandez Astudillo

[...]

Todd Ward

... Learning robust and transferable representation has been the core of language model pre-training (Peters et al., 2019). For the general-purposed PLM to generalize well on domain-specific tasks, endowing the model with domain knowledge via indomain training remains the go-to approach (Gururangan et al., 2020;Whang et al., 2020;Zhang et al., 2020;Li et al., 2023). In this section, we show that without any in-domain pre-training, PlugLM could flexibly adapt to multiple domains with domainspecific DPM. ...
Reference:
Decouple knowledge from paramters for plug-and-play language modeling

Multi-Stage Pre-training for Low-Resource Domain Adaptation

Citing Conference Paper
Full-text available
January 2020

[...]

... Here, semantically related concepts are grouped at a higher level. The statistical parser uses the training data to examine the training sentences to find the best combinations of sentential clues that works best across all the training sentences [10]. ...
Reference:
Semantic confidence measurement for spoken dialog systems

The IBM conversational telephony system for financial applications

Citing Conference Paper
Full-text available
September 1999

[...]

... Our intuition was that a new topic might be strengthened if it were grouped with other, later, stories on the same topic. Other groups have had more success with this approach, though not to a huge degree [6]. • It seemed that additional supporting information would be helpful, so we tried to decide to which cluster a story should be assigned using the top n > 1 most similar stories rather than just the top n = 1. ...
Reference:
Taking Topic Detection From Evaluation to Practice.

Segmentation and Detection at IBM

Citing Chapter
January 2002

[...]

... Contrarily, the BLEU score between s 1 and s 2 will decrease for each n-gram that is in s 2 but is not in s 1 . We report cumulative BLEU scores for 1-grams to 4-grams for the model's top five candidate predictions, as they have reported correlations with human judgements (Ward and Reeder 2002). Finally, expert emergency physicians evaluated the algorithm's performance subjectively. ...
Reference:
The AI-Medic: an artificial intelligent mentor for trauma surgery

Corpus-based comprehensive and diagnostic mt evaluation: Initial arabic

Citing Article
January 2002

[...]

... Stopping word removal and Porter stemming are performed before producing a vector representation of words in these documents. The Okapi BM25 relevance scoring formula [47] is employed to generate a text-vector as (3) where is the Term Frequency occurring in a shot document, the constant factors and are set as 2.0 and 0.75, respectively. ...
Reference:
A Multimodal Scheme for Program Segmentation and Representation in Broadcast Video Streams

Segmentation and detection at IBM: Hybrid statistical models and two-tiered clustering broadcast news domain

Citing Article

[...]

... In recent years, MAP adaptation has been successfully applied to n-gram language models (Bacchiani and Roark, 2003 ) and lexicalized PCFG models (Roark and Bacchiani, 2003). Luo et al. have proposed transformation-based approaches based on the Markov transform (Luo et al., 1999 ) and the Householder transform (Luo, 2000), to adapt statistical parsers. However, the optimization processes for the latter are complex and it is not clear how general they are. ...
Reference:
Spoken Language Understanding Using the Hidden Vector State Model

Unsupervised adaptation of statistical parsers based on Markov trans-form

Citing Article

Xiaoqiang Luo

Salim Roukos

T. Ward

... Several metrics are proposed to assess the syntactical similarity, which estimates the number of shared words or phrases between two texts. Examples of such metrics that could be used in text summarization include Rouge [7], Meteor [29], and BLEU [30]. ...
Reference:
Warm-Starting for Improving the Novelty of Abstractive Summarization

IBM Research Report Bleu: a Method for Automatic Evaluation of Machine Translation

Citing Article

... Quality of translations. Before evaluating our NMT-based approach to cross-lingual text similarity, Table 1 compares the performance of Japanese-to-English and English-to-Japanese translations by our NMT models and Google Translate [37] in bilingual evaluation understudy (BLEU) scores [38]. BLEU is a commonly used MT evaluation metric, computed by comparing machine-generated translations and reference translations (the gold standard) by n-gram overlaps. ...
Reference:
Cross-lingual text similarity exploiting neural machine translation models

Corpus-based comprehensive and diagnostic MT evaluation: Initial Arabic, Chinese, French, and Spanish results

Citing Article
Full-text available
January 2003

[...]

... In order to account for this variability, we developed a probabilistic log-linear model that fuses multiple similarity signals. Probabilistic log linear models have found success in domains such as natural language processing (NLP) [24,25,26] and Information Retrieval (IR) [27], particularly point-wise learning to rank (LTR) [28]. Log-linear models permit a rich variety of feature representations to influence the probabilistic estimation of relevance, making it ideal for our endpoint ranking task. ...
Reference:
API-Spector: an API-to-API Specification Recommendation Engine

Feature-based language understanding

Citing Conference Paper
September 1997

Kishore A. Papineni

Salim Roukos

Todd R. Ward

Todd Ward's research while affiliated with IBM Research and other places

What is this page?

Publications (36)

Citations (29)