Todd Ward's research while affiliated with IBM Research and other places

What is this page?


This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

Publications (36)


Figure 4: Histogram of different kinds of errors
Dev set Smatch for AMR2.0 and AMR1.0.
Test set Smatch for AMR1.0.
Bootstrapping Multilingual AMR with Contextual Word Alignments
  • Preprint
  • File available

February 2021

·

153 Reads

Janaki Sheth

·

·

Ramon Fernandez Astudillo

·

[...]

·

Todd Ward

We develop high performance multilingualAbstract Meaning Representation (AMR) sys-tems by projecting English AMR annotationsto other languages with weak supervision. Weachieve this goal by bootstrapping transformer-based multilingual word embeddings, in partic-ular those from cross-lingual RoBERTa (XLM-R large). We develop a novel technique forforeign-text-to-English AMR alignment, usingthe contextual word alignment between En-glish and foreign language tokens. This wordalignment is weakly supervised and relies onthe contextualized XLM-R word embeddings.We achieve a highly competitive performancethat surpasses the best published results forGerman, Italian, Spanish and Chinese.

Download
Share


Results on TechQA-RC task. Each row with a + adds a step to the previous row. HA F1 refers to F1 for answerable questions. Numbers in parentheses show standard deviation.
Hyperparameters for the LM training.
Hyperparameters for the TechQA-RC task.
Hyperparameters for the TechQA-DR task.
Multi-Stage Pre-training for Low-Resource Domain Adaptation

October 2020

·

107 Reads

Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain. Current approaches directly adapt a pre-trained language model (LM) on in-domain text before fine-tuning to downstream tasks. We show that extending the vocabulary of the LM with domain-specific terms leads to further gains. To a bigger effect, we utilize structure in the unlabeled data to create auxiliary synthetic tasks, which helps the LM transfer to downstream tasks. We apply these approaches incrementally on a pre-trained Roberta-large LM and show considerable performance gain on three tasks in the IT domain: Extractive Reading Comprehension, Document Ranking and Duplicate Question Detection.





Figure 1: Examples of questions in the TechQA dataset.
The TechQA Dataset

November 2019

·

188 Reads

We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size -- 600 training, 310 dev, and 490 evaluation question/answer pairs -- thus reflecting the cost of creating large labeled datasets with actual data. Consequently, TechQA is meant to stimulate research in domain adaptation rather than being a resource to build QA systems from scratch. The dataset was obtained by crawling the IBM Developer and IBM DeveloperWorks forums for questions with accepted answers that appear in a published IBM Technote---a technical document that addresses a specific technical issue. We also release a collection of the 801,998 publicly available Technotes as of April 4, 2019 as a companion resource that might be used for pretraining, to learn representations of the IT domain language.


Improving Coreference Resolution by Using Conversational Metadata.

January 2009

·

79 Reads

·

11 Citations

In this paper, we propose the use of metadata contained in documents to improve corefer- ence resolution. Specifically, we quantify the impact of speaker and turn information on the performance of our coreference system, and show that the metadata can be effectively en- coded as features of a statistical resolution sys- tem, which leads to a statistically significant improvement in performance.


Automatic Recognition of Spontaneous Speech for Access to Multilingual Oral History Archives

August 2004

·

242 Reads

·

133 Citations

IEEE Transactions on Speech and Audio Processing

Much is known about the design of automated systems to search broadcast news, but it has only recently become possible to apply similar techniques to large collections of spontaneous speech. This paper presents initial results from experiments with speech recognition, topic segmentation, topic categorization, and named entity detection using a large collection of recorded oral histories. The work leverages a massive manual annotation effort on 10 000 h of spontaneous speech to evaluate the degree to which automatic speech recognition (ASR)-based segmentation and categorization techniques can be adapted to approximate decisions made by human annotators. ASR word error rates near 40% were achieved for both English and Czech for heavily accented, emotional and elderly spontaneous speech based on 65-84 h of transcribed speech. Topical segmentation based on shifts in the recognized English vocabulary resulted in 80% agreement with manually annotated boundary positions at a 0.35 false alarm rate. Categorization was considerably more challenging, with a nearest-neighbor technique yielding F=0.3. This is less than half the value obtained by the same technique on a standard newswire categorization benchmark, but replication on human-transcribed interviews showed that ASR errors explain little of that difference. The paper concludes with a description of how these capabilities could be used together to search large collections of recorded oral histories.


TIPS: A Translingual Information Processing System

March 2004

·

43 Reads

·

6 Citations

Searching online information is increasingly a daily activity for many people. The multilinguality of online content is also increasing (e.g. the proportion of English web users, which has been decreasing as a fraction the increasing population of web users, dipped below 50% in the summer of 2001). To improve the ability of an English speaker to search mutlilingual content, we built a system that supports cross-lingual search of an Arabic newswire collection and provides on demand translation of Arabic web pages into English. The cross-lingual search engine supports a fast search capability (sub-second response for typical queries) and achieves state-of-the-art performance in the high precision region of the result list. The on demand statistical machine translation uses the Direct Translation model along with a novel statistical Arabic Morphological Analyzer to yield state-of-the-art translation quality. The on demand SMT uses an efficient dynamic programming decoder that achieves reasonable speed for translating web documents.


Citations (29)


... That is, we can represent the semantics in other languages using the corresponding AMR graph of the semantic equivalent in English. A number of crosslingual AMR parsers (Damonte and Cohen, 2018;Blloshmi et al., 2020;Sheth et al., 2021;Procopio et al., 2021;Cai et al., 2021) have been developed to transform non-English texts into AMR graphs. Most of them rely on pre-trained multilingual language models and synthetic parallel data. ...

Reference:

Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation
Bootstrapping Multilingual AMR with Contextual Word Alignments
  • Citing Conference Paper
  • January 2021

... Learning robust and transferable representation has been the core of language model pre-training (Peters et al., 2019). For the general-purposed PLM to generalize well on domain-specific tasks, endowing the model with domain knowledge via indomain training remains the go-to approach (Gururangan et al., 2020;Whang et al., 2020;Zhang et al., 2020;Li et al., 2023). In this section, we show that without any in-domain pre-training, PlugLM could flexibly adapt to multiple domains with domainspecific DPM. ...

Multi-Stage Pre-training for Low-Resource Domain Adaptation

... Here, semantically related concepts are grouped at a higher level. The statistical parser uses the training data to examine the training sentences to find the best combinations of sentential clues that works best across all the training sentences [10]. ...

The IBM conversational telephony system for financial applications

... Our intuition was that a new topic might be strengthened if it were grouped with other, later, stories on the same topic. Other groups have had more success with this approach, though not to a huge degree [6]. • It seemed that additional supporting information would be helpful, so we tried to decide to which cluster a story should be assigned using the top n > 1 most similar stories rather than just the top n = 1. ...

Segmentation and Detection at IBM
  • Citing Chapter
  • January 2002

... Contrarily, the BLEU score between s 1 and s 2 will decrease for each n-gram that is in s 2 but is not in s 1 . We report cumulative BLEU scores for 1-grams to 4-grams for the model's top five candidate predictions, as they have reported correlations with human judgements (Ward and Reeder 2002). Finally, expert emergency physicians evaluated the algorithm's performance subjectively. ...

Corpus-based comprehensive and diagnostic mt evaluation: Initial arabic
  • Citing Article
  • January 2002

... Stopping word removal and Porter stemming are performed before producing a vector representation of words in these documents. The Okapi BM25 relevance scoring formula [47] is employed to generate a text-vector as (3) where is the Term Frequency occurring in a shot document, the constant factors and are set as 2.0 and 0.75, respectively. ...

Segmentation and detection at IBM: Hybrid statistical models and two-tiered clustering broadcast news domain
  • Citing Article

... In recent years, MAP adaptation has been successfully applied to n-gram language models (Bacchiani and Roark, 2003 ) and lexicalized PCFG models (Roark and Bacchiani, 2003). Luo et al. have proposed transformation-based approaches based on the Markov transform (Luo et al., 1999 ) and the Householder transform (Luo, 2000), to adapt statistical parsers. However, the optimization processes for the latter are complex and it is not clear how general they are. ...

Unsupervised adaptation of statistical parsers based on Markov trans-form
  • Citing Article

... Quality of translations. Before evaluating our NMT-based approach to cross-lingual text similarity, Table 1 compares the performance of Japanese-to-English and English-to-Japanese translations by our NMT models and Google Translate [37] in bilingual evaluation understudy (BLEU) scores [38]. BLEU is a commonly used MT evaluation metric, computed by comparing machine-generated translations and reference translations (the gold standard) by n-gram overlaps. ...

Corpus-based comprehensive and diagnostic MT evaluation: Initial Arabic, Chinese, French, and Spanish results

... In order to account for this variability, we developed a probabilistic log-linear model that fuses multiple similarity signals. Probabilistic log linear models have found success in domains such as natural language processing (NLP) [24,25,26] and Information Retrieval (IR) [27], particularly point-wise learning to rank (LTR) [28]. Log-linear models permit a rich variety of feature representations to influence the probabilistic estimation of relevance, making it ideal for our endpoint ranking task. ...

Feature-based language understanding
  • Citing Conference Paper
  • September 1997