Veselin Stoyanov's research works

What is this page?

This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

Figure 1: Existing few-shot fine-tuning methods require manual...

Validation performance for sentence-pair bench- marks for different...

PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models

Preprint

Full-text available

Apr 2022

Current methods for few-shot fine-tuning of pretrained masked language models (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score. In this work, we propose PERFECT, a simple and efficient method for few-shot fine-tuning of PLMs without relying on any such handc...

Prompt-free and Efficient Few-shot Learning with Language Models

Conference Paper

Jan 2022

Self-training Improves Pre-training for Natural Language Understanding

Conference Paper

Jan 2021

Multi-Task Retrieval for Knowledge-Intensive Tasks

Conference Paper

Jan 2021

Multi-task Retrieval for Knowledge-Intensive Tasks

Preprint

Full-text available

Dec 2020

Retrieving relevant contexts from a large corpus is a crucial step for tasks such as open-domain question answering and fact checking. Although neural retrieval outperforms traditional methods like tf-idf and BM25, its performance degrades considerably when applied to out-of-domain data. Driven by the question of whether a neural retrieval model ca...

General Purpose Text Embeddings from Pre-trained Language Models for Scalable Inference

Preprint

Apr 2020

The state of the art on many NLP tasks is currently achieved by large pre-trained language models, which require a considerable amount of computation. We explore a setting where many different predictions are made on a single piece of text. In that case, some of the computational cost during inference can be amortized over the different tasks using...

Emerging Cross-lingual Structure in Pretrained Language Models

Conference Paper

Full-text available

Jan 2020

We study the problem of multilingual masked language modeling, i.e. the training of a single model on concatenated text from multiple languages, and present a detailed study of several factors that influence why these models are so effective for cross-lingual transfer. We show, contrary to what was previously hypothesized, that transfer is possible...

Pretrained Language Models for Biomedical and Clinical Tasks: Understanding and Extending the State-of-the-Art

Conference Paper

Jan 2020

Unsupervised Cross-lingual Representation Learning at Scale

Conference Paper

Jan 2020

General Purpose Text Embeddings from Pre-trained Language Models for Scalable Inference

Conference Paper

Jan 2020

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Conference Paper

Jan 2020

Conversational Semantic Parsing

Conference Paper

Jan 2020

Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model

Preprint

Dec 2019

Recent breakthroughs of pretrained language models have shown the effectiveness of self-supervised learning for a wide range of natural language processing (NLP) tasks. In addition to standard syntactic and semantic NLP tasks, pretrained models achieve strong improvements on tasks that involve real-world knowledge, suggesting that large-scale langu...

Unsupervised Cross-lingual Representation Learning at Scale

Preprint

Nov 2019

This paper shows that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks. We train a Transformer-based masked language model on one hundred languages, using more than two terabytes of filtered CommonCrawl data. Our model, dubbed XLM-R, significantly outperforms m...

Emerging Cross-lingual Structure in Pretrained Language Models

Preprint

Full-text available

Nov 2019

Bridging the domain gap in cross-lingual document classification

Preprint

Sep 2019

The scarcity of labeled training data often prohibits the internationalization of NLP models to multiple languages. Recent developments in cross-lingual understanding (XLU) has made progress in this area, trying to bridge the language barrier using language universal representations. However, even if the language problem was resolved, models traine...

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Preprint

Jul 2019

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging. Training is computationally expensive, often done on private datasets of different sizes, and, as we will show, hyperparameter choices have significant impact on the final results. We present a replication study of...

Knowledge-Augmented Language Model and its Application to Unsupervised Named-Entity Recognition

Preprint

Apr 2019

Traditional language models are unable to efficiently model entity names observed in text. All but the most popular named entities appear infrequently in text providing insufficient context. Recent efforts have recognized that context can be generalized between entity names that share the same type (e.g., \emph{person} or \emph{location}) and have...

Knowledge-Augmented Language Model and Its Application to Unsupervised Named-Entity Recognition

Conference Paper

Jan 2019

XNLI: Evaluating Cross-lingual Sentence Representations

Preprint

Sep 2018

State-of-the-art natural language processing systems rely on supervision in the form of annotated data to learn competent models. These models are generally trained on data in a single language (usually English), and cannot be directly used beyond that language. Since collecting data in every language is not realistic, there has been a growing inte...

Simple Fusion: Return of the Language Model

Preprint

Sep 2018

Neural Machine Translation (NMT) typically leverages monolingual data in training through backtranslation. We investigate an alternative simple method to use monolingual data for NMT training: We combine the scores of a pre-trained and fixed language model (LM) with the scores of a translation model (TM) while the TM is trained from scratch. To ach...

A Multi-lingual Multi-task Architecture for Low-resource Sequence Labeling

Conference Paper