Figure 4: Example of a generated sentence.

Reversible jump attack to textual classifiers with modification reduction

Article

Full-text available

Apr 2024
MACH LEARN

Recent studies on adversarial examples expose vulnerabilities of natural language processing models. Existing techniques for generating adversarial examples are typically driven by deterministic hierarchical rules that are agnostic to the optimal adversarial examples, a strategy that often results in adversarial samples with a suboptimal balance between magnitudes of changes and attack successes. To this end, in this research we propose two algorithms, Reversible Jump Attack (RJA) and Metropolis–Hasting Modification Reduction (MMR), to generate highly effective adversarial examples and to improve the imperceptibility of the examples, respectively. RJA utilizes a novel randomization mechanism to enlarge the search space and efficiently adapts to a number of perturbed words for adversarial examples. With these generated adversarial examples, MMR applies the Metropolis–Hasting sampler to enhance the imperceptibility of adversarial examples. Extensive experiments demonstrate that RJA-MMR outperforms current state-of-the-art methods in attack performance, imperceptibility, fluency and grammar correctness.

Frauds Bargain Attack: Generating Adversarial Text Samples via Word Manipulation Process

Preprint

Full-text available

Mar 2023

Recent studies on adversarial examples expose vulnerabilities of natural language processing (NLP) models. Existing techniques for generating adversarial examples are typically driven by deterministic heuristic rules that are agnostic to the optimal adversarial examples, a strategy that often results in attack failures. To this end, this research proposes Fraud's Bargain Attack (FBA) which utilizes a novel randomization mechanism to enlarge the search space and enables high-quality adversarial examples to be generated with high probabilities. FBA applies the Metropolis-Hasting sampler, a member of Markov Chain Monte Carlo samplers, to enhance the selection of adversarial examples from all candidates proposed by a customized stochastic process that we call the Word Manipulation Process (WMP). WMP perturbs one word at a time via insertion, removal or substitution in a contextual-aware manner. Extensive experiments demonstrate that FBA outperforms the state-of-the-art methods in terms of both attack success rate and imperceptibility.

Automated Deep Learning : Principles and Practice

Thesis

Nov 2021

Zhengying Liu

Automated Machine Learning (AutoML) aims at rendering the application of machine learning (ML) methods as devoid of human intervention as possible. This ambitious goal has been the object of much research and engineering since the outset of ML. The objective of this thesis is to put a formal framework around this multi-faceted problem, to benchmark existing methods, and to explore new directions. To formulate the AutoML problem in a rigorous way, we first introduce a mathematical framework that: (1) categorizes all involved algorithms into three levels (alpha, beta and gamma levels); (2) concretely defines the concept of a task (especially in a supervised learning setting); (3) formally defines HPO and meta-learning; (4) introduces an any-time learning metric that allows to evaluate learning algorithms by not only their accuracy but also their learning speed, which is crucial in settings such as hyperparameter optimization (including neural architecture search) or meta-learning. This mathematical framework unifies different sub-fields of ML (e.g. transfer learning, meta-learning, ensemble learning), allows us to systematically classify methods, and provides us with formal tools to facilitate theoretical developments (e.g. the link to the No Free Lunch theorems) and future empirical research. In particular, it serves as the theoretical basis of a series of challenges that we organized. Indeed, our principal methodological approach to tackle AutoML with Deep Learning has been to set up an extensive benchmark, in the context of a challenge series on Automated Deep Learning (AutoDL), co-organized with ChaLearn, Google, and 4Paradigm. These challenges provide a benchmark suite of baseline AutoML solutions with a repository of around 100 datasets, over half of which are released as public datasets to enable research on meta-learning. At the end of these challenges, we carried out extensive post-challenge analyses which revealed that: (1) Winning solutions generalize to new unseen datasets, which validates progress towards universal AutoML solution; (2) Despite our effort to encourage generic solutions, the participants adopted specific workflows for each modality; (3) Any-time learning was addressed successfully, without sacrificing final performance; (4) Although some solutions improved over the provided baseline, it strongly influenced many; (5) Deep learning solutions dominated, but Neural Architecture Search was impractical within the time budget imposed; (6) Ablation studies revealed the importance of meta-learning, ensembling, and efficient data loading, while data-augmentation is not critical. All code and data are available at autodl.chalearn.org. Besides the introduction of a novel general formulation of the AutoML problem, setting up and analyzing the AutoDL challenge, the contributions of this thesis include: (1) Developing our own solutions to the problems we posed to the participants. Our work GramNAS tackles the neural architecture search (NAS) problem by using a formal grammar to encode neural architectures. Two alternative search strategies have been experimentally investigated: one based on Monte-Carlo Tree Search (MCTS), which achieves 94% accuracy on CIFAR-10 dataset, and another one based on an evolutionary algorithm which beats state-of-the-art packages AutoGluon and AutoPytorch on 4 large well-known datasets; (2) Laying the basis for a future challenge on meta-learning. The AutoDL challenge series revealed the importance of meta-learning but the challenge setting did not evaluate meta-learning properly. With an intern, we experiment with various meta-learning challenge protocols; (3) Making several theoretical contributions. During the course of this thesis, several collaborations were entered to tackle problems of transfer learning and expressiveness of neural networks. Investigations on the Universal Approximation Theorem helped us understand theoretical guarantee behind Deep Learning systems we deploy.

To Beam Or Not To Beam: That is a Question of Cooperation for Language GANs

Preprint

Full-text available

Jun 2021

Due to the discrete nature of words, language GANs require to be optimized from rewards provided by discriminator networks, via reinforcement learning methods. This is a much harder setting than for continuous tasks, which enjoy gradient flows from discriminators to generators, usually leading to dramatic learning instabilities. However, we claim that this can be solved by making discriminator and generator networks cooperate to produce output sequences during training. These cooperative outputs, inherently built to obtain higher discrimination scores, not only provide denser rewards for training, but also form a more compact artificial set for discriminator training, hence improving its accuracy and stability. In this paper, we show that our SelfGAN framework, built on this cooperative principle, outperforms Teacher Forcing and obtains state-of-the-art results on two challenging tasks, Summarization and Question Generation.

Generating Fluent Adversarial Examples for Natural Languages

Preprint

Jul 2020

Efficiently building an adversarial attacker for natural language processing (NLP) tasks is a real challenge. Firstly, as the sentence space is discrete, it is difficult to make small perturbations along the direction of gradients. Secondly, the fluency of the generated examples cannot be guaranteed. In this paper, we propose MHA, which addresses both problems by performing Metropolis-Hastings sampling, whose proposal is designed with the guidance of gradients. Experiments on IMDB and SNLI show that our proposed MHA outperforms the baseline model on attacking capability. Adversarial training with MAH also leads to better robustness and performance.

Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions

Preprint

Jun 2020

We present our view of what is necessary to build an engaging open-domain conversational agent: covering the qualities of such an agent, the pieces of the puzzle that have been built so far, and the gaping holes we have not filled yet. We present a biased view, focusing on work done by our own group, while citing related work in each area. In particular, we discuss in detail the properties of continual learning, providing engaging content, and being well-behaved -- and how to measure success in providing them. We end with a discussion of our experience and learnings, and our recommendations to the community.

Synthesis of Social Media Messages and Tweets as Feedback Medium in Introductory Programming

Conference Paper

Full-text available

Jul 2019

Social Media has been recognised as a supportive tool in Education, creating benefits that supplement student collaboration, class interactions and communication between instructors and students. Active informal interactions and feedback between instructors and students outside class is one of the main reasons behind Social Media pedagogy. With many innovative usage methods of Social Media in Education this creates new opportunities, one being automatic feedback for students. Despite the prevalence of traditional email methods of providing feedback to students, many studies show that they do not check their emails as frequent as they check their Social Media accounts. In this paper, we present the automatic generation of feedback messages and tweets using a Context-free Grammars (CFG). Our design takes a class list of students and their mark sheets and automatically composes tweets (using the CFG rules) about statistical “fun facts” about programming problems, exercises, class performances, and private messages about individual student performances. These tweets and messages are then pushed to Tweeter using the Twitter Application Programming Interface (API). A survey of 116 student participants at a South African university showed that majority of the students will love to get such notifications on Social Media, rather than check their emails; and that Lecturers also find this initiative to be a forward thinking one.

Generating Fluent Adversarial Examples for Natural Languages

Conference Paper

Full-text available

Jan 2019

Efficiently building an adversarial attacker for natural language processing (NLP) tasks is a real challenge. Firstly, as the sentence space is discrete, it is difficult to make small perturbations along the direction of gradients. Secondly, the fluency of the generated examples cannot be guaranteed. In this paper, we propose MHA, which addresses both problems by performing Metropolis-Hastings sampling, whose proposal is designed with the guidance of gradients. Experiments on IMDB and SNLI show that our proposed MHAoutperforms the baseline model on attacking capability. Adversarial training with MHA also leads to better robustness and performance.

Constructing narrative using a generative model and continuous action policies

Conference Paper

Full-text available

Jan 2017

Fraud's Bargain Attack: Generating Adversarial Text Samples via Word Manipulation Process

Article

Jul 2024
IEEE T KNOWL DATA EN

Recent research has revealed that natural language processing (NLP) models are vulnerable to adversarial examples. However, the current techniques for generating such examples rely on deterministic heuristic rules, which fail to produce optimal adversarial examples. In response, this study proposes a new method called the Fraud's Bargain Attack (FBA), which uses a randomization mechanism to expand the search space and produce high-quality adversarial examples with a higher probability of success. FBA uses the Metropolis-Hasting sampler, a type of Markov Chain Monte Carlo sampler, to improve the selection of adversarial examples from all candidates generated by a customized stochastic process called the Word Manipulation Process (WMP). The WMP method modifies individual words in a contextually-aware manner through insertion, removal, or substitution. Through extensive experiments, this study demonstrates that FBA outperforms other methods in terms of attack success rate, imperceptibility and sentence quality.

Example of a generated sentence.

Context in source publication

Citations