Home
Pontifícia Universidade Católica do Rio Grande do Sul
Programa de Pós-Graduação em Ciência da Computação
Jônatas Wehrmann

Jônatas Wehrmann
Pontifícia Universidade Católica do Rio Grande do Sul | PUCRS · Programa de Pós-Graduação em Ciência da Computação

Doctor of Philosophy

About

Publications

81,852

Reads

1,140

Citations

Skills and Expertise

Neural Networks

Classification

Natural Language Processing

Computer Vision

Pattern Recognition

Machine Learning

Neural Networks and Artificial Intelligence

August 2016 - present

PUCRS/Motorola

Porto Alegre, Brazil

Position

Researcher

Description

Deep Learning for image and text understanding.

January 2015 - August 2016

PUCRS/Motorola

Porto Alegre, Brazil

Position

Researcher

Description

Deep learning for sentiment analysis, video classification, adult content detection.

August 2016 - August 2020

Pontifícia Universidade Católica do Rio Grande do Sul

Field of study

Machine Learning / Deep Learning

January 2015 - July 2016

Pontifícia Universidade Católica do Rio Grande do Sul

Field of study

Machine Learning / Deep Learning

Publications

A character-based convolutional neural network for language-agnostic Twitter sentiment analysis

Conference Paper

Full-text available

May 2017

Adult Content Detection in Videos with Convolutional and Recurrent Neural Networks

Article

Full-text available

Jul 2017

The amount of adult content on the Internet grows daily. Much of the pornographic content is unconstrained and freely-available for all users, requiring parents to make use of parental control strategies for protecting their children. Current parental control devices depend on human intervention, and hence there is the need of computational approac...

Bidirectional Retrieval Made Simple

Article

Full-text available

Apr 2018

This paper provides a very simple yet effective character-level architecture for learning bidirectional retrieval models. Aligning multimodal content is particularly challenging considering the difficulty in finding semantic correspondence between images and descriptions. We introduce an efficient character-level inception module, designed to learn...

Language-Agnostic Visual-Semantic Embeddings

Article

Full-text available

Aug 2019

This paper proposes a framework for training language-invariant cross-modal retrieval models. We also introduce a novel character-based word-embedding approach, allowing the model to project similar words across languages into the same word-embedding space. In addition, by performing cross-modal retrieval at the character level, the storage require...

Hierarchical Multi-Label Classification Networks

Conference Paper

Full-text available

Sep 2019

One of the most challenging machine learning problems is a particular case of data classification in which classes are hierarchically structured and objects can be assigned to multiple paths of the class hierarchy at the same time. This task is known as hierarchical multi-label classification (HMC), with applications in text classification, image a...

Looks Like Magic: Transfer Learning in GANs to Generate New Card Illustrations

Conference Paper

Jul 2022

COBE: A Natural Language Code Search Robustness Benchmark

Conference Paper

Jul 2022

Fig. 1. Illustrations generated by the proposed approach. Captions of...

Fig. 2. Subsets for the MTG dataset. Upper-left: samples from...

Fig. 3. Comparison of training results for the "MTG-Island" (left) and...

Fig. 6. Per epoch evolution of the generated image for the model...

Looks Like Magic: Transfer Learning in GANs to Generate New Card Illustrations

Preprint

Full-text available

May 2022

In this paper, we propose MAGICSTYLEGAN and MAGICSTYLEGAN-ADA - both incarnations of the state-of-the-art models StyleGan2 and StyleGan2 ADA - to experiment with their capacity of transfer learning into a rather different domain: creating new illustrations for the vast universe of the game "Magic: The Gathering" cards. This is a challenging task es...

Language-Agnostic Visual-Semantic Embeddings

Conference Paper

Jul 2021

We propose a framework for training language-invariant cross-modal retrieval models. We introduce four novel text encoding approaches, as well as a character-based word-embedding approach, allowing the model to project similar words across languages into the same word-embedding space. In addition, by performing cross-modal retrieval at the characte...

Efficient Neural Architecture for Text-to-Image Synthesis

Conference Paper

Jul 2020

Component Analysis for Visual Question Answering Architectures

Conference Paper

Jul 2020

Efficient Neural Architecture for Text-to-Image Synthesis

Preprint

Full-text available

Apr 2020

Text-to-image synthesis is the task of generating images from text descriptions. Image generation, by itself, is a challenging task. When we combine image generation and text, we bring complexity to a new level: we need to combine data from two different modalities. Most of recent works in text-to-image synthesis follow a similar approach when it c...

Adaptive Cross-Modal Embeddings for Image-Text Alignment

Article

Full-text available

Apr 2020

Component Analysis for Visual Question Answering Architectures

Preprint

Full-text available

Feb 2020

Recent research advances in Computer Vision and Natural Language Processing have introduced novel tasks that are paving the way for solving AI-complete problems. One of those tasks is called Visual Question Answering (VQA). A VQA system must take an image and a free-form, open-ended natural language question about the image, and produce a natural l...

Adaptive Cross-modal Embeddings for Image-Text Alignment

Conference Paper

Full-text available

Nov 2019

In this paper, we introduce a novel approach for training image-text alignment models, namely ADAPT. Image-text alignment methods are often used for cross-modal retrieval, i.e., to retrieve an image given a query text, or captions that successfully label an image. ADAPT is designed to adjust an intermediate representation of instances from a modali...

Language-Agnostic Visual-Semantic Embeddings

Conference Paper

Oct 2019

Supplementary Material: Language-Agnostic Visual-Semantic Embeddings 1. Hyper-Parameters and Training Details

Experiment Findings

Full-text available

Aug 2019

We choose hyper-parameters based on the results over validation data. We employ Adam for optimization, with an initial learning rate of 6 × 10 −4 , which is further decreased 10× at the 15 th epoch for Flickr and M30K datasets, and at the 10 th epoch for COCO and YJ Captions datasets. We use a batch size of 128, leading us to select the hard-contra...

GADIS: A Genetic Algorithm for Database Index Selection (S)

Conference Paper

Full-text available

Jul 2019

Fast and Efficient Text Classification with Class-based Embeddings

Conference Paper

Full-text available

Mar 2019

Current state-of-the-art approaches for Natural Language Processing tasks such as text classification are either based on Recurrent or Convolutional Neural Networks. Notwithstanding , those approaches often require a long time to train, or large amounts of memory to store the entire trained models. In this paper, we introduce a novel neural network...

Attention-based Adversarial Training for Seamless Nudity Censorship

Conference Paper

Full-text available

Mar 2019

The amount of digital pornographic content over the Internet grows daily and accessing such a content has become increasingly easier. Hence, there is a real need for mechanisms that can protect particularly-vulnerable audiences (e.g., children) from browsing the web. Recently, object detection methods based on deep neural networks such as CNNs have...

Real-Time Detection of Pedestrian Traffic Lights for Visually-Impaired People

Conference Paper

Jul 2018

Seamless Nudity Censorship: an Image-to-Image Translation Approach based on Adversarial Training

Conference Paper

Full-text available

Jun 2018

The easy access and widespread of the Internet makes it easier than ever to reach content of any kind at any moment, and while that poses several advantages, there is also the fact that sensitive audiences may be inadvertently exposed to nudity content they did not ask for. Virtually every work on nudity and pornography censorship focus solely on p...

Bidirectional Retrieval Made Simple

Conference Paper

Jun 2018

Fast Self-Attentive Multimodal Retrieval

Conference Paper

Full-text available

Jan 2018

Multimodal bidirectional retrieval is a very challenging task that consists of semantically aligning two distinct modalities such as images and textual descriptions, allowing the retrieval of content from one of the modalities given the other. The goal is to find a common semantic space for both modalities in order to discover the correspondences b...

A Multi-Task Neural Network for Multilingual Sentiment Classification and Language Detection on Twitter

Conference Paper

Full-text available

Jan 2018

In this paper, we propose a novel approach for classifying both the sentiment and the language of tweets. Our proposed architecture comprises a convolutional neural network (ConvNet) with two distinct outputs, each of which designed to minimize the classification error of either sentiment assignment or language identification. Results show that our...

Multi-label movie genre predictions for Rush Hour.

Multi-label movie genre predictions for Sicario.

Movie Genre Classification: A Multi-Label Approach based on Convolutions through Time

Article

Aug 2017

Jônatas Wehrmann

The task of labeling movies according to their corresponding genre is a challenging classification problem, having in mind that genre is an immaterial feature that cannot be directly pinpointed in any of the movie frames. Hence, off-the-shelf image classification approaches are not capable of handling this task in a straightforward fashion. Moreove...

Order embeddings and character-level convolutions for multimodal alignment

Article

Full-text available

Jun 2017

With the novel and fast advances in the area of deep neural networks, several challenging image-based tasks have been recently approached by researchers in pattern recognition and computer vision. In this paper, we address one of these tasks, which is to match image content with natural language descriptions, sometimes referred as multimodal conten...

An Efficient Deep Neural Architecture for Multilingual Sentiment Analysis in Twitter

Conference Paper

Full-text available

May 2017

Sentiment analysis of tweets is often monolingual and the models provided by machine learning classifiers are usually not applicable across distinct languages. Cross-language sentiment classification usually relies on machine translation strategies in which a source language is translated to the desired target language. Machine translation is costl...

Leveraging deep visual features for content-based movie recommender systems

Conference Paper

Full-text available

May 2017

Hierarchical multi-label classification with chained neural networks

Conference Paper

Full-text available

Apr 2017

In classification tasks, an object usually belongs to one class within a set of disjoint classes. In more complex tasks, an object can belong to more than one class, in what is conventionally termed multi-label classification. Moreover, there are cases in which the set of classes are organised in a hierarchical fashion, and an object must be associ...

Convolutions through time for multi-label movie genre classification

Conference Paper

Apr 2017

In this paper, we explore the suitability of employing Convolutional Neural Networks (ConvNets) for multi-label movie trailer genre classification. Assigning genres to movies is a particularly challenging task because genre is an immaterial feature that is not physically present in a movie frame, so off-the-shelf image detection models cannot be ea...

(Deep) Learning from Frames

Conference Paper

Oct 2016

Learning content from videos is not an easy task and traditional machine learning approaches for computer vision have difficulties in doing it satisfactorily. However, in the past couple of years the machine learning community has seen the rise of deep learning methods that significantly improve the accuracy of several computer vision applications,...

Movie genre classification with Convolutional Neural Networks

Conference Paper

Full-text available