Workflow diagram of our proposed method

Source publication

Multi-scale Attention U-Net (MsAUNet): A Modified U-Net Architecture for Scene Segmentation

Preprint

Full-text available

Sep 2020

Soham Chattopadhyay
Hritam Basak

Despite the growing success of Convolution neural networks (CNN) in the recent past in the task of scene segmentation, the standard models lack some of the important features that might result in sub-optimal segmentation outputs. The widely used encoder-decoder architecture extracts and uses several redundant and low-level features at different ste...

Figure 3: The SGN consists of (a) Visual Encoder, (b) Phrase Encoder,...

Semantic Grouping Network for Video Captioning

Preprint

Full-text available

Feb 2021

Hobin Ryu
Sunghun Kang
Haeyong Kang
Chang Yoo

This paper considers a video caption generating network referred to as Semantic Grouping Network (SGN) that attempts (1) to group video frames with discriminating word phrases of partially decoded caption and then (2) to decode those semantically aligned groups in predicting the next word. As consecutive frames are not likely to provide unique info...

Workflow diagram of our proposed method

Similar publications