Fig 1 - uploaded by Ichiro Ide
Content may be subject to copyright.
Composition of broadcast news video 

Composition of broadcast news video 

Source publication
Conference Paper
Full-text available
Recently, for reusing large quantities of accumulated news video, technology for news topic searching and tracking has become nec- essary. Moreover, since we need to understand a certain topic from vari- ous viewpoints, we focus on identical event detection in various news pro- grams from different countries. Currently, text information is generall...

Similar publications

Article
Full-text available
Recently, the data on web site has been thought of as mean- ingful data to analyze our society. It is used to see the trends in specific field. In addition, it can be used to predict the next trend or current hidden needs. This is because the data is natural (unintentional). We run the multi-lingual machine translation service site. In fact, in mor...
Chapter
Full-text available
We describe experiments with Czech-to-English phrase-based machine translation. Several techniques for improving translation quality (in terms of well-established measure BLEU) are evaluated. In total, we are able to achieve BLEU of 0.36 to 0.41 on the examined corpus of Wall Street Journal texts, outperforming all other systems evaluated on this l...
Article
Full-text available
In this work, we examine the quality of several statistical machine translation sys-tems constructed on a small amount of parallel Serbian-English text. The main bilingual parallel corpus consists of about 3k sentences and 20k running words from an unrestricted domain. The translation systems are built on the full corpus as well as on a reduced cor...
Conference Paper
Full-text available
Evaluating translation models is a trade-off between effort and detail. On the one end of the spectrum there are automatic count-based methods such as BLEU, on the other end linguistic evaluations by humans, which arguably are more informative but also require a disproportionately high effort. To narrow the spectrum, we propose a general approach o...
Preprint
Full-text available
Current work on multimodal machine translation (MMT) has suggested that the visual modality is either unnecessary or only marginally beneficial. We posit that this is a consequence of the very simple, short and repetitive sentences used in the only available dataset for the task (Multi30K), rendering the source text sufficient as context. In the ge...

Citations

... Como las tomas se caracterizan por la coherencia de las características visuales de "bajo nivel", segmentar los vídeos en tomas resulta una _____________ 5 Traducimos de esta manera el término inglés shot, por considerar que es el término español más acorde con la definición. En otras obras también ha sido traducido como plano temporal, etc. tarea relativamente fácil (Ogawa, et al., 2008). Por otra parte, se considera que las tomas son las unidades fundamentales para organizar el contenido de las secuencias de vídeo y la base (primitives) para la anotación semántica y las tareas de recuperación de más nivel (Hu, et al., 2011). ...
Article
Full-text available
Se analizan las diferentes maneras en el que el documentalista realiza la segmentación del vídeo para así identificar la unidad discursiva mínima de análisis en los departamentos de documentación de las cadenas televisiva. De forma paralela, y atendiendo al cambio de paradigma digital en la producción en televisión, se analizan las principales fórmulas de segmentación automática de vídeo desarrolladas por medio de la inteligencia artificial. Una vez determinada ambas realidades se intenta implementar el mejor método de segmentación entre lo humano y lo automático, para que tenga su utilidad en los sistemas de información documental de las televisiones. Para ello se ofrecen las principales líneas de trabajo en segmentación semántica de vídeo.
Article
As one tool for structuring a massive volume of archived news videos based on their semantic contents, this paper proposes a method to detect scene duplicates from news videos. A scene duplicate is a pair of video segments taken at the same event from different viewpoints. Referring to the audio channel is effective to detect scene duplicates regardless of viewpoints, but it cannot be relied on when external audio sources (e.g. Narrations, sound effects) overlap the original one. In contrast, the image channel can be useful in most cases, although significant difference in viewpoints affect the detection. The proposed method integrates the information from these two channels in order to improve the accuracy of scene duplicate detection from news videos. The performance of the proposed method was evaluated through an experiment with actual broadcast news videos. As a result, we obtained the higher detection accuracies in both recall and precision. Therefore, we confirmed the effectiveness of the proposed method.
Conference Paper
Near-duplicate video detection is becoming a core-technology for analyzing the structure of a large-scale video archive. It, however, is naturally an O(n<sup>2</sup>) problem, where n is a value proportional to the total length of an input video stream. We have previously challenged this time-consuming task by reducing the cost required for each of the O(n<sup>2</sup>) comparisons. This paper, on the other hand, proposes a method that reduces the number of comparisons by adaptively dividing the feature space according to the distribution of feature points.
Conference Paper
We propose a method that analyzes the structure of a large volume of general broadcast video data by the appearance patterns of near-duplicate video segments. We define six classification rules based on the appearance patterns of near-duplicate video segments according to their roles, and evaluated them over more than 1,000 hours of actual broadcast video data.