Schematic illustration.

Source publication

A Joint Optimization Model for Image Summarization Based on Image Content and Tags

Conference Paper

Full-text available

Jul 2014

As an effective technology for navigating a large num-ber of images, image summarization is becoming a promising task with the rapid development of image sharing sites and social networks. Most existing sum-marization approaches use the visual-based features for image representation without considering tag informa-tion. In this paper, we propose a...

Context 1

... propose a joint framework for image summarization. The schematic illustration is shown in Figure 2. Assume α and β ∈ R n are the indicator vectors, measuring the "rep- resentativeness" of each image from the visual and textual viewpoints respectively. ...

View in full-text

Joint Reconstruction of Multi-channel, Spectral CT Data via Constrained Total Nuclear Variation Minimization

Article

Full-text available

Feb 2015

We explore the use of the recently proposed "total nuclear variation" (TNV) as a regularizer for reconstructing multi-channel, spectral CT images. This convex penalty is a natural extension of the total variation (TV) to vector-valued images and has the advantage of encouraging common edge locations and a shared gradient direction among image chann...

Multi-Modal Knowledge Graph Construction and Application: A Survey

Preprint

Feb 2022

Recent years have witnessed the resurgence of knowledge engineering which is featured by the fast growth of knowledge graphs. However, most of existing knowledge graphs are represented with pure symbols, which hurts the machine's capability to understand the real world. The multi-modalization of knowledge graphs is an inevitable key step towards the realization of human-level machine intelligence. The results of this endeavor are Multi-modal Knowledge Graphs (MMKGs). In this survey on MMKGs constructed by texts and images, we first give definitions of MMKGs, followed with the preliminaries on multi-modal tasks and techniques. We then systematically review the challenges, progresses and opportunities on the construction and application of MMKGs respectively, with detailed analyses of the strength and weakness of different solutions. We finalize this survey with open research problems relevant to MMKGs.

Multi-Modal Supplementary-Complementary Summarization using Multi-Objective Optimization

Conference Paper

Jul 2021

Large amounts of multi-modal information online make it difficult for users to obtain proper insights. In this paper, we introduce and formally define the concepts of supplementary and complementary multi-modal summaries in the context of the overlap of information covered by different modalities in the summary output. A new problem statement of combined complementary and supplementary multi-modal summarization (CCS-MMS) is formulated. The problem is then solved in several steps by utilizing the concepts of multi-objective optimization by devising a novel unsupervised framework. An existing multi-modal summarization data set is further extended by adding outputs in different modalities to establish the efficacy of the proposed technique. The results obtained by the proposed approach are compared with several strong baselines; ablation experiments are also conducted to empirically justify the proposed techniques. Furthermore, the proposed model is evaluated separately for different modalities quantitatively and qualitatively, demonstrating the superiority of our approach.

Deep social collaborative filtering by trust

Conference Paper

May 2018

The primary purpose of recommender system is providing valuable and pertinent information actively according to users' preference. Collaborative filtering (CF) is a successful approach among different recommendation techniques. However, CF-based methods usually suffer from limited performance due to the sparsity of rating data. To address such issues, utilizing auxiliary information such as content information and social networks is a very promising approach. This paper proposes an effective Bayesian hierarchical framework called Deep Trust Recommender (DTR), which aims to improve the quality of recommender systems by integrating user-item rating information and social trust network into the embedding representation. Specifically, we first apply deep learning methods like stacked denoising auto-encoders to learn the approximation representation of the users preference from the users trust network, and then take this representation to train with rating information collaboratively. Extensive experiments on two real-world datasets reveal that our approach performs much better than several widely adopted state-of-the-art recommendation methods.

Personal Photo Management and Preservation

Chapter

Full-text available

Feb 2018

Andrea Ceroni

Thanks to the spread of digital photography and available devices, taking photographs has become effortless and tolerated nearly everywhere. This makes people easily ending up with hundreds or thousands of photos, for example, when returning from a holiday trip or taking part in ceremonies, concerts, and other events. Furthermore, photos are also taken of more mundane motives, such as food and aspects of everyday life, further increasing the number of photos to be dealt with. The decreased prices of storage devices make dumping the whole set of photos common and affordable. However, this practice frequently makes the stored collections a kind of dark archives, which are rarely accessed and enjoyed again in the future. The big size of the collections makes revisiting them time demanding. This suggests to identify, with the support of automated methods, the sets of most important photos within the whole collections and to invest some preservation effort for keeping them accessible over time. Evaluating the importance of photos to their owners is a complex process, which is often driven by personal attachment, memories behind the content, and personal tastes that are difficult to capture automatically. Therefore, to better understand the selection process for photo preservation and future revisiting, the first part of this chapter presents a user study on a photo selection task where participants selected subsets of most important pictures from their own collections. In the second part of this chapter, we present methods to automatically select important photos from personal collections, in light of the insights emerged from the user study. We model a notion of photo importance driven by user expectations, which represents what photos users perceive as important and would have selected. We present an expectation-oriented method for photo selection, where information at both photo and collection levels is considered to predict the importance of photos.

A knowledge-based semantic approach for image collection summarization

Article

Full-text available

May 2017
MULTIMED TOOLS APPL

With the advent of digital cameras, the number of digital images is on the increase. As a result, image collection summarization systems are proposed to provide users with a condense set of summary images as a representative set to the original high volume image set. In this paper, a semantic knowledge-based approach for image collection summarization is presented. Despite ontology and knowledge-based systems have been applied in other areas of image retrieval and image annotation, most of the current image summarization systems make use of visual or numeric metrics for conducting the summarization. Also, some image summarization systems jointly model visual data of images together with their accompanying textual or social information, while these side data are not available out of the context of web or social images. The main motivation of using ontology approach in this study is its ability to improve the result of computer vision tasks by the additional knowledge which it provides to the system. We defined a set of ontology based features to measure the amount of semantic information contained in each image. A semantic similarity graph was made based on semantic similarities. Summary images were then selected based on graph centrality on the similarity graph. Experimental results showed that the proposed approach worked well and outperformed the current image summarization systems.

Image Tagging via Cross-Modal Semantic Mapping

Conference Paper

Full-text available

Oct 2015

Images without annotations are ubiquitous on the Internet, and recommending tags for them has become a challenging open task in image understanding. A common bottleneck of related work is the semantic gap between the image and text representations. In this paper, we bridge the gap by introducing a semantic layer, the space of word embeddings that represents the image tags as the word vectors. Our model first learns the optimal mapping from the visual space to the semantic space using training sources. Then we annotate test images by decoding the semantic representations of the visual features. Extensive experiments demonstrate that our model outperforms the state-of-the-art approaches in predicting the image tags.

Multi-Document Summarization Based on Two-Level Sparse Representation Model

Article

Feb 2015

Multi-document summarization is of great value to many real world applications since it can help people get the main ideas within a short time.In this paper, we tackle the problem of extracting summary sentences from multi-document sets by applying sparse coding techniques and present a novel framework to this challenging problem. Based on the data reconstruction and sentence denoising assumption, we present a two-level sparse representation model to depict the process of multi-document summarization. Three requisite properties is proposed to form an ideal reconstructable summary: Coverage, Sparsity and Diversity. We then formalize the task of multi-document summarization as an optimization problem according to the above properties, and use simulated annealing algorithm to solve it.Extensive experiments on summarization benchmark data sets DUC2006 and DUC2007 show that our proposed model is effective and outperforms the state-of-the-art algorithms.

Relational Stacked Denoising Autoencoder for Tag Recommendation

Conference Paper

Full-text available

Feb 2015

Tag recommendation has become one of the most important ways of organizing and indexing online resources like articles, movies, and music. Since tagging information is usually very sparse, effective learning of the content representation for these resources is crucial to accurate tag recommendation. Recently, models proposed for tag recommendation, such as collaborative topic regression and its variants, have demonstrated promising accuracy. However, a limitation of these models is that, by using topic models like latent Dirichlet allocation as the key component, the learned representation may not be compact and effective enough. Moreover, since relational data exist as an auxiliary data source in many applications, it is desirable to incorporate such data into tag recommendation models. In this paper, we start with a deep learning model called stacked denoising autoencoder (SDAE) in an attempt to learn more effective content representation. We propose a probabilistic formulation for SDAE and then extend it to a relational SDAE (RSDAE) model. RSDAE jointly performs deep representation learning and relational learning in a principled way under a probabilistic framework. Experiments conducted on three real datasets show that both learning more effective representation and learning from relational data are beneficial steps to take to advance the state of the art.

Multi-Document Summarization Based on Two-Level Sparse Representation Model

Conference Paper

Full-text available

Jan 2015

Multi-document summarization is of great value to many real world applications since it can help people get the main ideas within a short time. In this paper, we tackle the problem of extracting summary sentences from multi-document sets by applying sparse coding techniques and present a novel framework to this challenging problem. Based on the data reconstruction and sentence denoising assumption, we present a two-level sparse representation model to depict the process of multi-document summarization. Three requisite properties is proposed to form an ideal reconstructable summary: Coverage, Sparsity and Diversity. We then formalize the task of multi-document summarization as an optimization problem according to the above properties, and use simulated annealing algorithm to solve it. Extensive experiments on summarization benchmark data sets DUC2006 and DUC2007 show that our proposed model is effective and outperforms the state-of-the-art algorithms.

A Modified Optimal Clustering Technique for Image Categorization and Summarization

Article

Jul 2017

Schematic illustration.

Context in source publication

Similar publications

Citations