Structure of Gram layer: The feature map, the output of the convolution layer, is used as the input to the Gram layer. The Gram layer uses the input feature map to create a gram matrix and uses it to create a style weight. The style weight is multiplied by the input feature map so that the weighted input tensor becomes the output of the Gram layer.

Structure of Gram layer: The feature map, the output of the convolution layer, is used as the input to the Gram layer. The Gram layer uses the input feature map to create a gram matrix and uses it to create a style weight. The style weight is multiplied by the input feature map so that the weighted input tensor becomes the output of the Gram layer.

Source publication
Article
Full-text available
As the scale of the film industry grows, the demand for well-established movie databases is also growing. The genre of a movie supplies information on its overall content and has multiple values. Therefore, it should be well classified utilizing the characteristics of movies, without omissions in the database. In this study, we extract the optimal...

Context in source publication

Context 1
... Gram layer is a single layer, broadly divided into two steps, which can be easily added to the bottleneck block and basic block of VGGNet [36] or ResNet [38]. Fig. 4 illustrates the overall internal structure of the Gram layer. The operations inside the Gram layer are two-fold. The first step is to create a gram matrix with an input feature map X l to extract the style features from the input feature maps X l . X l ∈ R B×C×H×W is a feature map that is calculated by l's convolution layers. B is the ...

Similar publications

Technical Report
Full-text available
Opinion analysis has become a flourishing frontier as of late. In this paper, we exhaustively study movie reviews from a popular online database. We randomly sample more than 1000 reviews with titles to train our model. It is capable of suggesting words during the process of appraising a film. It can intelligently anticipate the words that an appra...
Article
Full-text available
The aim of this paper is the definition of a daily index representing the risk-return on investments in the American film industry. The index should be used to predict the riskiness and the expected return of movie projects at the level of the overall industry and then to determine a premium for insurance for such an investment. Such an index can i...

Citations

... According to these studies, deep learning models have made significant progress in their ability to extract intricate and sophisticated information from videos. [23] suggested a crucial CNN layer that gathers information on style features from posters. They determined the relationship between the movie's genre and the aesthetic elements of the posters. ...
... К перечню «гибридных» методов мы можем отнести: аналитику визуальных изображений -промопостеров [Wi et al, 2020]; анализ маркетинговой среды и прогнозов показателей кинопроката с помощью рекуррентных нейронных сетей [Yu, Liu, 2022]; анализ неформальной коммуникации и «сарафанного радио» в отношении информации о будущих фильмах [Yun Kyung, 2017]. ...
Article
The subject of the research is the results of distribution of Russian national films. The purpose of the study is to classify projects according to the principle of their success/failure at the box office and predict the characteristics of the box office. The objectives of the study are to create algorithms for selecting (classifying) potentially successful projects into an investment portfolio and predicting (regression) rental characteristics: the number of views, payback, viewer rating. The technique is based on the application of ensemble machine learning models. The empirical base of the study is the entire set of Russian national films in distribution from 2004 to April 2022 (N=1469) and from May 2022 to April 2023 (N=194). Achieved accuracy of 0.95 and 0.89 for two and four class classification and high performance ROC_AUC = 0.97 for two class model and 0.94 – 0.98 for four class model. More complex metamodels (superensembles) can achieve an accuracy of 0.97-0.98 for a two-class classification and 0.96 for a four-class one. Complex regression metamodels predict the absolute values of payback, fees, views with a coefficient of determination (R2) in the range of 0.97-0.98 using synthetic data. As a result, it became possible to form investment portfolios of film projects with an annual historical return of up to 139%. The scope of application is to ensure the selection of films for investment "portfolios of film projects" of state (Ministry of Culture, Cinema Fund) and private investors. Machine learning models can be adapted to the conditions of global and foreign markets by increasing the number of model features, expanding the arsenal of machine learning methods, including the analysis of texts, images, videos, and user data of social networks.
... Movie genre prediction studies are made by using visual, auditory, and textual features [9]. Classification of posters with visual features according to their genres by a machine-learning algorithm is one of these studies [10]. Posters are important because they create a first impression of the movie content and genre. ...
... Wi et al. [10] carried out a multilabel genre classification study from movie posters using the Gram layer in convolutional neural networks. ResNet architecture was used as the reference model. ...
Article
Full-text available
In this study, transfer learning has been used to overcome multilabel classification tasks. As a case study, movie genre classification by using posters has been chosen. Six state-of-the-art pretrained models, VGG16, ResNet, DenseNet, Inception, MobileNet, and ConvNeXt, have been employed for this experiment. The movie posters have been obtained from Internet Movie Database (IMDB). The dataset has been divided using an iterative stratification technique. A sequence of dense layers has been added on top of each model and these models have been trained and fine-tuned. All the results of the models compared considered accuracy, loss, Hamming loss, F1-score, precision, and AUC metrics. When the metrics used were evaluated, the most successful result regarding accuracy has been obtained from the modified DenseNet architecture at 90%. Also, the ConvNeXt, which is the newest model among all, performed quite satisfactorily, reaching over 90% accuracy. This study uses an iterative stratification method to split an unbalanced dataset which provides more reliable results than the classical splitting method which is the common method in the literature. Also, the feature extraction capabilities of the six pretrained models have been compared. The outcome of this study shows promising results regarding multilabel classification. As for future work, it is planned to enhance this study by using natural language processing and ensemble methods.
... The genre of a film is an essential indicator of its central themes and serves many functions. This factor heavily impacts moviegoers' preferences (Wi et al., 2020). Accurate movie genre classification has been added to the recommended systems (Kundalia et al., 2020). ...
... Wi et al. (Wi et al., 2020) explored using a Gram layer in a CNN and extracted the ideal details and traits from movie posters to help classify films into genres. 2664 movie film posters with 12 multi-genres comprised the dataset in the experiment. ...
Article
Full-text available
India experienced a 23% rise in podcast listening after the Covid‐19 pandemic. The pandemic and screen fatigue led people to seek their favourite old audio podcasts. Podcast genre classification allows listeners to compile a playlist of their favourite tracks; it also helps podcast streaming services provide recommendations to users based on the genre of the podcasts they enjoy. Since the COVID‐19 pandemic, the need for educational content in all forms, including podcasts, has skyrocketed, making it even more crucial to anticipate the genre of educational podcasts. Educational podcasts are a sub‐genre of the broader education genre and typically involve audio recordings of discussions, lectures, or interviews on educational topics. Education podcast genre prediction is required to efficiently classify and arrange educational content and make it simpler for listeners to access and absorb pertinent information. This study focuses on Podcast Genre Prediction, specifically for the Hindi language. In this study, our developed PodGen dataset was used, which consists of 550 five‐minute podcasts with 26,867 sentences, where every podcast was manually annotated into one of the four genre categories (Horror, Motivational, Crime, and Romance). The performance comparison of state‐of‐the‐art machine learning techniques on the PodGen dataset was used to demonstrate accuracy. The best performance on testing data was observed in the Support Vector Classifier model with balanced accuracy: 82.42%, precision (weighted): 83.09%, recall (weighted): 82.42%, and F1 score (weighted): 82.39%.
... Little research has been conducted concerning movie posters to assess their value in predicting box office income. Most of the studies mainly deal with movie genre classification using posters [3,7,14,15,22,25,30,36,37,41] and extract the most useful information and traits from movie posters to aid in the genre classification of films. Reference [41] is the first paper to recommend using movie poster features and movie metadata as a multimodal neural network to predict movie box-office revenue. ...
Article
Full-text available
Demand forecasting a film’s opening weekend box office revenue is a difficult and complex task that decision-makers face due to a lack of historical data and various complex factors. We proposed a novel Deep Multimodal Feature Classifier Neural Network model (DMFCNN) for predicting a film’s opening weekend box office revenue using deep multimodal visual features extracted from movie posters and movie metadata. DMFCNN is an end-to-end predictive model that fuses two different feature classifiers’ predictive power in estimating the movie box office revenue. Initially, a pre-trained residual convolutional neural network (ResNet50) architecture using transfer learning techniques extracts visual, and object representations learned from movie posters. The movie posters’ discriminative and financial success-related features are combined with other movie metadata to classify the movie box office revenue. The proposed DMFCNN aided in developing a robust predictive model that jointly learns and defines useful revenue-related poster features and objects semantics, which strongly correlates with movie box office revenue and aesthetic appearance. Although our main task was classification, we also analyzed regressions between our exogenous variables as a regularizer to avoid the risk of overfitting. We evaluated DMFCNN’s performance and compared it to various state-of-the-art models on the Internet Movie Database by collecting 49,857 movies metadata and posters from 2006 to 2019. The learned information on movie posters and predicted outcomes outperformed existing models, achieving 59.30% prediction accuracy. The proposed fusion strategy outperformed the existing fusion schemes in precision, Area Under Cover, sensitivity, and specificity by achieving 80%, 81%, 79%, and 78%, respectively.
... The genre of a film is an essential indicator of its central themes and serves many functions. Moviegoers' preferences are heavily impacted by this factor 14 . Accurate movie genre classification has been added to the recommended systems 15 . ...
Preprint
India sees a 23% rise in podcast listening after the Covid-19 pandemic. The pandemic and screen fatigue led people to seek out old favourite audio podcasts. Podcast genre classification allows listeners to compile a playlist of their favourite tracks; this also helps podcast streaming services provide recommendations to users based on the genre of the podcasts they enjoy. After the COVID-19 pandemic, the need for educational content in all forms, including podcasts, has skyrocketed, making it even more crucial to anticipate the genre of educational podcasts. Educational podcasts are a sub-genre of the broader education genre and typically involve audio recordings of discussions, lectures, or interviews on educational topics. Education podcast genre prediction is required to efficiently classify and arrange educational content and make it simpler for listeners to access and absorb pertinent information. This study focuses on Podcast Genre Prediction, specifically for the Hindi language. In our study, our developed PodGen dataset is used, which consists of 550 podcasts of 5 minutes each and have a total of 26,867 sentences, where every podcast has been manually annotated into one of the four genre categories (Horror, Motivational, Crime, and Romance). The performance comparison of state-of-the-art machine learning techniques on the PodGen dataset is used to demonstrate accuracy. The best performance was observed in the case of the Support Vector Classifier model with balanced accuracy:82.42%, precision (weighted):83.09%, recall (weighted):82.42%, and F1 score(weighted):82.39% on testing data.
... Therefore, touching people's emotions through images is a prominent feature of movie posters. It shows that the organic unity of the content and form of the movie poster image is the key to realize the function of the movie poster [6]. Tulbure et al. (2022) revealed that accurate image recognition is of great research significance, and image recognition technology plays a vital part in medicine, aerospace, military, industry and agriculture, and many other fields. ...
Article
Full-text available
With the development of science and technology and the continuous changes of social environment, the development prospect of traditional cinema is worrying. This work aims to improve the publicity effect of movie posters and optimize the marketing efficiency of movie posters and promote the development of film and television industry. First, the design concept of high grossing movie posters is discussed. Then, the concept of movie poster analysis based on Deep Learning (DL) technology is analyzed under Big Data Technology. Finally, a movie poster analysis model is designed based on Convolutional Neural Network (CNN) technology under DL and is evaluated. The results demonstrate that the learning curve of the CNN model reported here is the best in the evaluation of model performance in movie poster analysis. Besides, the learning rate of the model is basically stable when the number of iterations is about 500. The final loss value is around 0.5. Meanwhile, the accuracy rate of the model is also stable at the number of iterations of about 500, and the accuracy rate of the model is around 0.9. In addition, the recognition accuracy of the model designed here in movie poster classification recognition is generally between 60% and 85% in performing theme, style, composition, color scheme, set, and product recognition of movie posters. Moreover, the evaluation of the model in the movie poster style composition suggests that the style composition of movie poster production dramatically varies in different films, in which movie posters focus most on movie product, style, and theme. Compared with other models, the performance of this model is more outstanding in all aspects, which shows that this work has achieved a great technical breakthrough. This work provides a reference for the optimization of the design method of movie posters and contributes to the development of the movie industry.
... In addition, there are many similar genres, in which one film may include several genres in it, making accurate classification difficult [4] . To overcome this problem and perform genre classification efficiently, many previous studies have used Machine Learning and Deep Learning to try automatic multi-label classification of English film genres, based on various data such as movie posters [5], [6], plot summary [7], [8], synopsis [9]- [11], and trailers [12] used separately or in combination [13]. However, the use of the film's plot summary is limited by the fact that it only reveals the introductory part of the film and not the entire content. ...
... Trailers contain various types of information, such as picture frames and audio. However, trailers require high computational capacity, due to the large data size [6]. Likewise, film posters are single images that have a high degree of variability and lack of pattern formation [4]. ...
Conference Paper
Full-text available
Movies were still a very popular means ofentertainment. The current distribution of internet userscauses a large amount of movie data to be created anddistributed online. The emergence of movie streaming servicesmakes consumers very interested in using automatic film genreclassification. In this study, a multi-label film genreclassification will be carried out based on an English synopsis.Data were collected from the Internet Movie Database (IMDb)website. The amount of data used in this study was 10,432 linesof data obtained using scraping techniques on June 7, 2022.Researchers divided the dataset labels into 18 labelsrepresenting each genre. Feature extraction using TF-IDF andStemming. The multi-label classification algorithm used is theSupport Vector Machine, Logistic Regression, and Naïve BayesAlgorithms. Optimal parameter search using GridSearch ofeach algorithm. The optimum result in this study was obtainedf1-score value of 0.58 using the SVM algorithm with TF-IDFfeature extraction with stemming dataset, followed by NB withthe f1-score value of 0.48 and LR with an f1-score value of 0.43
... As another example, Wi et al. [51] used movie posters to assist in the classification of movies into genres using convolutional neural network (CNN). They extracted style features from the posters and identified correlations between the genres. ...
Preprint
Full-text available
In the last decades, global awareness towards the importance of diverse representation has been increasing. Lack of diversity and discrimination toward minorities did not skip the film industry. Here, we examine ethnic bias in the film industry through commercial posters, the industry's primary advertisement medium for decades. Movie posters are designed to establish the viewer's initial impression. We developed a novel approach for evaluating ethnic bias in the film industry by analyzing nearly 125,000 posters using state-of-the-art deep learning models. Our analysis shows that while ethnic biases still exist, there is a trend of reduction of bias, as seen by several parameters. Particularly in English-speaking movies, the ethnic distribution of characters on posters from the last couple of years is reaching numbers that are approaching the actual ethnic composition of US population. An automatic approach to monitor ethnic diversity in the film industry, potentially integrated with financial value, may be of significant use for producers and policymakers.
... After that, they reproduced the movie poster dataset for multilevel classification with up to 9 genres. They compared their model with the existing model [13]. ...
Conference Paper
Full-text available
Technological breakthroughs and the interest of business entities have made the categorization of media products gradually conventional in this digital environment. This is usually a multi-label scenario in which an object might be labeled with several categories. Most of the literature addresses the classification of movie genre as a mono-labeling task, generally based on audio-visual features. This study addressed a multilabel movie genre classification model using supervised machine learning techniques to classify the movies into their corresponding genres. The novelty of this work lies in its attempt to optimize the classifier and combine the classifier to make a hybrid classification system. The parameter optimized hybrid classification technique for multilabel movie genre classification has been proposed as a hybrid classification technique that combines SVM and DT. The performance of the classifiers is compared with respect to feature vectors with TF-IDF and BOW representation methods. Dimensionality has been reduced using the chi-square feature selection technique. For performance comparison, we measured the recall, precision and F1-measure for the classifiers. As a result, we recommend the parameter optimized hybrid classification technique because it shows high degree of accuracy regardless of the dataset and the feature vector. If we need to use traditional classifiers, we recommend KNN because it promises high accuracy after selecting the absolute value of parameter K. In order to use SVM, robust scaling will be needed to resolve unbalanced dataset. If we use DT, we need to use the N-gram practice to improve the accuracy.