ArticlePDF Available
A preview of the PDF is not available
... However, these representations are lowlevel and handcrafted visual features, which are not discriminative enough for large-scale applications. In addition, previous public patent retrieval datasets such as the concept [29] only contains a limited number of ID categories and images. The evaluation protocols of image-based patent retrieval are also not consistent across most existing approaches. ...
... Patent retrieval [14] is the task of finding relevant patents for a given image query, which can be a sketch, a photo or a patent image. An image and text analysis approach automatically extracts concept information describing the patent image content [29]. A retrieval system uses a hybrid approach that combines feature extraction, indexing and similarity matching to retrieve patent images based on text and shape features [18]. ...
Preprint
Full-text available
Patent retrieval has been attracting tremendous interest from researchers in intellectual property and information retrieval communities in the past decades. However, most existing approaches rely on textual and metadata information of the patent, and content-based image-based patent retrieval is rarely investigated. Based on traits of patent drawing images, we present a simple and lightweight model for this task. Without bells and whistles, this approach significantly outperforms other counterparts on a large-scale benchmark and noticeably improves the state-of-the-art by 33.5% with the mean average precision (mAP) score. Further experiments reveal that this model can be elaborately scaled up to achieve a surprisingly high mAP of 93.5%. Our method ranks first in the ECCV 2022 Patent Diagram Image Retrieval Challenge.
... Jiang et al., (2022) develop a multimodal system to predict IPCclasses of patents based on patent titles, abstract and drawings. Vrochidis et al., (2012) combine figure descriptions manually associated to patent images and train detectors, which can identify global concepts in patent figures in order to classify patent documents according to those global concepts. Most literature on the applications of multimodality in patents focuses on tasks of patent retrieval and patent classification. ...
Article
Full-text available
Images provide concise representations of design artifacts and emerge as the primary mode of communication among innovators, engineers, and designers. The advanced of Artificial Intelligence tools which integrates image and textual information can significantly support the Engineering Design process. In this paper we create 5 different datasets combining both images and text of patents and we develop a set of text-based metrics to assess the quality of text for multimodal applications. Finally, we discuss the challenges arising in the development of multimodal patent datasets.
... This technique is most used in mailing software to distinguish between important and unimportant emails by classifying the latter as spam [11]. [14]. ...
Conference Paper
Full-text available
Digitalization has a significant impact on various fields, including business, research, and community. One of the areas influenced by automation is patent administration, specifically in searching for and assessing patents. Traditional methods of patent prior art search involve using Boolean logic, keywords, synonym-selection, classifiers, multilingualism, and other techniques. However, this manual process can be time-consuming and inefficient. This study explores how artificial intelligence (AI) can aid patent assessors in their prior art search by suggesting search keywords, retrieving relevant documents, ranking them, and visualizing their content. The findings reveal that AI can reduce the time and cost involved in sifting through many patents. The study highlights the importance of human-in-the-Ioop methods and the need for better tools that support human-centered decision-making in prior art searches.
... According to a recent survey paper on patent analysis [14], there has been a lot of progress in tasks like patent retrieval [24,35,30,23] and patent image classification [9,32,15] due to the advancements in deep learning. We mainly focus on image classification since visualizations contain important information about patents [14,5,11]. ...
Preprint
Full-text available
Due to the swift growth of patent applications each year, information and multimedia retrieval approaches that facilitate patent exploration and retrieval are of utmost importance. Different types of visualizations (e.g., graphs, technical drawings) and perspectives (e.g., side view, perspective) are used to visualize details of innovations in patents. The classification of these images enables a more efficient search and allows for further analysis. So far, datasets for image type classification miss some important visualization types for patents. Furthermore, related work does not make use of recent deep learning approaches including transformers. In this paper, we adopt state-of-the-art deep learning methods for the classification of visualization types and perspectives in patent images. We extend the CLEF-IP dataset for image type classification in patents to ten classes and provide manual ground truth annotations. In addition, we derive a set of hierarchical classes from a dataset that provides weakly-labeled data for image perspectives. Experimental results have demonstrated the feasibility of the proposed approaches. Source code, models, and dataset will be made publicly available.
... Сложность просмотра многих изобретений усложняется набором черно-белых чертежей, которые используются в патентах. Так же распространяемые в настоящее время программные системы патентно-лицензионного поиска предоставляют пользователям возможности поиска, основанные лишь на стандартном (атрибутивном) формировании запроса или на так называемом поиске «подстроки» [9]. А большинство методов поиска по рисункам основаны на сравнении изображений. ...
Article
В работе проанализированы существующие автоматизированные системы поиска и классификации изображений в базе патентов, выделены их преимущества и недостатки. Все известные системы перед поиском предварительно анализируют загруженные изображения и изображения, хранящиеся в базе данных, после чего происходит поиск в патентной базе данных. С ростом времени рассмотрения заявки на регистрацию патента и с увеличением количества патентов растет. Эксперту патентного бюро необходимо установить уникальность патентуемой технологии.
Chapter
We introduce a new large-scale patent dataset termed PDTW150K for patent drawing retrieval. The dataset contains more than 150,000 patents associated with text metadata and over 850,000 patent drawings. We also provide a set of bounding box positions of individual drawing views to support constructing object detection models. We design some experiments to demonstrate the possible ways of using PDTW150K, including image retrieval, cross-modal retrieval, and object detection tasks. PDTW150K is available for download on GitHub [1].
Chapter
Patent retrieval has been attracting tremendous interest from researchers in intellectual property and information retrieval communities in the past decades. However, most existing approaches rely on textual and metadata information of the patent, and content-based image-based patent retrieval is rarely investigated. Based on traits of patent drawing images, we present a simple and lightweight model for this task. Without bells and whistles, this approach significantly outperforms other counterparts on a large-scale benchmark and noticeably improves the state-of-the-art by 33.5% with the mean average precision (mAP) score. Further experiments reveal that this model can be elaborately scaled up to achieve a surprisingly high mAP of 93.5%. Our method ranks first in the ECCV 2022 Patent Diagram Image Retrieval Challenge.
Article
Full-text available
Recent advances in computer vision (CV) and natural language processing have been driven by exploiting big data on practical applications. However, these research fields are still limited by the sheer volume, versatility, and diversity of the available datasets. CV tasks, such as image captioning, which has primarily been carried out on natural images, still struggle to produce accurate and meaningful captions on sketched images often included in scientific and technical documents. The advancement of other tasks such as 3D reconstruction from 2D images requires larger datasets with multiple viewpoints. We introduce DeepPatent2, a large-scale dataset, providing more than 2.7 million technical drawings with 132,890 object names and 22,394 viewpoints extracted from 14 years of US design patent documents. We demonstrate the usefulness of DeepPatent2 with conceptual captioning. We further provide the potential usefulness of our dataset to facilitate other research areas such as 3D image reconstruction and image retrieval.
Chapter
Due to the swift growth of patent applications each year, information and multimedia retrieval approaches that facilitate patent exploration and retrieval are of utmost importance. Different types of visualizations (e.g., graphs, technical drawings) and perspectives (e.g., side view, perspective) are used to visualize details of innovations in patents. The classification of these images enables a more efficient search in digital libraries and allows for further analysis. So far, datasets for image type classification miss some important visualization types for patents. Furthermore, related work does not make use of recent deep learning approaches including transformers. In this paper, we adopt state-of-the-art deep learning methods for the classification of visualization types and perspectives in patent images. We extend the CLEF-IP dataset for image type classification in patents to ten classes and provide manual ground truth annotations. In addition, we derive a set of hierarchical classes from a dataset that provides weakly-labeled data for image perspectives. Experimental results have demonstrated the feasibility of the proposed approaches. Source code, models, and datasets are publicly available (https://github.com/TIBHannover/PatentImageClassification).
Conference Paper
Full-text available
The aim of this document is to describe the methods we used in the Patent Image Classification and Image-based Patent Retrieval tasks of the Clef-IP 2011 track. The patent image classification task consisted in categorizing patent images into pre-defined categories such as abstract drawing, graph, flowchart, table, etc. Our main aim in participating in this sub-task was to test how our image categorizer performs on this type of categorization problem. Therefore, we used SIFT-like local orientation histograms as low level features and on the top of that we built a visual vocabularies specific to patent images using Gaussian mixture model (GMM). This allowed us to represent images with Fisher Vectors and to use linear classifiers to train one-versus-all classifiers. As the results show, we obtain very good classification performance. Concerning the Image-based Patent Retrieval task, we kept the same image repre-sentation as for the Image Classification task and used dot product as similarity measure. Nevertheless, in the case of patents the aim was to rank patents based on patent similarities, which in the case of pure image-based retrieval implies to be able to compare a set of images versus another set of images. Therefore, we investigated different strategies such as averaging Fisher Vector representation of an image set or considering the maximum similarity between pairs of images. Finally, we also built runs where the predicted image classes were considered in the retrieval process.
Article
Full-text available
In this paper a new approach to video event detection is presented, combining visual concept detection scores with a new dimensionality reduction technique. Specifically, a video is first decomposed to a sequence of shots, and trained visual concept detectors are used to represent video content with model vector sequences. Subsequently, an improved subclass discriminant analysis method is used to derive a concept subspace for detecting and recognizing high-level events. In this space, the median Hausdorff distance is used to implicitly align and compare event videos of different lengths, and the nearest neighbor rule is used for recog-nizing the event depicted in the video. Evaluation results obtained by our participation in the Multimedia Event De-tection Task of the TRECVID 2010 competition verify the effectiveness of the proposed approach for event detection and recognition in large scale video collections.
Article
Full-text available
Efficient and effective handling of video documents depends on the availability of indexes. Manual indexing is unfeasible for large video collections. In this paper we survey several methods aiming at automating this time and resource consuming process. Good reviews on single modality based video indexing have appeared in literature. Effective indexing, however, requires a multimodal approach in which either the most appropriate modality is selected or the different modalities are used in collaborative fashion. Therefore, instead of separately treating the different information sources involved, and their specific algorithms, we focus on the similarities and differences between the modalities. To that end we put forward a unifying and multimodal framework, which views a video document from the perspective of its author. This framework forms the guiding principle for identifying index types, for which automatic methods are found in literature. It furthermore forms the basis for categorizing these different methods.
Conference Paper
Full-text available
This paper proposes a novel binary image descriptor, namely the Adaptive Hierarchical Density Histogram, that can be utilized for complex binary image retrieval. This novel descriptor exploits the distribution of the image points on a two-dimensional area. To reflect effectively this distribution, we propose an adaptive pyramidal decomposition of the image into non-overlapping rectangular regions and the extraction of the density histogram of each region. This hierarchical decomposition algorithm is based on the recursive calculation of geometric centroids. The presented technique is experimentally shown to combine efficient performance, low computational cost and scalability. Comparison with other prevailing approaches demonstrates its high potential.
Article
Taking as a starting point the actual and the perceived value of models of claimed inventions in the 19th century, the author explores the proposition that non-text disclosures of this type and, more significantly, their modern equivalents as 3D electronic images, as well as many other forms of electronic non-text material, will have an increasing place in fully communicating the inventions filed in 21st century patent applications. He argues that moving to a patent application process in which supplementary machine readable material can be supplied, and can form part of the total disclosure of an invention, without the need for it to be convertible to paper form, would increase intelligibility of invention disclosures of both existing problem areas such as genetic sequence patents, as well as forestalling increasing problems likely to arise in other areas as the 21st century unfolds. He notes that in this way the limitations inherent in having to describe 21st century inventions by using the 15th century technology of the printed page could be overcome. Analogies with the situation for certain types of trademark applications for registration, e.g. sound marks, are also noted.
Article
The automatic removal of suffixes from words in English is of particular interest in the field of information retrieval. An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL. Although simple, it performs slightly better than a much more elaborate system with which it has been compared. It effectively works by treating complex suffixes as compounds made up of simple suffixes, and removing the simple suffixes in a number of steps. In each step the removal of the suffix is made to depend upon the form of the remaining stem, which usually involves a measure of its syllable length.
Article
LIBSVM is a library for support vector machines (SVM). Its goal is to help users to easily use SVM as a tool. In this document, we present all its imple-mentation details. For the use of LIBSVM, the README file included in the package and the LIBSVM FAQ provide the information.
Chapter
With ever increasing number of registered trademarks, the task of trademark office is becoming increasingly difficult to ensure the uniqueness of all trademarks registered. Trademarks are complex patterns consisting of various image and text patterns, called device-mark and word-in-mark respectively. Due to the diversity and complexity of image patterns occurring in trademarks, due to multi-lingual word-in-mark, there is no very successful computerized operating trademark registration system. We have tackled key technical issues: multiple feature extraction methods to capture the shape, similarity of multi-lingual word-in-mark, matching device mark interpretation using fuzzy thesaurus, and fusion of multiple feature measures for conflict trademark retrieval. A prototype System for Trademark Archival and Registration (STAR) has been developed. The initial test run has been conducted using 3000 trademarks, and the results have shown satisfaction to trademark officers and specialists.