ArticlePDF Available

Tagging and Automation : Challenges and Opportunities for Academic Libraries

Authors:

Abstract and Figures

Purpose: comparing and examining the quality of the results of tagging, intellectual and automated indexing processes. Design/methodology/approach: analysis and graphical representation of annotation sets using the software "Semtinel". Findings: a combination of tagging, intellectual and automatic indexing is probably best suited to shape the annotation of literature more efficiently without compromising quality. Research limitations/implications: exploratory study on the base of three journals. Originality/value: the paper presents the open source software Semtinel offering a highly optimized toolbox for analysing thesauri and classifications.
Content may be subject to copyright.
A preview of the PDF is not available
... Popular academic applications of social tagging include its use in bookmarking academic articles on CiteULike (http://www.citeulike.org/) or BibSonomy (http://www.bibsonomy.org/) (Eckert, Hänger and Niemann 2009) and the personal online cataloging of books on LibraryThing (https://www.librarything. com/), Shelfari (http://www.shelfari.com/), ...
Article
The purpose of this study was to investigate the characteristics and effectiveness of tags in public library online public access catalogues (OPACs). Three public libraries that have adopted BiblioCommons’ OPAC system-Edmonton Public Library, Seattle Public Library, and Christchurch City Libraries-were selected for the study. In the OPAC of each of these libraries, fifty queries were searched using tags as well as keyword and subject as access point. The results of the study showed that a large number of items in public libraries are still not being tagged, while for those items that have been tagged, the tags were mostly made up of one or two words and were subject related. In terms of effectiveness, the precision level of a tag search was found acceptable and somewhat comparable to the precision levels of keyword and subject searches, but of the three access points, tags retrieved the fewest number of items. © 2015 The Canadian Journal of Information and Library Science.
... In , the methodology was adapted to classification systems and automatic classification. In (Eckert, Hänger, & Niemann, 2009), we evaluated tagging results and compared them to intellectual indexing and automatic indexing using the ICE-Map Visualization. ...
Article
Full-text available
In this paper, we describe in detail the Information Content Evalua-tion Map (ICE-Map Visualization, formerly referred to as IC Difference Analysis). The ICE-Map Visualization is a visual data mining approach for all kinds of concept hierarchies that uses statistics about the concept usage to help a user in the evaluation and maintenance of the hierarchy. It consists of a statistical framework that employs the the notion of infor-mation content from information theory, as well as a visualization of the hierarchy and the result of the statistical analysis by means of a treemap.
Article
The objective of this study is to compare the retrieval effectiveness of using tag as an access point as against subject heading or keyword in a public library OPAC. Thirty queries were searched in Oakville Public Library, and tag retrieved a fewer number of items per query than keyword or subject. However, there was no significant difference in the average precision values.L’objectif de cette étude est de comparer l’efficacité du repérage à l’aide des étiquettes comme point d’accès à celui des vedettes matières ou des mots-clés d’un OPAC de bibliothèque publique. Trente requêtes ont été analysées à la bibliothèque publique d’Oakville et le repérage par étiquette a retourné moins de résultats par requête que les mots-clés ou les vedettes matières. Cependant, aucune différence significative n’a été notée pour les valeurs de précision moyenne.
Article
Full-text available
Library so also academic library is considered as the trinity of collections, users and staff and among users, faculty members are most important category who gives guidelines to other categories of users such as students, research scholars and staff for proper use of library for study and research. The basic objective of the study is to study on the use of library resources by the faculty members of private engineering colleges of Odisha and is limited to the faculty members of private engineering colleges of Odisha only. A survey method and questionnaire technique has been followed for the present study. The study analyses and interprets the collected data according to the scope of the study. Summarizes the findings of the study .The important findings include that faculty members of these institutions use their library, they prefer print resources than e-resources and the satisfaction level is average. Concludes with the remarks that the management of these institutions need to take appropriate measures to built their library collections ,develop infrastructure facilities and provide library services properly by allocating more library budget and required professional manpower, so that the quality of teaching and education of these institutions will improve to a great extent. Key Words : Library resources, Engineering College, BPUT, Odisha.
Article
Full-text available
Early literature on tagging has been enthusiastic about the potential that it holds for libraries. Theorists have thoroughly analyzed the nature of tags, as well as the benefits and the problems for libraries: the positives and the negatives of tags compared to subject headings, how tagging can help libraries increase the findability of documents, what the benefits from user-created vocabulary are, and so on. However, there is a gap in the knowledge of how tags actually work within the professional context of libraries. More evidence is needed if the library community is to understand whether tags present an exciting opportunity for libraries. The purpose of this paper is to review the literature regarding the implementation of the tagging process primarily in library catalogs. The aim is to document evidence regarding this particular service within the range of library services provided to users.
Article
Purpose The purpose of this paper is to explore the growth and development of periodical literature on Web 2.0 technologies and their other fields. Design/methodology/approach Bibliographic data of the articles published in the 13 leading peer‐reviewed journals are obtained from the Emerald database ( www.emeraldinsight.com ) directly using such keywords as “Web 2.0”, “blogs”, “wikis”, “RSS”, “social networking sites”, “podcasts”, “Mashup”, and multimedia sharing tools, i.e. YouTube and Flickr. The bibliographical surrogates such as author, title, subtitle, source, issue, volume, pages, etc. were recorded in MS‐Excel (2010) sheet for the analysis and interpretation of data. A bibliography of selected articles is provided. Findings The study found 206 research articles on the subject published in 13 leading library and information science journals of Emerald for period 2007‐2011. Further, the study found that 2009 was the most productive year with 69 articles. The study observed Online Information Review published 49 articles, and hence can be considered the core journal on the topic. Mike Thelwall from the UK was found to be the most prolific author, having authored or co‐authored five articles. Research limitations/implications The study was based on 206 research articles published during the years 2007‐2011. The study was restricted to this period because the Web 2.0 concept was originated during 2004‐2005 and the undertaken period has sufficient published literature on the topic. Originality/value The paper provides reliable and authentic information on the subject. This is the first study on this topic.
Conference Paper
To make digital archives more accessible to industries, this study used tags given by cultural creative industries to items in digital archives to analyze the differences between the terms used by commercial users and scholars from the archive agency. We analyzed the self-created commercial tags (60%) and the tags adopted from academic terms (40%). The results showed that terms provided by the archive agency were still more likely to be based on domain knowledge. In comparison, the superordinate terms are more likely to be needed by the commercial users. This study suggests that the research findings of six types of semantic relationship and eight types of linked properties could be further transformed into metadata best practices for the digital archive agency, thus bridging the divide between the commercial tags and academic terms.
Article
Full-text available
In Web 2.0 services "prosumers" - producers and consumers - collaborate not only for the purpose of creating content, but to index these pieces of information as well. Folksonomies permit actors to describe documents with subject headings, "tags", without regarding any rules. Apart from a lot of benefits folksonomies have many shortcomings (e.g., lack of precision). In order to solve some of the problems we propose interpreting tags as natural language terms. Accordingly, we can introduce methods of NLP to solve the tags' linguistic problems. Additionally, we present criteria for tagged documents to create a ranking by relevance (tag distribution, collaboration and actor-based aspects). Besides recommending similar documents ("more like this!") folksonomies can be used for the recommendation of similar users and communities ("more like me!").
Article
Full-text available
Conference Paper
Full-text available
One of the uses of social tagging is to associate freely selected terms (tags) to resources for sharing resources among tag consumers. This enables tag consumers to locate new resources through the collective intelligence of other tag creators, and offers a new avenue for resource discovery. This paper investigates the effectiveness of tags as resource descriptors determined through the use of text categorisation using Support Vector Machines. Two text categorisation experiments were done for this research, and tags and web pages from del.icio.us were used. The first study concentrated on the use of terms as its features. The second study used both terms and its tags as part of its feature set. The results indicate that the tags were not always reliable indicators of the resource contents. At the same time, the results from the terms only experiment were better compared to the experiment with terms and tags. A deeper analysis of a sample of tags and documents were also conducted and implications of this research are discussed.
Conference Paper
Full-text available
The use of thesaurus-based indexing is a common approach for increasing the performance of document retrieval. With the growing amount of documents available, manual index- ing is not a feasible option. Statistical methods for auto- mated document indexing are an attractive alternative. We argue that the quality of the thesaurus used as a basis for indexing in regard to its ability to adequately cover the con- tents to be indexed is of crucial importance in automatic in- dexing because there is no human in the loop that can spot and avoid indexing errors. We propose a method for the- saurus evaluation that is based on a combination of statis- tical measures and appropriate visualization techniques that supports the detection of potential problems in a thesaurus. We describe this method and show its application in the con- text of two automatic indexing tasks. The examples show that the methods indeed eases the detection and correction of errors leading to a better indexing result. Please refer to http://www.kaiec.org for high resolution media of all figures used in this paper, as well as an animated presentation of the interactive tool.
Conference Paper
Full-text available
Information Content (IC) is an important dimension of word knowledge when assessing the similarity of two terms or word senses. The conventional way of measuring the IC of word senses is to combine knowledge of their hierarchical structure from an on- tology like WordNet with statistics on their actual usage in text as derived from a large corpus. In this paper we present a wholly in- trinsic measure of IC that relies on hierarchical structure alone. We report that this measure is consequently easier to calculate, yet when used as the basis of a similarity mechanism it yields judgments that correlate more closely with human assessments than other, extrinsic measures of IC that additionally employ corpus analysis.
Article
The importance of social Bookmarking, Folksonomies, and Web 2.0 tools and the Web services provided by these search tools, are presented. These Web searches help in completion of jobs, pursue interests and hobbies, and keep track of the already found information that are found useful. Every user has a unique tag and by which an image or Web page is tagged and identified and linked with the identical Web pages and images. Social Bookmarking and Folksonomies tools allows users to tag Websites and links, and to share their search with other users. It serves academic and scientific interests through its Websites such as CiteULike and Connotea. These Websites Share, store, and organize academic papers, and find bibliographic information from scientific articles and journals. Web 2.0 works on the 'architecture of participation' in which user adds to the value of the services. It harness the collective intelligence of the Web and uses these information to make the service better.
Article
Collaborative tagging systems—systems where many casual users annotate objects with free-form strings (tags) of their choosing—have recently emerged as a powerful way to label and organize large collections of data. During our recent investigation into these types of systems, we discovered a simple but remarkably effective algorithm for converting a large corpus of tags annotating objects in a tagging system into a navigable hierarchical taxonomy of tags. We first discuss the algorithm and then present a preliminary model to explain why it is so effective in these types of systems.
Conference Paper
This work shows how the content of a digital library can be enhanced to better satisfy its users' needs. Missing content is identified by finding missing content topics in the system's query log or in a pre-defined taxonomy of required knowledge. The ...