Article

Collaborative Tagging as a Knowledge Organisation and Resource Discovery Tool

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Purpose The purpose of the paper is to provide an overview of the collaborative tagging phenomenon and explore some of the reasons for its emergence. Design/methodology/approach The paper reviews the related literature and discusses some of the problems associated with, and the potential of, collaborative tagging approaches for knowledge organisation and general resource discovery. A definition of controlled vocabularies is proposed and used to assess the efficacy of collaborative tagging. An exposition of the collaborative tagging model is provided and a review of the major contributions to the tagging literature is presented. Findings There are numerous difficulties with collaborative tagging systems (e.g. low precision, lack of collocation, etc.) that originate from the absence of properties that characterise controlled vocabularies. However, such systems can not be dismissed. Librarians and information professionals have lessons to learn from the interactive and social aspects exemplified by collaborative tagging systems, as well as their success in engaging users with information management. The future co‐existence of controlled vocabularies and collaborative tagging is predicted, with each appropriate for use within distinct information contexts: formal and informal. Research limitations/implications Librarians and information professional researchers should be playing a leading role in research aimed at assessing the efficacy of collaborative tagging in relation to information storage, organisation, and retrieval, and to influence the future development of collaborative tagging systems. Practical implications The paper indicates clear areas where digital libraries and repositories could innovate in order to better engage users with information. Originality/value At time of writing there were no literature reviews summarising the main contributions to the collaborative tagging research or debate.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Aunque la utilidad más notoria de esta práctica es que los usuarios puedan organizar la información en su propio espacio virtual, también sirve cuando se aprovechan y gestionan todas esas etiquetas de manera colectiva para buscar información dentro de las aplicaciones o sitios webs donde se han generado (Macgregor y McCulloch, 2006). El conjunto de las tags o etiquetas que los usuarios han utilizado para describir contenidos en un sistema de información da lugar a las folksonomías (Rolla, 2009), que son sistemas de clasificación de información no jerárquicos basados en el lenguaje natural de los usuarios, donde las relaciones entre los términos no están predeterminadas. ...
... Esta forma de describir contenidos está ganando popularidad, ya que muchos sitios web la utilizan, y ha abierto un importante debate acerca de su utilidad en la recuperación de información, especialmente en comparación con los lenguajes controlados utilizados tradicionalmente en los catálogos y las bases de datos bibliográficas (tesauros, clasificaciones…). Sus ventajas e inconvenencias, así como las diferencias con respecto a los lenguajes controlados, han sido ampliamente analizadas en la literatura científica (Macgregor y McCulloch, 2006;Noruzi, 2006;Porter, 2013;Rolla, 2009;Spiteri, 2006;Steele, 2009) y podrían sintetizarse en la siguiente tabla: Tabla 1. Diferencias entre las folksonomías y los lenguajes controlados. ...
... • Complementan las descripciones del catálogo pero no las sustituyen. Las carencias del lenguaje natural hacen que las búsquedas mediante etiquetas no sean exhaustivas ni precisas, pero puede ser un punto de acceso secundario que aporta otro tipo de información relevante para los usuarios (Macgregor y McCulloch, 2006;Pecoskie, Spitery y Traulli, 2014;Rolla, 2009). • Actitud favorable de los bibliotecarios. ...
... Aunque la utilidad más notoria de esta práctica es que los usuarios puedan organizar la información en su propio espacio virtual, también sirve cuando se aprovechan y gestionan todas esas etiquetas de manera colectiva para buscar información dentro de las aplicaciones o sitios webs donde se han generado (Macgregor y McCulloch, 2006). El conjunto de las tags o etiquetas que los usuarios han utilizado para describir contenidos en un sistema de información da lugar a las folksonomías (Rolla, 2009), que son sistemas de clasificación de información no jerárquicos basados en el lenguaje natural de los usuarios, donde las relaciones entre los términos no están predeterminadas. ...
... Esta forma de describir contenidos está ganando popularidad, ya que muchos sitios web la utilizan, y ha abierto un importante debate acerca de su utilidad en la recuperación de información, especialmente en comparación con los lenguajes controlados utilizados tradicionalmente en los catálogos y las bases de datos bibliográficas (tesauros, clasificaciones…). Sus ventajas e inconvenencias, así como las diferencias con respecto a los lenguajes controlados, han sido ampliamente analizadas en la literatura científica (Macgregor y McCulloch, 2006;Noruzi, 2006;Porter, 2013;Rolla, 2009;Spiteri, 2006;Steele, 2009) y podrían sintetizarse en la siguiente tabla: Tabla 1. Diferencias entre las folksonomías y los lenguajes controlados. ...
... • Complementan las descripciones del catálogo pero no las sustituyen. Las carencias del lenguaje natural hacen que las búsquedas mediante etiquetas no sean exhaustivas ni precisas, pero puede ser un punto de acceso secundario que aporta otro tipo de información relevante para los usuarios (Macgregor y McCulloch, 2006;Pecoskie, Spitery y Traulli, 2014;Rolla, 2009). • Actitud favorable de los bibliotecarios. ...
... With the rapid growth of literature, it is essential for subject indexers or cataloguers to enhance the subject access of library materials so that subject-based retrieval could be possible 4 . Enhancement of subject access is possible through the efficient exploitation of subject metadata that derives from subject cataloguing or subject indexing which provides a direct approach to find a document or group of documents based on subject [5][6][7][8][9][10] . To make the cataloguing process effective, subject cataloguing first determines the content of documents and then allows cataloguers or indexers to use controlled vocabularies which is a list of standard terms as subject descriptors like list of subject headings used in document description in order to ensure uniformity, universality and discoverability of library records in library catalogues and other bibliographic databases [11][12][13] . ...
... In that context, a parallel concept as social tagging (a process of social cataloguing) has emerged with the appearance of web 2.0 applications which allows natural languages in the form of keywords for document indexing. Social tagging allows users to assign keywords as tags and other kinds of metadata as per their need that helps to retrieve those resources in future (Sample Social tags mentioned in Fig. 1) 7,[18][19] . Social tagging differs from traditional subject cataloguing in many ways but the core philosophy is that any users can assign tags to any web resources using any keywords (uncontrolled terms) but, in case of subject cataloguing, only librarians, trained cataloguers only conduct the process of subject cataloguing using controlled terms 20 . ...
Article
Full-text available
The paper comparatively investigates the relation between controlled vocabularies assigned by the expertsin Library of Congress and tags assigned by users in Library Thing database in three subjects, Economics,History and Sociology under Social Science domain. Based on Term matching (S= 14.80 %, E= 12.77 % andH= 8.06 %) and Jaccard similarity coefficient (E= 0.15, S= 0.15 and H= 0.11), we found little matching betweenboth vocabularies. We also found that experts mostly use double-word and multi-word specific topical terms(S= 73.14 %, E= 72.89 % and H= 61.05 %), whereas social taggers mostly use single-word general non-topicalterms (E= 54.88 %, H= 54.21 % and S= 48.55%) and little topical and few personal terms. While comparisonwith LCSH subfield, we found that experts prefer topical terms for all subjects, whereas, taggers only prefer it for Economics and geographic subdivision terms for History and Sociology, but they don’t prefer chronological terms for tagging. Even, experts prefer little title-based terms (H= 196 terms, S= 195 terms and E= 175 terms) but taggers mostly prefer title-based terms (H= 673 terms, S= 564 terms and E= 444 terms) in three subjects. However, the study concludes that both vocabularies are different, but libraries can exploit those uncontrolled vocabularies and can introduce ‘hybrid metadata ecology’ which combines controlled vocabularies, classification and folksonomies for better subject access and retrieval of social science documents.
... Mais force a été de constater que le concept même de folksonomie « enrichie » était très peu traité (Alemanne et al., 2008;Mathes, 2004;Munk & Mørk, 2007) encore moins en contexte professionnel (Comtet, 2007;du Hommet et al., 2014du Hommet et al., , 2015Macgregor & McCulloch, 2006). ...
... Si sept études ont montré un fort lien entre usage du RSE et productivité, une seule argumente qu'il n'y a pas de lien significatif entre adoption d'un RSE et performance professionnelle puisque de nombreuses autres variables l'influencent aussi. P a rt i e I -C a d re t h é o ri q u e C h a p i t re 1 -L e s R é s e a u x S o c i a u x d ' E nt re p ri s e (Macgregor & McCulloch, 2006). 31 Citation : "everyone is expert in their own vocabulary". ...
Thesis
Full-text available
Our thesis takes place in a global video game (VG) company. In this world where there are many restrictions on information accessibility, where projects usually managed locally are now developed interstudio, its actors are faced with multiple knowledge silos and KM becomes a challenge even more strategic. What are the sources used by VG employees, the criteria applied and the barriers they encounter? How are web 2.0 tools positioned within this digital workplace? What should be the mandatory pillars to deploy a corporate folksonomy? Using Savolainen’s information horizon methodology, we conducted an exploratory study based on interviews organized at the Ubisoft Montreal studio. Our 29 participants had to place their sources of information on mind maps. Theses maps and interviews transcripts are supplemented by some data related to the deployment of the company's tags. We propose a new categorization of sources and confirm the typology of Savolainen criteria. We shed light on the impact of a variable temporality between production and support teams on their information practices. We attest that the conditions of the folksonomy deployment did not allow the initial associated vision to be applied. The results of our survey on the informational practices of video game employees could find application in strategic knowledge management. We affirm that folksonomies can still play a major role if they fully benefit from the technological support necessary for their deployment. They could thus help to facilitate the information paths of the employee for whom the primary sources of information remain the other employees and tools which allow them to communicate together. Notre thèse se déroule dans une entreprise mondiale vidéo ludique. Dans cet univers où les restrictions d’accès à l’information sont nombreuses, où les projets habituellement gérés par lieux sont désormais développés inter studios, les acteurs y sont confrontés à de nombreux silos et le KM devient un enjeu d’autant plus stratégique. Quelles sont les sources d’information que les acteurs du jeu vidéo utilisent, les critères qu’ils appliquent et les barrières qu’ils rencontrent ? Comment sont positionnés les outils web 2.0 dans ce paysage numérique ? Quels axes de déploiement pour une folksonomie d’entreprise ? En suivant la méthodologie horizon informationnel de Savolainen, notre étude exploratoire repose sur des entretiens menés au studio Ubisoft de Montréal où nos 29 participants plaçaient leurs sources d’information sur des cartes mentales. Ces données sont complétées par celles relatives au déploiement des tags dans l’organisation. Nous proposons une nouvelle catégorisation de sources et confirmons la typologie de critères de Savolainen. Nous éclairons l’impact d’une temporalité variable entre les équipes de production et support sur leurs pratiques informationnelles. Nous constatons que le déploiement de la folksonomie n’a pas permis d’appliquer la vision initiale associée. Nous affirmons que les folksonomies peuvent encore jouer un rôle majeur dans ces applications si elles bénéficient du support technologique nécessaire à leur déploiement. Elles contribueraient ainsi à la facilitation du parcours informationnels des acteurs de l’organisation, pour qui les premières sources d’information restent les collaborateurs et les outils pour échanger ensemble.
... Also, this task yields important information as retrieval of information depends to a large extent on the quality of indexing [4]. Typically, subject indexing has been made manually, i.e., assigned by a curator according to their content or aboutness [16]. However, manual subject indexing causes a challenge that requires huge time and efforts of curators. ...
... In 2020, we downloaded the 2020 version 16 of MeSH and the biomedical articles indexed by the MeSH terms from BioASQ. From these articles, we only chose the articles which have been published in the last 5 years and also indexed by the terms under the diseases subtaxonomy. ...
Article
In digital repositories, it is crucial to refine existing subject terms and exploit a taxonomy with subject terms, in order to promote information retrieval tasks such as indexing, cataloging and searching of digital documents. In this paper, we address how to refine an existing set of subject terms, often containing irrelevant ones or creating noise, that are used to index digital documents. Further, we present how to automatically induce a subject term taxonomy to capture and utilise the semantic relations among subject terms. Most related works have little studied these problems, focusing mostly on creating subject terms or building a taxonomy of key terms from text documents. We propose a methodology² for refining an existing set of subject terms in a digital repository by identifying their semantics, as well as inducing a taxonomy with subject terms by analysing their mutual usages, maximising their semantic relatedness. Then, we present a case study using the (Analysis & Policy Observatory) APO digital repository to analyse the proposed methodology and demonstrate its applicability. Further, to validate the generalisability of the proposed taxonomy inducing method, we evaluate it using a gold-standard taxonomy in life sciences, Medical Subject Headings (MeSH), in comparison with the state–of-the-art taxonomy inducing method, TaxoFinder. Our evaluation shows that our methodology has a high potential for refining an existing set of subject terms and capturing their semantic relationships by inducing a subject term taxonomy.
... This seems surprising. However, one possible explanation for these results is brought forward by Macgregor and McCulloch (2006) that small businesses are unwilling to adopt ICT because they consider it complex. Firm size was negative and significant. ...
... This seemingly appears unexpected. Though one likely reason for these consequences is taken forward by Macgregor and McCulloch (2006) who argue that small businesses are unwilling to adopt ICT because of assuming ICT as complex in nature. The location of firms was significantly and negatively associated with internet integration (Table 3). ...
Article
Full-text available
The study examines the extent of ICT adoption and determines the effects of technological competencies and human capital on the adoption of ICT among small and medium enterprises (SMEs) in Southern Punjab, Pakistan. We collected cross-sectional data from 170 firms. The ordered probit model was employed. ICT adoption includes ICT intentions, ICT infrastructure, internet integration, e-sales, and e-procurement. Results showed that research collaboration was the only variable having a significant and positive effect on all ICT adoption measures. This implied that investment in research collaboration by the firms could lead to an increase in ICT adoption. The study also found that research and development were significantly related to ICT infrastructure, e-sales, and e-procurement. Results also suggest that the firms need to enhance their R&D activities, innovations, and research collaborations to increase their ICT intentions. The latter two should also be emphasized to promote Internet integration in firms. Disciplinary: Business Management (SME) and Information Technology.
... Collaborative tagging systems [28,38,57] allow sites to distribute the work of categorization across the entire community. It has now been utilized by many sites to help organize large online repositories. ...
... Tagging is to summarize the content with several compact keywords [38,57] and collaborative tagging allows individual users to create and apply tags to online items. Collaborative tagging systems are widely adopted by different sharing sites including Q&A (e.g., Quora [11]), picture sharing (e.g., Flickr [7]), web bookmarking sharing (e.g., Delicious [9]), etc. ...
Preprint
Design sharing sites provide UI designers with a platform to share their works and also an opportunity to get inspiration from others' designs. To facilitate management and search of millions of UI design images, many design sharing sites adopt collaborative tagging systems by distributing the work of categorization to the community. However, designers often do not know how to properly tag one design image with compact textual description, resulting in unclear, incomplete, and inconsistent tags for uploaded examples which impede retrieval, according to our empirical study and interview with four professional designers. Based on a deep neural network, we introduce a novel approach for encoding both the visual and textual information to recover the missing tags for existing UI examples so that they can be more easily found by text queries. We achieve 82.72% accuracy in the tag prediction. Through a simulation test of 5 queries, our system on average returns hundreds more results than the default Dribbble search, leading to better relatedness, diversity and satisfaction.
... Having many advantages, it suffers from disadvantages too. It has semantic ambiguity, synonymous issues, lack of controlled vocabulary [12][13] and use of many personal tags ('to-read', 'read', 'read in 2007') for personal use rather than public benefit 14 . ...
... Likewise, overlapping terms comprise a near to half of the LCSH descriptors (47.39 %), that means there is near about 50 per cent probability that it can be adopted by users as social tags 4 . terms (08), eight non-subject terms (08) and four personal tags (04) e.g., 'to-read', 'read', 'unread', and 'wishlist' whereas LCSH vocabulary contains thirteen subject-based terms (13), seven non-subject terms (07). Table 3 also shows that out of both datasets only seven terms (07) are common. ...
Article
Full-text available
The concept of ‘social tagging’ has gained popularity nowadays due to the emergence of web 2.0 technologies. Those technologies led to the practice of associating metadata with digital resources among users through collaboratively or socially for self-information retrieval. Many researchers have opined that social tags can enhance the use of library collections. The present study was predominantly carried out to compare social tags collected from the LibraryThing website with Library of Congress Subject Heading (LCSH) descriptors collected from the Library of Congress Online Catalogue applied for thousand book titles in the field of Economics. The study also aimed to know whether social tags can be applied in the library database or not. The findings elucidate that users mostly use descriptors (47.39 %) as tags than expert’s usage of tags (12.77 %) as descriptors. Spearman’s correlation suggests that 75 per cent chance where tags and descriptors can be used simultaneously in overlapping terms. The Jaccard similarity coefficient identifies that users and experts use different terminologies to annotate the books. Users and experts use at least one common keyword for major book titles (908). Users mostly sought title based keywords but experts use mostly subject-based terminologies. The study further clarifies that social tags may be incorporated into the library databases but cannot replace LCSHs. The accessibility and usage of documents especially in the field of economics may be enhanced once the notion of social tags is incorporated with the library OPAC.
... For example, Thung et al. [12] used collaborative tags for similarity detection which are hardly found associated with software applications in a repository. Very often collaborative tags give the wrong impression about software applications as these tags are provided by a user's experience which may vary from user to user and may cover a small part of the whole application [37]. Thus, figuring out a universal solution for detecting and searching similar software applications across different programming languages for any open source software repositories is still a challenge to solve. ...
... Our evaluation with Repopal(readme) and CLAN on our repository experienced an average 2-3% error from the evaluation results shown in the original publication which is acceptable for reproducing any previous model. We did not compare with collaborative tagging [12] because it requires manual tagging and at the same time has a performance challenge [37]. Thus, finally we ended up comparing with the two closely related state of the art approaches with ours. ...
... Once they know the terms, their meanings, and the relationships among terms, they should be able to apply this vocabulary in a variety of ways and contexts. A disciplinary taxonomy, "a subject-based classification that arranges terms in a controlled vocabulary into a hierarchy" (Garshol, 2004), provides a professional structure for classifying and finding resources (Macgregor & McCulloch, 2006). "Folksonomy," a term coined by Thomas Vander Wal to describe collaborative informal categorization of materials (Vander Wal, 2007), typically provides a more informal context for tagging and finding resources. ...
... Folksonomies, on the other hand, cost almost nothing to produce; anyone can tag materials without learning a controlled vocabulary (Macgregor & McCulloch, 2006). The trade-off is that consumers of those resources incur a perpetual cost in searching and discovery. ...
Book
With the advent of Web 2.0, e-learning has the potential to become far more personal, social, and flexible. Collective Intelligence and E-Learning 2.0: Implications of Web-Based Communities and Networking provides a valuable reference to the latest advancements in the area of educational technology and e-learning. This innovative collection includes a selection of world-class chapters addressing current research, case studies, best practices, pedagogical approaches, and strategies related to e-learning resources and projects.
... Teoría del Interaccionismo Social: la teoría del interaccionismo social influyó en el desarrollo del aprendizaje colaborativo (Macgregor & McCulloch, 2006). La importancia de la interacción social en el desarrollo cognitivo y el aprendizaje más efectivo cuando se realiza en un entorno social y cultural. ...
Article
Anti-pandemic policies, through strategies of confinement and distancing of people, led to the emergence of collaborative learning which lies in expectations of interaction and contribution. In this sense, the objective of the study was to compare the theoretical structure of eleven factors with respect to an observed four-factor model. An exploratory, transversal and psychometric work was carried out with a sample of 100 students assigned to health institutes. The results demonstrate that four factors prevail, related to experience, expectations, contributions and interactions, which explained 79% of the total variance. In relation to the state of the art, it was recommended not to reject the hypothesis that predicts significant differences between theory and empirical testing. The limits of the sample and the instrument are recognized, although the contrast of the four factors found is recommended.
... The process of cataloguing library resources involves the arrangement of metadata for information resources in a manner that will facilitate easy access and retrieval when needed (Macgregor & McCulloch, 2006). This could be attained by the use of international standards, such as Anglo American Cataloguing Rules (AACR), Resource Description and Access (RDA), Machine Readable Catalogue (MARC), subject headings schemes, classification schemes and other metadata standards used. ...
... Therefore library staff in Federal Polytechnic libraries in Southwest Nigeria possessed high information literacy skills based on the overall mean scores.This result shows that most of the library staff did not acquire information literacy skills through the training organized by their institution libraries. This finding is inconsistent with the position ofMacgregor and McCulloch (2006) who reported in their finding that the goal of library training is to enable users' community to discriminate between useful and irrelevant information as well as engaging users with information management. In addition, the University of Auckland Academic Plan2005-2007 (2004) canvassed that the polytechnic (library) aims are to provide its users with key, high-level generic skills like the capacity for lifelong critical, conceptual and reflective thinking, and attributes such as creativity and originality. ...
Article
This study assessed the information literacy and ICT skills of library staff in Federal Polytechnics in southwest Nigeria. The study adopted a survey research design and a total population of 154 which cut across six states in the Southwest geopolitical zone with five federal polytechnic. The study adopted stratified sampling techniques from which a sample size of 136 library staff was sampled. The major instrument used for data collection was questionnaire. A total of 154 copies of questionnaires were sent out, from which 136 copies were found to be valid and found fit for analysis. The data were analyzed using descriptive frequency table and mean with the aid of Statistical Products for Service Solutions (SPSS). The study established among others, that the library staff acquired basic information literacy skills through attending workshops/seminars, trial and error, through the help of their colleagues, and through the guidance from library staff; library staff possessed high information literacy skills, which include ability to recognise a need for information resources, distinguish potential information resources, construct strategies for locating information, compare and evaluate information obtained from different sources, locate and access information resources, organise, apply and communicate information, and ability to synthesize and build on existing information among others. The study concluded that library staff possess information literacy and ICT skills and they could recognize a need for information resources, distinguish, potential information and deploy the resources appropriately. Besides, the research shows that Federal Polytechnics in Southwest have information resources. The study recommended that Federal government should continuously fund the federal polytechnic libraries to enhance productivity; Polytechnic management should provide more computers with Internet access in their polytechnics. The bandwidth for Internet connectivity should be increased to improve the speed of accessing information from the Internet among others.
... Sixth, due to having uncontrolled terms, folksonomies face lack of recall and precision while retrieving information. 2,7,[17][18] Further, social tags suffer from personal tags such as 'wishlist', 'kindle', 'own', 'hardcover' etc. which are not useful for retrieving resources. Users only assign personal tags just to meet their own needs. ...
Article
Full-text available
The study attempts to compare user-generated social tags with expert-generated LCSH descriptors of one thousand sociology books. The objective is to examine if social tags can be used to enhance the accessibility of library collections. The study found that both datasets do not follow the same vocabulary. Though, the Spearmans' rank correlation (0.89) indicates a good association between common terms in both vocabularies. The Jaccard similarity coefficient (J = 0.13, 0.14, 0.17, 0.15 and 0.16) in different word clusters proves that top frequent social tags and top frequent LCSH descriptors used by users and experts are different. The comparison with each book also reveals that 555 books (55.5%) have 50 to 100 percent matching between both vocabularies. LCSH descriptor vocabulary contains more subject terms (24) than social tag vocabulary (12) out of the top thirty frequent terms. The comparison of social tags with MARC subfields ($a, $x, $y, $z, $v) reveals that users use more or less all the subfield terms as tags but either they do not use chronological terms ($y) for tags or use different terms other than experts for chronological information. Further, comparison with each book title reveals that social tags alongside LCSH descriptors can enhance the title-based search of libraries. Moreover, the study suggests that usage of social tags will not only enhance the accessibilities of library resources under sociology but also complement to controlled vocabularies by supplementing a variety of terms other than experts.
... Despite many advantages, social tags still suffer from quality issues. Being generated from uncontrolled vocabulary social tags suffer from homonyms, synonyms, lack of controlled vocabularies and semantic ambiguities [13][14] . Besides social tags contain many personal tags ('read', 'to-read', 'unread', 'read in 2015' etc.) that neither define any subject nor help in information retrieval. ...
Article
Full-text available
Social tagging allows users to assign any free-form keywords as tags to any digital resources through a decentralised way. Many information scientists find that there are similarities through their studies between user-generated social tags and the librarian-generated subject headings for the libraries. The present study was conducted to identify the similarity and dissimilarity between user-generated social tags and librarian-generated subject terms of 1000 books in the domain of History. The study also conducted to identify whether social tags can replace controlled vocabularies. The study finds that only a small portion of terms overlaps with each other (3.54 % of social tags & 56.07 % of SLSH terms) and Spearman's rank correlation proves that there is a good association between overlapping terms. Jaccard similarity coefficient highlights that users and the librarian use different terminologies (as J = 0.13, 0.12 & 0.11). Individual title wise comparison also defines that 90 per cent (88.4 %) of all book titles where users and the librarian use at least one common term. Users use the least subject & non-subject terms but use some personal tags for personal benefit whereas the librarian use only subject & non-subject terms. Matching with each book title clarifies that for describing resources users mostly use title based keywords (696) whereas the librarian use very little title based keywords (113). The study clearly defines that social tags can enhance the experience of library users. If it can be exploited properly it can complement to controlled vocabularies but can not replace the controlled vocabularies used for libraries a long time. Overall the study explicitly identifies the viability regarding the adoption of social tags into the library databases where the resources in the field of history will be accessed.
... When restricted vocabularies and categories for record annotation are not available or practical, users are often allowed to assign uncontrolled keywords to a record, a process referred to as "Tagging". This allows concepts to be derived freely in the course of work, as repeated and cross-contextual usage, often among multiple users, leads to a naturally-arising set of useful, domain-specific concepts [17][18][19][20]. This freedom implies that "tags" have not been directly controlled-that is, picked from a fixed list known ahead of time. ...
Article
Recovering a system’s underlying structure from its historical records (also called structure mining) is essential to making valid inferences about that system’s behavior. For example, making reliable predictions about system failures based on maintenance work order data requires determining how concepts described within the work order are related. Obtaining such structural information is challenging, requiring system understanding, synthesis, and representation design. This is often either too difficult or too time consuming to produce. Consequently, a common approach to quickly elicit tacit structural knowledge from experts is to gather uncontrolled keywords as record labels—i.e., “tags.” One can then map those tags to concepts within the structure and quantitatively infer relationships between them. Existing models of tag similarity tend to either depend on correlation strength (e.g., overall co-occurrence frequencies) or on conditional strength (e.g., tag sequence probabilities). A key difficulty in applying either model is understanding under what conditions one is better than the other for overall structure recovery. In this paper, we investigate the core assumptions and implications of these two classes of similarity measures on structure recovery tasks. Then, using lessons from this characterization, we borrow from recent psychology literature on semantic fluency tasks to construct a tag similarity measure that emulates how humans recall tags from memory. We show through empirical testing that this method combines strengths of both common modeling paradigms. We also demonstrate its potential as a preprocessor for structure mining tasks via a case study in semi-supervised learning on real excavator maintenance work orders.
... Unfortunately, most of the time, developers or users of a software application do not associate it with tags available in the repository. Moreover, collaborative tagging mechanisms face a number of challenges [45], which further hinders the acceptance of this model. Further, this work does not fit with our work as it works with external information rather than focusing on the information available within the software application itself. ...
Article
Full-text available
While there are novel approaches for detecting and categorizing similar software applications, previous research focused on detecting similarity in applications written in the same programming language and not on detecting similarity in applications written in different programming languages. Cross-language software similarity detection is inherently more challenging due to variations in language, application structures, support libraries used, and naming conventions. In this paper we propose a novel model, CroLSim, to detect similar software applications across different programming languages. We define a semantic relationship among cross-language libraries and API methods (both local and third party) using functional descriptions and a word-vector learning model. Our experiments show that CroLSim can successfully detect cross-language similar software applications, which outperforms all existing approaches (mean average precision rate of 0.65, confidence rate of 3.6, and 75% highly rated successful queries). Furthermore, we applied CroLSim to a source code repository to see whether our model can recommend cross-language source code fragments if queried directly with source code. From our experiments we found that CroLSim can recommend cross-language functional similar source code when source code is directly used as a query (average precision=0.28, recall=0.85, and F-Measure=0.40).
... Many studies have shown that structured knowledge can emerge from social tagging systems [55][56][57]. Hierarchical clustering techniques have been applied to induce taxonomies from collaborative tagging systems [58,59], and from software project hosting 2 Literature Review 11 site Freecode [60]. Schmitz analyzes association rule mining results to infer a subsumption based model from Flickr tags [61]. ...
Thesis
Full-text available
With software penetrating into all kinds of traditional or emerging industries, there is a great demand on software development. Faced with the fact that there is a limited number of developers, one important way to meet such urgent needs is to significantly improve developers' productivity. As the most popular Q&A site, Stack Overflow has accumulated abundant software development knowledge. Effectively leveraging such a big data can help developers reuse the experience there to further improve their working efficiency. However, the rich yet unstructured large-scale data in Stack Overflow makes it difficult to search due to two reasons. First, there are too many questions and answers within the site, and there may be lingual gap (the same meaning can be written in different languages) between the query and content in Stack Overflow. In addition, the decay of information quality such as misspelling, inconsistency, and abuse of domain-specific abbreviations aggravates the search performance. Second, some higher-order knowledge in Stack Overflow is implicit for searching and it needs certain distillation from existing raw data. In this thesis, I present methods for supporting developers' information search over Stack Overflow. To overcome the lexical gap and information decay, I also develop an edit recommendation tool to ensure the post quality of Stack Overflow so that posts can be more easily searched by the query. But such explicit information search still requires developers to read, understand and summarize, which is time-consuming. So I propose to shift from the document (information) search to entity (knowledge) search by mining the implicit knowledge from tags in Stack Overflow to render direct answers to developers instead of asking them to read lengthy documents. I first build a basic software-specific knowledge graph including thousands of software-engineering terms and their associations by association rule mining and community detection. Then, I enrich the knowledge graph with more fine-grained relationships i.e., analogy among different third-party libraries. Finally, I combine both semantic and lexical information to infer morphological forms of software terms so that the knowledge graph is more robust for knowledge search.
... Indexing is difficult to do well and even professional indexers are inconsistent when applying keywords from controlled vocabularies (Funk and Reid, 1983). More recently the phenomenon of collaborative tagging has emerged on the Web (Macgregor and McCulloch, 2006). Creators and users can assign descriptors, commonly referred to as tags, to describe a variety of electronic resources such as web pages 1 , bibliographic references 2 , and photo collections 3 . ...
... Recently, Web 2.0 applications have presented "tagging" as an incremental, user-centered strategy for organizing personal information in a public space [Rashmi, 2005;Shirky, 2005;Vander Wal, 2007;MacGregor and Mcculloch, 2006]. The tagging clas-sification model grew out of the need to address the long obvious inadequacy of the traditional filesystem models to manage the ever growing range of kinds of information pieces (as seen in personal information management systems [Malone, 1983;Dourish et al., 1999, Whittaker, 1996) in a cohesive, intuitive and user-centered fashion. ...
Article
This paper describes an innovative tagging model incorporated into a web 2.0 social and personal information management application. Our work utilizes web 2.0 tagging concepts in a new way in an effort to provide better support for users’ needs for contextualization and personalization of their information spaces for both personal…Cet article décrit un modèle innovateur d’étiquetage intégré à une application de gestion de l’information personnelle et sociale du Web 2.0. Notre travail utilise les concepts de l’étiquetage du Web 2.0 d’une manière nouvelle, afin de mieux subvenir aux besoins des utilisateurs pour la contextualisation et la personnalisation de leurs espaces informationnels, pour des fins personnelles…
... Collaborative tagging systems build on the long-time practice of using keywords to describe and label content, standard in most libraries and repositories [14,26]. However, collaborative tagging systems democratize the role of categorization. ...
Article
Q&A websites compile useful knowledge through user-generated questions and responses. Many Q&As use collaborative tagging systems to improve search and discovery while distributing the work of categorizing and organization throughout the community. Although early work on collaborative tagging questioned whether consistent categorization schemes could emerge from large groups with little to no coordination, empirical studies have found surprising coherence among users' tags. We build on this research by testing whether coherence emerges in tag usage on Q&As, a more challenging context, focusing in particular on mismatches in the specificity of tags (basic level disagreement). We found that some users shifted toward more specific tag usage over time slightly increasing conflict, but that moderators were instrumental in helping to resolve some of this conflict. This study highlights the importance of learning and moderation in the development of coherence in collaborative tagging systems.
... And much like some today think that collaborative tagging can bring about emancipating effects in relation to the controlled vocabulary 'oppressions' of formal knowledge organisation executed by a select few professionals (c.f. e.g., Macgregor & McCulloch, 2006), hypertext was also thought to make possible a sort of constantly changing and evolving canon of consensus (Moulthrop, 1991). In hindsight it is safe to say that hypertext did not bring about this kind of paradigm shift that some had hoped, but the example provides a testament to the ways in which interpretations of technology permeate both common understandings of 'problems of representation' and hopes and perceptions of solutions to the same. ...
Book
Full-text available
The aim of this study is to analyse mutual enactments of critical literacies and social visualisation tools as information resources. The central concept of critical literacies as used here extends and redefines prior critical literacy definitions to denote the pluralistic situated enactments of meaning through which study participants identify, question and transform bias, restrictions and power related aspects of access, control and use in relation to the tools. The study is based on two critical ethnography inspired case studies involving observations, interviews, and contextual inquiry and located in professional settings. Case 1 is centred on how a geographic information system (MapInfo) is used for analysing and preventing traffic accidents. Case 2 is centred on how a dynamic time series animating chart (Trendalyzer) is used for analysing and spreading knowledge about the world’s development. The results demonstrate co-existing critical literacies described in terms of three main directionalities as reactive, proactive, and adaptive, of which the adaptive varieties seem thus far largely overlooked. On the basis of these findings, it is suggested that dominant cognitivist and positivist narratives of visualisations should be replaced with more nuanced alternatives that emphasise the potentials of visualisation tools as evocative and non-blackboxed information resources; i.e., as encouraging new questions and allowing alternative analyses, rather than constructing them as enunciative tools providing true answers. As theoretical contributions, the dissertation argues for a conceptualisation of visualisation tools as representational artefacts and a species of documents actuating information organisation related problems of representation. It also presents a new theoretical construct for the analysis and understanding of the mutual shaping of critical literacies and information resources that includes both cultural practices and actor interests through a combination of sociocultural theories on tools and sociotechnical theories on inscriptions.
... Control of lexical anomalies. CVs control lexical anomalies by minimizing any superfluous vocabulary or grammatical variations that could potentially create noise in the users' results set (Chamis, 1991;Garshol, 2004), e.g., removing leading articles, prepositions, conjunctions, etc., or ensuring consistency (Macgregor & McCulloch, 2006). ...
Article
Controlled Vocabularies for DDI 3: Enhancing Machine-Actionability
... Même s'il est plus efficace qu'un modèle booléen, l'un des inconvénients de ces modèles est qu'il est impossible d'estimer les probabilités utilisées pour évaluer la pertinence des résultats si de vastes corpus d'entraînement ne sont pas disponibles. Comme alternative à l'utilisation des vocabulaires contrôlés s'offre l'indexation collaborative, dont la principale caractéristique est l'utilisation d'un vocabulaire libre, qui contient les mots courants de la langue (Macgregor & McCulloch, 2006 ...
Thesis
L’objectif principal de cette thèse est de montrer que les informations lexicales issues d’un dictionnaire de langue, tel le Trésor de la langue française informatisé (TLFi), peuvent améliorer les processus d’indexation et de recherche d’images. Le problème d’utilisation d’une telle ressource est qu’elle n’est pas suffisamment formalisée pour être exploitée d’emblée dans un tel domaine d’application. Pour résoudre ce problème, nous proposons, dans un premier temps, une approche de construction automatique de hiérarchies sémantiques à partir du TLFi. Après avoir défini une caractéristique quantitative (mesurable) et comparable des noms apparaissant dans les définitions lexicographiques, à travers une formule de pondération permettant de sélectionner le nom de poids maximal comme un bon candidat hyperonyme pour un lexème donné du TLFi, nous proposons un algorithme de construction automatique de hiérarchies sémantiques pour les lexèmes des vocables du TLFi. Une fois notre approche validée à travers des évaluations manuelles, nous montrons, dans un second temps, que les hiérarchies sémantiques obtenues à partir du TLFi peuvent être utilisées pour l’enrichissement d’un thésaurus construit manuellement ainsi que pour l’indexation automatique d’images à partir de leurs descriptions textuelles associées. Nous prouvons aussi que l’exploitation d’une telle ressource dans le domaine de recherche d’images améliore la précision de la recherche en structurant les résultats selon les domaines auxquels les concepts de la requête de recherche peuvent faire référence. La mise en place d’un prototype nous a permis ainsi d’évaluer et de valider les approches proposées.
... Individual user's tagging of items shows great potential in the organization of knowledge within and across information systems [28]. However, the usefulness of such annotations is conditioned by the development of a shared terminology that leads to a meaningful description of resources [38]. ...
Conference Paper
Full-text available
In online social learning environments, tagging has demonstrated its potential to facilitate search, to improve recommendations and to foster reflection and learning. Studies have shown that as a prerequisite for learning, shared understanding needs to be established in the group. We hy-pothesise that this can be fostered through tag recommendation strategies that contribute to semantic stabilization. In this study, we investigate the application of two tag rec-ommenders that are inspired by models of human memory: (i) the base-level learning equation BLL and (ii) Minerva. BLL models the frequency and recency of tag use while Min-verva is based on frequency of tag use and semantic context. We test the impact of both tag recommenders on semantic stabilization in an online study with 51 students completing a group-based inquiry learning project in school. We find that displaying tags from other group members contributes significantly to semantic stabilization in the group, as compared to a strategy where tags from the students' individual vocabularies are used. Testing for the accuracy of the different recommenders revealed that algorithms using frequency counts such as BLL performed better when individual tags were recommended. When group tags were recommended, the Minerva algorithm performed better. We conclude that tag recommenders, exposing learners to each other's tag choices by simulating search processes on learn-ers' semantic memory structures, show potential to support semantic stabilization and thus, inquiry-based learning in groups.
... When user annotate resources without drawing on a controlled vocabulary it is not assured that they will reach a common understanding of terminology to describe resources or resource attributes. Such a common understanding, is an essential criterion for the useful application of tagging systems as means to organize content (Macgregor and McCulloch, 2006). The implicit agreement of users on a vocabulary that is stable over time is called semantic stability (Wagner et al., 2014). ...
Article
Full-text available
In recent years, various recommendation algorithms have been proposed to support learners in technology-enhanced learning environments. Such algorithms have proven to be quite effective in big-data learning settings (massive open online courses), yet successful applications in other informal and formal learning settings are rare. Common challenges include data sparsity, the lack of sufficiently flexible learner and domain models, and the difficulty of including pedagogical goals into recommendation strategies. Computational models of human cognition and learning are, in principle, well positioned to help meet these challenges, yet the effectiveness of cognitive models in educational recommender systems remains poorly understood to this date. This thesis contributes to this strand of research by investigating i) two cognitive learner models (CbKST and SUSTAIN) for resource recommendations that qualify for sparse user data by following theory-driven top down approaches, and ii) two tag recommendation strategies based on models of human cognition (BLL and MINERVA2) that support the creation of learning content meta-data. The results of four online and offline experiments in different learning contexts indicate that a recommendation approach based on the CbKST, a well-founded structural model of knowledge representation, can improve the users? perceived learning experience in formal learning settings. In informal settings, SUSTAIN, a human category learning model, is shown to succeed in representing dynamic, interest based learning interactions and to improve Collaborative Filtering for resource recommendations. The investigation of the two proposed tag recommender strategies underlined their ability to generate accurate suggestions (BLL) and in collaborative settings, their potential to promote the development of shared vocabulary (MINERVA2). This thesis shows that the application of computational models of human cognition holds promise for the design of recommender mechanisms and, at the same time, for gaining a deeper understanding of interaction dynamics in virtual learning systems.
Conference Paper
Full-text available
In this review article, our main goal is understanding the Networked Learnings used for professional development. Networked learning can be defined as a form of learning where information and communication technology (ICT) can be used to promote connections between learners and their peers, learners and tutors and learners and learning resources. Such networks play an important role in professional development of employees in different sectors, from high tech industries to traditional businesses, and in both formal teaching and educational programs and informal learning activities. In this review, we explore how networked learning contexts, domains, and levels of scale are practiced and reported in the academic literature. And finally, we will investigate support technologies that have been used to facilitate networked learning for professional development.
Article
Full-text available
We seek to guide design, development, and adoption of Renewable Assignments by testing ways learners can contribute to Open Educational Resources (OER). We design, test, and iterate four assignment structures to this end. Testing was completed in an upper-division undergraduate endocrinology course, taught emergency remote due to COVID-19.Using mixed methods: surveys, focus groups, and iterations, we assessed assignment structures and created design guidance for renewable assignments and open pedagogy. We find that in a remote course, these assignments were effective in advancing learning goals. Both students and teachers favored their inclusion in the course. Analysis revealed six design principles to maximize effectiveness of renewable assignments and courses, and empowering teachers and learners to contribute to open knowledge. These principles also provide insight to praxis related to theories of open pedagogy, scaffolding, peer interaction, and active learning.
Article
Purpose This paper aims to identify and understand changing research themes within this field and apply a novel technique for text mining. Design/methodology/approach Statistical text mining methods are applied as an approach to bibliographic analysis to nearly 30 years of papers published in Library Review and Global Knowledge, Memory and Communication to identify key research themes and analyse how they have evolved over this period. Findings Key stable research themes include students, literacy, learning, research, while emerging research themes include social media, networking and knowledge sharing through information and communication technology. Originality/value A novel approach to bibliometric analysis is applied to a large collection of texts published in the library field.
Chapter
Full-text available
Linked Data (LD) emerged as an innovation in libraries over a decade ago. It refers to a set of best practices for publishing and linking structured data using existing Semantic Web technologies. Knowledge organisation in academic libraries can use the advantages of LD technologies to increase availability of library resources on the world wide web. Existing methods of descriptive cataloguing are based on describing metadata and constructing unique authorized access points as text strings. However, this strings-based approach works well in the closed environment of a traditional library catalogue and not in an open environment where data are shared and linked. This chapter investigates the introduction of LD in the organization of knowledge in academic libraries, as literature shows that students prefer to search the internet for their information needs. Secondary literature was reviewed and analysed. Findings indicated that libraries that adopted LD increased the visibility of their products on the internet.
Chapter
Digital libraries build on classifying contents by capturing their semantics and (optionally) aligning the description with an underlying categorization scheme. This process is usually based on human intervention, either by the content creator or a curator. As such, this procedure is highly time-consuming and - thus - expensive. In order to support the human in data curation, we introduce an annotation tagging system called “AnnoTag”. AnnoTag aims at providing concise content annotations by employing entity-level analytics in order to derive semantic descriptions in the form of tags. In particular, we are generating “Semantic LOD Tags” (linked open data) that allow an interlinking of the derived tags with the LOD cloud. Based on a qualitative evaluation on Web news articles we prove the viability of our approach and the high-quality of the automatically extracted information.
Chapter
Digital curation requires substantial human expertise in order to achieve and maintain document collections of high quality. This necessitates usually expert knowledge of a librarian or curator in order to interpret the content and categorize it accordingly. This process is at the same time expensive and time-consuming. With the advent of knowledge bases and the plenitude of information contained within them new opportunities emerge at the horizon. In particular, entity-level analytics allows to semantically enrich contents via linked open data (LOD). To this end, we assess in this paper the approach of concise content annotation as a means of supporting the process of digital curation. In particular, we compare various entity-level annotation methods and highlight the importance of concise semantic tagging based on qualitative as well as quantitative evaluations.
Article
Katalogowanie zasobów sieciowych jest porównywane do katalogowania zasobów bibliotecznych. Jedną z metod systematyzowania zawartości zasobów sieciowych jest społeczne tagowanie. Od kilku lat naukowcy zastanawiają się, czy metoda ta jest przydatna dla bibliotek oraz w jaki sposób można zwiększyd jej użytecznośd i wyeliminowad wady, aby stała się równoprawnym narzędziem wśród bibliotecznych systemów informacyjno-wyszukiwawczych. Autorka przedstawia stan badao tego problemu, propozycje przekształceo folksonomii w systemy tagowania kontrolowanego oraz, porównując system tagów z systemem unitermów, stawia tezę o tendencji rozwoju folksonomii w kierunku tezaurusów fasetowych.
Article
Design sharing sites provide UI designers with a platform to share their works and also an opportunity to get inspiration from others' designs. To facilitate management and search of millions of UI design images, many design sharing sites adopt collaborative tagging systems by distributing the work of categorization to the community. However, designers often do not know how to properly tag one design image with compact textual description, resulting in unclear, incomplete, and inconsistent tags for uploaded examples which impede retrieval, according to our empirical study and interview with four professional designers. Based on a deep neural network, we introduce a novel approach for encoding both the visual and textual information to recover the missing tags for existing UI examples so that they can be more easily found by text queries. We achieve 82.72% accuracy in the tag prediction. Through a simulation test of 5 queries, our system on average returns hundreds more results than the default Dribbble search, leading to better relatedness, diversity and satisfaction.
Chapter
User support services face complex problems in the efficient and satisfactory delivery of services to users. Knowledge management (KM) principles can be effectively implemented within this organizational context to support efficient and effective problem solving to improve service delivery to the users. A KM system with good information retrieval capabilities is critical in empowering front line employees to utilize organizational knowledge repositories for better service delivery. The purpose of this chapter is to present different aspects of the major information retrieval techniques that can be used in a user services environment and to propose new models to enhance retrieval capabilities of KM systems. The authors discuss the basic elements that make up an information retrieval system including metadata, controlled, and uncontrolled vocabularies. The authors also propose three experimental search interfaces for enhancing information retrieval capabilities of a KM system. The first uses a thesaurus for enhanced retrieval features through better query formulation and browsing of search results; the second uses the tag cloud concept to present thesaurus terms; and the third combines the structure of a controlled vocabulary with the flexibility of a folksonomy and tag cloud, thus incorporating the beneficial aspects of both uncontrolled and controlled vocabularies to support retrieval within a heterogeneous corporate environment.
Thesis
Le système de tags pour un système d’organisation des connaissances centralise et fournit les tags qui peuvent être utilisés pour classer, partager et rechercher des connaissances sur le web pour l’utilisation personnelle ou organisationnelle. Bien que les études précédentes aient pensé à améliorer le système de tags visuels en utilisant des icônes, il existe dans ce cas le problème de reconnaissance, de mémorisation et de désorientation. Notre recherche se consacre à la recherche d'une nouvelle approche pour améliorer la représentation des tags et surtout de leur structure, dans un système où les icônes bien structurées pourront améliorer l'efficacité de tagage en considérant la qualité et la rapidité. Ce système de tags iconiques s’organise sur un LVD (Langage Visuel Distinctif) lui-même basé sur le modèle Hypertopic pour la représentation de cartes de thèmes multipoints de vue développé par l’équipe Tech-CICO. Cette solution est proposée pour améliorer principalement l'interprétation sémiotique du sens de l’icone et renforcer la compréhension et l’usage de la structure de tags dans un système informatisé de partage des connaissances, notamment pour gérer et partager les tags iconiques sur une plate-forme collaborative
Article
Full-text available
وب ۲ نشان دهندة مجموعه‌ای از ابزارهای در حال ظهور است که ظرفیت زیادی در غنی‌سازی ارتباطات، ارائه امکان همکاری و پرورش نوآوری دارند. با وجود این، تا به حال تحقیقات کمی در خصوص کاربردهای وب ۲ در وب‌سایت کتابخانه‌ها صورت گرفته ‌است. پژوهش حاضر در پی آن است تا با بررسی میزان استفاده از ابزارهای وب ۲ در کتابخانه‌های ایران و خاورمیانه، ارتباط بین سن وب‌سایتها و میزان استفاده این ابزارها را بررسی کند. بدین منظور، ۱۴۶ وب‌سایت کتابخانه دانشگاهی خاورمیانه، از «فهرست دانشگاه برتر دنیا» ارائه ‌شده توسط آزمایشگاه سایبرسنجی در اسپانیا انتخاب و با روش تحلیل محتوا تجزیه و تحلیل شد. بر اساس یافته‌ها، میزان استفاده از ابزارهای وب ۲ در کتابخانه‌های خاورمیانه و ایران بخصوص از حیث اشاعه (10/6%) و سازماندهی اطلاعات (0%) پایین است و استقبال بیشتری (13/80%) از ابزارهای اشتراک اطلاعات بخصوص پیام‌رسان فوری در میان این کتابخانه‌ها وجود دارد. همچنین، تنها کتابخانه دو دانشگاه علوم پزشکی گیلان و فردوسی مشهد (۶/۰۶%) از میان ۱۴۶ کتابخانه دانشگاهی در خاورمیانه از شاخص گردآوری اطلاعات و ابزار استفاده می‌کنند. نتایج پژوهش نشان داد در وب‌سایتهای کتابخانه‌‌های دانشگاهی کشورهای خاورمیانه، میان سن وب و امکان استفاده از ابزارهای وب ۲، ارتباط معناداری وجود ندارد.
Article
The Simplified Knowledge Organization System (SKOS) is a Semantic Web framework, based on the Resource Description Framework (RDF) for thesauri, classification schemes, and simple ontologies. It allows for machine-actionable description of the structure of these knowledge organization systems (KOS) and provides an excellent tool for addressing interoperability and vocabulary control problems inherent to the rapidly expanding information environment of the Web. This paper discusses the foundations of the SKOS framework and reviews the literature on a variety of SKOS implementations. The limitations of SKOS that have been revealed through its broad application are addressed with brief attention to the proposed extensions to the framework intended to account for them.
Article
Purpose The purpose of this paper is to provide a bibliometric review of the journal Library Review (LR) from 1989 until its relaunch in 2018 as global knowledge, memory and communication. Design/methodology/approach Bibliometric analysis of 1,084 articles published in LR in the period 1989–2017. Findings Authors from 69 different countries have published in the journal, with Scotland providing the largest single contribution in terms of authors and institutions. Articles in the journal have been extensively cited, with the citations coming not only from the core library and information science literature but also from journals in a very broad range of disciplines. Originality/value This paper extends previous work on articles published in the journal and provides the first detailed study of citations to those published articles.
Article
Purpose The purpose of this paper is to explore the potential of enriching the library subject headings with folksonomy for enhancing the visibility and usability of the library subject headings. Design/methodology/approach The WorldCat-million data set and SocialBM0311 are preprocessing and over 210,000 library catalog records and 124,482 non-repeating tags were adopted to construct the matrix to observe the semantic relation between library subject headings and folksonomy. The proposed system is compared with the state-of-the-art methods and the parameters are fixed to obtain effective performance. Findings The results demonstrate that by integrating different semantic relations from library subject headings and folksonomy, the system’s performance can be improved compared to the benchmark methods. The evaluation results also show that the folksonomy can enrich library subject headings through the semantic relationship. Originality/value The proposed method simultaneous weighted matrix factorization can integrate the semantic relation from the library subject headings and folksonomy into one semantic space. The observation of the semantic relation between library subject headings and social tags from folksonomy can help enriching the library subject headings and improving the visibility of the library subject headings.
Article
Purpose-There has been a significant rise in the use of web 2.0 social network websites and online applications in recent years. One of the most popular is Flickr: an online image management application. This paper investigates general patterns of tag usage and determines the usefulness of the tags used within university image groups to the wider Flickr community. Design/Methodology/Approach-This study uses a webometric data collection, classification and informetric analysis. Findings-The results show that members of university image groups tend to tag in a manner that is of use to users of the system as a whole rather than merely for the tag creator. Originality/Value-This paper gives a valuable insight into the tagging practices of image groups in Flickr.
Article
This article examines tagging as knowledge organization. Tagging is a kind of indexing, a process of labelling and categorizing information made to support resource discovery for users. Social tagging generally means the practice whereby internet users generate keywords to describe, categorise or comment on digital content. The value of tagging comes when social tags within a collection are aggregated and shared through a folksonomy. This article examines definitions of tagging and folksonomy, and discusses the functions, advantages and disadvantages of tagging systems in relation to knowledge organization before discussing studies that have compared tagging and conventional library-based knowledge organization systems. Approaches to disciplining tagging practice are examined and tagger motivation discussed. Finally, the article outlines current research fronts. © 2018 International Society for Knowledge Organization. All rights reserved.
Article
Full-text available
Organizational research describes the inherent tension between innovation, as a means to adapt to environmental change, and continuing to do what one does well and what current customers appreciate. Managing this tension successfully leads to so-called ambidexterity. How to achieve it is still a matter of debate: proponents of structural approaches recommend a separation of exploration and exploitation, while proponents of so-called contextual ambidexterity suggest that contextual factors such as culture and process are equal if not more critical in leading the organization to ambidexterity. Based on the findings of empirical ambidexterity research, many more factors are suggested, though they are rarely researched in an ambidexterity context nor are the interdependencies between the factors and the known ambidexterity strategies described. To guide future research, this paper develops an expanded and system-focused framework for achieving ambidexterity. It is used to review and integrate findings from organizational theory and neighboring disciplines, including project management theories, knowledge management theories, human resource management theories, and open and distributed innovation theories. Managerial implications are discussed and illustrated with a case example. The resulting work provides the basis for explicitly modeling the drivers and inhibitors of exploration and exploitation and their interdependencies. In future research, this can be used to better understand and overcome conflicting objectives, devise new approaches for achieving ambidexterity, and ultimately design more successful organizations.
Chapter
Allowing users to organize content by tagging resources in webbased systems has led to the emergence of the so-called SocialWeb. Tags turned out to be helpful not only for giving recommendations and improving search in social tagging systems but also for enhancing information access by navigating. In this chapter, we will cover much of the pioneer research work that has studied tag-based navigation and visualization. After giving a short overview of the social tagging process and its specifics, we provide an extensive description of the typical user interfaces and visualization techniques characteristic for social tagging systems. As the efficiency of tag-based navigation depends on structuring tagging data, we also provide a review of the state of the art algorithms for tag clustering. Before we conclude, we demonstrate how tag-based navigation can be modeled and discuss the intrinsic navigability of social tagging systems from various theoretic perspectives.
Article
Full-text available
The term 'social library', is a new concept that was inspired by the term Web 2.0 and social networks. The aim of this study was to introduce and define social library (library 2.0 plus social networks) and social catalog (catalog 2.0 plus social networking, also known as SOPAC). This descriptive research also examined the extent of use of SOPAC in Iranian library automation systems. This study attempts to answer the following question "to what extent do Iranian library automation systems use the capabilities of SOPAC?" The results show that Iranian library automation systems did not utilize SOPAC in library catalogs. This study suggests applying the power of the social networks and library 2.0 in library automation systems and catalogs. Keywords: Social cataloging, Catalog, Social software, Web 2.0, Library 2.0, Social library
Article
Full-text available
This paper surveys classification research literature, discusses various classification theories, and shows that the focus has traditionally been on establishing a scientific foundation for classification research. This paper argues that a shift has taken place, and suggests that contemporary classification research focus on contextual information as the guide for the design and construction of classification schemes.
Article
Full-text available
This report analyzes the methodologies used in establishing interoperability among knowledge organization systems (KOS) such as controlled vocabularies and classification schemes that present the organized interpretation of knowledge structures. The development and trends of KOS are discussed with reference to the online era and the Internet era. Selected current projects and activities addressing KOS interoperability issues are reviewed in terms of the languages and structures involved. The methodological analysis encompasses both conventional and new methods that have proven to be widely accepted, including derivation/modeling, translation/adaptation, satellite and leaf node linking, direct mapping, co-occurrence mapping, switching, linking through a temporary union list, and linking through a thesaurus server protocol. Methods used in link storage and management, as well as common issues regarding mapping and methodological options, are also presented. It is concluded that interoperability of KOS is an unavoidable issue and process in today's networked environment. There have been and will be many multilingual products and services, with many involving various structured systems. Results from recent efforts are encouraging.
Article
Full-text available
To be faced with a document collection and not to be able to find the information you know exists somewhere within it is a problem as old as the existence of document collections. Information architecture is the discipline dealing with the modern version of this problem: how to organize web sites so that users can actually find what they are looking for. Information architects have so far applied known and well-tried tools from library science to solve this problem, and now topic maps are sailing up as another potential tool for information architects. This raises the question of how topic maps compare with the traditional solutions, and that is the question this paper attempts to address. The paper argues that topic maps go beyond the traditional solutions in the sense that they provide a framework within which they can be represented as they are, but also extended in ways which significantly improve information retrieval.
Article
Full-text available
Katholieke Universiteit Leuven, Belgium Erik DuvalErik.Duval@cs.kuleuven.ac.beStrategic FuturistAutodesk Wayne Hodginswayne.hodgins@autodesk.comAssociate Professor, The Information SchoolUniversity of Washington Stuart Suttonsasutton@u.washington.eduExecutive DirectorDublin Core Metadata Initiative Stuart L. WeibelWeibel@oclc.org
Article
Full-text available
Collaborative tagging describes the process by which many users add metadata in the form of keywords to shared content. Recently, collaborative tagging has grown in popularity on the web, on sites that allow users to tag bookmarks, photographs and other content. In this paper we analyze the structure of collaborative tagging systems as well as their dynamical aspects. Specifically, we discovered regularities in user activity, tag frequencies, kinds of tags used, bursts of popularity in bookmarking and a remarkable stability in the relative proportions of tags within a given url. We also present a dynamical model of collaborative tagging that predicts these stable patterns and relates them to imitation and shared knowledge.
Article
This paper examines user-generated metadata as implemented and applied in two web services designed to share and organize digital me- dia to better understand grassroots classification. Metadata - data about data - allows systems to collocate related information, and helps users find relevant information. The creation of metadata has generally been approached in two ways: professional creation and author creation. In li- braries and other organizations, creating metadata, primarily in the form of catalog records, has traditionally been the domain of dedicated profes- sionals working with complex, detailed rule sets and vocabularies. The primary problem with this approach is scalability and its impracticality for the vast amounts of content being produced and used, especially on the World Wide Web. The apparatus and tools built around professional cataloging systems are generally too complicated for anyone without spe- cialized training and knowledge. A second approach is for metadata to be created by authors. The movement towards creator described docu- ments was heralded by SGML, the WWW, and the Dublin Core Metadata Initiative. There are problems with this approach as well - often due to inadequate or inaccurate description, or outright deception. This paper examines a third approach: user-created metadata, where users of the documents and media create metadata for their own individual use that is also shared throughout a community.
Article
An issue currently at the forefront of digital library research is the prevalence of disparate terminologies and the associated limitations imposed on user searching. It is thought that semantic interoperability is achievable by improving the compatibility between terminologies and classification schemes, enabling users to search multiple resources simultaneously and improve retrieval effectiveness through the use of associated terms drawn from several schemes. This column considers the terminology issue before outlining various proposed methods of tackling it, with a particular focus on terminology mapping.
Article
Purpose There is a lack of efficiency when dealing with information and searching for the right content. Aims to present a procedural model which in essence is a generalized approach to terminology management, with which to build and maintain glossaries and taxonomies. Design/methodology/approach In addition to an extensive literature review, analysis of three action research cases with several corporate partners is presented. The first case focuses on the introduction of a glossary for a Swiss insurance company. The second illustrates the results from setting up a corporate taxonomy at an international professional services firm. The third case combines glossary and taxonomy for document classification and retrieval. Findings Glossary and taxonomy are suitable for solving a wide range of terminological defects. Usage and maintenance processes play a central role in the management of terms and should be well defined. Only a well‐suited trade‐off between centralized and decentralized terminology management will be sustainable. Research limitations/implications Other means besides clearly defined processes have to be defined to clearly eliminate certain issues. Furthermore, there is the question of whether the implementation of terminology management could benefit certain types of companies in certain industry branches more than others. Practical implications Concrete actions that have to be taken into consideration when introducing glossary and taxonomy systems. Originality/value Proposes a procedural model for the introduction of glossary and taxonomy as well as the cultivation of a corporate terminology.
Article
From the Publisher:Databases and public access catalogs are being used extensively by the public and the academic and business communities as major sources of information. Most users want to access these databases directly to locate the information they need. Increasingly, users are demanding user-friendly databases that will assist them in finding conceptual information effectively. The lack of compatibility or standardization among many different indexing vocabularies and thesauri makes it difficult to find related information in information retrieval systems containing many different online databases. This book provides a thought-provoking new perspective on the role of vocabulary control in providing access to the conceptual information found in online databases and catalogs.
Article
Brant Cruz is a vice president and team general manager at Chadwick Martin Bailey Inc., a Boston market strategy ®rm dedicated to creating competitive advantage through advanced, custom market research deliverables. He can be reached at bcruz@chadwickmartinbailey.com Smart companies listen carefully and adapt to their marketplace. What separates companies that survive from ones in overdrive? One answer is a corporate taxonomy. This new twist on the practice of strategic segmentation helps a diverse company keep its market segments in an overall, strategic perspective, permitting it to adapt more easily to new opportunities. It affords the big picture that allows companies to seize new opportunities. Classic market segmentation remains useful in de®ning and targeting separate parts of a larger market. When it arrived in the 1950s, it was a leap over Henry Ford's promise to sell customers cars of any color as long as the color was black. It divides markets into identi®able groups, or segments, that have different goals or needs and may respond differently to promotions, advertising and other marketing tools. Aided by recent advances in technology and marketing theory, this basic custom marketing approach has been updated many times and now allows companies to reach potential buyers with highlycustomizèniche'' offerings. The downside is that market segmentation can put customers in separate named bucketsuilty Moms''urs and SUVs''ave-it-all Singles''), where, through inertia, they may remain. That can fragment a customer base into disconnected parts. Yet the basic idea is so valuable that, in some cases, product-level segmentation has become entrenched and crept up the ladder to become a permanent corporate strategy.
Article
Does human intellectual indexing have a continuing role to play in the face of increasingly sophisticated automatic indexing techniques? In this two-part essay, a computer scientist and long-time TREC participant (Pérez-Carballo) and a practitioner and teacher of human cataloging and indexing (Anderson) pursue this question by reviewing the opinions and research of leading experts on both sides of this divide. We conclude that human analysis should be used on a much more selective basis, and we offer suggestions on how these two types of indexing might be allocated to best advantage. Part one of the essay critiques the comparative research, then explores the nature of human analysis of messages or texts and efforts to formulate rules to make human practice more rigorous and predictable. We find that research comparing human vs automatic approaches has done little to change strongly held beliefs, in large part because many associated variables have not been isolated or controlled.Part II focuses on current methods in automatic indexing, its gradual adoption by major indexing and abstracting services, and ways for allocating human and machine approaches. Overall, we conclude that both approaches to indexing have been found to be effective by researchers and searchers, each with particular advantages and disadvantages. However automatic indexing has the over-arching advantage of decreasing cost, as human indexing becomes ever more expensive.
Article
Subjectivity is discussed in the context of information processing, and its properties are considered in relation to Popper’s three Worlds model of information. The uncertainties that subjectivity creates are seen as central to some problematic issues of information handling, including classification and retrieval. An appreciation of problems relating to subjectivity also has relevance in several subject areas of interest to information science, including understanding, relevance and significance, knowledge management, and creativity.
Article
This paper reports on the automatic metadata generation applications (AMeGA) project's metadata expert survey. Automatic metadata generation research is reviewed and the study's methods, key findings and conclusions are presented. Participants anticipate greater accuracy with automatic techniques for technical metadata (e. g., ID, language, and format metadata) compared to metadata requiring intellectual discretion (e. g., subject and description metadata). Support for implementing automatic techniques paralleled anticipated accuracy results. Metadata experts are in favour of using automatic techniques, although they are generally not in favour of eliminating human evaluation or production for the more intellectually demanding metadata. Results are incorporated into Version 1.0 of the Recommended Functionalities for automatic metadata generation applications (Appendix A).
Article
Thesis (Ph. D.)--University of California, Los Angeles, 1994. Vita. Includes bibliographical references (leaves 226-232).
Article
For many years metadata has been recognised as a significant component of the digital information environment. Substantial work has gone into creating complex metadata schemes for describing digital content. Yet increasingly Web search engines, and Google in particular, are the primary means of discovering and selecting digital resources, although they make little use of metadata. This article considers how digital libraries can gain more value from their metadata by adapting it for Google users, while still following well-established principles and standards for cataloguing and digital preservation.
Essential Classification
  • V. Broughton
Why tagging is expensive”, Silkworm Blog
  • I Davis
Information Intelligence: Content Classification and the Enterprise Taxonomy Practice
  • Delphi Group
Metadata extraction and harvesting: a comparison of two automatic metadata generation applications
  • J. Greenberg
Clay Shirky's viewpoints are overrated”, Peterme.com: links, thoughts, and essays from Peter Merholz
  • P Merholz
HILT: High Level Thesaurus Project – Final Report to RSLP & JISC
  • D Nicholson
  • S Neill
  • S Currier
  • L Will
  • A Gilchrist
  • R Russell
  • M. Day
Ontology is overrated: categories, links and tags
  • C Shirky
Explaining and showing broad and narrow folksonomies”, vanderwal.net, available at: www.vanderwal.net/random/entrysel
  • T Vander Wal
Accepted for – and to appear in
  • Pre-Print
Pre-Print: Accepted for – and to appear in -Library Review, Vol.55 No.5, pp.XXX
The Structure of Collaborative Tagging Systems, Information Dynamics Lab: HP Labs
  • S Golder
  • A Huberman
Golder, S, A. & Huberman, B, A. (2005), The Structure of Collaborative Tagging Systems, Information Dynamics Lab: HP Labs, Palo Alto, USA, available at: http://www.hpl.hp.com/research/idl/papers/tags/tags.pdf (accessed 20 February 2006)
Folksonomies are a forced move: a response to Liz”, Many2Many: A group Weblog of social software
  • C Shirky
A cognitive analysis of tagging”, Rashmi Sinha's weblog, available at: www
  • R Sinha
Semi-structured meta-data has a posse: a response to Gene Smith
  • C Shirky
Order out of chaos”, Wired Magazine
  • B Sterling