Figure 2 - uploaded by Robin Aly
Content may be subject to copyright.
4: The positive and negative examples are training data. Assuming a good SVM model the positive examples will be denser distributed in positive areas of the confidence scores o. The posterior probability follows a sigmoid function. The Figure is similar to Platt (2000).

4: The positive and negative examples are training data. Assuming a good SVM model the positive examples will be denser distributed in positive areas of the confidence scores o. The posterior probability follows a sigmoid function. The Figure is similar to Platt (2000).

Source publication
Article
Full-text available
Representing multimedia documents by means of concepts labels attached to parts of these documents has great potential for improving retrieval performance. The reason is that concepts are independent from how users refer to them and from the modality in which they occur. For example, a Flower and une Fleur refers to the same concept and a singing b...

Similar publications

Conference Paper
Full-text available
Automatic urban sound classification is a growing area of research with applications in multimedia retrieval and urban informatics. In this paper we identify two main barriers to research in this area – the lack of a common taxonomy and the scarceness of large, real-world, annotated data. To address these issues we present a taxonomy of urban sound...

Citations

... The URR framework was originally proposed in the PhD thesis of the first author(Aly 2010). ...
Article
Full-text available
Concept based video retrieval often relies on imperfect and uncertain concept detectors. We propose a general ranking framework to define effective and robust ranking functions, through explicitly addressing detector uncertainty. It can cope with multiple concept-based representations per video segment and it allows the re-use of effective text retrieval functions which are defined on similar representations. The final ranking status value is a weighted combination of two components: the expected score of the possible scores, which represents the risk-neutral choice, and the scores' standard deviation, which represents the risk or opportunity that the score for the actual representation is higher. The framework consistently improves the search performance in the shot retrieval task and the segment retrieval task over several baselines in five TRECVid collections and two collections which use simulated detectors of varying performance.
... In this year's experiments, we set the weights λ 1 , λ 2 , λ 3 equally, modeling a situation where text, concepts, and image similarity are equally important. In future participa- tions we plan to replace this straightforward fusion scheme with a more sophisticated scheme, such as the probabilistic scheme described in [18], or using a scheme that models the uncertainty of the detected objects (for example, words in transcripts or concept occurrences) [1]. ...
Conference Paper
Full-text available
The AXES project participated in the interactive known-item search task (KIS) and the interactive instance search task (INS) for TRECVid 2011. We used the same system architecture and a nearly identical user interface for both the KIS and INS tasks. Both systems made use of text search on ASR, visual concept detectors, and visual similarity search. The user experiments were carried out with media professionals and media students at the Netherlands Institute for Sound and Vision, with media professionals performing the KIS task and media students participating in the INS task. This paper describes the results and findings of our experiments.
Article
Entity resolution (ER) is the problem of identifying duplicate tuples, which are the tuples that represent the same real-world entity. There are many real-life applications in which the ER problem arises. These applications range from news aggregation websites, identifying the news that cover the same story, in order to avoid presenting one story several times to the user, to the integration of two companies' customer databases in the case of a merger, where identifying the tuples that refer to the same customer is crucial. Due to its diverse applications, the ER problem has been formulated in different ways in the literature. The two main ER's related problem formulations include: 1) identity resolution, and 2) reduplication. In identity resolution, the aim is to find duplicate(s) of a given tuple in a given database, while in deduplication, the aim is to find groups of duplicate tuples in a given database, and merge them in order to increase the quality of the database itself. The ER problem is however not limited to deterministic (ordinary) databases, rather it also arises in applications that deal with probabilistic databases, i.e. databases in which each tuple or attribute value is associated with a probability value to, for instance, indicate its confidence level. In this thesis, we study the ER problem in probabilistic databases. More specifically, we study the following problems: 1) identity resolution in probabilistic data, 2) identity resolution in distributed probabilistic data, 3) deduplication in probabilistic data, and 4) schema matching in a fully automated setting.
Thesis
Full-text available
Today, pupils at the age of 15 have spent their entire life surrounded by and interacting with diverse forms of computers. It is a routine part of their day-to-day life and by now computer-literacy is common at very early age. Over the past five years, technology for teens has become predominantly mobile and ubiquitous within every aspect of their lives. To them, being online is an implicitness. In Germany, 88% of youth aged between 12-19 years own a smartphone and about 20% use the Internet via tablets. Meanwhile, more and more young learners bring their devices into the classroom and pupils increasingly demand for innovative and motivating learning scenarios that strongly respond to their habits of using media. With this development, a shift of paradigm is slowly under way with regard to the use of mobile technology in education. By now, a large body of literature exists, that reports concepts, use-cases and practical studies for effectively using technology in education. Within this field, a steadily growing body of research has developed that especially examines the use of digital games as instructional strategy. The core concern of this thesis is the design of mobile games for learning. The conditions and requirements that are vital in order to make mobile games suitable and effective for learning environments are investigated. The base for exploration is the pattern approach as an established form of templates that provide solutions for recurrent problems. Building on this acknowledged form of exchanging and re-using knowledge, patterns for game design are used to classify the many gameplay rules and mechanisms in existence. This research draws upon pattern descriptions to analyze learning game concepts and to abstract possible relationships between gameplay patterns and learning outcomes. The linkages that surface are the starting bases for a series of game design concepts and their implementations are subsequently evaluated with regard to learning outcomes. The findings and resulting knowledge from this research is made accessible by way of implications and recommendations for future design decisions.
Article
Computer games are more and more often used for training purposes. These virtual training games are exploited to train competences like leadership, negotiation or social skills. In such virtual training, a human trainee interacts with one or more virtual characters playing the trainee’s team members, colleagues or opponents. To learn from virtual training, it is important that the virtual characters display realistic human behavior. This can be achieved by human players who control the virtual game characters, or by intelligent software agents that generate the behavior of virtual characters automatically. Using intelligent agents instead of humans allows trainees to train independently of others, which gives them more training opportunities. A potential problem of using intelligent agents is that trainees do not always understand why the agents behave the way they do. For instance, virtual team members (played by intelligent agents) that do not follow the instructions of their leader (a human trainee) may have misunderstood the instructions, or disobey them on purpose. After playing the scenario, the trainee does not know whether he should communicate clearer, or give better or safer instructions. A solution is to let virtual agents explain the reasons behind their behavior. When trainees can ask their co-players to explain the motivations for their actions, they are given the opportunity to better understand played scenarios and their own performance. This thesis proposes an approach to automatically generate explanations of the behavior of virtual agents in training games. Psychological research shows that people usually explain and understand human (or human-like) behavior in terms of mental concepts like beliefs, goals and intentions. In the proposed approach, actions of virtual agents are also explained by mental concepts. To generate such explanations in an efficient way, agents are implemented in a BDI-based (Belief Desire Intention) programming language. The behavior of BDI agents is represented by beliefs, goals, plans and intentions, and their actions are determined by a reasoning process on their mental concepts. Thus, the mental concepts that are responsible for the generation of an action can be reused to explain that action. The approach can generate different types of explanations. Empirical studies with instructors, experts and novices, respectively, showed that people generally prefer explanations that contain a combination of the belief that triggered an action, and the goal that is achieved by the action. In a validation study in the domain of virtual negotiation training, subjects indicated that the agent’s explanations increased their understanding in the motivations behind its behavior. In a validation study in the domain of human-agent teamwork, subjects better understood the agent’s behavior and preferred the amount of information provided by the agent when the agent explained its behavior. Finally, the approach was extended to make agents capable of providing explanations containing predictions about the behavior of other agents. For that, the explainable agents were equipped with a theory of mind, that is, the ability to attribute mental states such as beliefs and goals to others, and based on that, make predictions about their behavior.
Article
In this thesis we investigate the possibility to integrate domain-specific knowledge into biomedical information retrieval (IR). Recent decades have shown a fast growing interest in biomedical research, reflected by an exponential growth in scientific literature. An important problem for biomedical IR is dealing with the complex and inconsistent terminology encountered in biomedical publications. Dealing with the terminology problem requires domain knowledge stored in terminological resources: controlled indexing vocabularies and thesauri. The integration of this knowledge is, however, far from trivial. The first research theme investigates heuristics for obtaining word-based representations from biomedical text for robust retrieval. We investigated the effect of choices in document preprocessing heuristics on retrieval effectiveness. Document preprocessing heuristics such as stop word removal, stemming, and breakpoint identification and normalization were shown to strongly affect retrieval performance. An effective combination of heuristics was identified to obtain a word-based representation from text for the remainder of this thesis. The second research theme deals with concept-based retrieval. We compared a word-based to a concept-based representation and determined to what extent a manual concept-based representation can be automatically obtained from text. Retrieval based on only concepts was demonstrated to be significantly less effective than word-based retrieval. This deteriorated performance could be explained by errors in the classification process, limitations of the concept vocabularies and limited exhaustiveness of the concept-based document representations. Retrieval based on a combination of word-based and automatically obtained concept-based query representations did significantly improve word-only retrieval. In the third and last research theme we propose a cross-lingual framework for monolingual biomedical IR. In this framework, the integration of a concept-based representation is viewed as a cross-lingual matching problem involving a word-based and concept-based representation language. This framework gives us the opportunity to adopt a large set of established crosslingual information retrieval methods and techniques for this domain. Experiments with basic term-to-term translation models demonstrate that this approach can significantly improve word-based retrieval. Directions for future work are using these concepts for communication between user and retrieval system, extending upon the translation models and extending CLIR-enhanced concept-based retrieval outside the biomedical domain. Available online from http://purl.utwente.nl/publications/72481.
Article
The history of software engineering in general and programming languages in particular is marked by the introduction of high-level engineering concepts, abstracting away from the rather low-level principles that are used by the machine on which the software is executed. Such high-level abstractions allow us to focus only on a few essential concepts at the same time by factoring out details. The abstractions by which we engineer complex software systems are more than often inspired by metaphorical concepts by which we understand and structure the complex world around us. A well-known example is the concept of folder for archiving our files. Introducing the concepts of agent as the metaphorical counterpart of humans and multi-agent system as the metaphorical counterpart of a society, the field of agent-oriented software engineering brings the use of abstractions in programming to an even higher level. Agents are autonomous entities that are typically programmed in terms of beliefs modeling the information they have about their world, goals denoting the situations they desire to establish and plans describing how to reach their goals. Agents participating in a multi-agent system may have been engineered by different parties with differing design objectives, implying that agents may encounter and interact with agents having conflicting goals. An illustrative example is an online marketplace on which agents interact with (unknown) parties to sell and buy their goods. Because little can be assumed about the behavior the interacting agents exhibit and nobody directly wrote the whole program encompassing the multi-agent system it is hard to predict the emerging behavior of the system as a whole. To increase the likelihood of the design objectives of the system being met, coordination media are put in place to regulate the individual agents' behavior. Some of these media are based on constructs that resemble physical every-day structures, such as a tube's entrance gate and traffic lights. Yet others use more abstract concepts we use for organizing our society, such as norms that should be followed and roles that agents can play. Getting back to the online marketplace example, agents play the role of seller and buyer, and are expected to abide by certain norms, e.g. paying the price agreed within a certain time. In this thesis we will focus on organization-oriented coordination media. In this thesis we show that research on individual agents progressed rather independently from research on agent organizations, leaving a gap between agent-oriented and organization-oriented programming. We identify what we consider the root causes underlying this gap and develop an organization-oriented programming language whose constructs accord better with the key concepts and characteristics associated with agents. Constructs for programming roles, norms and constructs for changing the norms at run time will be investigated in particular. To understand what our programming language can (or cannot) offer, a precise description of its meaning (semantics) is indispensable. For example, to use an obligation properly, we need to know exactly when it is fulfilled or violated, and when sanctions will be imposed. Therefore, in this thesis, we formally describe the semantics of the programming constructs in a mathematically rigorous manner.