Figure 1 - uploaded by Yannis Tzitzikas
Content may be subject to copyright.
A faceted taxonomy for indexing hotel Web pages  

A faceted taxonomy for indexing hotel Web pages  

Source publication
Article
Full-text available
A materialized faceted taxonomy is an information source where the objects of interest are indexed according to a faceted taxonomy. This paper shows how from a materialized faceted taxonomy, we can mine an expression of the Compound Term Composition Algebra that specifies exactly those com- pound terms (conjunctions of terms) that have non-empty in...

Contexts in source publication

Context 1
... that we want to build a Catalog of hotel Web pages and suppose that we want to pro- vide access to these pages according to the Location of the hotels, the Sports that are possible in these hotels, and the Facilities they offer. For doing so, we can design a faceted taxonomy, i.e. a set of taxonomies, each describing the domain from a different aspect, or facet, like the one shown in Figure 1. Now each object (here Web page) can be indexed using a compound term, i.e., a set of terms from the different facets. ...
Context 2
... as the number of associations of this kind can be very big, we employ CTCA for encoding and representing them compactly. Table 1: The valid and invalid compound terms of the example of Figure 1 This paper generalizes and extends the work first sketched in [24]. In particular in [24], we introduced the problem of expression mining and described some basic and unoptimized algo- rithms for solving it. ...
Context 3
... compound term over T is any subset of T . For example, the following sets of terms are compound terms over the taxonomy Sports of Figure 1: s 1 = {SeaSki, W indsurf ing}, s 2 = {SeaSports, W interSports}, s 3 = {Sports}, and s 4 = ∅. We denote by P (T ) the set of all compound terms over T (i.e. the powerset of T ). ...
Context 4
... particular, compound terms can be derived from a faceted taxonomy. For example, the set S = {{Greece} {Sports}, {SeaSports}, {Greece, Sports}, {Greece, SeaSports}, ∅}, is a compound terminology over the terminology T of the faceted taxonomy shown in Figure 1. The set S together with the compound ordering of T (restricted to S) is a compound taxonomy over T . ...
Context 5
... the computation of V + and V − in F indShortestExpressionO(M ) does not add any extra cost with respect to F indShortestExpression(M ), as the cost of computing V + and V − is smaller than the cost of Specif yP arams(⊕ P (T 1 , ..., T k ), M ) and Specif yP arams ( N (T 1 , .. ...
Context 6
... 4 Let e and e be two space-minimal and equivalent (i.e. e ≡ e ) expressions. If e has the form e = N (e 1 , ..., e n ), for n ≥ 3, and e = N 1 ( N 2 (e 1 , ..., e n−1 ), e n ) , or e = N 1 ( N 2 (e 1 , ..., e l−1 ), N 3 (e l , ..., e n )), where 2 < l < n ...
Context 7
... 4 Let e and e be two space-minimal and equivalent (i.e. e ≡ e ) expressions. If e has the form e = N (e 1 , ..., e n ), for n ≥ 3, and e = N 1 ( N 2 (e 1 , ..., e n−1 ), e n ) , or e = N 1 ( N 2 (e 1 , ..., e l−1 ), N 3 (e l , ..., e n )), where 2 < l < n ...

Citations

... Faceted search has emerged as a foundation for interactive information browsing and retrieval and has become increasingly prevalent in online information access systems, particularly for e-commerce and site search [7,[70][71][72]74]. Especially significant is combining browsing and searching in more flexible ways to support non-professional end-users in finding information. ...
Article
Full-text available
The success of the use of ontology-based systems depends on efficient and user-friendly methods of formulating queries against the ontology. We propose a method to query a class of ontologies, called facet ontologies ( fac-ontologies ), using a faceted human-oriented approach. A fac-ontology has two important features: (a) a hierarchical view of it can be defined as a nested facet over this ontology and the view can be used as a faceted interface to create queries and to explore the ontology; (b) the ontology can be converted into an ontological database , the ABox of which is stored in a database, and the faceted queries are evaluated against this database. We show that the proposed faceted interface makes it possible to formulate queries that are semantically equivalent to $${\mathcal {SROIQ}}^{Fac}$$ SROIQ Fac , a limited version of the $${\mathcal {SROIQ}}$$ SROIQ description logic. The TBox of a fac-ontology is divided into a set of rules defining intensional predicates and a set of constraint rules to be satisfied by the database. We identify a class of so-called reflexive weak cycles in a set of constraint rules and propose a method to deal with them in the chase procedure. The considerations are illustrated with solutions implemented in the DAFO system ( data access based on faceted queries over ontologies ).
... These navigational trees can be used for indexing (for avoiding errors) and browsing. Additionally, if we have a materialized faceted taxonomy M (i.e., a corpus of objects indexed through a faceted taxomony) then specific mining algorithms (such as, these in [13]) can be used for expressing the extensionally valid compound terms of M in the form of an algebraic expression. Obviously, such mined algebraic expressions enable the user to take advantage of the aforementioned interaction scheme, without having to resort to the (possibly, numerous) instances of M. Furthermore , algebraic expressions describing the valid compound terms of a faceted taxonomy can be exploited in other tasks, such as retrieval optimization [15], configuration management [1], consistency control [14], and compression [12]. ...
... The complexity of the compound term validity algorithms of CTCA and IFCA coincide, in the case that F is a simple faceted taxonomy (i.e., an interrelated faceted taxonomy with < F = ∅). Issues for further research include: (i) generalizing the supported framework such that the relation ≤ between the terms of F is allowed to include nontrivial cycles, (ii) devising an algorithm for deciding whether an IFCA expression e is well-formed, (iii) devising mining algorithms (similar to these for CTCA [13] ) that, given a materialized interrelated faceted taxonomy M, derive wellformed IFCA expressions, defining the extensionally valid compound terms of M. Finally, we plan to implement our proposed IFCA framework. ...
Conference Paper
Full-text available
In previous work, we proposed an algebra whose operators allow to specify the valid compound terms of a faceted taxonomy, in a flexible manner (by combining positive and negative statements). In this paper, we treat the same problem but in a more general setting, where the facet taxonomies are not independent but are (possibly) interrelated through narrower/broader relationships between their terms. The pro- posed algebra, called Interrelated Facet Composition Algebra (IFCA), is more powerful, as the valid compound terms of a faceted taxonomy can be derived through a smaller set of declared valid and/or invalid com- pound terms. An optimized (w.r.t. the naive approach) algorithm that checks compound term validity, according to a well-formed IFCA expres- sion, and its worst-time complexity are provided.
... This section introduces a formal model aiming at capturing all key notions appearing in [11], [14], and [3].Table 1 introduces basic notions and notations, like terms, terminologies , taxonomies, faceted taxonomies, interpretations and materialized faceted taxonomies (for details refer to [14, 13]). An example of a materialized faceted taxonomy, i.e. a faceted taxonomy accompanied by a set of object indexes , is shown inFigure 1 ...
Conference Paper
Full-text available
Faceted and dynamic taxonomies are increasingly used nowadays in a plethora of applications. For developing user interfaces grounded on this interaction paradigm, it is advantageous to have a framework that enables the manipulation of the underlying information structure and provides the basic functionalities required. This paper introduces a formal model that captures faceted materialized taxonomies and the associated interaction-related notions. Subsequently, it discusses the current design (and implementation) of a general purpose framework grounded on this formal model. Finally, the paper reports some preliminary experimental and empirical results from using this framework and relational DBMSs.
... Solving this problem would be very useful during the process of valid compound term specification, i.e. it can enhance the robustness and usability of systems that are based on CTCA, like FASTAXON (Tzitzikas et al. 2004b). In addition, as a CTCA expression can be also used for exchanging compactly the compound terms that are extensionally valid according to a materialized faceted taxonomy (using the mining algorithms presented in (Tzitzikas and Analyti 2006) ), this automation could be exploited in order to avoid reapplying these (computationally expensive) mining algorithms after an update of the faceted taxonomy. Moreover, as showed in (Tzitzikas 2006), CTCA can be used for compressing a Symbolic Data Table (Diday 2002 ). ...
... The answer is negative. At first note that previous work (Tzitzikas and Analyti 2006) has proved that if A is a subset of P(T ) such that Br(A) = A, then there is always an expression e such that S e = A. Moreover, we have shown that this is true, for every possible parse tree of e (i.e. for every possible order of operations, operands and parentheses). ...
Article
Full-text available
A faceted taxonomy is a set of taxonomies each describing the application domain from a difierent (preferably orthogonal) point of view. CTCA is an algebra that allows specifying the set of meaningful compound terms (meaningful conjunctions of terms) over a faceted taxonomy in a ∞exible and e-cient manner. However, taxonomy updates may turn a CTCA expression e not well-formed and may turn the compound terms specifled by e to no longer re∞ect the domain knowledge originally expressed in e. This paper shows how we can revise e after a taxonomy update and reach an expression e0 that is both well-formed and whose semantics (compound terms deflned) is as close as possible to the semantics of the original expression e before the update. Various cases are analyzed and the revising algorithms are given. The proposed technique can enhance the robustness and usability of systems that are based on CTCA and allows optimizing several other tasks where CTCA can be used (including mining and compressing).
... CTCA is the only well-founded and flexible solution to this problem. Table 1 recalls in brief the basic notions around taxonomies, faceted taxonomies, and materialized faceted taxonomies (for more refer to [14,13]). ...
... Assuming a materialized faceted taxonomy M , the problem of automatically deriving an expression e (or the shortest expression e) that specifies all extensionally valid compound terms of M , V (M ), is elaborated in [13]. This problem is called expression mining and is illustrated in Figure 2. ...
Conference Paper
Faceted indexing and searching are being increasingly studied in the literature and used for real-life applications, e.g., for publishing heterogeneous museum collections on the Web. In this paper, we discuss in brief several aspects of managing (faceted) taxonomy-based information sources. Specifically, we discuss (i) the semantic description of faceted taxonomies, based on the compound term composition algebra (CTCA), (ii) the revision of CTCA expressions, as faceted taxonomies evolve, (Hi) the dynamic generation of navigational trees (and other applications of CTCA), and (iv) the integration and personalization of taxonomy-based sources.
... We believe this is because of the current popularity of these systems. Other research (e.g., [21, 22]) takes meaning in the restricted zone as the seed in order to guide users to richer annotation. In the ontology world, we mention DOGMA-MESS [23], an ontology engineering methodology for communities. ...
Conference Paper
In this paper we give a brief overview of different metadata mechanisms (like ontologies and folksonomies) and how they relate to each other. We identify major strengths and weaknesses of these mechanisms. We claim that these mechanisms can be classified from restricted (e.g., ontology) to free (e.g., free text tagging). In our view, these mechanisms should not be used in isolation, but rather as complementary solutions, in a continuous process wherein the strong points of one increase the semantic depth of the other. We give an overview of early active research already going on in this direction and propose that methodologies to support this process be developed. We demonstrate a possible approach, in which we mix tagging, taxonomy and ontology. Keywordstagging-folksonomy-community informatics-faceted classification-ontology-Semantic Web
... Jacob, Loehrlein, Lee & Yu, 2004). Research in the area of automated classification and taxonomy includes FASTAXON, an application designed to utilize compositional term algebra in knowledge mining and the creation and updating of faceted taxonomies as well as providing encoding and browsing support for domain knowledge(Tzitzikas, Launonen, Hakkarainen, Korhonen, Leppänen, Simpanen, Törnroos, Uusitalo, Vänskä, 2004;Tzitzikas, Analyti, 2006).2.4 Review of similar studiesA brief survey of studies that inform the current research follows. These research studies provided guidelines for the development of website component categories, analyzed the relative importance of various website components upon website usability or functionality, or looked for evidence of FAST in website construction and design. ...
... These navigational trees can be used for indexing (for avoiding errors) and do not present the problem of missing terms or missing relationships that characterize single-taxonomies. Additionally, given a materialized faceted taxonomy M (i.e., a corpus of objects indexed through a faceted taxonomy), specific mining algorithms (such as, these in [118]) can be used for expressing the extensionally valid compound terms of M in the form of an algebraic expression. Such mined algebraic expressions enable the user to take advantage of the aforementioned interaction scheme, without having to resort to the (possibly, numerous) instances of M. Furthermore, algebraic expressions describing the valid compound terms of a faceted taxonomy can be exploited in other tasks, such as retrieval optimization, configuration management, consistency control, and compression. ...
... Solving this problem would be very useful during the process of valid compound term specification, i.e. it can enhance the robustness and usability of systems that are based on CTCA, like FASTAXON [26]. In addition, as a CTCA expression can be also used for exchanging compactly the compound terms that are extensionally valid according to a materialized faceted taxonomy (using the mining algorithms presented in [22] [21]), this automation can be exploited in order to avoid reapplying these (computationally expensive) mining algorithms after an update of the faceted taxonomy. Moreover, as showed in [20] [19], CTCA can be used for compressing a Symbolic Data Table [4]. ...
... The initial motivation for CTCA was to provide a well-founded method that is both flexible and economical (in terms of required input) and computationally efficient. One system based on CTCA has already been built [26], while other applications of CTCA are described in [20] [22] [21]. The semantics of CTCA differ from that of Description Logics (DL)[5] mainly because each operation of CTCA makes either a positive or a negative closed world assumption at its range, as it is shown in detail in [24]. ...
... We will call this problem expression revision after taxonomy update.Figure 2 Solving this problem would be very useful during the process of valid compound term specification, i.e. it can enhance the robustness and usability of systems that are based on CTCA, like FASTAXON [26]. In addition, as a CTCA expression can be also used for exchanging compactly the compound terms that are extensionally valid according to a materialized faceted taxonomy (using the mining algorithms presented in [22, 21]), this automation can be exploited in order to avoid reapplying these (computationally expensive) mining algorithms after an update of the faceted taxonomy. Moreover, as showed in [20, 19], CTCA can be used for compressing a Symbolic Data Table [4]. ...
Article
Full-text available
A faceted taxonomy is a forest of taxonomies each describing the application domain from a different (preferably orthogonal) point of view. CTCA is an algebra that allows specifying the set of meaningful compound terms (meaningful conjunctions of terms) over a faceted taxonomy in a flexible and efficient manner. However, taxonomy updates may turn a CTCA expression e not well-formed and may turn the compound terms specified by e to no longer reflect the domain knowledge originally expressed in e. This paper shows how we can revise e after a taxonomy update and reach an expression e that is both well-formed and whose semantics (compound terms defined) is as close as possible to the semantics of the original expression e before the update. Various cases are analyzed and the revising algorithms are given. The proposed technique can enhance the robustness and usability of systems that are based on CTCA and allows optimizing several other tasks where CTCA can be used (including mining and compressing).